BigML’s Winter 2017 Release is here! Join us on Tuesday, March 21, at 10:00 AM PDT (Portland, Oregon. GMT -07:00) / 06:00 PM CET (Valencia, Spain. GMT +01:00) for a FREE live webinar to discover the enhanced version of BigML! We’ll be announcing BigML’s Boosted Trees, the third ensemble-based strategy that BigML provides to help you easily solve your classification and regression problems.
Together with Bagging and Random Decision Forests, Boosted Trees make for a powerful tool for both the BigML Dashboard and our REST API. With Boosted Trees, tree outputs are additive rather than averaged (or decided by majority vote). Individual trees in a Boosted Tree differ from trees in bagged or random forest ensembles, since they do not try to predict the objective field directly. Instead, they try to fit a ‘gradient’ to correct mistakes made in previous iterations. This unique technique, where each tree improves on the imperfect predictions of the previously grown tree, lets you predict both categorical and numeric fields.
This latest addition to BigML’s toolset is visualized with a Partial Dependence Plot (PDP) chart, a graphic representation of the marginal impact of a set of variables (input fields) on the ensemble predictions irrespective of the rest of the input variables. It is a common method for visualizing and interpreting the marginal impact of the variables on ensemble predictions as well as their interactions with the rest of input fields. BigML’s Boosted Trees will also contain an importance attribute that lists each field importance in the same format used by the rest of our models and ensemble types. This option allows you to inspect and analyze the features that are most important to predict the objective field.
Just like the other BigML supervised learning models, Boosted Trees offer Single Predictions to predict a given single instance and Batch Predictions to predict multiple instances simultaneously. And now all our classification ensembles, from single trees to Boosted Trees, will return not just a single class along with its confidence, but also a set of probabilities for the rest of the classes in the objective field. What is more, each class probability will be shown in the predictions histogram.
Would you like to find out exactly how Boosted Trees work? Join us on Tuesday, March 21, at 10:00 AM PDT (Portland, Oregon. GMT -07:00) / 06:00 PM CET (Valencia, Spain. GMT +01:00). Be sure to reserve your FREE spot today as space is limited! Following our tradition, we will also be giving away BigML t-shirts to those who submit questions during the webinar. Don’t forget to request yours!
PreSeries, the joint venture between Telefónica Open Future_ and BigML, staged the fourth edition of the Artificial Intelligence Startup Battle yesterday (Tuesday, February 28). The event took place at the main stage of the 4 Years From Now (4YFN) event, the startup focused platform of the Mobile World Congress that enables investors and corporations to connect with successful entrepreneurs to launch new ventures together. More than 500 attendees witnessed this unique battle, where no humans were involved in assessing the contestants. Instead, the Machine Learning algorithm of PreSeries chose the winner.
This fourth edition followed the footsteps of the previous AI Startup Battles, the first one celebrated in Valencia last March 15, 2016, the second one in Boston on October 12, 2016, and the third one in Sao Paulo on December 9, 2016, where PreSeries’ algorithm asked a number of dynamically selected questions to each contender in order to provide a score between 0 and 100. The startup with the highest score won the contest as the system deemed it to be the one with most likelihood of future success. The predictions are based on historical data from more than 350.000 companies from around the world.
Pixoneye, the winner of the Artificial Intelligence Startup Battle at 4 Years From Now conference. Ana Segurado, the Global Manager of Telefónica Open Future_ gives away the award (left) to Pixoneye’s Erin Bronstein (right).
With a score of 96.63, the winner was announced as Pixoneye. Pixoneye is based in London and Tel Aviv; and offers the ability to analyze the untapped power of mobile users’ photo galleries on behalf of their clients. The second place finisher was Action.ai with a score of 94.00, an English company that develops chatbots that revolutionises customer interactions for businesses. The third position, with 67.82 points, was for people.io of London. people.io gives people ownership of their data to enable the next phase in the evolution of human connectivity. Finally, the fourth placed contestant (with a score of 61.23) was Descifra of Mexico, which helps businesses understand the characteristics of the markets around them through easy to understand charts, tables, and maps.
As in previous battles, the audience enthusiastically warmed up to the idea of an AI system judging the contestants after the pitch sessions and was excited to witness Pixoneye being crowned the latest AI Startup Battle winner.
Over the weekend, we saw an eventful 89th Academy Award Ceremony wrapped up. Despite underwhelming viewership counts of the televised event, it will likely be remembered for a long time for the remarkable mishap in the end.
As for our predictions performance, we got 5 out of 8 predictions right. While not phenomenal, it wasn’t such a bad for a performance given that this has been our first stab in this domain. With that said, it is important to see where and why we failed so we get to improve on it next year.
Best Movie: Moonlight
The best movie prediction is hard to explain within the confines of our dataset. We surely were not the only ones to not see this one coming given that even PricewaterhouseCooper fumbled.
We still feel that La La Land had all the ingredients that have historically given a film the win in this category. Perhaps (though not a guarantee) if we had included the Independent Film Awards data, we may have predicted Moonlight as the winner with a slight edge. But the most plausible reason why we haven’t predicted Moonlight is because we didn’t have any variables accounting for socio psychological aspects of the awards. The changes to the voting body of The Academy in response to campaigns like #OscarsSoWhite and the sustained criticism of the Academy’s conservatism may have indeed made just enough of a difference in the final decision between two well-deserving candidates.
Best Actor: Casey Affleck, Manchester by the Sea
Our best actor prediction miss can be explained because Casey Affleck and Denzel Washington shared the prizes that historically have the highest predictive power among each other. Denzel Washington won the Screen Actors Guild Award and was nominated for all of the rest of awards considered by the model while Affleck won the Golden Globes, BAFTA ,Critics Choice, Online Film & TV Association awards among four others. Although Casey Affleck won many more awards, the Screen Actors Guild had a particularly high importance in all models. Two of our models with different weights for each of those prizes gave us different predictions. Both models had exactly the same evaluation performance (100% accuracy). Denzel Washington’s prediction had a higher confidence so we went with Denzel Washington, but we could have just as easily chosen Casey Affleck.
Best Adapted Screenplay: Moonlight
This was the most difficult category to predict, as it seemed difficult to infer a general pattern that applied consistently over history. Arrival was nominated for the BAFTA (Lion won), which shows up as the most important variable, and it also won The Critic’s Choice and The Writers Guild awards. However, the eventual winner, Moonlight, was nominated for the Best Original Screenplay at BAFTA, and not this category. Go figure!
Finally, Moonlight sneaked in and pulled this one off from our pick of Arrival perhaps as result of the halo effect of the overall popular support for this year’s low budget wonder that could. We don’t have any variables that account for such correlations among the various categories, but it is less common to see extreme fragmentation of the awards in a given year, i.e., if one thinks a given movie is the best in season, she’s more likely to attribute that to multiple related factors that together make for a good movie.
We’ll chew on these lessons for next year’s predictions. In the meanwhile, happy movie watching and Machine Learning modeling endeavors to all of you!
PreSeries, the joint venture between Telefónica Open Future_ and BigML, has been invited to join the exclusive Four Years From Now (4YFN) event. 4YFN is the startup focused platform of the Mobile World Congress that enables investors and corporations to connect with the chosen entrepreneurs to launch new ventures together. As part of this year’s 4YFN agenda, we will also be presenting the fourth edition of the Artificial Intelligence Startup Battle, which takes place on Tuesday February 28, at the main stage of 4YFN in Barcelona, Spain. More than 500 technologists and decision makers will witness the power of the PreSeries Machine Learning algorithms, that predict the probability of success of any startup even in their early stages. There won’t be any humans involved in deciding the winner, PreSeries AI is the sole jury as we have showcased in previous battles.
If you plan on going to 4YFN, do not hesitate to come by the PreSeries booth at the Telefónica Open Future_ stand, where we’ll be ready to answer all your questions regarding our technology, product, new features, etc., from today Monday February 27 until Wednesday March 1.
Meet the contenders!
The four contenders will each give a four-minute pitch to the audience followed by PreSeries asking them a number of question in order to provide them with a score between 0 and 100. At this point, you may be wondering who will be competing in the battle, so let’s get to know the contenders!
Action.ai, from London, develops smart Chatbots that revolutionize interaction. They transform business processes or user experiences by changing the way people achieve tasks that meet their needs both in their personal and professional lives. Action.ai’s technology enables industry-leading services to be launched without huge Capex or expertise in AI or Chatbots.
Descifra, from Mexico City, is an online service that helps businesses to understand the characteristics of the markets around them through easy to grasp charts, tables, and maps. They advise other companies on where to build their business by taking into consideration the level of competition and the market characteristics in a certain geographic area.
people.io, also from London, gives people ownership of their data to enable the next phase in the evolution of human connectivity. Users of people.io earn credits each time they take an action such as answering a question or connecting a new data source like their email or bank accounts. As the type and amount of data associated with a given user increases, people.io starts match users with relevant brands and advertisers. Each time a user is “matched”, they receive a brief update from the advertiser, which offers the user extra credits for viewing or engaging with their content. However, at no point in the whole process does a brand or advertiser get direct access to the users’ profile data.
Pixoneye, from London and Tel Aviv, is a unique company with the ability to analyze the untapped power of users’ mobile photo galleries on behalf of their clients. In doing so, the company provides best in market real-time behavioral understanding and targeting capabilities for their clients’ ads, offers and services. Pixoneye team consists of some of the leading minds in computer vision and deep learning, and has recently been named one of the five AI companies to watch out for in 2017.
These four contesting startups will be showcased on Tuesday February 28, at the main stage of 4YFN. For those that can’t make it to the live event, our subsequent blog posts will share the results of the fourth edition of the AI Startup Battle. Good luck to all contenders!
About PreSeries events and battles
PreSeries was born in March 2016 and officially started its journey of re-imagining early stage technology venture investing by taking an unorthodox, AI-driven approach showcased via the world premiere of Artificial Intelligence Startup Battle at the PAPIs Connect Conference in Valencia. Thanks to the continuation of AI Startup Battles in Boston and Brazil, as well as our collaboration at WIRED London 2016, the startup community has a new-found appreciation of how Machine Learning can be utilized to better allocate venture capital.
Be sure to follow the upcoming PreSeries’ battles in 2017 as they will be announced on the BigML event’s page as well as on Preseries.com. Among other venues, these will include the battles that we will present at PAPIs ‘17 and PAPIs Connect. For more details, please stay tuned by following us on: LinkedIn, Google+, Facebook, or Twitter. The countdown starts now!
Machine Learning is accelerating its transition from academia to industry. We see more and more media outlets reporting about it, but most of the time they exclusively focus on the final results and not in all the human-powered tasks that happen behind the scenes and that really make the magic possible. So for most people Machine Learning continues to be some sort of elusive magic. We were recently approached by One, the Vodafone-sponsored section of El Pais, to explain how Machine Learning works and, after giving it some thought, we decided to explain it using a simple example in a domain everyone is familiar with. As the 89th annual Academy of Motion Picture Arts and Sciences Award ceremony draws near and movie fans all over the world are getting ready for their office pools, we couldn’t resist the temptation to take a stab at predicting the 2017 Oscars by applying some BigML-powered Machine Learning courtesy of our own Teresa Álvarez and Cándido Zuriaga.
Of course, picking Oscar winners remains a favorite pastime for many people every winter. As usual, there’s no shortage of opinions ranging from those of movie critics to POTUS Donald J. Trump to established media outlets that publish more data-driven analysis. One thing that many of these crystal balls have in common is the fact that none of them give the reader access to the underlying data, logic or models. Time for us to change that for the better!
No model is perfect, so before we go ahead and reveal our picks, a word of caution is in order.
Our main objective with this exercise is to demonstrate the process that is usually followed in order to make a prediction by using Machine Learning.
The Oscars nominees and winners are selected by vote of the Academy members. To properly model this problem, we should also model the Academy members and all the factors that influence how they pick and choose their favorite movies. However, we have restricted our effort to publicly available information about the movies and not about the Academy members.
Furthermore, the nature of the problem itself is ever evolving. The Academy is not a monolithic structure and the body of membership, the rules that apply to the nomination, and voting processes are subject to change over the years. A great example of that is the recent introduction of a new batch of Academy members in response to the complaints on the lack of diversity. So past behavior is not always the best predictor of future behavior.
- Finally, tastes change. One can argue that “The Academy” has stronger roots in tradition as than other movie industry awards, but we can’t deny that what works in one era is not guaranteed to translate to another without any changes. In our digital age, it is no longer a pipe dream to imagine a small budget art house movie released with the right timing at the right festivals and riding a big wave of word-of-mouth promotion on social media to finally steal the show from blockbusters that major movie studios bankroll. The times they are a’changin!
So let’s begin!
1. Problem Definition and Context Understanding
Stating the predictive problem is the most critical step in any Machine Learning workflow, as it totally shapes the rest of our solving process. Predicting an Oscar winner can be modeled as a classification task, that is, we need to create a predictive model that given a movie released in 2016 will output ‘yes’ when it predicts that the movie will win the Oscar and ‘no’ otherwise. In predicting this year’s Oscar winners, we decided to limit our predictions to only 8 out of the 24 awarded categories.
The next step is to collect and prepare some data about movies and what made them winners in those categories in the past, as well as those same attributes for 2016 movies. The more context and business understanding of the problem you have, the more prepared you are to decide what data to collect. A couple of business insights guided our data collection process:
- Of the total of approximately 600 films nominated since 2000, 62% are from the USA with an average budget of $50M, more than 3 times higher than the European budgets and 20 times higher than the Latin American countries.
- The budget amount is correlated with subsequent income from the movie, but it does not seem to be strongly correlated to winning an Oscar. Moreover, for the analyzed period, the difference in the average budget between films that win Oscars and those that don’t varies wildly. So we are not expecting budget to be a significant factor in our models.
2. Data collection and data transformations
In virtually all Machine Learning projects, the most time consuming task is collecting and structuring data. In our case, due to time constraints, we anticipate that we have left out a lot of data that could be very valuable in making predictions. For example: actresses and actors with previous nominations or awards, or the number of Oscars previously received by the nominated director, scriptwriter, etc. It is also very important to select how far back in time your training data should go; not going far enough might mean missing something useful, but going too far back is going to pick up patterns that are probably no longer relevant (in the business we sometimes call this bad practice “doing archeology”). We decided to use movies between the years 2000 and 2016.
For that period of time, we have compiled a dataset that combines:
- Movie metadata such as genre, year, budget etc. as well as user ratings and reviews from IMDb for the 50 most popular movies of each year
- Each year’s nominations and winners of 20 key industry awards, including The Academy Awards, Golden Globes, BAFTA, Screen Actors Guild, Critics Choice, Directors Guild, Producers Guild, Art Directors Guild, Writers Guild, Costume Designers Guild, Online Film Television Association, People’s Choice, London Critics Circle, American Cinema Editors, Hollywood Film, Austin Film Critics Association, Denver Film Critics Society, Boston Society of Film Critics, New York Film Critics Circle, and Los Angeles Film Critics Association.
An added complexity is that, for data like ratings and reviews, it is difficult to determine if they were impacted by the fact that the movie was nominated for an Oscar or not. In other words, we don’t have the ability to reconstruct the exact timeline of our data’s construction.
We must also note that despite our best efforts to cleanse the dataset, there may still be some inaccuracies in the data itself. The final dataset that we have compiled is a wide one with fairly small number of rows, due to the nature of the problem (after all this is a once a year event with a different set of contestants each year). This makes models prone to noise and overfitting, even though the selection of ensembles as our algorithm mitigates this risk to some extent. You can get access to our input dataset via this BigML shared link or via BigML’s dataset gallery here.
3. Data Exploration
A good warm-up exercise in any predictive task is a visual perusal of the data. One such fishing expedition using BigML’s Association Discovery capabilities netted some interesting associations:
Nominees for best film are usually dramas and biographies and seldom action films. Among the winners, we did not find a strong correlation with genre since the nominees already belong to a tight group of genres.
When using Association Discovery to find the most important correlations between the Oscars and other awards, we saw especially notable correlations with the Golden Globes or the Critics Choice awards, among a few others.
- As shown in the scatter plot below, not always the movies with higher box office or more votes win the Oscar to the best picture.
- Something similar happens with user and critics review:
4. Feature Engineering
Most datasets can be enriched with some extra features, derived from existing ones, that can increase the predictive power of the data. In our case, since the unstructured movie reviews can be challenging to analyze, we ran all the IMDb user review data through a Topic Model analysis, which automatically discovers a set of topics that can be used to characterize each data row. Then, for each movie, the set of topics and their associated probabilities were added as new features to our dataset. All told, our Machine Learning-ready dataset is composed of 256 different fields for 1,152 movies produced since 2000. Once the a dataset is ready, the modeling and evaluation tasks become easy-peasy-lemon-squeezy with BigML.
Usually predictive modeling involves comparing and selecting the appropriate classification algorithms and their specific parameters. Most of this process can be fully automated, although you need to be aware of the hype around full-automation. In our case, after a few tries and given our limited historical data and the need to avoid overfitting, we opted for tree ensembles over a decision tree or logistic regression. So we created 8 separate binary classification models (one per award category) with the objective field (the column we want to predict) being “winner”.
To assess the predictive impact of each group of variables per category (e.g., metadata, ratings, reviews to unveil the Best Picture Winner), we took a stepwise approach, where we made different predictions based on different ensembles built on different subsets of our dataset. This approach showed us the contribution of different types of data to the final combined prediction, and helped guide our efforts to pull together more data on certain aspects when needed. For instance, discovering that focusing on award data yields better results, translated into the collection of even more historical award data for better final predictions.
To evaluate our classification models, we used the period between 2000-12 as the training period, and 2013-15 as the test period. We then input the data for 2016 nominees to the already validated ensembles to arrive at our final predictions.
Evaluation results demonstrated that combining all available variables yields the best results by essentially reducing the False Positives, while maintaining a very high True Positive hit rate.
7. The Predictions: Drum rolls please…
So let’s finally see what our models found out!
It seems everyone’s sentimental favorite musical La La Land will have a big prize to show for the record 14 nominations it received this year. When we delve into the major drivers behind this prediction, we observe that the Critics Choice nomination and award it received along with its other award performances (with the Producers Guild, Screen Actors Guild and BAFTA especially sticking out) explain why La La Land is the favorite with a pretty high F-measure to boot.
However, if we dig deeper and look at how non-award data have played into the predictions we see a different picture, where IMDb user reviews are favoring Fences, and award nominations are accentuating the small budget wonder Moonlight. But when actual award winnings are added and all the factors are combined, La La Land emerges as the #1 pick. Can we be in for a surprise Sunday night? Maybe so, but it would have to be one epic upset, so the chances are rather low.
Now that we have lifted the veil of mystery regarding the most anticipated award of the night, let’s quickly recap the remaining categories.
No big surprises here with Damien Chazelle expected to pick up the Best Director award consistent with his success in the awards circuit—especially Directors Guild.
Best Actress will go to Emma Stone carrying the tally for La La Land even higher. Again, Golden Globe and Screen Actors Guild pickups are the biggest forces pushing her nomination forward.
This year’s Best Actor category is a close call between Casey Affleck (Manchester by the Sea) and Denzel Washington (Fences), thus the lower confidence prediction here. Screen Actors Guild award going to Denzel Washington definitely makes a difference for him, but Casey Affleck has picked up more awards in total and, per our model, that seems to help him get a slight edge. We’ll only find out on Sunday whether The Academy’s judgement was at all clouded by the personal troubles of Mr. Affleck despite what most agree was a stellar acting performance.
Viola Davis (Fences) is the popular favorite for Best Supporting Actress thanks mainly to her performance in BAFTA, Screen Actors Guild and New York Critics Circle awards.
Best Supporting Actor award is another category with a number of viable choices. Our pick is Mahershala Ali (Moonlight). Most critics seem to be highlighting Mahershala Ali and Dev Patel as the favorites in this category as well. Mahershala Ali is somewhat handicapped by the model since he wasn’t able to pick up the Golden Globe for this category, which translates into a rather low confidence value for this prediction. Interestingly, the Nocturnal Animals actor Aaron Tayor-Johnson won the Golden Globe, but he is not even a nominee for the Oscars. Actually, in one of our preliminary models we made the mistake of inputing the Golden Globes wins for Michael Shannon and as it has been nominated to best supporting model, our model predicted him as the winner. This was helpful to remind us how careful you need to be not only collecting and cleaning the right data for training your models, but also making sure you input the right one at prediction time.
Best Original Screenplay is also very much in play with La La Land and Manchester by the Sea vying for the award. La La Land seems to have a very slight edge on the back of its BAFTA success, but don’t count out Manchester by the Sea just yet.
Finally, Best Adapted Screenplay will likely go to Arrival since the movie did quite well within the award circuit for the category.
Besides being a fun undertaking, this exercise has been further testament to the power and importance of working with the right dataset and paying due attention to feature engineering. Being able to construct the best features remains the biggest return on time invested, especially in the presence of a solid Machine Learning platform like BigML where:
- Some of the most versatile algorithms ever invented are offered via an intuitive interface (as well as a thorough API).
- Scalability concerns are abstracted away for the end-user to concentrate on the analytical task at hand.
- Flexible deployment options make it a breeze to operationalize chosen models working with the right data.
At the end of the day, feature engineering is the reflection of true expertise in a given domain into the models you build.
We hope you will find these predictions useful as you grab a glass of wine and follow Jimmy Kimmel kicking off the ceremonies on Sunday. Good luck with your picks!
The investment industry is an extremely competitive one, where fund managers work hard to demonstrate a strong track record that beats their respective benchmarks in order to be able to justify their fees and to partake in the profits from their assets under management. For retail investors, the recent years have been characterized by a significant shift towards passive investing instruments such as index funds at the expense of actively managed funds that have been struggling to justify higher expense ratios against the backdrop of high volatility markets and easy-money policies that interventionist central bank policies worldwide manufactured in response to the great recession.
In the meanwhile, the abundance of financial market data has given birth to new wave of startups looking to put Machine Learning to good use in order to create sustainable market edge at lower costs. One such exciting company is STATS4TRADE out of France. We have caught up with the CEO – Founder of STATS4TRADE to see how his company is innovating with advanced analytics.
BigML: Congrats on launching your startup Jean-Marc. Can you tell us what was the motivation behind founding STATS4TRADE?
Jean-Marc Guillard: It really starts with my conviction that the financial services industry is faced with drastic change in the coming years and actively-managed equity funds are not immune to that. Investors are rightfully questioning high fees in the face of continued poor performance compared to passive funds with much lower fees. Similar to the disruptive changes now occurring in the transport industry, active-fund managers must contemplate an “Uber-ization” of their business model with software driving innovation to provide investors promised returns at lower cost.
BigML: I understand that active managers are between a rock and a hard place, but what’s wrong with the good old buy-and-hold?
Jean-Marc Guillard: Consider putting your money into a diversified index fund and waiting decades. Would such a traditional buy-and-hold approach yield decent returns with low volatility? Normally yes – but beware. This strategy can yield poor results with rather high volatility for some indices. For example just consider the performance of French CAC 40 over the past two decades.
The CAC 40 index is a diverse weighted stock price average of France’s 40 largest public companies including such internationally famous names as Airbus, L’Oreal and Michelin. As a result, the CAC 40 should serve as an ideal index for the risk averse small investor in France with a buy-and-wait strategy over the long-run. But the performance of the CAC 40 between 1990 and 2015 has been a dismal 3.3% without dividends and 6.5% with dividends. Moreover these returns came with rather high volatility upward of 22%.
Overall we strongly believe that a buy-and-hold strategy is absolutely valid for the risk averse small investor – especially if one considers the cost of active funds. However the recent advent of no fee brokerages like Robinhood in the United States and Deziro in Europe offers investors the ability to actively manage their own investments at costs approaching those of index funds. We want to encourage this democratization process by offering investors an objective way to automatically select stocks that yields better results while bypassing high fees.
BigML: That’s very interesting. How is STATS4TRADE’s approach to this problem different? How can the risk averse small investor earn decent returns – say in the range of 6-9% including all fees – with low volatility over shorter time periods than decades?
Jean-Marc Guillard: STATS4TRADE is uniquely positioned to help investors navigate this coming change. With the aid of Machine Learning and cloud computing technologies, we offer investors a new approach for selecting stocks and making buy/sell decisions – a data-driven approach that not only yields consistently better-than-index performance but also minimizes volatility and decreases operational costs while protecting capital.
Our trading applications leverage the power of BigML’s Machine Learning tools and allow investors both private and professional the opportunity to not only select but also simulate different investment strategies based on short-term price forecasts. Once an investor has selected a strategy corresponding to her particular risk-profile, the application automatically provides daily buy/sell signals for trading on no-fee platforms like Robinhood and Deziro.
Of course none of this is magic and our approach is not without its limitations. For example if one expects to become rich quickly, he will be sorely disappointed for no forecast is completely accurate. Normally one needs about six months to begin seeing the benefits of our method. Nonetheless the message is clear: through the power of statistics and data-driven approaches like ours ultimately yield better results at lower cost. The results certainly speak for themselves!
BigML: Thanks for the detailed explanation. Can you also tell a bit about specifically how Machine Learning comes into play?
Jean-Marc Guillard: Someone once said that predicting the future is a fool’s errand. We agree. However, one can still use stastitics to estimate the likelihood of future events based on past data and an underlying statistical model. In fact, statistical methods have been used extensively for years in activities like consumer research, weather forecasting and of course finance. In our case we use Machine Learning methods powered by BigML to estimate the probability of short-term price movements of selected securities and indices, currencies and commodities. Namely, we aim to identify underlying statistical patterns for a given security, basket of securities, or an index and thereby accurately forecast upcoming movements in price.
BigML: What made you choose to build your models on the BigML platform?
Jean-Marc Guillard: A big part of it was the drastically faster iterative experimentation BigML Dashboard enables, which in turn allowed us to achieve faster time-to-market. One usually doesn’t know what the final Machine Learning workflow will look like when he sets on exploring possibilities in a large hypothesis space a complex problem like ours require. So it is essential that the tools you use afford you very quick and easy iterative exploration. BigML excels on this front.
In addition, the automation options made available on the BigML platform let’s us decrease ongoing operational costs to a minimum level that can compete with passive index funds while further differentiating from actively-managed funds that rely on manual processes. Lastly, we have had phenomenal support from the BigML team throughout our evaluation, exploration and solutions implementation phases.
BigML: Thanks Jean-Marc. It is very impressive to see how you have been able to ramp up your Machine Learning efforts in such a limited time period despite constrained resources. We hope stories like yours inspire many more startups to realize that they too can turn their data and know-how into sustainable competitive advantages.
For our readers benefit, a downloadable PDF version of the STATS4TRADE case study is also available.
There’s a lot of buzz lately around “Automating Machine Learning”. The general idea here is that the work done by a Machine Learning engineer can be automated, thus freeing potential users from the tyranny of needing to have specific expertise.
Presumably, the ultimate goal of such automations is to make Machine Learning accessible to more people. After all, if a thing can be done automatically, that means anyone who can press a button can do it, right?
Maybe not. I’m going to make a three-part argument here that “Machine Learning Automation” is really just a poor proxy for the true goal of making Machine Learning useable by anyone with data. Furthermore, I think the more direct path to that goal is via the combination of automation and interactivity that we often refer to in the software world as “abstraction”. By understanding what constitutes a powerful Machine Learning abstraction, we’ll be in a better position to think about the innovations that will really make Machine Learning more accessible.
Automation and Interaction
I had the good fortune to attend NIPS in Barcelona this year. In particular, I enjoyed the (in)famous NIPS workshops, in which you see a lot of high quality work out on the margins of Machine Learning research. The workshops I attended while at NIPS were each excellent, but were, as a collection, somewhat jarringly at odds with one another.
In one corner, you had the workshops that were basically promising to take Machine Learning away from the human operator and automate as much of the process as possible. Two of my favorites:
- Towards an Artificial Intelligence for Data Science – What it says on the box, basically trying to turn Machine Learning back around on itself and learn ways to automate various phases of the process. This included an overview of an ambitious multi-year DARPA program to come up with techniques that automate the entire model building pipeline from data ingestion to model evaluation.
- Bayesopt – This is a particular subfield in Machine Learning, where we try to streamline the optimization of any parameterized process that you’d usually figure out via trial and error. The central learning task is, given all of the parameter sets you’ve tried so far, trying to choose the next one to evaluate so that you have the best shot at finding the global maximum. Of course, Machine Learning algorithms themselves are parameterized processes tuned by trial and error, so these techniques can be used on them. My own WhizzML script, SMACdown, is a toy version of one of these techniques that does exactly this for BigML ensembles.
In the other corner, you had several workshops on how to further integrate people into the Machine Learning pipeline, either by inserting humans into the learning process or finding more intuitive ways of showing them the results of their learning.
- The Future of Interactive Learning Machines – This workshop featured a panoply of various human-in-the-loop learning settings, From humans giving suggestion to Machine Learning algorithms, to machine-learned models trying to teach humans. There was, in particular, an interesting talk on using reinforcement learning to help teachers plan lessons for children, which I’ll reference below.
- Interpretable Machine Learning for Complex Systems – This workshop featured a number of talks on ways to allow humans to better understand what a classifier is doing, why it makes the predictions it does, and how best to understand what data the classifier needs to do its job better.
So what is going on here? It seems like we want Machine Learning to be automatic . . . but we also want to find ways to keep people closely involved? It is a strange pair of ideas to have at the same time. Of course, people want things automated, but why do they want to stay involved, and how to those two goals co-exist?
A great little call-and-response on this topic happened between two workshops as I attended them. Alex Wiltschko from Twitter gave an interesting talk on using Bayesian parameter optimization to optimize the performance of their Hadoop jobs (among other things) and he made a great point about optimization in general: If there’s a way to “cheat” your objective, so that the objective increases without making things intuitively “better”, the computer will find it. This means you need to choose your objective very carefully so the mathematical objective always matches your intuition. In his case, this meant a lot of trial and error, and a lot of consultations with the people running the Hadoop cluster.
An echo and example came from the other side of the “interactivity divide”, in the workshop on interactive learning. Emma Brunskill had put together a system that optimized the presentation of tutorial modules (videos, exercises, and so on) being presented to young math students. The objective the system was trying to optimize was something like the performance on a test at the end of the term. Simple enough, right? Except that one of the subjects being taught was particularly difficult. So difficult that few of the tutorial modules managed to improve the students’ scores. The optimizer, sensing this futility, promptly decided not to bother teaching this subject at all. This answer is of course unsatisfying to the human operator; the curriculum should be a constraint on the optimization, not a result of it.
Crucially though, there’s no way the computer could know this is the case without the human operator telling it so. And there’s no way for the human to know that the computer needs to know this unless the human is in the loop.
Herein lies the tension between interactivity and automation.
On one hand, people want and need many of the tedious and unnecessary details around Machine Learning to be automated away; often such details require expertise and/or capital to resolve appropriately and end up as barriers to adoption.
On the other, people still want and need to interact with Machine Learning so they can understand what the “Machine” has learned and steer the learning process towards a better answer if the initial one is unsatisfactory. Importantly, we don’t need to invoke a luddite-like mistrust of technology to explain this point of view. The reality is that people should be suspicious of the first thing a Machine Learning algorithm spits out, because the numerical objective that the computer is trying to optimize often does not match the real-world objective. Once the human and machine agree precisely on the nature of the problem, Machine Learning works amazingly well, but it sometimes takes several rounds of interaction to generate an agreement of the necessary precision.
Said another way, we don’t need Machine Learning that is “automatic”. We need Machine Learning that is comfortable and natural for humans to operate. Automating away laborious details is only a small part of this process.
If this sounds familiar to those of you in the software world, it’s because we’re here all the time.
From Automation to Abstraction
In the software world, we often speak in terms of abstractions. A good software library or programming language will hide unnecessary details from the user, exposing only the modes of interaction necessary to operate the software in a natural way. We say that the library or language is a layer of abstraction over the underlying software.
For those of you unfamiliar with the concept, consider the C programming language. In C, we can write a statement like this:
x = y + 3
The C compiler converts this operation to machine code, which requires knowing where in memory the x and y variables are, loading these variables into registers, loading the binary value for “3” into a register, summing the values to a new register, assigning that result to a new variable, and so on.
The language hides machine code and registers from us so we can think in terms of operators and variables, the primitives of higher level problems. Moreover, it exposes an interface (mathematical expressions, functions, structs, and so on) that allows us to operate the layer underneath in a way that’s more useful and natural than if we worked on the layer directly. In this sense, the C language is a very good abstraction: It hides many of the things we’re almost never concerned about, and exposes the relevant functionality in an easier-to-use way.
It’s helpful to think about abstractions in the same way we think about compression algorithms. They can be “strong”, so that they hide a lot of details, or “weak” so they hide few. They can also be “very lossy”, so that they expose a poor interface, up to “lossless”, where the interface exposed can do everything that the hidden details can do. The devil of creating a good abstraction is rather the same as creating a good compression algorithm: You want to hide as many unimportant details from your users as possible, while hiding as little as possible that those same users want to see or use. The C language as an abstraction over machine code is both quite strong (hides virtually all of the details of machine code from the user) and near-lossless (you can do the vast majority of things in C that are possible directly via machine code).
The astute reader can likely see the parallel to our view of Machine Learning; We have the same sort of tension here between the hiding of drudgeries and complexities while still providing useful modes of interaction between tool and user. Where, then, does “Machine Learning Automation” stand on our invented scale of abstractions?
Automations Are Often Lossy and Weak Abstractions
The problem (as I see it) with some of the automations on display at NIPS (and indeed in the industry at large) is that they are touted using the language of abstraction. There are often claims that such software will “automate data science” or “allow non-experts to use Machine Learning”, or the like. This is exactly what you might say about the C language; that it “automates machine code generation” or “allows people who don’t know assembly to program”, and you would be right.
As an example of why I find this a bit disingenuous, consider using Bayesian parameter optimization to tune the parameters of Machine Learning algorithms, one of my favorite newish techniques. It’s a good idea, people in the Machine Learning community generally love it, and it certainly has the power to produce better models from existing software. But how good of an abstraction is it, on the basis of drudgery avoided and the quality of the interface exposed?
Put another way, suppose we implemented some of these parameter optimizations on top of, say scikit-learn (and some people have). Now suppose there’s a user that wants to use this on data she has in a CSV file to train and deploy a model. Here’s a sample of some of the other details she’s worried about:
- Installing Python
- How to write Python code
- Loading a CSV in Python
- Encoding categorical / text / missing values
- Converting the encoded data to appropriate data structures
- Understanding something about how the learned model makes its predictions
- Writing prediction code around the learned model
- Writing/maintaining some kind of service that will make predictions on-demand
- Getting a sense of the learned model’s performance
Of course, things get even more complicated at scale, as is their wont:
- Get access to / maintain a cluster
- Make sure that all cluster nodes have the necessary software
- Load your data onto the cluster
- Write cluster specific software
- Deal with cluster machine / job limitations (e.g., lack of memory)
This is what I mean when I say Machine Learning automations are often weak abstractions: They hide very few details and provide little in the way of a useful interface. They simply don’t usually make realistic Machine Learning much easier to use. Sure, they prevent you from having to hand-fit maybe a couple dozen parameters, but the learning algorithm is already fitting potentially thousands of parameters. In that context, automated parameter tuning, or algorithm selection, or preprocessing doesn’t seem like it’s the thing that suddenly makes the field accessible to non-experts.
In addition, the abstraction is also “lossy” under our definition above; it hides those parameters, but usually doesn’t provide any sort of natural way for people to interact with the optimization. How good is the solution? How well does that match the user’s internal notion of “good”? How can you modify it to do better? All of those questions are left unanswered. You are expected to take the results on faith. As I said earlier, that might not be a good idea.
A Better Path Forward
So why am I beating on Bayesian parameter optimization? I said that I think it’s awesome and I really do. But I don’t buy that it’s going to be the thing that brings the masses to Machine Learning. For that, we’re going to need proper abstractions; layers that hide details like those above from the user, while providing them with novel and useful ways to collaborate with the algorithm.
This is part of the reason we created WhizzML and Flatline, our DSLs for Machine Learning workflows and feature transformation. Yes, you do have to learn the languages to use them. But once you do, you realize that the languages are strong abstractions over the concerns above. Hardware, software, and scaling issues are no longer any concern as everything happens on BigML-maintained infrastructure. Moreover, you can interact graphically with any resources you create via script in the BigML interface.
The goal of making Machine Learning easier to use by anyone is a good one. Certainly, there are a lot of Machine Learning sub-tasks that could bear automating, and part of the road to more accessible Machine Learning is probably paved with “one-click” style automations. I would guess, however, that the larger part is paved with abstractions; ways of exposing the interactions people want and need to have with Machine Learning in an intuitive way, while hiding unnecessary details. The research community is right to want both automation and interactivity: If we’re clever and careful we can have it both ways!
2016 has proven a whirlwind year for BigML with substantial growth in users, customers and the team riding on the realization by businesses and experts that Machine Learning has transformational power in the new economy where data is in abundance but actionable insights have not been able to keep pace with improvements in storage, computational power and lowered costs. When things happen so fast, one can sometimes find it a challenge to stop and reflect on milestones and achievements. So below are the highlights of what made 2016 a special year for BigML.
Releases and Product Updates
In 2016, BigML users were greeted by many new capabilities that they were asking for. As a result, the platform is now more mature and versatile than ever. Logistic Regression (Summer 2016 Release) and Topic Modeling (Fall 2016 Release) techniques beefed up existing supervised and unsupervised learning resources, while Workflow Automation with WhizzML (Spring 2016 Release) gave the platform a whole new dimension that can deliver huge productivity boosts to any analytics team in the form of reduced solution maintenance and mitigated model management risks.
Events, Certifications and Awards
2016 has seen BigML being represented at 4YFN, Machine Learning Prague, PAPIs 2016 and PAPIs Connect, Legal Management Forum, IEEE Rock Stars of Emerging Tech, WIRED, Mobey Day, Data Wrangling Automation and other industry events around the world with flattering reception and genuine enthusiasm that keeps pushing the team to innovate. Most notably, we have created a new and very hands-on BigML Certification Program that teaches participants how to solve practical real-life Machine Learning problems. The next wave starts on January 19th, 2017!
After conducting the 2nd Valencian Summer School in Machine Learning followed by a special lecture by BigML advisor Professor Geoff Webb, BigML gave its first Brazilian Summer School in Machine Learning in São Paulo. Look for more education events to follow in 2017 as BigML has joined forces with CICE in Madrid to take its educational efforts to the next level to capitalize on the great hunger for Machine Learning from developers, analysts and scientist.
Although the biggest award for us are the compliments we receive from our users and customers, in 2016, we were also pleased to be recognized by DIA Barcelona and Zapier for best advanced analytics for insurance companies and BigML for Google Sheets respectively.
Popular Posts of 2016
Some of the Machine Learning veterans on our team, also were able to make time in sharing their career experiences over multiple posts that were well-received.
For reprise, here is a good selection to revisit for those who would like to gain new perspectives on the current market landscape and what has worked in real life situations right from the horses mouth.
Looking Forward to 2017
Now that the awareness of Machine Learning in general, and cloud-born Machine Learning platforms in particular, have reached a critical threshold, our go-to-market strategy will double up on communicating positive examples to the entire community rather than having to explain “Why Machine Learning Matters” to the uninitiated. In that regard, we must also thank Google, Apple, Uber, Airbnb, Facebook, Amazon, Microsoft for putting Machine Learning squarely in the business lexicon.
In 2017, we also intend to intensify our educational efforts that promote learning by doing, while expanding the breadth and depth of capabilities to enable Agile Machine Learning at any organization in any industry. A big part of this will manifest itself through our active participation in technology events. We are kicking off the year with a trio of events, where BigML speakers will be on stage:
- Anomaly Detection: Principles, Benchmarking, Explanation, and Theory
Anomaly detection algorithms are widely applied in data cleaning, fraud detection, and cybersecurity. This talk will begin by defining various anomaly detection tasks and then focus on unsupervised anomaly detection. It will present a benchmarking study comparing eight state-of-the-art methods. Then it will discuss methods for explaining anomalies to experts and incorporating expert feedback into the anomaly detection process. The talk will conclude with a theoretical (PAC-learning) framework for formalizing a large family of anomaly detection algorithms based on discovering rare patterns.
Speaker: Thomas G. Dietterich, Co-Founder and Chief Scientist.
FiturtechY is an event organized by the Instituto Tecnológico Hotelero (ITH), where innovation and technology meet to improve the tourism industry. FiturtechY will host four forums to discuss different topics: business, destiny, sustainability, and trends. BigML will be presenting at #techYnegocio forum, the meeting point for those professionals who seek to learn the latest tools that help revolutionize the tourism industry.
Speaker: Dario Lombardi, VP of Predictive FinTech.
- Computer-Supported Cooperative Work and Social Computing
CSCW is the premier venue for presenting research in the design and use of technologies that affect groups, organizations, communities, and networks. Bringing together top researchers and practitioners from academia and industry, CSCW explores the technical, social, material, and theoretical challenges of designing technology to support collaborative work and life activities.
Speaker: Poul Petersen, Chief Infrastructure Officer.
We also intend to put together the first BigML User Conference later in the year. So stay tuned for further event updates.
Hope this post gave a good crash course tour (especially for those of you that have recently joined BigML) of what’s been happening around our neck of the woods. Powered by your support, we’re hungrier than ever to bring to the market the best Machine Learning software platform there ever was. We’d also highly encourage you to take a look at our 2017 predictions, which will guide our roadmap in the remainder of the year. As always, be sure to reach out to us with your ideas no matter how crazy they seem!
As each year wraps up experts pull their crystal balls from their drawers and start peering into it for a glimpse of what’s to come in the next one. At BigML, We have been following such clairvoyance carefully this past holiday season to compare and contrast with our own take on what 2017 will have in store, which can come across as quite unorthodox to some experts out there.
For the TL;DR crowd, our crystal ball is showing us a cloudy (no pun intended) 2017 Machine Learning market forecast with some sunshine behind the clouds for good measure. To put it more directly, enterprises need to look beyond the AI hype for practical ways to incorporate Machine Learning into their operations. This starts with the right choice of internal platform that will help them build on smaller, low hanging fruit type projects that leverage their proprietary datasets. In due time, those projects add up to create positive feedback effects that ultimately not only introduce decision automation on the edges, but help agile Machine Learning teams transform their industries.
Jumping back to our regularly scheduled programming, let’s start with a quick synopsis of the road traveled so far:
But digesting, adopting and profiting from 36 years of Machine Learning advances and best practices has been a very bumpy ride for many businesses few have managed to navigate so far.
There are many “New Experts” that read a couple of books or take a few online classes and are suddenly in a position to “alter” things just because they have access to cheap capital. While top technology companies have been “collecting” as much experienced Machine Learning talent as possible to get ready ready for the up and coming AI economy, other businesses are at the mercy of Machine Learning-newbie investors and inexperienced recent graduates with unicorn ambitions. It is wishfully assumed that versatile, affordable and scalable solutions based on a magical new algorithm will materialize out of these ventures.
In 2017, we suspect that the ecosystem is going to start converging around the right approach, albeit after otherwise avoidable roadkills.
Before we get to the specific predictions, we must note that 2016 was a special year in that it presented us with the watershed event such that the planet’s Top 5 most valuable companies are all technology companies for the first time in history. All five share the common traits of large scale network effects, highly data-centric company cultures and new economic value-added services built atop sophisticated analytics. Whats more they have been heavily publicizing their intent to make Machine Learning the fulcrum of their future evolution. With the addition of revenue generating unicorns like Uber and Airbnb the dominance of the tech sector is likely to continue in the coming years that will benefit immensely from the wholesale digitization of the World economy.
However, the trillion dollar question is how legacy companies (i.e., non-tech firms with rich data plus smaller technology companies) can counteract and become an integral part of the newly forming value chains to be able to not only survive, but thrive in the remainder of the decade. Today, these firms are stuck with rigid rear view mirror business intelligence systems and archaic workstation-based traditional statistical systems running simplistic regression models that fail to capture the complexity of many real life predictive use cases.
At the same time, they sit on growing heaps of hard to replicate proprietary datasets that go underutilized. The latest McKinsey Global Institute report named The Age of Analytics: Competing in a Data-driven World reveals that less than 30% of the potential of modern analytics technologies outlined in their 2011 report has been realized — not even counting the new opportunities made possible by the advent of the same technologies in the last five years. To make matters worse, the progress looks very unbalanced across industries (i.e., as low as 10% in U.S. Healthcare vs. up to 60% in the case of Smartphones) at a time analytics prowess is correlated with competitive differentiation more than ever.
Even if it maybe hidden behind polished marketing speak pushed by major vendors and research firms (e.g., “Cognitive Computing”, “Machine Intelligence” or even doomsday-like “Smart Machines”), the Machine Learning genie is out of the bottle without a doubt as its wide-ranging potential across the enterprise has already made it part of the business lexicon. This new found appetite for all things Machine Learning means many more legacy firms and startups will begin their Machine Learning journeys in the 2017. The smart ones will separate themselves from the bunch by learning from others’ mistakes. Nonetheless, some old bad habits are hard to kick cold turkey, so let’s dive in with some gloomier predictions and end on a higher note:
- PREDICTION #1:
“Big Data” soul searching leads to the gates of #MachineLearning.
The soul searching in the “Big Data” movement will continue as experts recognize the level of technical complexity that aspiring companies must navigate to piece together useful “Big Data” solutions that fit their needs. At the end of the day “Big Data” is tomorrow’s data but nothing else. The recent removal of the “Big Data” entry from the Gartner Hype Cycle is further testament to the same realization. All this will only hasten the pivot to analytics and specifically to Machine Learning as the center of attention so as to recoup the sunk costs from those projects via customer touching smart applications. Moreover, the much maligned sampling remains a great tool to rapidly explore new predictive use cases that will support such applications.
- PREDICTION #2:
VCs investing in algorithm-based startups are in for a surprise.
- PREDICTION #3:
#MachineLearning talent arbitrage will continue at full speed.
The education process of VCs will continue, albeit slowly and through hard lessons. They will keep investing in algorithm-based startups with the marketable academic founder resumes, while perpetuating myths and creating further confusion e.g., portraying Machine Learning as synonymous with Deep Learning, completely misrepresenting the differences between Machine Learning algorithms and Machine-learned models or model training and predicting from trained models1. A deeper understanding of the discipline with the proper historical perspective will remain elusive in the majority of the investment community that is on the look out for quick blockbuster hits. On a slightly more positive note, a small subset of the VC community seems to be waking up to the huge platform opportunity Machine Learning presents.
The media frenzy around AI and Machine Learning will continue at full steam as humored by Rocket AI type parties, where young academics will be courted and ultimately funded by aforementioned investors. Ensuing portfolio companies will find it hard to compete on algorithms as few algorithms are really widely useful in practice although some do slightly better than other for very niche problems. Most will be cast as brides at shotgun weddings with corporate development teams looking to beef up on Machine Learning talent strictly for internal initiatives. In some nightmare scenarios, the acquirers will have no clear analytics charter, yet they will be in a frantic hunt to grab headlines to generate the illusion that they too are on the AI/Machine Learning bandwagon.
- PREDICTION #4:
Top down #MachineLearning initiatives built on Powerpoint slides will end with a whimper.
Legacy company executives that opt for getting expensive help from consulting companies in forming their top-down analytics strategy and/or making complex “Big Data” technology components work together before doing their homework on low hanging predictive use cases will find that actionable insights and game-changing ROI will be hard to show. This is partially due to the requirement to have the right data architecture and flexible computing infrastructure already in place, but more importantly outperforming 36 years of collective achievements by the Machine Learning community with some novel approach is just a tall order regardless how relatively cheap computing has become.
- PREDICTION #5:
#DeepLearning commercial success stories will be few and far in between.
Deep Learning’s notable research achievements such as the AlphaGo challenge will continue generating media interest. Nevertheless, its advances in certain practical use cases such as speech recognition and image understanding will be the real drivers for it to find a proper spot in the enterprise Machine Learning toolbox alongside other proven techniques. Interpretability issues, dearth of experienced specialists, its reliance on very large labeled training datasets and significant computational resource provisioning will limit mass corporate adoption in 2017. In its current form, think of it as the Polo of Machine Learning techniques, a fun time perhaps that will let you rub elbows with the rich and famous provided that you can afford a well-trained horse, the equestrian services and upkeep, the equipment and a pricey club membership to go along with those. Nevertheless, not quite an Olympic sport. So short of a significant research breakthrough in the unsupervised flavors of Deep Learning, most legacy companies experimenting with Deep Learning are likely to come to the conclusion that they can get better results faster if they pay more attention to areas like Reinforcement Learning and the bread and butter Machine Learning techniques such as ensembles.
- PREDICTION #6:
Exploration of reasoning and planning under uncertainty will pave the way to new #MachineLearning heights.
Of course, Machine Learning is only a small part of AI. More attention to research and the resulting applications from startups in the fields of reasoning and planning under uncertainty and not only learning will help cover truly new ground beyond the better understood pattern recognition. Not surprisingly, Facebook’s Mark Zuckerberg has reached similar conclusions in his assessment of the state of AI/Machine Learning after spending nearly a year to code his intelligent personal assistant “Jarvis”, that was loosely modeled after the same in the Iron Man series.
- PREDICTION #7:
Humans will still be central to decision making despite further #MachineLearning adoption.
Some businesses will see early shoots of faster and evidence-based decision making powered by Machine Learning, however humans will still be central to the decision making. Early examples of smart applications will emerge in certain industry pockets adding to the uneven distribution of capabilities due to differences in regulatory frameworks, innovation management approaches, competitive pressures, end customer sophistication and demand for higher quality experiences as well as conflicting economic incentives in some value chains. Despite the talk about the upcoming singularity and robots taking over the world, cooler heads in the space point out that it will take a while to create truly intelligent systems. In the meanwhile, businesses will slowly learn to trust models and their predictions as they realize that algorithms can outperform humans in many tasks.
- PREDICTION #8:
Agile #MachineLearning will quietly take hold beneath the cacophony of AI marketing speak.
A more practical and agile approach to adopting Machine Learning will quietly take hold next year. Teams of doers not afraid to get their hands dirty with unruly yet promising corporate data will completely bypass the “Big Data” noise and carefully pick low hanging predictive problems that they can solve with well proven algorithms in the cloud with smaller sampled datasets that have a favorable signal to noise ratio. As they build confidence in their abilities, the desire to deploy what they have build in product as well as to add more use cases will mount. No longer bound by data access issues, complex, hard to deploy tools these practitioners not only start improving their core operations but also start thinking about predictive use cases with a higher risk-reward profiles that can serve as the enablers of brand new revenue streams.
- PREDICTION #9:
MLaaS platforms will emerge as the “AI-backbone” for enterprise #MachineLearning adoption by legacy companies.
MLaaS platforms will emerge as the “AI Backbone” in accelerating the adoption of Agile Machine Learning practices. Consequently, commercial Machine Learning will get cheaper and cheaper thanks to a new wave of applications built on MLaaS infrastructure. Cloud Machine Learning platforms in particular will democratize Machine Learning by
- significantly lowering costs by eliminating complexity or front-loaded vendor contracts
- offering a preconfigured frameworks that packages the most effective algorithms
- abstracting the complexities of infrastructure setup and management from the end user
- providing easy integration, workflow automation and deployment options through REST APIs and bindings.
- PREDICTION #10:
Data Scientists or not, more Developers will introduce #MachineLearning into their companies.
2017 will be the year, when developers start carrying the Machine Learning banner easing the talent bottleneck for thousands of businesses that cannot compete with the Googles of the world in attracting top research scientists with over a decade of experience in AI/Machine Learning, which doesn’t automagically translate to smart business applications that deliver business value. The developers will start rapidly building and scaling such applications on MLaaS platforms that abstract painful details (e.g., cluster configuration and administration, job queuing, monitoring and distribution etc.) that are better kept underground in the plumbing. Developers just need a well-designed and well-documented API instead of knowing what a LR(1) Parser is to compile and execute their Java code or knowing what Information Gain or the Wilson Score are to be able to solve a predictive use case based on a decision tree.
We are still in the early innings of “The Age of Analytics”, so there is much more to feel excited about vs. dwelling on bruises from past false starts. Here’s to keeping calm and carrying on with this exciting endeavor that will take business as we know it through a storm by perfecting the alchemy between mathematics, software and management best practices. Happy 2017 to you all!
1: The A16Z presenter seems to think every self-driving car has to learn what a stop sign is by itself, thus reinventing the wheel many times over instead of relying on tons of historical sensor data from an entire fleet of such vehicles. In reality, few Machine Learning use cases require a continuously trained algorithm e.g., handwriting recognition.
Four Years From Now, the startup business platform of Mobile World Congress that enables startups, investors and corporations to connect and launch new ventures together, goes to Barcelona, Spain, from February 27 to March 1, 2017. We could not think of a better context to run the fourth edition of our series of Startup Battles.
Telefónica has invited PreSeries, the joint venture between Telefónica Open Future_ and BigML, to participate at the 4YFN event and showcase its early stage venture investing platform on the main stage on February 28 in front of an audience of over 500 technologists. In a nutshell, PreSeries provides insights and many other metrics to help investors make objective, data-driven decisions in investing in tomorrow’s leading technology companies.
In rapid fire execution mode, Valencia was the first city that witnessed the World’s premiere Artificial Intelligence Startup Battle last March 15, 2016. On October 12, the PreSeries Team travelled to Boston to celebrate the second edition at the PAPIs ‘16 conference. Less than two months later we celebrated the third edition in São Paulo, in the BSSML16 context. The fourth edition of our series of startup battles will be hosted in Barcelona, Spain. The distinguished audience and press members in Catalonia will discover how an algorithm is able to predict the success of a startup without any human intervention.
To recap the process, five startups from the Wayra Academy, Telefónica’s startups accelerator, will present their projects on stage through five-minute to-the-point pitches. Afterwards, PreSeries will ask a number of questions to each contender in order to provide a score between 0 and 100. The startup with highest score of all will win the battle. Having the opportunity to participate in the battle is key for participating startups as it will give them excellent exposure to potential corporate sponsors, strategic partners and the venture investments community. Stay tuned for future announcements, where we will reveal the contenders of the fourth edition of our Startup Battle as it may just prove to be the most competitive one so far.