Skip to content

Investigating Real-World Data with Time Series

In this blog post, the second one in our six post series on Time Series, we will bring the power of Time Series to a specific example. As we have previously posted, a BigML Time Series is a sequence of time-ordered data that has been processed by using exponential smoothing. This includes three smoothing filters to dampen high-frequency noise to reveal the underlying trend of the data. With BigML’s simple and beautiful Dashboard visualizations, we’ll investigate the number of houses sold in the United States.

The Data

We will be examining the number of houses sold (in millions) in the United States by month and year from January 1963 to December 2016. 

Just looking at a scatterplot of the data, we see the number of houses sold goes generally up and down until early 1991, after which the trend is mostly upward. It reaches a peak in early 2005, then goes generally downward again until 2011, when it once more begins to climb. Within each of these years, there is a noticeable seasonal trend, with more houses sold in the summer months and fewer in the winter. But these are all subjective impressions. Can we create a quantifiable model to predict house volume?

The Chart

First, let’s create a Time Series model from the 1-click action menu by using our raw dataset.

img 1.png

We can see in the chart that our Time Series data is represented by the black line and the plot of our best fit model is represented by the purple line. The model with the lowest AIC (one measure of fit) is labeled “M,A,N”. By clicking on the Select more models: dropdown, we can see this means this model is using Holt’s linear method with multiplicative errors, additive trend and no seasonality. If we wished, we could select some other model, perhaps optimizing for some other measure of fit.

By sliding the Forecast slider, we can see what the model predicts for dates in the future. This model predicts that the volume of houses sold will continue rise linearly. Because this model does not use seasonality, it doesn’t display the up and down pattern we would expect it to. Let’s create another Time Series, this time configuring the parameters so we can add seasonality.

img 2.png

This time the model with the lowest AIC is labeled “M,N,M” for multiplicative error, no trend, and multiplicative seasonality. It captures the ebb and flow of the seasonal sales, but no longer indicates that volume will continue to go up. Since 1963, housing volume has indeed been overall relatively flat.

img 3.png

Another Look at the Data

Perhaps we aren’t interested in what behavior housing volume has shown since 1963, but rather what it has been doing recently. We may use our domain knowledge to reason that the housing bubble and following crash was a very unusual event justifying our decision to focus on data from 2011 onwards. How has housing sales volume been changing during these years?

So we start by filtering our data to only include the months between January 2011 and December 2016. We want to capture seasonality, so we choose Configure Time Series from the configuration menu and on the advanced options, set Seasonality to All and Seasonal Periods to 12 (twelve months in a year). Now we can see both the upward trend and cyclic seasonality that we expect. One interesting and unexpected thing our model has discovered is that the cyclic trend is not completely smooth. It seems that there is a little uptick in housing volume in October of each year. Perhaps this can be explained by people wanting to buy before the busy holiday season!

img 4

This has been our second blog post on the new Time Series resource. We’ve quickly put Time Series through its paces and used it to better understand sequential trends in our data. Please join us again next time for the third blog post in this series, which will cover a detailed Dashboard tutorial for Time Series.

Want to know more about Time Series?

Please visit the dedicated release page for further learning. It includes a series of six blog posts about Time Series, the BigML Dashboard and API documentation, the webinar slideshow as well as the full webinar recording.

Introduction to Time Series

We are proud to present Time Series as a new resource brought to the BigML platform. On July 20, it will be available via the BigML Dashboard, API and WhizzML. Time Series is a sequentially indexed representation of your historical data that can be used to solve classification and segmentation problems, in addition to forecasting future values of numerical properties, e.g., air pollution level in Madrid for the last two days. This is a very versatile method often used for predicting stock prices, sales forecasting, website traffic, production and inventory analysis, or weather forecasting, among many other use cases.

Following our mission of democratizing Machine Learning and making it easy for everyone, we will provide new learning material for you to start with Time Series from scratch and become a power user over time. We start by publishing a series of six blog posts that will progressively dive deeper into the technical and practical aspects of Time Series with an emphasis on Time Series models for forecasting. Today’s post sets the tone by explaining the basic Time Series concepts. We will follow with an example use case. Then there will be several posts focused on how to use and interpret Time Series through the BigML Dashboard, API, WhizzML to make forecasts for new time horizons. Finally, we will complete this series with a technical view of how Time Series models work behind the scenes.

Let’s get started!

Why Bring Time Series to BigML?

There are times when historical data inform certain behavior in the short or longer future. However, unlike the general classification or regression problems, Time Series needs your data to be organized as a sequence of snapshots of your input fields at various points in time. For example, the chart below depicts the variation of sales during a given month. Can we forecast the sales for future days or even months based on this data?

The answer is a resounding “Yes” since BigML has implemented the exponential smoothing algorithm to train Time Series models. In this type of models, the data is modeled as a combination of exponentially-weighted moving averages. Exponential smoothing is not new, it was proposed in the late 1950s. Some of the most relevant work was Robert Goodell Brown’s, in 1956, and later on, the field of study was expanded by Charles C. Holt in 1957, as well as Peter Winters in 1960. Contrary to other methods, in forecasts produced using exponential smoothing, past instances are not equally weighted and recent instances are given more weight than older instances. In other words, the more recent the observation the higher the associated weight.

From Zero to Time Series Forecasts

 

In BigML, a regular Time Series workflow is composed of training your data, evaluating it and forecasting what will happen in the future. In that way, it is very much like other modeling methods available in BigML. But what makes Time Series different?

1. The training data structure: The instances in the training data need to be sequentially arranged. That is, the first instance in your dataset will be the first data point in time and the last one will be the most recent one. In addition, the interval between consecutive instances must be constant.

2. The objective fields: Time Series models can only be applied to numeric fields, but a single Time Series model can produce forecasts for all the numeric fields in the dataset at once (as opposed to classification or regression models, which only allow one objective field per model). In other words, a Time Series model accepts multiple objective fields, and in fact you can use all numeric fields in the input dataset as objective at once.

3. The Time Series models. BigML automatically trains multiple models for you behind the scenes and lists them according to different criteria, which is a big boost in productivity as compared to hand tuning all the different combinations of the underlying configuration options. BigML’s exponential smoothing methodology models Time Series data as a combination of different components: level, trend, seasonality, and error (see Understanding Time Series Models section for more details).

When creating a Time Series, we have several options regarding whether to model each component additively or multiplicatively, or whether to include a component at all. To alleviate this burden, BigML computes in parallel a model for each applicable combination, allowing you to explore how your Time Series data fits within the entire family of exponential smoothing models. Naturally, we need some way to compare the individual models. BigML computes several different performance measures for each model, allowing you to select the model that best fits your data and gives the most accurate forecasts. Their parameters and corresponding formulas are described in depth in the Dashboard and API documentation.

4. Forecast: BigML lets you forecast the future values of multiple objective fields in short or long time horizons with a Time Series model. You will be able to separately train a different Time Series model for each objective field in just a few clicks. Your Time Series Forecasts come with forecast intervals: a range of values within which the forecast is contained with a 95% probability. Generally, these intervals grow as the forecast period increases, since there is more uncertainty when the predicted time horizon is further away.

5. Evaluation: Evaluations for Time Series models differ from supervised learning evaluations in that the training-test splits must be sequential. For an 80-20 split, the test set is the final 20% of rows in the dataset. Forecasts are then generated from the Time Series model with a horizon equal to the length of the test set. BigML computes the error between the test dataset values and forecast values and represents the evaluation performance in the form of several metrics: Mean Absolute Error (MAE), Mean Squared Error (MSE), R Squared, Symmetric Mean Absolute Percentage Error (sMAPE), Mean Absolute Scaled Error (MASE), and Mean Directional Accuracy (MDA). These metrics are fully covered in the Dashboard documentation.  Every exponential smoothing model type contained by a BigML Time Series model is automatically evaluated in parallel, so the end result is a comprehensive overview of all models’ performance.

Understanding Time Series Models

As mentioned in our description above, Time Series models are characterized by these four components:

  • Level: Exponential smoothing, as the name suggests, reduces noisy variation in a Time Series’ value, resulting in a gentler, smoother curve. To understand how it works, let us first consider a simple moving average filter: for a given order m, the filtered value at time t is simply the arithmetic mean of the m preceding values, or in other words, an equally-weighted sum of the past m values. In exponential smoothing, the filtered value is a weighted sum of all the preceding values, with the weights being highest for the most recent values, and decreasing exponentially as we move towards the past. The rate of this decrease is controlled by a single smoothing coefficient α, where higher values mean the filter is more responsive to localized changes. In the following figure, we compare the results of filtering stock price data with moving average and exponential smoothing.levelBoth smoothing techniques attenuate the sharp peaks found in the underlying Time Series data. The filtered values from exponential smoothing are what we call the level component of the Time Series model. Because the behavior of exponential smoothing is governed by a continuous parameter α, rather than an integer m, the number of possible solution is infinitely greater than the moving average filter, and it is possible to achieve a superior fit to the data by performing parameter optimization. Moreover, the exponential smoothing procedure for the level component may be analogously applied to the remaining Time Series components: trend and seasonality.
  • Trend: While the level component represents the localized average value of a Time Series, the trend component of a Time Series represents the long term trajectory of its value. We represent trend as either the difference between consecutive level values (additive trend, linear trajectory), or the ratio between them (multiplicative trend, exponential trajectory). As with the level component, the sequence of local trend values is considered to be a noisy series which we can again smooth in an exponential fashion. The following series exhibits a pronounced additive (linear) trend.austa
  • Seasonality: The seasonality component of a Time Series represents any variation in its value which follows a consistent pattern over consecutive, fixed-length intervals. For example, sales of alcohol may be consistently higher during summer months and lower during winter months year after year. This variation may be modeled as a relatively constant amount independent of the Time Series’ level (additive seasonality), or as a relatively constant proportion of the level (multiplicative seasonality). The following series is an example of multiplicative seasonality on a yearly cycle. Note that the magnitude of variation is larger when the level of the series is higher.ausbeer2
  • Error: After accounting for the level, trend, and seasonality components. There remains some variation not yet accounted for by the model.  Like seasonality, error may be modeled as an additive process (independent of the series level), or multiplicative process (proportional to the series level). Parameterizing the error component is important for computing confidence bounds for Time Series Forecasts.

In Summary

To wrap up, BigML’s Time Series models:

  • Help solve use cases such as predicting stock prices, sales forecasting, website traffic, production and inventory analysis as well as weather forecasting, among many other use cases.
  • Are used to characterize the properties of ordered sequences, and to forecast their future behavior.
  • Implement exponential smoothing, where the data is modeled as a combination of exponentially-weighted moving averages. That is, the recent instances are given more weight than older instances.
  • Train the data with a different split compared to other modeling methods. The split needs to be sequential rather than random, so you can test your model against the latter period in your dataset.
  • Are trained as level components and it can include other components such as: trend, damped, seasonality, or error.
  • Let you forecast one or multiple objective fields. For multiple objectives, you can train a separate set of Time Series models for each objective field.

You can create Time Series models, interpret and evaluate them, as well as forecast short and longer horizons with them via the BigML Dashboard, our API and Bindings, plus WhizzML and Bindings (to automate your Time Series workflows). All of these options will be showcased in future blog posts.

Want to know more about Time Series?

At this point, you may be wondering how to apply Time Series to solve real-world problems. Rest assured, we’ll cover specific examples in the coming days. For starters, in our next post, we will show a use case where we will be examining a dataset with the number of houses sold in the United States since January 1963 to see if we can predict general or even seasonal trends. Stay tuned!

We hope this post made wet your appetite to learn more about Time Series. Please visit the dedicated release page for further learning. It includes a series of six blog posts about Time Series, the BigML Dashboard and API documentation, the webinar slideshow as well as the full webinar recording.

BigML Spring 2017 Release and Webinar: Time Series!

BigML’s Spring 2017 Release is here! Join us on Thursday July 20, 2017, at 10:00 AM US PDT (Portland, Oregon. GMT -07:00) / 07:00 PM CEST (Valencia, Spain. GMT +02:00) for a FREE live webinar to discover the updated version of BigML’s platform. We’ll be showcasing Time Series, the latest supervised learning method added to our toolset for analyzing time based data when historical patterns can explain future behavior.

 

Our new capability brought to the BigML Dashboard, API and WhizzML is Time Series, a well-known supervised learning method commonly used for predicting stock prices, sales forecasting, website traffic, production and inventory analysis as well as weather forecasting, among many other use cases. In BigML, a Time Series model is trained with Time Series data, that is, a field that contains a sequence of equally distributed data points in time. BigML implements exponential smoothing to train Time Series models, where the data is modeled as a combination of exponentially-weighted moving averages.

 

Time Series is a supervised learning model, as such, it’s ideal to evaluate its performance. As usual, prior to training your model you will need to split your dataset in two different subsets: one for training and the other one for testing. However, the split for Time Series has to be sequential rather than random, which means that you will test your model against the most recent instances in your dataset representing the latter period. BigML offers a special option (via API or Dashboard) for this type of sequential split. You can then easily interpret the results of your model by visually comparing those against the corresponding test data in a chart view.

 

As in every BigML resource, you can make predictions with your model. With Time Series Forecasts you can easily forecast events in short or longer time horizons. You can also employ a Time Series model to forecast the future values of multiple objective fields. Additionally, BigML offers the ability to generate your forecast in real-time on your preferred local device at no cost, which is an ideal context to make faster predictions.

Are you ready to discover this new BigML resource? Please visit the dedicated release page for further learning. It includes a series of six blog posts about Time Series, the BigML Dashboard and API documentation, the webinar slideshow as well as the full webinar recording.

 

Joint Webinar Video: BigML Machine Learning meets Trifacta Data Wrangling

Yesterday we hosted a joint webinar with Trifacta to showcase how seamlessly both platforms fit together in turning raw data into real-life predictive use cases. What makes these tools special is their emphasis on ease of use, making Machine Learning viable for significantly more professionals than ever before. These developers, analyst and business experts routinely work with critical business data sources yet they lack the deep data engineering and/or Machine Learning technical skills that have been darn hard to acquire and retain for organizations that are not named Google or Facebook.

To solidify these benefits, Poul Petersen, BigML’s Chief Infrastructure Officer, and Victor Coustenoble, Technical Regional Manager EMEA at Trifacta, presented a live demo on how to solve a loan risk analysis use case. Special thanks to hundreds of curious minds that registered, attended and asked questions during the webinar. We know some of you couldn’t make it due to conflicts and others found out after the deadline. Don’t fret, you can now watch the full webinar recording on the BigML Youtube channel.

The accompanying presentations are also accessible on the BigML SlideShare page. As you will also find out in the recording, it doesn’t take much to leave behind the inertia and make a dash for sharpening your data wrangling and Machine Learning skills since both Trifacta and BigML offer FREE versions.

Stay tuned for future webinars with concrete examples of how to transform your data to actionable business insights. As always, let us know if there is a specific topic or technique you’d like to see covered next.

Data Wrangling meets Machine Learning

Imagine a world, where a business expert can quickly generate actionable insights starting from raw data and ending up with powerful predictive models.

To avoid Garbage-in-Garbage-out situations, transformation and mapping of raw data into Machine Learning-ready data must be done properly. Data wrangling is a key process that enables Machine Learning to make a big impact. Filtering, joining, cleansing, stacking are necessary steps to get your data Machine Learning-ready. Lowering the barriers for non-experts to work with data autonomously, increasing the productivity of experts and offering a beautifully designed, easy to use platform. BigML and Trifacta both share key common values in their respective fields.

These two disruptive forces of the analytics industry are coming together to host a FREE, joint webinar on July 6, 2017 at 09:00 AM PDT (Portland, Oregon. GMT -07:00) / 06:00 PM CEST (Valencia, Spain. GMT +02:00).

Within 30 minutes both companies are going to demonstrate how seamlessly their platforms fit together in turning raw data into real-life predictive use cases. You’ll witness the interplay of two leading analytics platforms devoted to user friendliness, simplicity and beautiful design as Victor Coustenoble from Trifacta and Poul Petersen from BigML will demo how they use BigML and Trifacta Wrangler tools to solve a loan risk analysis use case.

Please save the date and reserve your spot today, space is limited!

PreSeries’ VC-in-a-Box Crowned Dataholics.io at the 6th AI Startup Battle in São Paulo

The 6th Artificial Intelligence Startup Battle came to an end on June 21 in São Paulo in fully automated fashion. The jury of this unique battle, PreSeries’ algorithms, predicted with a score of 96.50 out of 100 that Dataholics is the startup most likely to succeed among other contenders. Dataholics captures and structures millions of data points about people on social networks such as Facebook, Linkedin, Google, Twitter, Google search results, blogs, web portals and online services. Their algorithm creates a unified profile for each person based on behavioral, professional and demographic indicators from their email, cell phone, name or ID.

From left to right: Renato Valente – Country Manager – Telefonica Open Future_ & Wayra Brasil; João Gabriel Souza – Co-Founder & CEO – Mr. Descartes; Eduardo D. Martucci – Founder and CEO – Voice Commerce; Daniel Mendes – Founder and CEO – Dataholics; Dhiogo Corrêa – Data Architect – Itera; Rafael Libardi – Public Relations Executive – Data H.

In the battle, all five startups have had the chance to introduce their company during a 5­-minute pitch. Later on, PreSeries’ AI took the time to ask some questions to all the contenders about key aspects of their business. The exchange was made possible through a voice-assistant device present on stage (thus the name ‘VC-in-a-box’).

Itera, came in 2nd with a score of 86.81. Itera is a technology company founded in 2008 and established in São Carlos/SP, always aiming to build innovative solutions for its clients. They are now investing in a machine learning platform for text mining named ALICE. The platform is currently focusing on finance, and marketing case studies.

Mr. Descartes got the 3rd position with a score of 62.13. This company provides a chatbot to help cities improve their waste management and sustainability. They work in collaboration with local governments, businesses, and people from the community in order to generate data, educate the public and build lasting partnerships.

Voice Commerce was the 4th startup in the ranking with a score of 62.12. Voice Commerce is a voicebot that provides anyone with a simple, objective and secure online purchase experience through voice commands. It creates the perfect solution for people with visual impairment when buying goods and services online.

Finally, Data H achieved the 5th position with a score of 62.10. DATA H is a startup focused on creating intelligent products and artificial intelligence outsourcing of research and development. DATA H has created its own ecosystem to enable artificial intelligence projects for a diverse set of sectors.

After the event, BigML’s CEO & Co-founder and President of PreSeries, Francisco J. Martin, said: “Having organized our 6th AI Startup Battle in only the last year and a half across the globe, it is amazing to us that humans are surprisingly open and adaptable in trusting PreSeries algorithms to assess the future prospects of startups. What started as a crazy idea has come to be seen as an obvious need. This can be attributed to the investment professionals being overwhelmed with mountains of new data created every day, which in turn highlights the acute need for objective assistance and automation.”

This edition of the battle took place on June 21 in São Paulo, Brazil, at the PAPIs Connect conference, Latin America’s 1st conference on real-world Machine Learning applications. Our next AI Startup Battle will be in Boston (Microsoft N.E.R.D. – MIT) for PAPIs ’17 (Oct. 24-25), stay tuned on Twitter with #AIStartupBattle and @PreSeries.

Machine Learning: Past, Present and Future by Tom Dietterich

BigML Chief Scientist, Professor Tom Dietterich gave one of the keynotes at the recent 2ML event held in Madrid, Spain. The event was jointly organized by the consultancy Barrabés and BigML, and gathered an audience of 400 decision makers, technology professionals, and industry practitioners. Based on popular demand, we have posted on our YouTube channel the video recording of Dr. Dietterich’s presentation that covers the evolution of Machine Learning since its inception:

The corresponding slide deck can be accessed on the BigML SlideShare page. It goes over the present-day Machine Learning challenges such as Automated Decision Making, Perceptual Tasks, and Anomaly Detection.  It concludes with key future themes that will keep the discipline occupied for years to come: Detection and Correcting for Bias,  Risk-sensitive Optimization, Explanation of Black Box System, Verification and Validation, and  Integrating ML Components into larger software systems. Enjoy!

AI Startup Battle in São Paulo – Meet the Contenders!

PreSeries, the joint venture between Telefónica Open Future_ and BigML, will be hosting a brand new AI Startup Battle at PAPIs Connect on June 21 in São Paulo. PAPIs Connect is Latin America’s 1st conference on real-world Machine Learning applications and will feature talks from BigML, Nubank, Uber, IBM and many more.

But what makes the AI Startup Battle so special? Well, it is the absence of human involvement in selecting the eventual winner. Indeed, a human jury is no longer needed thanks to PreSeries’ AI. Our voice-controlled AI communicates with the contenders live on-stage and generates scores to rank the startups and choose the winner. In our AI Startup Battles, our Artificial Intelligence is made available through a little device on stage. Our little “VC-in-a-box”, asks the contenders a set of questions and chooses its follow-up questions based on answers given to previous ones. It will naturally focus on questions that have the most predictive power in its own bias-free opinion. In the end, the startup with the highest score is announced as the winner.

Meet the contenders!

At this point, you may be wondering who will be competing in the battle, so let’s get to know the contenders.

Dataholics

Dataholics captures and structures millions of data points about people on social networks such as Facebook, Linkedin, Google, Twitter, Google search results, blogs, web portals and online services. Their algorithm creates a unified profile for each person based on behavioral, professional and demographic indicators from their email, cell phone, name or ID.

Voice Commerce

Voice Commerce is a voicebot that provides anyone with a simple, objective and secure online purchase experience through voice commands. It creates the perfect solution for people with visual impairment when buying goods and services online.

Data H

DATA H is a company focused on creating intelligent products and artificial intelligence outsourcing of research and development. DATA H has created its own ecosystem to enable artificial intelligence projects for a diverse set of sectors.

Mr. Descartes

Mr. Descartes provides a chatbot to help cities improve their waste management and sustainability. They work in collaboration with local governments, businesses, and people from the community in order to generate data, educate the public and build lasting partnerships.

Itera

Itera is a technology company founded in 2008 and established in São Carlos/SP, always aiming to build innovative solutions for its clients. They are now investing in a machine learning platform for text mining named ALICE. The platform is currently focusing on finance, and marketing case studies.

Stay tuned!

Be sure to stay tuned as the winner will be announced right after the event on social media (on Twitter with #AIStartupBattle) as well as on our blog. For more details, please follow us on: LinkedIn, Google+, Facebook, or Twitter. The countdown starts now!

Results of 5th AI Startup Battle in Palma de Mallorca

PreSeries, hosted the 5th edition of the AI Startup Battle at the exclusive Global Tourism Innovation Summit taking place in Palma de Mallorca, Spain on June 9. The event served as the meeting point of top international professionals discussing topics such as Tourism Innovation / AI / IoT / VR / Smart Destinations / Hotels / OTA / Tour Operator / Airlines / Connectivity / Big Data / New Technologies / Machine Learning / Airports / Ports / Infrastructure / Smart Mobility / Smart Management & Solutions / Dynamic And Immersive / Video Wall Experiences / Digital Signage / Mobile / Apps. The summit was organized by Agora Next Telefónica Open Future_, the first global tourism innovation program from Telefónica, a strategic partner oriented towards companies and entrepreneurs, who aspire to become a global reference of Tourism 4.0. A large crowd of tourism experts and decision makers came to witness the power of the PreSeries Machine Learning algorithms.

Sr. D. Iñigo Valenzuela (center) – CEO of Smartvel – Winner of the 5th edition, receiving his prize alongside Valentín Fernández, Global Director of Business Development and Partnerships at Telefónica Open Future_ (center left) and Kemel Kharbichi, CEO and President of Agora Next (center right).

With a score of 96.51, Smartvel, a SaaS Supplier of Digital Destination Content, was crowned winner of the 5th edition of the AI Startup Battle. They have built a unique tool combining three types of content:

  • dynamic content like up-to-date travel agenda
  • traditional destination content gathered through geolocalization, e.g., restaurants, points of interest, sights and attractions
  • geocoded layers of content recommended by their clients to promote and cross-sell passes, places or events.

The second place finisher was Apartool with a score of 86.82. Apartool is the bridge between apartment blocks, aparthotels, and tour operators. This intermediation project aims to give travel agencies the smartest way to offer a global service to all their clients. Apartools offers, through a platform, the first booking solution specialized in the booking of entire buildings for touristic purposes.

Finally, the third position, with 67.82 points, was for MallorcaWifi.com, which has been working for more than a decade in the implementation of a Wi-Fi network in Palma that allows citizens to connect at any time and at no cost. The project, which has been carried out jointly with the City of Palma, has allowed the simultaneous connection of up to ten million mobile devices in the last year.

From left to right: PreSeries’ VC-in-a-Box, Fabien Durand (Marketing Manager at PreSeries), Sr. D. Iñigo Valenzuela (CEO at Smartvel), Sr. D. Mauricio Socias (Founder & CEO at MallorcaWifi.com), Sr. D. Marc Vilar (CEO at Apartool) and Julian Vinué (Director of Wayra Barcelona).

We’ve been proud to host another impressive group of startups during this latest edition of our AI battle.  If you are interested in competing in our next AI Startup Battle in São Paulo, Brazil (June 21), please apply here and stay tuned!

Machine Learning Challenges and Impact: an interview with Thomas Dietterich

BigML’s Co-founder and Chief Scientist, Professor Thomas Dietterich was recently interviewed by National Science Review, a peer-reviewed journal aimed at reviewing cutting-edge developments across science and technology in China and around the world.

ML for Ecosystem Management

The piece touches on many contemporary topics that are source for much debate in the AI/Machine Learning community as well as his own projects:

  • Expanding application areas of Machine Learning e.g., anomaly detection techniques that can identify unusual transactions and present them to a human analyst for law enforcement or improving the management of forest fires in Oregon by applying reinforcement learning.

  • The impact of deep learning and its pros and cons with specific emphasis on the migration of academic talent on the brain drain caused by academics specialized on the topic migrating to large technology companies.

  • Interpretation of alternative future scenarios involving advanced AI systems, technological singularity and the (so called) superintelligence i.e., impact on humanity as a whole from economical, cultural and moral perspectives.

For extra credit, we also highly recommend the presentation below, which Professor Dietterich gave in Valencia at the The Age Of Machine Learning event sponsored by BigML. It does an excellent job of bringing everyone up to speed in understanding the roots and evolution of the discipline of Machine Learning and the future challenges facing technologists like us and the society as a whole. Enjoy!

%d bloggers like this: