Introduction to OptiML: Automatic Model Optimization

Posted by

BigML’s upcoming release on Wednesday, May 16, 2018, will be presenting a new resource to the platform: OptiML. In this post, we’ll do a quick introduction to OptiML before we move on to the remainder of our series of 6 blog posts (including this one) to give you a detailed perspective of what’s behind the model optimization part of the release. Today’s post explains the basic concepts that will be followed by an example use case. Then, there will be three more blog posts focused on how to use OptiML through the BigML Dashboard, API, and WhizzML for automation. Finally, we will complete this series of posts with a technical view of how OptiML works behind the scenes.

Understanding OptiML

At BigML, we are believers of human-in-the-loop Machine Learning and the importance of feature engineering which is driven by subject matter expertise in real-life situations.  As such, we have been treading carefully when it comes to ML automation as it is very easy these days to overpromise and fail to deliver a solution that doesn’t overfit or introduce unacceptable tradeoffs between bias and variance.

OptiML

BigML already offers a variety of highly effective supervised learning algorithms including deepnets, logistic regressions, models (decision trees), and ensembles. Thanks to our 1-click modeling capability, these can be executed with intelligent defaults to quickly form baseline models before you may prefer iterate on your project with different configuration options that can better solve your ML problem.  Over time, based on popular demand, we also have made available a number of complementary WhizzML scripts that you can easily clone and execute to perform automated hyperparameter tuning or feature selection for specific algorithms such as ensembles.

We have been witnessing clear interest from our users to further automate model selection for classification or regression problems they are tackling via BigML’s built-in automation options.  The drive for more productivity is nothing surprising.  However, the issue boils down to: is it possible to create a generalized automation approach whereby all applicable algorithms offered on the platform can be compared and contrasted with as little as a few clicks?  The obvious benefit being time savings in deciding which direction in the hypothesis space to further explore to find an optimum model as you avoid exhaustive trial and error experimentation with different algorithms and their parameter configurations.

Well, we have some good news to share on this very front! BigML’s OptiML capability is taking the automation of model selection to the next level.

In essence, OptiML is an automatic optimization option that will allow you to find the best supervised learning model for your data.

  • It can be used for both classification or regression problems.
  • It works by automatically creating and evaluating multiple models with multiple configurations (decision trees, ensembles, logistic regressions, and deepnets) by using Bayesian parameter optimization.
  • When the process finishes, you get a list of the best models so you can compare them and select the one that best suits your use case.

 

OptiML Automates Model Optimizations

The OptiML menu option on the BigML Dashboard attempts to find the best model for a given dataset by sequentially trying groups of parameters, training models using them, evaluating them, and trying a new group of parameters based on the results of the previous tries.  In many cases, this process tends to converge to a good solution faster due to the ability to reason about the expected outcome of a new set of parameters before they are executed. Furthermore, the search can be parameterized by a user-specified performance metric that will guide the optimization process e.g., ROC AUC, F-measure etc.

OptiML can be configured to allow the search to try all applicable model types (deepnets, logistic regressions, models, and ensembles) or a subset of them. However, if deepnets are selected, it won’t iterate over them because they already come with two automatic optimization options: automatic structure suggestion and automatic network search. In those instances, two deepnets, a search and a structure suggestion, will automatically be executed as part of the model optimization.

On a related note, even though we consider them part of the supervised learning toolbox, Time Series are not included in the scope of OptiML as time series datasets tend to present a different type of data structure that are best treated differently than the other supervised tasks mentioned.

Finally, for completeness sake, in addition to finding the best supervised model among several algorithms with OptiML, we have also enabled the Automatic Optimization option for models, ensembles, logistic regression, and deepnets, separately. This means that you no longer need to manually tune any of your supervised models to achieve the best results. Instead, you can simply select the Automatic Optimization option and BigML will execute this task for your chosen algorithm only. Once complete, it will similarly return the top performing model along with its related parameter values.

The Algorithm

The OptiML algorithm is split into two phases. The first, the “parameter search” phase, uses a single holdout set to iteratively find promising sets of parameters. The second, the “validation” phase is used to iteratively perform Monte Carlo cross-validation on those parameters that are somewhat close to the best.

For this second phase, the algorithm iteratively does new train/test splits for the top half of algorithms remaining. Thus, the best models will typically have more than one evaluation associated with them.

The two phases are both governed by an argument specifying the maximum training time allowed. So BigML halts a given phase of the algorithm when it goes over time in that phase. It does, however, guarantee that at least one iteration of each phase will complete before returning. Thus, in extreme cases, such as massive datasets coupled with very low maximum training times, it may overrun the said maximum training time significantly.

Want to know more about OptiML?

If you have any questions or you would like to learn more about how OptiML works, please visit the release page. It includes a series of blog posts, the BigML Dashboard and API documentation, the webinar slideshow as well as the full webinar recording.

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s