Automating Deepnets with WhizzML and The BigML Python Bindings

Posted by

This blog post, the fifth of our series of six posts about Deepnets, focuses on those users that want to automate their Machine Learning workflows using programming languages. If you follow the BigML blog, you may know WhizzML, BigML’s domain-specific language for automating Machine Learning workflows, implementing high-level Machine Learning algorithms, and easily sharing them with others. WhizzML helps developers to create Machine Learning workflows and execute them entirely in the cloud. This avoids network problems, memory issues and lack of computing capacity while taking full advantage of WhizzML’s built-in parallelization. If you are not familiar with WhizzML yet, we recommend that you read the series of posts we published this summer about how to create WhizzML scripts: Part 1, Part 2 and Part 3 to quickly discover their benefits.

In addition, in order to easily automate the use of BigML’s Machine Learning resources, we maintain a set of bindingswhich allow users to work in their favorite language (Java, C#, PHP, Swift, and others) with the BigML platform.

Screen Shot 2017-03-15 at 01.51.13

Let’s see how to use Deepnets through both the popular BigML Python Bindings and WhizzML, but note that the operations described in this post are also available in this list of bindings.

We start creating Deepnets with the default settings. For that, we need to start from an existing Dataset to train the network in BigML, so our call to the API will need to include the Dataset ID we want to use for training as shown below:

;; Creates a deepnet with default parameters
(define my_deepnet (create-deepnet {"dataset" training_dataset}))

The BigML API is mostly asynchronous, that is, the above creation function will return a response before the Deepnet creation is completed. This implies that the Deepnet information is not ready to make predictions right after the code snippet is executed, so you must wait for its completion before you can predict with it. You can use the directive “create-and-wait-deepnet” for that:

;; Creates a deepnet with default settings. Once it's
;; completed the ID is stored in my_deepnet variable
(define my_deepnet
  (create-and-wait-deepnet {"dataset" training_dataset}))

If you prefer the BigML Python Bindings, the equivalent code is:

from bigml.api import BigML
api = BigML()
my_deepnet = api.create_deepnet("dataset/59b0f8c7b95b392f12000000")

Next up, we will configure a Deepnet with WhizzML. The configuration properties can be easily added in the mapping by using property pairs such as <property_name> and <property_value>. For instance, to create a Deepnet with a dataset, BigML automatically fixes the number of iterations to optimize the network to 20,000, but if you prefer the maximum number of gradient steps to take during the optimization process, you should add the property “max_iterations and set it to 100,000. Additionally, you might want to set the value used by the Deepnet when numeric fields are missing. Then, you need to set thedefault_numeric_valueproperty to the right value. In the example shown below, it is replaced by the mean value. Property names always need to be between quotes and the value should be expressed in the appropriate type. The code for our example can be seen below:

;; Creates a deepnet with some settings. Once it's
;; completed the ID is stored in my_deepnet variable
(define my_deepnet
  (create-and-wait-deepnet {"dataset" training_dataset
                            "max_iterations" 100000
                            "default_numeric_value" "mean"}))

The equivalent code for the BigML Python Bindings is:

from bigml.api import BigML
api = BigML()
args = {"max_iterations": 100000, "default_numeric_value": "mean"}
training_dataset ="dataset/59b0f8c7b95b392f12000000"
my_deepnet = api.create_deepnet(training_dataset, args)

For more details about these and other properties, please check the dedicated API documentation (available on October 5.)

Once the Deepnet has been created, we can evaluate how good its performance is. Now, we will use a different dataset with non-overlapped data to check the Deepnet performance.  The “test_dataset” parameter in the code shown below represents the second dataset. Following WhizzML’s philosophy of “less is more”, the snippet that creates an evaluation has only two mandatory parameters: a Deepnet to be evaluated and a Dataset to use as test data.

;; Creates an evaluation of a deepnet
(define my_deepnet_ev
 (create-evaluation {"deepnet" my_deepnet "dataset" test_dataset}))

Similarly, in Python the evaluation is done as follows:

from bigml.api import BigML
api = BigML()
my_deepnet = "deepnet/59b0f8c7b95b392f12000000"
test_dataset = "dataset/59b0f8c7b95b392f12000002"
evaluation = api.create_evaluation(my_deepnet, test_dataset)

After evaluating your Deepnet, you can predict the results of the network for new values of one or many features in your data domain. In this code, we demonstrate the simplest case, where the prediction is made only for some fields in your dataset.

;; Creates a prediction using a deepnet with specific input data
(define my_prediction
 (create-prediction {"deepnet" my_deepnet
                     "input_data" {"sepal length" 2 "sepal width" 3}}))

And the equivalent code for the BigML Python bindings is:

from bigml.api import BigML
api = BigML()
input_data = {"sepal length": 2, "sepal width": 3}
my_deepnet = "deepnet/59b0f8c7b95b392f12000000"
prediction = api.create_prediction(my_deepnet, input_data)

In addition to this prediction, calculated and stored in BigML servers, the Python Bindings allow you to instantly create single local predictions on your computer. The Deepnet information is downloaded to your computer the first time you use it, and the predictions are computed locally on your machine, without any costs or latency:

from bigml.deepnet import Deepnet
local_deepnet = Deepnet("deepnet/59b0f8c7b95b392f12000000")
input_data = {"sepal length": 2, "sepal width": 3}

It is pretty straightforward to create a Batch Prediction from an existing Deepnet, where the dataset named “my_dataset” represents a set of rows with the data to predict by the network:

;; Creates a batch prediction using a deepnet 'my_deepnet'
;; and the dataset 'my_dataset' as data to predict for
(define my_batchprediction
 (create-batchprediction {"deepnet" my_deepnet
                          "dataset" my_dataset}))

The code in Python Bindings to perform the same task is:

from bigml.api import BigML
api = BigML()
my_deepnet = "deepnet/59d1f57ab95b39750c000000"
my_dataset = "dataset/59b0f8c7b95b392f12000000"
my_batchprediction = api.create_batch_prediction(my_deepnet, my_dataset)

Want to know more about Deepnets?

Our next blog post, the last one of this series, will cover how Deepnets work behind the scenes, diving into the most technical aspects of BigML’s latest resource. If you have any questions or you’d like to learn more about how Deepnets work, please visit the dedicated release page. It includes a series of six blog posts about Deepnets, the BigML Dashboard and API documentation, the webinar slideshow as well as the full webinar recording.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s