Programming Deepnets with the BigML API

Posted by

So far, we have introduced BigML’s Deepnets, how they are used, and how to create one in the BigML Dashboard. In this post, the fourth in our series of blog posts about Deepnets, we will see how to use them programmatically using the API. So let’s start!

The API workflow to create Deepnets includes five main steps: first, upload your data to BigML, then create a dataset, create a Deepnet, evaluate it and finally make predictions. Note that any resource created with the API will automatically be created in your Dashboard too so you can take advantage of BigML’s intuitive visualizations at any time.

Authentication

The first step in using the API is setting up your authentication. This is done by setting the BIGML_USERNAME and BIGML_API_KEY environment variables. Your username is the same as the one you use to log into the BigML website. To find your API key, on the website, navigate to your user account page and then click on ‘API Key’ on the left. To set your environment variables, you can add lines like the following to your .bash_profile file.

export BIGML_USERNAME=my_name
export BIGML_API_KEY=13245 
export BIGML_AUTH="username=$BIGML_USERNAME;api_key=$BIGML_API_KEY;"

Once your authentication is set up, you can begin your workflow.

1. Upload your Data

Data can be uploaded to BigML in many different ways. You can use a local file, a remote URL, or inline data. To create a source from a remote URL, use the curl command:

 curl "https://bigml.io/source?$BIGML_AUTH" \
 -X POST \
 -H 'content-type: application/json' \
 -d '{"remote": "https://static.bigml.com/csv/iris.csv"}'

2. Create a Dataset

Now that you have a source, you will need to process it into a dataset. Use the curl command:

curl "https://bigml.io/dataset?$BIGML_AUTH" \
-X POST \
-H 'content-type: application/json' \
-d '{"source": "source/59c14dce2ba7150b1500fdb5"}'

If you plan on running an evaluation later (and you will want to evaluate your results!), you will want to split this dataset into a testing and a training dataset. You will create your Deepnet using the training dataset (commonly 80% of the original dataset) and then evaluate it against the testing dataset (the remaining 20% of the original dataset). To make this split using the API, you will first run the command:

curl "https://bigml.io/dataset?$BIGML_AUTH" \
-X POST \
-H 'content-type: application/json' \
-d '{"origin_dataset": "dataset/59c153eab95b3905a3000054", 
     "sample_rate": 0.8, 
     "seed": "myseed"}'

using the sample rate and seed of your choice. This creates the training dataset.

Now, to make the testing dataset, run:

curl "https://bigml.io/dataset?$BIGML_AUTH" \
-X POST \
-H 'content-type: application/json' \
-d '{"origin_dataset": "dataset/59c153eab95b3905a3000054", 
     "sample_rate": 0.8,
     "out_of_bag": true,
     "seed": "myseed"}'

By setting “out_of_bag” to true, you are choosing all the rows you did not choose while creating the training set. This will be your testing dataset.

3. Create a Deepnet

Now that you have the datasets you need, you can create your Deepnet. To do this, use the command:

curl "https://bigml.io/deepnet?$BIGML_AUTH" \
-X POST \
-H 'content-type: application/json' \
-d '{"dataset": "dataset/59c15634b95b3905a1000032"}' 

being sure to use the dataset ID of your training dataset. This will create a Deepnet by using the default settings.

You can also modify the settings of your Deepnet in various ways. For example, if you want to change the maximum number of gradient steps to be ten and you want to name your deepnet “my deepnet” you could run:

curl "https://bigml.io/deepnet?$BIGML_AUTH" \
-X POST \
-H 'content-type: application/json' \
-d '{"dataset": "dataset/59c15634b95b3905a1000032",
     "max_iterations": 10,
     "name": "my deepnet"}' 

The full list of Deepnet arguments can be found in our API documentation, which will be fully available on October 5.

4. Evaluate your Deepnet

Once you have created a Deepnet, you may want to evaluate it to see how well it is performing. To do this, create an Evaluation using the resource ID of your Deepnet and the dataset ID of the testing dataset you created earlier.

curl "https://bigml.io/evaluation?$BIGML_AUTH" \
-X POST \
-H 'content-type: application/json' \
-d '{"deepnet": "deepnet/59c157cfb95b390597000085"'
     "dataset": "dataset/59c1568ab95b3905a0000040"}'

Once you have your Evaluation, you may decide that you want to change some of your Deepnet parameters to improve its performance. If so, just repeat step three with different parameters.

5. Make Predictions

When you are satisfied with your Deepnet, you can begin to use it to make predictions. For example, suppose you wanted to predict if the value of field “000001” was 3. To do this, use the command:

curl "https://bigml.io/prediction?$BIGML_AUTH" \
-X POST \
-H 'content-type: application/json' \
-d '{"deepnet": "deepnet/59c157cfb95b390597000085", 
     "input_data" : {"000001": 3}}'

Want to know more about Deepnets?

Stay tuned for the more blog posts! In the next post, we will explain how to automate Deepnets with WhizzML and the BigML Python Bindings. Additionally, if you have any questions or you’d like to learn more about how Deepnets work, please visit the dedicated release page. It includes a series of six blog posts about Deepnets, the BigML Dashboard and API documentation, the webinar slideshow as well as the full webinar recording.

 

3 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s