WhizzML, the powerful Domain-Specific Language (DSL) developed by BigML, is used not only to automate Machine Learning (ML) workflows but also to implement high-level ML algorithms. As part of our recent release, BigML now supports a feature that allows users to configure optional inputs for WhizzML scripts on the BigML Dashboard.
Before we dive into the specifics, let’s briefly recap the different ways to create WhizzML scripts.
When using the BigML API, there are three ways to create a WhizzML script programmatically: make HTTP POST requests directly to the BigML API such as using curl commands, use BigMLer, the BigML command-line tool, or use BigML bindings for different programming languages. Each script requires a corresponding metadata.json file to specify its inputs and outputs, as well as other meta information such as name, description, etc.
On the BigML Dashboard, there are also many ways to create a WhizzML script: clone a script from the Gallery, automatically generate a script that can recreate a BigML resource by using the “Scriptify” menu option, import a WhizzML script from Github, and finally, write your own from scratch by using the WhizzML editor. You can easily specify the inputs and outputs for your WhizzML script on the Dashboard as seen below:
For a given WhizzML script, some inputs may be mandatory, some optional. You can also provide default values to inputs. When the input type is “resource” (Sources, Datasets, Models, etc.) BigML now provides checkboxes on the Dashboard for users to toggle between “mandatory” or “optional” as they specify their preference. Users also can choose to provide default values to those inputs or leave them empty.
Let’s see an example script to go over how you can configure its inputs.
After validating the source code of a script as usual, here’s how the section for specifying inputs looks:
In this script, dataset-input is a mandatory resource while source-input is an optional resource, as shown by the checked optional checkbox. Note that source-input doesn’t have a default value. The third resource input, supervised-model-input, is also optional but a default value is provided. So if a user does not assign a value at execution, the default value will be used.
Even after the script has been created, it’s still possible to modify input configuration by clicking on the pencil icon in the input box:
This provides an opportunity after script creation to set a resource input as optional or mandatory, as well as to set its default value (if applicable), as shown below:
In the following screen, an input, source-input, is set as optional without a default value:
Here the same input as above, source-input, set as optional and the drop-down list used to select an available resource as its default value:
Once all changes are ready, you should click on the save button to save the input configuration.
When configuring an execution, optional resources will be displayed as enabled, as shown by their green icons, even if you don’t provide a value for them, because they are not required to launch the execution. The mandatory resources will appear disabled, as shown by their gray icons until a valid value is provided for them. Based on this, the “Execute” button will only appear enabled when all inputs are displayed as enabled.
Resources come and go, and a resource used as a default value for a WhizzML script input may be deleted down the line. When this happens, a default value that was previously defined for an input won’t be valid anymore. In those cases, you will still see the default value, which is the resource ID string, in the execution configuration. But a warning icon on top of it will indicate that the resource used as the default cannot be found.
However, you can use the drop-down box to select a new default from the list of available resources. Of course, you also can ignore the warning and continue with the script execution since this is an optional input.
A good example of the optional resource inputs is BigML’s AutoML functionality. AutoML helps automate the complete Machine Learning pipeline, which includes both feature selection and model selection. Because input datasets for training, validation or testing are optional, users can choose not to pass one of them. For instance, if you only want to train a new AutoML model and evaluate it, but you don’t want to perform predictions yet, then you don’t need to pass the test dataset to the execution. For more information on AutoML, please check out our blogpost — AutoML: BigML Automated Machine Learning.
With this new feature, we hope to enhance the power and versatility of WhizzML for all our users looking to automate their workflows. Now, BigML users have even more options to create and manage different ML workflows and higher-level algorithms based on WhizzML. To learn more, please visit the release page, where you will find complete documentation on WhizzML and related resources.