Enhancing your Web Browsing Experience with Machine Learned Models (part I)
The other day, I was showing off the Kiva model from BigML’s gallery to my wife. I got the comment that while it’s super easy to do the predictions in the BigML dashboard, it would be even better if the model results could appear directly in the Kiva loan page, without needing to flip between browser tabs. This got me thinking: this sounds like a job for a browser extension. I had never created a browser extension before, and it struck me as an intriguing project. Turns out, injecting predictions into the Kiva webpages was a piece of cake, thanks to BigML’s feature-ful API and export capabilities. In this blog post, I’ll walk you through two versions of the BigML-powered browser extension: the first using actionable models, and the second using the BigML API to grab predictions from a live model.
Kiva is a micro-financing website which connects lenders to individuals and groups in developing countries who are seeking loans to improve their quality of life. Historical records about the loans created throughout the site’s history can be accessed via Kiva’s REST-ful API. The vast majority of Kiva loans are successfully repaid, but using data gathered through the Kiva API, we can train a BigML model to identify the circumstances which are more likely to result in a defaulted loan. This model was discussed at length in a previous blog post. Loans which are currently in the fundraising stage can also be queried by the Kiva API, and the resulting response can be fed to the BigML model to give a prediction of whether the final status of the loan will be
defaulted. Our goal is to create a browser extension which runs while viewing the page for an individual loan or the list of loans, and which will insert some sort of indicator for the predicted outcome of the loan.
Laying the Groundwork
We’ll be creating a Chrome browser extension here, but the same code could easily be ported to a Greasemonkey script for Firefox. The Chrome developer guide can tell you all about building extensions, but hopefully I can communicate everything you need to know in this tutorial. Also, the source code for this project can be found on Github. Feel free to grab a copy and follow along.
Every Chrome extension starts with a JSON manifest file, which gives the browser some info about what kind of extension we have, and which files it will need access to. In our case, the manifest is pretty short and sweet. Here are its contents in their entirety:
The first four items are just metadata about the extension. The next item is the more interesting bit. Chrome extensions are split into multiple categories depending on their behavior. We want an extension that modifies the content of a webpage, so we need to create a content script. Within the definition of the content script, we give a regular expression to specify the URLs at which we want the script to run. This pattern matches both the pages for viewing individual loans, and for browsing a list of loans. Next we state the scripts and stylesheets which comprise the extension. Note that I’ve bundled JQuery with the extension, as I lean on it for DOM manipulation and AJAX calls. The last item in the manifest specifies that the extension will need to access some image files located in the given directory.
With the manifest squared away, we can move on to writing the script.
One of the strengths of the classification and regression tree models created by BigML is that they can be readily represented by a series of nested if statements. This is precisely what we get when we export a model as an actionable model.
The actionable model accepts as parameters some data about a loan and returns the predicted status as a string. Our job now is to find the loan data that the model needs to do its thing. Kiva loans all have a unique ID number, so we’ll create a function which looks up a particular loan ID with the Kiva API, and uses the returned information to make a prediction, and create a status indicator which we will insert into the loan page’s DOM.
Finally, we need to differentiate the script’s behavior between the pages to view individual loans or browsing a list of loans. The URLs for individual loan pages end with the loan ID, which we can extract with a regular expression. In the list view, each item in the list contains a hyperlink to the individual loan page, which we can grab the link with JQuery and again get the loan ID from the destination URL. From that point, it’s just a simple matter of calling
predictStatus and inserting our indicator next to the “Lend” button.
Running the extension
To run the extension, you must first install it through Chrome’s extension configuration screen. Once there, ensure that Developer mode is selected, then click “Load unpacked extension”, select the directory which contains your manifest.json, and you’re good to go. If all goes according to plan, you will now see a green or red indicator icon beside every “Lend” button on kiva.org.
If you decide to do any tinkering with the extension’s source code, you will need to reload the extension to see the effect of your changes.
Coming up: Using the BigML API
Using actionable models is arguably the easiest way to include BigML models in a browser extension, but having the model baked into the script can be inconvenient if the model is frequently changing. New loans are constantly being posted on Kiva, and so new data is ever available through the Kiva API. With BigML’s multi-dataset capabilities, we can continously refine our model with a growing body of training data. Keeping our browser extension up to date with our model building efforts would involve pasting in a new version of
predictStatus every time we create a new model. In the next installment of this tutorial, I’ll show how we can use BigML’s REST-ful API to ensure that our extension is always using the freshest models. Stay tuned!