Enhancing your Web Browsing Experience with Machine Learned Models (part I)

Posted by

The other day, I was showing off the Kiva model from BigML’s gallery to my wife. I got the comment that while it’s super easy to do the predictions in the BigML dashboard, it would be even better if the model results could appear directly in the Kiva loan page, without needing to flip between browser tabs. This got me thinking: this sounds like a job for a browser extension. I had never created a browser extension before, and it struck me as an intriguing project. Turns out, injecting predictions into the Kiva webpages was a piece of cake, thanks to BigML’s feature-ful API and export capabilities. In this blog post, I’ll walk you through two versions of the BigML-powered browser extension: the first using actionable models, and the second using the BigML API to grab predictions from a live model.

The Vision

Kiva is a micro-financing website which connects lenders to individuals and groups in developing countries who are seeking loans to improve their quality of life. Historical records about the loans created throughout the site’s history can be accessed via Kiva’s REST-ful API. The vast majority of Kiva loans are successfully repaid, but using data gathered through the Kiva API, we can train a BigML model to identify the circumstances which are more likely to result in a defaulted loan. This model was discussed at length in a previous blog post. Loans which are currently in the fundraising stage can also be queried by the Kiva API, and the resulting response can be fed to the BigML model to give a prediction of whether the final status of the loan will be paid or defaulted. Our goal is to create a browser extension which runs while viewing the page for an individual loan or the list of loans, and which will insert some sort of indicator for the predicted outcome of the loan.

Laying the Groundwork

We’ll be creating a Chrome browser extension here, but the same code could easily be ported to a Greasemonkey script for Firefox. The Chrome developer guide can tell you all about building extensions, but hopefully I can communicate everything you need to know in this tutorial. Also, the source code for this project can be found on Github. Feel free to grab a copy and follow along.

Every Chrome extension starts with a JSON manifest file, which gives the browser some info about what kind of extension we have, and which files it will need access to. In our case, the manifest is pretty short and sweet. Here are its contents in their entirety:


{
"manifest_version": 2,
"name": "Kiva Predictor",
"description": "Predict the success of Kiva loans using BigML models",
"version": "0.1",
"content_scripts": [
{
"matches": ["http://www.kiva.org/lend*"],
"js":["jquery-2.1.1.min.js","kivapredict.js"],
"css":["kivapredict.css"]
}
],
"web_accessible_resources":["images/*.png"]
}

view raw

manifest.json

hosted with ❤ by GitHub

The first four items are just metadata about the extension. The next item is the more interesting bit. Chrome extensions are split into multiple categories depending on their behavior. We want an extension that modifies the content of a webpage, so we need to create a content  script. Within the definition of the content script, we give a regular expression to specify the URLs at which we want the script to run. This pattern matches both the pages for viewing individual loans, and for browsing a list of loans. Next we state the scripts and stylesheets which comprise the extension. Note that I’ve bundled JQuery with the extension, as I lean on it for DOM manipulation and AJAX calls. The last item in the manifest specifies that the extension will need to access some image files located in the given directory.

With the manifest squared away, we can move on to writing the script.

Making Predictions

One of the strengths of the classification and regression tree models created by BigML is that they can be readily represented by a series of nested if statements. This is precisely what we get when we export a model as an actionable model.

screen-download-actionableWe will be using the publicly available Kiva model that is in the BigML gallery. Selecting node.js as the export language gives us a JavaScript function which we can copy and paste, and call from within our content script. Here is the signature:


function predictStatus(fundedAmount, country, loanAmount, sector, fundedDateDayOfWeek, fundedDateDayOfMonth, fundedDateMonth, fundedDateYear)

The actionable model accepts as parameters some data about a loan and returns the predicted status as a string. Our job now is to find the loan data that the model needs to do its thing. Kiva loans all have a unique ID number, so we’ll create a function which looks up a particular loan ID with the Kiva API, and uses the returned information to make a prediction, and create a status indicator which we will insert into the loan page’s DOM.


function makeStatusIndicator(loan_id){
// this function executes the Kiva API query for a particular loan id, feeds the result through
// the actionable model, and returns a green light or red light icon depending on the result.
// create DOM object for icon
var img = $('<img class="indicator">') ;
// Kiva API call
var url = "http://api.kivaws.org/v1/loans/&quot; + loan_id + ".json" ;
$.get(url,function(data){
// pass response to actionable model
var data = data.loans[0] ;
var status = predictStatus(data.funded_amount,data.location.country,data.loan_amount,data.sector) ;
// create the indicator. use chrome.extension.getURL to resolve path to image resource
if (status == "paid"){
img.attr("src",chrome.extension.getURL("images/green_light.png")) ;
}else{
img.attr("src",chrome.extension.getURL("images/red_light.png")) ;
}
img.attr("title","The predicted status for this loan is: " + status.toUpperCase()) ;
})
return img ;
}

Finally, we need to differentiate the script’s behavior between the pages to view individual loans or browsing a list of loans. The URLs for individual loan pages end with the loan ID, which we can extract with a regular expression. In the list view, each item in the list contains a hyperlink to the individual loan page, which we can grab the link with JQuery and again get the loan ID from the destination URL.  From that point, it’s just a simple matter of calling predictStatus and inserting our indicator next to the “Lend” button.


// check if a loan ID is at the end of the url
var re = /lend\/(\d+)/ ;
var result = re.exec(window.location.href) ;
if (result !== null){
// individual loan page
var loan_id = result[1] ;
img = makeStatusIndicator(loan_id) ;
$("#lendFormWrapper").append(img) ;
}else{
// list of loans
$("article.borrowerQuickLook").each(function(idx,element){
var result = re.exec($(this).find("a.borrowerName").attr("href")) ;
var loan_id = result[1] ;
img = makeStatusIndicator(loan_id) ;
$(this).find("div.fundAction").append(img) ;
}) ;
}

view raw

kivapredict.js

hosted with ❤ by GitHub

Running the extension

To run the extension, you must first install it through Chrome’s extension configuration screen. Once there, ensure that Developer mode is selected, then click “Load unpacked extension”, select the directory which contains your manifest.json,  and you’re good to go. If all goes according to plan, you will now see a green or red indicator icon beside every “Lend” button on kiva.org.

screen-kiva-indicator

If you decide to do any tinkering with the extension’s source code, you will need to reload the extension to see the effect of your changes.

Coming up:  Using the BigML API

Using actionable models is arguably the easiest way to include BigML models in a browser extension, but having the model baked into the script can be inconvenient if the model is frequently changing. New loans are constantly being posted on Kiva, and so new data is ever available through the Kiva API. With BigML’s multi-dataset capabilities, we can continously refine our model with a growing body of training data. Keeping our browser extension up to date with our model building efforts would involve pasting in a new version of predictStatus every time we create a new model. In the next installment of this tutorial, I’ll show how we can use BigML’s REST-ful API to ensure that our extension is always using the freshest models. Stay tuned!

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s