Recently, perhaps the biggest news in tech was the acquisition of GitHub, the version control software company, by Microsoft for $7.5 billion USD. The move will almost certainly impact the workflow of developers and engineers, as GitHub enjoys widespread usage in the standard tech stacks of software development teams from small to large companies. Given GitHub’s association with the open source community and Microsoft’s mixed record with acquisitions (such as Skype), initial reactions were understandably mixed. But exactly how optimistic or pessimistic is the tech community about Microsoft taking ownership of this product?
There is perhaps no better place to get a pulse on people’s opinions in real time than Twitter. However, the sheer volume of information makes anecdotal or manual assessments intractable. The Twitter API conveniently allows for a reasonable number of tweets to be easily queried, which can be considered individual documents ripe for simple Natural Language Processing (NLP), such as sentiment analysis. Here we will demonstrate how the new BigML Zapier app allows you to seamlessly integrate Twitter queries related to GitHub with a predictive model of sentiment on BigML, and to send those results to another platform, such as a Google spreadsheet, for real time updates and monitoring.
In this tutorial, we walk through the process of setting up an automated Zapier app, or “Zap”, that consists of three different integrations and zero lines of code:
- Twitter “trigger” to retrieve tweets containing keyword(s) of interest
- BigML prediction of sentiment for the retrieved tweets
- Google spreadsheet to record the text and sentiment of each tweet
Prior to beginning any Zapier integration with BigML predictions, it is important to ensure that a trained Machine Learning model for the problem at hand exists. To predict the sentiment of tweets, we first trained an ensemble model using a Sentiment140 dataset made available by Stanford University researchers and stored on Kaggle. This dataset contains 1.6 million tweets and two labels, 4 for positive sentiment and 0 for negative sentiment. While human sentiment can be much more subtle than simply 2 classes, the sheer volume of these labels makes this dataset compelling and useful for the analysis of Twitter data specifically. For more information on how to train and evaluate classification models, please check out our extensive documentation and material on supervised learning.
Step 1: Connecting a Twitter account and “trigger”
The first step in constructing this Zapier integration is selecting the desired “trigger” from a specific app, which in our case will be Twitter. Connecting an account is as simple as entering an existing username and password.
In Zapier, a trigger refers to an event that sets your Zap in motion. Our Zap will be triggered whenever any user on Twitter creates a new tweet that contains our specific search term of interest: “github microsoft”. By default, Twitter uses an “AND function” to combine multiple words, but more sophisticated options are available.
Step 2: Making a prediction of sentiment using BigML
As outlined previously, it’s necessary to first train a model outside of Zapier if you would like to utilize any “actions” from the BigML Zapier app. In Zapier, an “action” is defined as an event that’s completed in a second (or third) party app automatically. The BigML app has several action options, and here we select the option to “Create Prediction”.
After connecting your BigML account with Zapier, there will be a large menu with options for how to set up a BigML prediction. Most of these options are in fact optional, but most important is that you provide the required BigML resource ID (e.g. ensemble/12345678) for the trained model, as well as an example input in the “text” field. At this point, you should be able to successfully run a test.
Step 3: Recording results in a Google spreadsheet
Multiple actions can be added in sequence for a Zap, and here we add an additional action using Google Sheets in order to collect our queried tweets and their corresponding sentiment. This process is essentially identical to the previous process of choosing BigML as an app action.
Before moving on to the next step, you should first set up a spreadsheet with columns dedicated to the elements that you wish to record. In this example, we are only opting to record a small amount of information: the date and time of the tweet, the Twitter handle of the Tweeter, the text of the tweet, and the predicted sentiment.
Because we will be collecting the data from Twitter and the corresponding BigML predictions, we will want to create a new row for each Tweet as the final Zap is triggered. “Create Spreadsheet Row” is one of the many actions available with the Google Sheets app, and the following page will ask for you to select the specific spreadsheet and worksheet where you would like to store the data.
In order to properly record the output from Zapier, you must then connect each of the columns in the spreadsheet to data being collected through either the Twitter trigger or BigML action. The first three fields (Time, User, Tweet) all will come from the Twitter “Search Mention” trigger, while the final field (Sentiment) will come from the BigML “Create Prediction” action. If the set up is done correctly, the final output should look something like below:
Your Zap will now be complete and you will have the option to turn it on. Depending on how many applications you integrate, and how many actions you want to perform, you may be prompted to upgrade your Zapier account.
Once all of these triggers and actions are set up properly, you are ready to launch your Zap. The Zap outlined here will update every 15 minutes and survey the sentiment of all tweets on our desired subject of “GitHub and Microsoft.” In our limited evaluation of ~600 tweets, we found that 81% contained a positive sentiment. There are a number of reasons why this result may misrepresent actual opinions, however. The main concern is that it was not possible to separate “announcements”, which are common among media accounts and often re-Tweeted, from “reactions”, which likely include opinions from personal accounts. Regardless, the excitement surrounding the acquisition likely drove the positive sentiment overall.
The seamless and low-effort integration of multiple web-services made possible by Zapier is extremely useful, but adding a layer of Machine Learning to the mix makes it even more powerful. The BigML Zapier integration has been in the works for a while, as a natural expansion for our mission to make Machine Learning easy and accessible. We launched a beta version of the BigML Zapier app in May 2017, and we’ve continued to add new features to the app based on the feedback from early adopters (thanks if you are one of them!).
Now, with the official launch of the BigML Zapier app, you can get even more bang for your buck on these two platforms. Simply sign up and start using the app for FREE. We encourage you to find clever ways to integrate Machine Learning models into your existing workflows – all made possible in just a few clicks!