In my work, which is predominantly information technology, the need for Machine Learning is everywhere. And I don’t mean just in the somewhat obvious ways like security or log file analysis. Consider, my work experience goes back to the tail end of the mainframe era. In fact one of my first jobs was networking together what were at the time considered powerful desktop computers as part of a pilot project to replace a mainframe. Since then I’ve seen the birth of the internet and the gradual transition of computing resources to the cloud. The difference between those two extremes is that adding capacity used to mean messing around in ceiling tiles for hours, where now when I need additional capacity I run a few lines of code and instantiate machines nearly instantly in the cloud.
This ability to scale up quickly to meet rising demand, and I would argue even more importantly, to scale down to save costs is a huge advantage. Everyone who is familiar with the cloud knows this, but there is more. Once the allocation of resources is programmatic, the next logical step is to make that allocation intelligent. To not just respond to current demand, but to be able to predict demand in real time and have an optimal infrastructure running at the exact instant that it is needed.
Indeed, the need for Machine Learning is everywhere!
But what is driving this demand for Machine Learning, why now? One factor, as in the previous example, is the ready availability of cheap machine power, in particular the cloud, which is continually making computation cheaper. As computation gets cheaper, the impossible becomes possible. And as things become possible, people tend to start doing them and you end up with people like me making intelligent infrastructure.
But another related factor is the explosive growth of data in the last few years. The interesting thing is that there has been a lot of focus on collecting, storing and even doing computation with so called Big Data, but much less discussion of how to actually derive useful insights from it. As it turns out, Machine Learning is particular well suited to the task.
So as companies that have a big data initiative are realizing this, it is driving a big demand for Machine Learning. How big? To answer that, I think it is important to realize that this is not just about mining data to improve the company bottom line, although that’s certainly a probable outcome. No, this is a question of survival. We often see startups that are using a data centric approach to disrupt existing markets. Entrenched companies that don’t understand this trend risk extinction.
So we have a growing demand for Machine Learning, but what work needs to be done? Do we need new and better algorithms? Probably – I mean, it’s always possible that someone will invent a new algorithm tomorrow that blows everything else away, and it would be really hard to argue that this is a bad thing. On the other hand, there are already a lot of really good algorithms available, and lots of problems ready to be solved.
In fact, I recently read a paper titled “Do we need hundreds of classifiers to solve real world problems?”. In this paper, they evaluated the performance of 179 classifiers on a variety of datasets. That’s a 179 different algorithms that are available just to handle a classification problem. Interestingly, the best performer in that paper was the RDF, which is hardly new.
Now don’t get me wrong, the need for newer algorithms to advance the science of Machine Learning will never go away. But there is a huge amount of work that needs to be done right now to move existing algorithms from the lab and into the practical world. To make the existing algorithms more robust and consumable. This work is more important than the elusive perfect model algorithm because the reality is that for most projects, a perfect model will not be the final product; the model will form only a piece of the entire application that most companies need to implement and it’s often more important to deliver results. After all, data often has only a temporal value.
The need for Machine Learning is everywhere, and BigML is here to deliver it.
Interesting observations on the increasing demand for ML, and the increasing supply of data and problems (or opportunities) for which ML is well suited.
The paper is interesting, too, though, in the context of practical ML, I think it’s too bad that the paper focuses exclusively on accuracy. Although this focus is common in the ML [academic] community, and among UCI datasets, all the interesting problems to which I’ve applied machine learning have had datasets that are extremely skewed with with respect to the distribution of class labels (the majority class often accounts for 90%+ of the data), and so accuracy is often a poor choice of metric.
I don’t see any reference to the data from the extensive experiments reported in that paper being publicly available. If the data set of results – TP, FP, FN, TN counts – were made available, it would enable others to investigate other performance metrics (e.g., precision and recall, sensitivity and specificity, ROC, AUC …).
FWIW, I recommend Steven Salzberg’s paper On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach (Data Mining and Knowledge Discovery, 1997, Volume 1, Issue 3, pp 317-328) to anyone interested in comparing the performance of classifiers … especially on the UCI data sets.
Thanks for the thoughtful comment Joe. Valid point on skewed class labels. One thing that may be useful to balance those instances is applying different weights depending on the label. Luckily BigML has different ways of applying weights as a model configuration variable.