Perspectives on Self-serve Machine Learning for Rapid Insights in Healthcare

Posted by

BigML users keep inspiring us with their creativity every day. Many take on Machine Learning with little to no background or education in the field. Why? Because they have access to relevant data and they are smart professionals with the kind of intuition only years of study in a certain field can bring about. It’s no surprise, then, that many of them come to suspect that there must be a better way than the good old descriptive spreadsheet analytics or plain vanilla business intelligence tool reports to solve their business problems. A natural curiosity and a self-starter attitude to actively experiment don’t hurt either!

Dr. Patrick Gladding

Long time BigML user Dr. Patrick Gladding is no exception. Dr. Gladding is practicing medicine in New Zealand, which is, fortunately, doing pretty good in their fight to eradicate COVID-19 these days. We’ve approached Dr. Gladding to find out his motivation behind picking up BigML in the first place as well as his informed opinions on how healthcare professionals can transition into more self-sufficient “man + machine” teams leveraging the full power of Machine Learning in their day-to-day routines at hospitals, clinics, and healthcare research institutions across the globe.

  1. Can you please tell us about your background and how you first developed an interest in Machine Learning?
    • I’m a clinical cardiologist trained in echocardiography and with an interest in genetics. I was involved in some pharmacogenetics studies years ago, where a combinatorial approach was needed to analyze the influence of a number of genetic variants. We used simple neural networks then well over ten years ago, so that would have been my first experience with Machine Learning. Over time, the terminology has changed from predictive analytics to Machine Learning and artificial intelligence, and the technology has evolved to become more capable and accessible. It once took a Ph.D. and access to a computer science lab with proprietary algorithms but now many of the most effective techniques are freely available and usable by anyone. Simultaneous to this has been the digitization of health records, which means it is much easier to aggregate and link patient data to outcomes. This along with the availability of excellent, user-friendly Machine Learning tools like BigML has made it very easy to do some high-quality projects in AI.
  2. Can you tell us (more) about one of your Machine Learning projects and how it made an impact in your field?
    • Given the ample availability of data now, there are almost infinite questions that can be posed. We’ve looked at many conditions and outcomes in cardiovascular medicine to see whether they can be predicted with existing clinical data. This includes predicting mortality from routine blood tests and echocardiography results and we’ve had very similar results to studies in the USA performed at Geisinger Health.
    • Some things seem very hard to predict such as atrial fibrillation as this is a more chaotic, stochastic condition. There are some very strong signals for heart failure both in predicting the presence of the condition as well as predicting outcomes. Heart failure is a common condition that has very poor health outcomes and is a diagnosis that is sometimes missed. We’ve shown the possibility of predicting the presence of heart failure using simple blood results such as a complete blood count and electrocardiogram with BigML.
    • What is really appreciated about BigML is that it provides a wide array of Machine Learning methods including more traditional logistic regression, decision trees as well as deep neural networks. It’s great that BigML provides explainability and transparency of the features that make up the predictive model. This allows me to vet these for things that shouldn’t be included and verifies the importance of features that have previously been shown to predict outcomes. For instance, albumin is a strong feature in models of mortality and this has been shown in previous population studies using standard biostatistics.
    • Similarly, hematocrit is a feature of heart failure predictions, which has been shown to be important in other studies. Using the unsupervised Machine Learning clustering features in BigML, we’ve been able to validate work by others (Shah et al) in the clustering of heart failure subtypes, e.g., heart failure with preserved ejection fraction (HFpEF). This opens up the potential for mass screening hospital patients and then subtyping patients, who might benefit from different treatments. This could improve the diagnosis of management of heart failure considerably but does carry with it implications of altering downstream use of diagnostic testing, therapies, cost, and also potential mislabelling of patients as no predictive model is perfect. There are a lot of ethical and service issues we have yet to work through, but the benefit of using BigML is that everything is transparent, explainable, verifiable, and easily validated. It is quite easy for instance to upload a new large dataset, apply a Machine Learning algorithm, and get some results within minutes, without the need for buying an expensive high-performance computer.
  3. What are the top three challenges for doctors and healthcare professionals in adopting Machine Learning? What’s the easiest path to go from data to insights and predictions? Any words of caution?
    • First comes understanding what Machine Learning can do and its limitations, which includes not completely buying into the hype. Conventional biostatistics are still really important.
    • Second, data is sometimes difficult to access. It takes time, persistence, good ethics, privacy, and security measures such as anonymizing data. Getting the data may require some coding abilities, e.g., SQL queries or Python scripting but we have integrated a system called Qlik into our hospital, which means that data is more easily linked and exportable as a CSV file. However, the overall benefit of BigML is that no coding experience is required to use it.
    • Third, a lot of data is required to generate models that cover the wide variability of human disease and interactions. This means collaborating with others, sharing data, and getting insight into a particular field. Most doctors have a wealth of knowledge and expertise. This means they know what the important questions are but working with Machine Learning still requires talking to data scientists, statisticians, and others to ensure that it’s applied appropriately. Combining a domain expert (doctor) with experts in Machine Learning is a very potent force, and BigML is a great platform to take on some of that role. Sharing health data can be a challenge, but it is really the only way forward for this field.
    • What is really important is to obtain quality data, and spend time tidying it up. The axiom “garbage in, garbage out” still applies regardless of whatever fancy new Machine Learning method is used. Machine Learning is a powerful tool but the ease of use and automation in BigML should not substitute for laziness in terms of evaluating, retesting, and validating models. The exciting thing about predictive algorithms is that they could have a very large impact and improve health. At the same time a biased, poorly fitting model that makes bad predictions could worsen outcomes and health disparities. This includes models that could include racial bias, which has been demonstrated before by Google and others. The scale to do good is equal to the scale to do bad so caution must be applied. This would also be relevant to non-healthcare predictions such as loan defaults.
  4. Once you built models and uncovered key insights from your data, how did you use them to mobilize your organization around findings? What is the best way to collaborate with others that may not be much into Machine Learning?
    • BigML has great visualization features and in-built evaluation models that demonstrate sensitivity and specificity for predictions. These outputs are much more understandable to clinicians, e.g., phi, F-measure, and recall. There is a bit of jargon in this field that doctors need to get their heads around. Mobilizing our organization was easy showing them the BigML visualizations. BigML has an enterprise version, which means it can be run on-site within a hospital network, so as to preserve the data security measures already in place.
    • Collaboration through BigML is a really good way of having a kind of escrow for health data as there is not a lot of data sharing that goes on. The reticence to share is probably due to a number of reasons like confidentiality and privacy around the use of health data but also the desire to commercialize results or preserve proprietary algorithms. Basically, if you have data it is now incredibly valuable, so even if you are not into Machine Learning there is no doubt that someone will want to talk to you about it if you have a lot of high-quality data. I would love to see a portal through BigML, where health care professionals could share either data or models or both.
  5. Do you have anything else to share regarding your experience with the BigML platform?
    • BigML is a great system for non-coding doctors and others who want an easy to use and understandable system. It took only about two hours of one-on-one tutoring to get my head around the platform, as well as watching YouTube videos. The customer service is excellent with friendly experts willing to give you help with any application. It’s great that BigML has integrated several Machine Learning methods, which include logistic regression which has been used for decades in the medical literature. It’s something doctors are very familiar with and it is worth noting that it often predicts better than deep neural networks. So despite the hype around deep learning, it is not the perfect tool for everything. Deep learning however is very good when applied to images and unstructured data so I am looking forward to new applications coming from BigML including AI image analysis.

Hope you enjoyed this interview and found something useful to directly apply in your projects. Do you have a similar Machine Learning adoption story you’d like to share with the BigML community? We’d be more than happy to spread the lessons learned for the greater good.  Please don’t hesitate to contact us at info@bigml.com and stay safe!

Leave a comment