Skip to content

PlusVitech Uses Machine Learning for Drug Treatment to Win the #EUvsVirus Hackathon

This guest post is originally authored by Fran Guillen, CBO of PlusVitech, and Vicente Salinas, CEO of Plusvitech.

A Little Bit of Context

PlusVitech is a Spanish company that was founded in 2013 with the key mission to improve people’s quality of life by finding solutions for high-impact diseases such as cancer. In particular, our strategy has always been to search for treatments that already exist in the market, which can be used for cancer. This strategy has many advantages: the cost of development is much lower than that of new drugs, candidate drugs have already been shown to be safe when administered to humans, and they can be immediately available for the new indication after approval. This strategy is called repositioning in the pharma world and it has already taken place with Viagra or Propecia, among others. In our case, we have very promising evidence with complete remissions in different types of cancer, even in more advanced stages.

However, this past March, when the COVID-19 epidemic broke out, we realized that some of the solutions we had for cancer could also be useful in treating COVID-19 infections in some people. It is not the virus itself that causes fatalities, but rather the reactions that take place inside our body. After all, the mechanisms activated by the human body for different situations are very similar. In particular, for COVID-19, there is a cascaded lung inflammation, very similar to an allergic reaction, or to the pulmonary inflammation that occurs in lung cancer. Often, it’s this inflammation that generates severe lung damage and pneumonia that leads to death from Coronavirus. Therefore, according to our thesis, if we were able to solve the lung inflammation, we could also stop COVID-19 deaths and any of its mutations, which is what we have patented worldwide.

PlusVitech and BigML against Covid19

About a month after our discovery, the European Commission, in collaboration with the EU Member States, held the Pan-European Hackathon #EUvsVirus to identify effective proposals towards curing the adverse effects of the pandemic. We decided to attend this call from the EU with our PVT-COVID project.

The PVT-COVID Project

The weekend was quite intense. For starters, our PlusVitech team and two more people from the hackathon joined the project disinterestedly, as well as various mentors and experts. We developed the business model based on licensing the patent to pharma companies who already produce this type of drug to ensure its availability immediately after obtaining approval by the regulatory agencies of each country. Also, we worked diligently on defining the necessary clinical trial protocol to approve the treatment for COVID-19 and the contacts to be made with hospitals and the Spanish Drug Agency.

However, in the process of designing the protocol, we found that each COVID-19 patient is different from others which means the patient is in a different clinical state. This requires different treatments to address different individual needs. Some of these patients are at home, others hospitalized, and the most severe ones in the ICU, with various levels of oxygen saturation, cough, or fever. This scenario is quite challenging for health professionals, as the treatments need to be personalized in order to be effective. For instance, there are patients so severe that they are intubated thus they can’t take medication orally. This simple idea made us realize that we can do better than just having a single treatment. Instead, we focused on personalized treatments with factors such as dosage, time, and even in combinations with other treatments to play with. Taking this idea into account, ideally, a hospital doctor could enter the patient’s data into an online system and obtain the most appropriate treatment for him in real-time.

Preparing such a system, even if it was only a prototype, was too much in the few hours that remained, as we only had a few more hours before the hackathon was to be over at 9:00 AM on Monday morning. At night, while sleeping, Fran Guillen, CBO of PlusVitech, had a dream about using BigML to solve this problem! So he got up at 5:00 AM, opened a free account in BigML, and, in a few hours, prepared an initial table in Google Sheets with characteristics and clinical states of patients, crossing it with preliminary results that we have from our treatment.

Sample of the table of anonymous patient data imported directly from Google Sheets.

Fran has almost no background in Machine Learning, but he was able to upload it to BigML, generating a dataset with the 1-click option and, henceforth, a Model, again with the 1-click option.

One of the COVID-19 treatment models that we generated with the 1-click option.

When the rest of the team woke up a few hours later they were amazed! The system allowed us to generate predictions of what would be the best drug treatment to apply for different COVID-19 patient cases, as it considers each of their health characteristics.

Prediction from the COVID-19 model generated with the 1-click option.

Just a few hours later, and after the sleepless night on Sunday, we finally presented the project a couple of hours before the end of the hackathon term, including in the pitch deck the explanation of the work done in the predictive system created using the BigML Dashboard. 

The #EUvsVirus Hackathon Results

The #EUvsVirus hackathon has been the largest hackathon in history worldwide, surpassing even those held previously by Google or Facebook, among others. More than 20,900 participants and 2,100 solutions were presented to fight the COVID-19 pandemic, judged on the potential of their social impact, their scalability prospects, the real possibility of launching the prototype, and a coherent business plan. Fortunately, our project called PVT-COVID, turned out to be one of the winners in the Life and Health category and the single winner in the pharmacological area! Additionally, this award has had an extraordinary reception in the Spanish media, appearing in print and digital newspapers, radio, and television.

Subsequently, PVT-COVID has been selected among the winners specifically for the DemoDay, which was last Thursday, May 21, where we were able to present our project to hundreds of European institutions and investors with the aim of obtaining partners and financing of the clinical trial that allows us to approve our treatment against COVID-19, since PlusVitech is seeking funding to carry out the phase 2 clinical trial that costs about €500,000 and we could have it approved in just 2 months, in addition to another €500,000 to continue with the approval of cancer treatment.

For all this, we want to thank the BigML team that enables so many projects like ours to become reality — especially in a field as crucial as healthcare presenting humanity with complex challenges like cancer and COVID-19. We hope you enjoyed our story on how Machine Learning helped us better predict the ideal drug treatment for COVID-19 patients. We will soon be authoring another article, which will explain our cancer treatment prediction system we are developing on top of BigML. So please stay tuned!

A1 Digital and BigML join forces to Support COVID-19 Research

We’re happy to report that BigML along with our partner A1 Digital, an expert in digitization and part of Telekom Austria Group, are jointly taking the initiative to extend Machine Learning capacity to research institutions free of charge. With this timely offer, A1 Digital is making the full potential of BigML’s state-of-the-art analytics capabilities available to combat the COVID 19 pandemic.

The Machine Learning-as-a-Service offering, hosted in A1 Digital’s secure, EU-GDPR-compliant Exoscale Cloud, is available to medical as well as commercial and non-profit research institutions dealing with the economic and social consequences of the pandemic. It is limited to a number of qualified institutions in Europe. Interested institutions and research groups can find more information and an application form here.

A1 Digital COVID-19

Francis Cepero, Director Vertical Market Solutions at A1 Digital, commented “The current exceptional situation confronts us all with completely new challenges of a medical, political, economic and social nature. To get the pandemic under control, research institutions are showing unprecedented efforts. In doing so, they are generating large amounts of data that they must analyze as quickly as possible in order to arrive at relevant results. This is where our Machine Learning Platform powered by BigML offers the necessary support. We have analyzed how we can best support the efforts to overcome the current crisis and are therefore making the Machine Learning-as-a-Service offering available to qualified groups and institutions free of charge.”

BigML’s CEO, Francisco Martin, shared “When A1 Digital approached us with a proposal to make our Machine Learning platform available to qualified organizations free of charge to help fight the COVID-19 pandemic, we were thrilled. Our platform is particularly suited for this context as the need for streamlining Machine Learning workflows from raw data to insights and production models is paramount given the time pressure public healthcare professionals are under. BigML requires no programming knowledge or prior Machine Learning experience to produce interpretable models. It also fosters collaboration to engage domain experts, who need to weigh in on those results before models are deployed in the field. Finally, because the platform runs in A1 Digital’s Exoscale Cloud, it can be used immediately and users do not need to worry about data security and compliance with data protection regulations.”

In summary, we’re looking forward to positive contributions from interested research institutions towards alleviating the adverse effects of the COVID-19 pandemic on the world population as we believe Machine Learning is the perfect 21st-century tool to accelerate serendipitous discoveries in these challenging times.

Machine Learning in Industrial Chemicals: Process Quality Optimization

This post is the last in our series of 5 blog posts highlighting use case presentations from the 2nd Edition of Seville Machine Learning School (MLSEV). You may also check out the previous posts about the 6 Challenges of Machine Learning, Predicting Oil Temperature Anomalies in a Tunnel Boring Machine, Optimization of Passenger Waiting Time for Elevators, or Applying Topic Modeling to improve Call Center Operations.

Today, we delve into a use case from the chemicals industry originally presented by José Cárdenas, Technical Services Manager at Indorama Ventures. Headquartered in Bangkok, Thailand, Indorama Ventures started its journey in 1994 specializing in the production of worsted wool yarn, which is typically used in tailored garments and textiles such as suits. Gradually, the company completed its global expansion with acquisitions in the United States and Europe, eventually becoming a global PET (Polyethylene Terephthalate) producer as well as a sizable player in the PTA (Purified Terephthalic Acid) business. Today, Indorama operates production sites in 31 countries on five continents – in Africa, Americas, Asia, Europe & Eurasia.

Indorama Ventures

PET is manufactured in the form of pellets that are then melted to produce packaging material (food and beverage containers, bottles) and various polyester fibers consumed in industries from automotive to medical. The specific Machine Learning project Indorama tackled involved the carboxylic acid process, which has a critical role in PET production of various grades, e.g., hot fill, high/low intrinsic viscosity, quick heat, general grade.

Carboxylic Acid Process

The Carboxylic Acid Process from Xylene to PET

In this project, the Indorama team was mainly concerned with the optimization of the above process, which can be translated as achieving cost reductions while not compromising from key PET quality parameters. In other words, it’s the age-old “do more with less” challenge, whereby the inputs to the process are used more efficiently than it is in the status quo.

Fortunately, the project team had access to 2.5 years worth of detailed chemical process data containing more than 6 million data points. This highly technical dataset described many aspects of the complex chemical process such as throughput, catalyst material concentration, feed to air ratio, oxygen level, and more. In order to pick apart the best signals out of it, technical domain experts turned to anomaly detection, outlier elimination, and association discovery by applying BigML’s handy unsupervised methods. Understanding the variable correlations during the data exploration phase was also key in feature selection to further eliminate noise.

The custom BigML workflow for the project

The custom BigML workflow for the project

The Indorama team went through multiple iterations to improve their classification model metrics such as recall to acceptable levels. The team of experts used BigML’s Partial Dependence Plot (PDP) visualizations to analyze the fine-grained impact of combinations of process variables on the PET quality and yield. In return for all the hard work, such close-up model inspection resulted in discoveries even long-time chemical process experts were not previously familiar with. These days they are hard at work in making the necessary changes and upgrades to the underlying chemical processes to mimic the higher efficiency modes of operation predicted by their best performing BigML models some of which were built by OptiML — BigML’s popular ‘AutoML’ capability.

With that, let’s jump into the corresponding MLSEV video describing exactly how Indorama went about implementing their custom Machine Learning workflow and how the subsequent iterations helped them gain actionable insights:

Do you have a similar Industrial Process Optimization challenge?

Depending on your specific needs, BigML provides expert help and consultation services in a wide range of formats including Customized Assistance to turn-key smart application delivery — all built on BigML’s market-leading Machine Learning platform. Do not hesitate to reach out to us anytime to discuss your specific use case at info@bigml.com.

Machine Learning in Business Services: Applying Topic Modeling to improve Call Center Operations

This post is a continuation of our series of blog posts highlighting presentations from the 2nd Edition of Seville Machine Learning School (MLSEV). You may also check out the previous posts about the 6 Challenges of Machine Learning, Predicting Oil Temperature Anomalies in a Tunnel Boring Machine, or Optimization of Passenger Waiting Time for Elevators.

Today, we will zoom in on an example application of Machine Learning in the labor-intensive enterprise functions of customer service and call center operations. This presentation was given by Andrés González of CleverData.io.  Located in Barcelona, CleverData is an IT services and consulting firm delivering Machine Learning solutions for a variety of corporate customers such as Mazda, Penguin Random House, iT Now (CaixaBank), and Sorli.

Ricoh Machine Learning

The project in focus was implemented for the European branch of the Japanese multinational imaging and electronics giant, Ricoh. Among other business activities, Ricoh operates a profitable turn-key ‘Managed Print Services’ business, where they actively manage the entire stock of printers and supplies as well as any scheduled maintenance of those printers on behalf of their customers. This helps businesses lighten the upfront cost hits to their capital structure. At the same time, outsourcing such business services helps them more easily adapt to changes in their operations as they can flexibly dial up or down the variable managed services costs — much to CFOs liking.

As part of their service agreements, Ricoh is also responsible for solving any after-sale service or printer issues with the help of their call center operations. Naturally, Ricoh is very interested in increasing the productivity of their call center operations in general and the associated incident tracking and resolution process in particular. This can be best achieved by avoiding the dispatching of technicians to the customer’s office site by maximizing the likelihood of issue resolution remotely. In this project, CleverData helped Ricoh build a ‘Dispatching Bot’ to automate this critical decision point in their incident management process.

The raw historical incidents dataset CleverData employed accounted for 19 months of incident reports and had around 150,000 records spanning some 61 columns such as printer characteristics, incident description, contract characteristics, and more importantly the labels explaining issue resolution outcomes.

Since the incident descriptions contained rich textual data, they were ideal targets for feature engineering. Consequently, CleverData proceeded to utilize BigML’s unsupervised Topic Models to do exactly that. In the following steps, the enriched datasets were fed to BigML supervised learning models to arrive at the final predictions deployed in production. During the 7 months following initial solution deployment, the ‘Dispatching Bot’ managed incidents were proven to increase the remote incident resolution rate from 61% to 81% on average — a rather dramatic improvement in saved costs and productivity!

Now, let’s hear, in more detail, how CleverData went about implementing both the custom BigML Machine Learning workflow and the eventual production application for this successful project:

Do you have a Customer Service or Call Center challenge?

Depending on your specific needs, BigML provides expert help and consultation services in a wide range of formats including Customized Assistance to turn-key smart application delivery — all built on BigML’s market-leading Machine Learning platform. Do not hesitate to reach out to us anytime to discuss your specific use case at info@bigml.com.

Machine Learning in Industrial Machinery: Optimization of Passenger Waiting Time for Elevators

This post is a continuation of our series of blog posts highlighting presentations from the 2nd Edition of Seville Machine Learning School (MLSEV). You may also check out the previous posts about the 6 Challenges of Machine Learning or Predicting Oil Temperature Anomalies in a Tunnel Boring Machine.

Given that Machine Learning has been established as a foundational piece of technology similar to RDBMS, it can touch every business or engineering process to power data-driven automated decision making in near real-time. Indeed, in our everyday life as consumers, we interact with Machine Learning systems unknowingly in many familiar contexts.

Elevators and Machine Learning

In this vein, as part of the real-world use case presentations during MLSEV, Delio Tolivia from Talento Corporativo, a Spanish consulting firm providing a wide array of business solutions, presented their optimization project to minimize elevator wait times. The primary goal of the project was to improve on the standard passenger experience by making elevators smarter. Their client for this project was Thyssen Krupp AG, which is a German multinational conglomerate with a focus on industrial engineering and steel production. With millions of elevators around the world, it’s not too hard to see how the scale of this problem and its potential impact can be very large in its future iterations.

To begin the Talento team gathered data from 5 different elevator systems and split those into 3 groups based on the different types of buildings they operated in, the usage patterns demonstrated such as traffic levels during weekdays vs. weekends, and frequency of trips between floor pairs.

Based on extensive data exploration efforts, the Talento team decided to go with a simpler rule-based approach to program the behavior of those elevators with very consistent usage patterns over the time period analyzed. This is very much in line with our philosophy at BigML, which has driven us to preach the value in trying simpler approaches first as those can save a lot of time and energy while forming meaningful baseline comparisons for future Machine Learning endeavors on the same business problem.

They achieved a 12% reduction in energy consumption with one of the rule-based subgroups of elevators with more straightforward operating characteristics. Then, they proceeded to build Machine Learning models on the BigML platform to better understand and optimize the behavior of the elevator with more complex operational characteristics. Their early results were very promising as they observed an 8% reduction in passenger wait times even though the initial set of experiments were deployed for hotel elevators that did not perfectly match the type of elevator producing the data their OptiML and Classification Ensemble Models were originally trained on.

Without further ado, let’s take a more in-depth look into this industrial use case. You can click on the Youtube video below and/or access the slides freely on our SlideShare channel:

Do you have an Industrial Machinery challenge?

Depending on your specific needs, BigML provides expert help and consultation services in a wide range of formats including Customized Assistance to turn-key smart application delivery — all built on BigML’s market-leading Machine Learning platform. Do not hesitate to reach out to us anytime to discuss your specific use case at info@bigml.com.

 

How to Configure Optional Inputs for WhizzML Scripts

WhizzML, the powerful Domain-Specific Language (DSL) developed by BigML, is used not only to automate Machine Learning (ML) workflows but also to implement high-level ML algorithms. As part of our recent release, BigML now supports a feature that allows users to configure optional inputs for WhizzML scripts on the BigML Dashboard.

Before we dive into the specifics, let’s briefly recap the different ways to create WhizzML scripts.

When using the BigML API, there are three ways to create a WhizzML script programmatically: make HTTP POST requests directly to the BigML API such as using curl commands, use BigMLer, the BigML command-line tool, or use BigML bindings for different programming languages. Each script requires a corresponding metadata.json file to specify its inputs and outputs, as well as other meta information such as name, description, etc. 

On the BigML Dashboard, there are also many ways to create a WhizzML script: clone a script from the Gallery, automatically generate a script that can recreate a BigML resource by using the “Scriptify” menu option, import a WhizzML script from Github, and finally, write your own from scratch by using the WhizzML editor. You can easily specify the inputs and outputs for your WhizzML script on the Dashboard as seen below:

script-inputs-outputs

For a given WhizzML script, some inputs may be mandatory, some optional. You can also provide default values to inputs. When the input type is “resource” (Sources, Datasets, Models, etc.) BigML now provides checkboxes on the Dashboard for users to toggle between “mandatory” or “optional” as they specify their preference. Users also can choose to provide default values to those inputs or leave them empty.

Let’s see an example script to go over how you can configure its inputs.

After validating the source code of a script as usual, here’s how the section for specifying inputs looks:

script-inputs-config

In this script, dataset-input is a mandatory resource while source-input is an optional resource, as shown by the checked optional checkbox. Note that source-input doesn’t have a default value. The third resource input, supervised-model-input, is also optional but a default value is provided. So if a user does not assign a value at execution, the default value will be used.

Even after the script has been created, it’s still possible to modify input configuration by clicking on the pencil icon in the input box:

script-inputs-after-edit

This provides an opportunity after script creation to set a resource input as optional or mandatory, as well as to set its default value (if applicable), as shown below:

script-inputs-after-toggle

In the following screen, an input, source-input, is set as optional without a default value:

script-inputs-without-default

Here the same input as above, source-input, set as optional and the drop-down list used to select an available resource as its default value:

script-inputs-with-default

Once all changes are ready, you should click on the save button to save the input configuration.

script-inputs-save

When configuring an execution, optional resources will be displayed as enabled, as shown by their green icons, even if you don’t provide a value for them, because they are not required to launch the execution. The mandatory resources will appear disabled, as shown by their gray icons until a valid value is provided for them. Based on this, the “Execute” button will only appear enabled when all inputs are displayed as enabled.

script-inputs-execute-disabled

Resources come and go, and a resource used as a default value for a WhizzML script input may be deleted down the line. When this happens, a default value that was previously defined for an input won’t be valid anymore.  In those cases, you will still see the default value, which is the resource ID string, in the execution configuration. But a warning icon on top of it will indicate that the resource used as the default cannot be found.

script-inputs-missing-warning

However, you can use the drop-down box to select a new default from the list of available resources. Of course, you also can ignore the warning and continue with the script execution since this is an optional input.

A good example of the optional resource inputs is BigML’s AutoML functionality. AutoML helps automate the complete Machine Learning pipeline, which includes both feature selection and model selection. Because input datasets for training, validation or testing are optional, users can choose not to pass one of them. For instance, if you only want to train a new AutoML model and evaluate it, but you don’t want to perform predictions yet, then you don’t need to pass the test dataset to the execution. For more information on AutoML, please check out our blogpost — AutoML: BigML Automated Machine Learning.

With this new feature, we hope to enhance the power and versatility of WhizzML for all our users looking to automate their workflows. Now, BigML users have even more options to create and manage different ML workflows and higher-level algorithms based on WhizzML. To learn more, please visit the release page, where you will find complete documentation on WhizzML and related resources.

Door-to-Door Data Delivery with External Data Sources

by

A common step when working with BigML is extracting data from a database or document repository for uploading as a BigML Data Source. Have you ever wished you could skip that step and create a BigML Data Source directly from your data store? Well, now you can!

Both the BigML Dashboard and the API allow you to provide connection information along with a table or query specifying the data you wish to extract. BigML will then connect to your data store and create the Data Source in BigML’s server.

Importing External Data with the BigML Dashboard

In the Dashboard, go to the Sources tab and you will see a new database icon with a dropdown for external sources as shown here:

moria-1

Choose your desired data store and you will have the opportunity to select a connector to a particular instance. Or, you can create a new connector by providing the necessary information. This can vary according to the data store. Here we see the Create New Connector dialog for MySQL:

Once you have selected your connector, you will be presented with the tables and views (where applicable) from your data store. Here you have two options. First, you can simply select one or more tables and immediately import them into your BigML account as Data Sources. Each table will be imported into a separate source.

If you’d like to first take a look at a bit of the data from a given table you can click on it for a preview. That way you can remind yourself of the columns and see some sample data before importing. Here we see a preview of a table containing the well-known Iris data set:

Sometimes the simplest table import is too blunt an instrument. That’s where your second option comes in — the ability to select the exact data you want by writing an SQL select statement. If you only wish to import a subset of columns, for example, the query can be as simple as

       select sepal_width, petal_width, species from iris

The preview button will verify that the query is valid in your data store and show you the initial result set, allowing you to confirm your intentions before importing into a BigML Data Source. Be assured you can take advantage of your data store’s full query language. A more advanced example, below, shows a select statement with both a join and a group-by clause. The typically normalized data has one table with school district information and another containing individual teacher statistics. Here we are creating a Data Source with information about school districts inclusive of the average teacher salary in each district:

Importing External Data via the BigML API

As is the case with most BigML features, external data sources can be utilized via the API. Again, this is done by providing connection information along with either a table, for a simple import, or a custom query for more control. Here’s an example using curl that imports a “sales” table as a BigML Data Source. (See the BigML documentation for how to construct your BIGML_AUTH string.)

    curl "https://bigml.io/source?${BIGML_AUTH}" \
      -X POST \
      -H 'content-type: application/json' \
      -d '{"external_data": {
                "source": "sqlserver",
                "connection": {
                    "host": "db.bigbox.com",
                    "port": 1433,
                    "database": "biztel",
                    "user": "autosource",
                    "password": "********"
                },
                "tables": "sales"}}'

In the case of the API, you have a few ways to control the imported data without resorting to a query. You can specify which fields to include or which to exclude, as well as limiting the number of records to import. You can also specify an offset along with an order to put that offset in context. All of this is explained in detail in our API docs.

Ready to Get More From Your Data?

We hope being able to import directly from your external data stores simplifies even more your ability to get the most out of your data with BigML. Currently supported are MySQL, Postgres, and Microsoft SQL Server relational databases, as well as the Elasticsearch analytics engine. If you have suggestions for other data stores to support please let us know. To learn still more about External Data Sources please visit our release page where you’ll find all the relevant documentation.

 

Fundación ONCE and BigML Join Forces to Create the World’s First Accessible Machine Learning Platform and Drive New Applications

BigML, the leading Machine Learning platform and Fundación ONCE, a Spanish foundation that has a long history of improving the quality life of people with disabilities have agreed to jointly evolve the BigML platform to allow all people, with or without disabilities, to effectively use Machine Learning as they build predictive applications facilitating more accessibility for citizens.

The strategic alliance will promote the creation of new applications that increase the capacities of professionals with cognitive or physical challenges. This collaboration will enable the adoption of Machine Learning among all types of professionals. As such, BigML will have access to the extensive experience of Fundación ONCE to make its platform more accessible and inclusive. In return, BigML will help Fundación ONCE to train its employees and collaborators and will support them in the process of creating their own Machine Learning applications. The ultimate goal is to create the world’s first and foremost accessible Machine Learning platform that will result in new smart applications making a positive impact in a variety of industries and businesses.

Machine Learning is already penetrating many corporations and institutions that apply it daily towards business cases in which the large volume of data makes it impossible for humans to make decisions efficiently. Examples of the Machine Learning applications in the real world include a wide array of use cases like predicting customer demand for a service or product, being able to anticipate when a machine is going to need a repair, detecting fraudulent transactions, increasing energy savings, improving customer support, and predicting illnesses, among others.

This agreement aims to ensure that these types of projects can just as easily be developed by people with disabilities. Therefore, this initiative will play a key role in promoting Machine Learning while providing access to equal employment opportunities for disabled professionals. To facilitate the inclusion of professionals with disabilities, Fundación ONCE and BigML will co-organize conferences, congresses, seminars, and training activities that will help improve occupational know-how, entrepreneurship, and overall gainful employment.

BigML’s CEO Francisco Martín points out, “Our mission for more than 9 years has been to democratize Machine Learning and make it easy and accessible to everyone. This new alliance allows us to count on ONCE’s vast experience to make BigML much more inclusive by accelerating the development of new high-impact applications. The entire BigML team is very excited and motivated with the collaboration.”

Jose Luis Martínez Donoso, CEO of Fundación ONCE, states “this agreement places Fundación ONCE in a leading position within an important field that will allow us to advance the inclusion of people with disabilities. Innovation and new technologies must take into account people with disabilities so that they are not excluded, and further opportunities are generated for their full social inclusion.”

Introducing New Data Connectors and BigML Dashboard Enhancements

We’re excited to share that we have just released a trio of new capabilities to the BigMLplatform. In this post, we’ll do a quick introduction to them followed by two more blog posts early next week that will dive deeper with some examples of how you can best utilize the new features. Without further ado, here they are.

New Data Connectors

BigML Data Connectors

Every Machine Learning project starts with data and data can come from many sources. This is especially true for complex enterprise computing environments. Naturally, many BigML users look to import data directly from external databases to streamline Machine Learning workflows. No sweat, BigML now supports MySQL, SQL Server, Elasticsearch, and Spark SQL in addition to PostgreSQL.

Both the BigML Dashboard and the API allow you to establish a connector to your data store by providing relevant connection and authentication information, which are encrypted and stored in BigML for future access. BigML can then connect to your data store and immediately create the ‘Source’ in its server(s). You have the option to import data from individual tables or to do it selectively via custom queries by specifying data spanning multiple tables. Moreover, in an organization setting, administrators can easily create connectors to be used by other members of the same organization.

API Request Preview Configuration Option

API Request Review

As a rule of thumb, anything you create on the BigML Dashboard, you can replicate with the BigML API. Now, BigML has added the ability to preview an API request as part of the configuration of unsupervised and supervised models — also available for Fusions on the Dashboard. This handy feature visually shows the user how to create a given resource programmatically including the endpoint for the REST API call as well as the corresponding JSON file that specifies the arguments to be configured.

WhizzML Scripting Enhancements in BigML Dashboard

WhizzML Inputs

When you use WhizzML scripts, some inputs may be set as mandatory while others are optional. You may also provide default values to inputs. You can specify those in the corresponding JSON metadata files. Now, you can also do this on the BigML Dashboard when inputs are resources like Sources, Datasets, and Models. BigML provides checkboxes for users to easily toggle between those inputs, which can be set as mandatory or optional. Similarly, users also have the option to provide default values for those inputs or leave them empty in the BigML Dashboard.

Want to know more about these features?

If you have any questions or would like to find out how the above features work, please visit the release page. It includes useful links to the BigML Dashboard and API documentation already and we’ll add the links for the upcoming blog posts as we publish them.

Claire, the Smart B2B Marketplace for the Food Industry powered by BigML

The food trading industry is one of those green-fields, where digitalization hasn’t really taken off in full force. Against this backdrop, about a year ago, Ramón Sánchez-Ocaña and Angeles Vitorica, Co-founders of Claire Global, contacted BigML with an idea that could turn the industry upside down. After more than 20 years as owners of a food trading business, they had all the domain knowledge about how to best achieve digitization in their industry and provide a valuable service to their peers. Now, their collaboration with BigML is helping them turn their ideas into reality.

Claire Global

Claire is a marketplace devoted to the B2B food trading business. It is purpose-built to implement Machine Learning-driven solutions to optimize the buying and selling processes that are core to the wide-reaching global food industry. Since its launch in January 2020, the marketplace has been in ‘Open BETA’ with interest from a diverse set of companies. Check out the introductory video below (or this one in Spanish) to find more about this innovative project pushing the envelope as far as B2B marketplaces can go.

The project is focused on increasing customer conversions and, most importantly, customer engagement. This becomes possible by adding valuable new functionality to the platform while actively supporting a highly heterogeneous group of user personas. All this to promote more activity on the platform and capture all the relevant inputs from customers, products, and transactions facilitated. This data, in turn, allows the team to implement the Machine Learning capabilities that add further value to the platform and its users.

Here are some of the most interesting optimization Machine Learning use cases we are exploring in this next-generation commerce environment:

  • Automated Product Recommendations: Selections based on customer data (e.g., prior transactions, web navigation patterns, user segmentation) and product data (e.g., product attributes, product similarities, purchase history) are key in making personalized offers to customers. By better “knowing” your users, you can provide them with highly relevant B2B information to further enhance their purchasing experience. This results in more repeat usage and customer loyalty over time. 
  • Optimal Pricing Suggestions for Sellers: Finding the optimal price point is a common problem in retail due to the high amount of parameters that can be considered when doing so, e.g., competitive dynamics, customer feedback, seasonality, current demand. There are also a wide variety of pricing strategies that can be chosen depending on the objectives of the retailer. For instance, maximizing profitability, accessing a new market, implementing dynamic pricing, etc. The use of predictive models for price optimization is quite attractive to cover all these possible different pricing scenarios.
  • Stock Management for Buyers and Sellers: Historical sales data can be very useful in order to extract sales trends and seasonality effects. This, together with some external data such as upcoming events or geographical location, can provide a producer with the best information on how to distribute its products among different warehouse locations according to the quantities it will sell in different areas. Supermarkets or hotel chains can also benefit by predicting the optimal quantity of a certain product they must acquire per location.
  • Anomaly Detection: This is a key technique for tasks such as flagging suspicious customer behavior to prevent fraud, checking for data inconsistencies by spotting pricing mistakes and other data integrity issues going otherwise unnoticed.

In short, we enthusiastically invite those of you in the food industry to actively participate in this new and exciting digital endeavor!

%d bloggers like this: