BigML and Las Naves are getting ready to host the 2nd Machine Learning Summer School in Valencia (September 8-9), which is fully booked. Although we are not able to extend any new invitations for the Summer School, we are happy to share that BigML’s Strategic Advisor Professor Geoff Webb (Monash University, Melbourne) will be giving an open talk on September 8th at the end of the first day of the Summer School. All MLVLC meetup members are cordially invited to attend this talk, which will start promptly at 6:30 PM CEST, in Las Naves. After Professor Webb’s talk, there will be time allocated for free drinks and networking. Below are the details of this unique talk.
A multiple test correction for streams and cascades of statistical hypothesis tests
Statistical hypothesis testing is a popular and powerful tool for inferring knowledge from data. For every such test performed, there is always a non-zero probability of making a false discovery, i.e. rejecting a null hypothesis in error. Family-wise error rate (FWER) is the probability of making at least one false discovery during an inference process. The expected FWER grows exponentially with the number of hypothesis tests that are performed, almost guaranteeing that an error will be committed if the number of tests is big enough and the risk is not managed; a problem known as the multiple testing problem. State-of-the-art methods for controlling FWER in multiple comparison settings require that the set of hypotheses be predetermined. This greatly hinders statistical testing for many modern applications of statistical inference, such as model selection, because neither the set of hypotheses that will be tested, nor even the number of hypotheses, can be known in advance.
Subfamilywise Multiple Testing is a multiple-testing correction that can be used in applications for which there are repeated pools of null hypotheses from each of which a single null hypothesis is to be rejected and neither the specific hypotheses nor their number are known until the final rejection decision is completed.
To demonstrate the importance and relevance of this work to current machine learning problems, Professor Webb and co-authors further refine the theory to the problem of model selection and show how to use Subfamilywise Multiple Testing for learning graphical models.
They assess its ability to discover graphical models on more than 7,000 datasets, studying the ability of Subfamilywise Multiple Testing to outperform the state-of-the-art on data with varying size and dimensionality, as well as with varying density and power of the present correlations. Subfamilywise Multiple Testing provides a significant improvement in statistical efficiency, often requiring only half as much data to discover the same model, while strictly controlling FWER.
Please RSVP for this talk soon and be sure to take advantage of this unique chance to learn more about theis cutting edge technique, while joining our Summer School attendees from around the world for a stimulating session of networking afterwards.