TITLE: Machine Learning for Statisticians
SPEAKERS: Andy Liaw and Junshui Ma, Merck & Co., Inc
MODERATOR: Ivan S. F. Chan
Abstract:
Both Machine Learning (ML), which is loosely called Artificial Intelligence in the media, and Statistics are the fields of learning from data. The fact that they share many underlying mathematical theories and computational tools overshadow the fact that they are based on different philosophies. Ignoring the differences caused confusion among some statisticians and prevented them from effective use of some ML technologies.
This tutorial is uniquely designed as an introduction to ML for statisticians. It avoids dwelling on topics that statisticians are already familiar. Instead, it puts emphasis on the areas unique in ML, and draws connections between the two fields for those with superficial similarity.
The presentation has 4 sections: (1) What is ML? This section shows similarities and differences between ML and Statistics and provides an overview of ML. (2) Supervised learning workflow and (3) methods. These two sections explain the workflow, along with key concepts, related to supervised learning tasks, and introduce the popular ML methods, such as SVM, boosting machine, random forests, etc. (4) Model inference. This section explains how to use the trained models to predict, to select/rank variables, and to gain insights into the data. Throughout the tutorial examples from drug development will be used to demonstrate the points.
Instructors’ Biography: