SC19 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Machine Learning and HPC in Pharma Research and Development

Authors: Mohammad Shaikh (Bristol-Myers Squibb Company), Harsha Gurukar (Merck & Company Inc)

Abstract: Machine learning and deep learning methods are increasingly being applied in the pharmaceutical industry. The usage is growing exponentially and expanding to more business areas. Discussions will focus on current trends, the many challenges and approaches in applying ML and DL to build insightful models. The ever-increasing scale and velocity of data creates challenges to model accuracies, model refinement and computational scale. Key challenges include network latency and multi-terabyte data sets. HPC is vital to building and running the models in a meaningful timeframe. The interactive panel discussion will focus on experiences and insights by panelists and audience members.

Long Description: The goal of the BoF session is to bring together pharma industry HPC professionals to discuss current topics of interest and challenges we are facing implementing AI within our environments. The expected outcome is sharing of ideas, approaches and lessons learned for all participants. Similar past BoF have been well attended and provided very valuable exchanges of ideas and information.

Machine learning & deep learning methods are increasingly being applied in the pharmaceutical business in recent years. The usage is growing exponentially and expanding to more business areas encompassing all areas of the industry including early drug discovery research, drug development, clinical research, manufacturing, marketing and financial forecasting. Many business areas are accelerating their timelines and increasing their overall throughput using deep learning methodologies.

The development and use of ML/DL algorithms is producing an all-around better understanding of business processes and insight into vital decisions. One key aspect to producing meaningful and insightful models is possessing the necessary data. This is quite a challenge since there are many varied data formats and sources and data sets can be extremely large (multi terabytes) and pose significant challenges. High performance computing is playing a vital role in transforming these data sets using ML algorithms and frameworks to provide meaningful insights and results.

Panel discussion will start with brief overviews of a few use cases and challenges.

1. Deep Learning is being used in 3D biomedical image processing providing functionality not previously within reach. Generation of Deep Learning models has proven to be very effective at accurately identifying thousands of 3D tumors. The model generation process is an iterative process that requires a great deal of fine tuning and having HPC and GPU resources to tackle this stage allows it to be done in a reasonable amount of time. The quick generation of deep learning models would not be possible without HPC and GPU resources.

2. Application of ML to predict relapse of depression ahead of time in patients who are at risk for major depressive disorder. A machine learning algorithm was implemented to help doctors predict relapse - ahead of time, in patients who had stopped taking medications. Using traditional techniques of sequential programming, the algorithm would roughly take about 300 days to successfully execute and complete. Using High Performance Computing, we were able to cut down processing time to approximately 20 hours. These simulations ran quite frequently to fine tune the modeling algorithm, which resulted in greater accuracy of determining high risk patients thus providing quality health care.

3. Protein crystallization classification outcomes using deep convolutional neural networks. ML models have been developed to classify crystal structures. These models can be applied broadly using transfer learning techniques.

4. Challenges of AI in Pharma: a perspective from an HPC provider Exponential data growth, constant application change, ever increasing dependencies and a myriad of emerging technologies. How do you build for reproducibility?


Back to Birds of a Feather Archive Listing