Building scalable health analytic platforms

Building scalable health analytic platforms

Computational phenotyping and cloud-based predictive modelling

Groups related to this event

Centre for Health Informatics
Centre for Health Systems and Safety Research
Centre for Healthcare Resilience and Implementation Science

Event date

Monday, 10 August 2015

Abstract

As the adoption of electronic health records (EHRs) has grown, EHRs are now composed of a diverse array of data, including structured information and unstructured clinical progress notes. Two unique challenges need to be addressed in order to utilize EHR data in clinical research and practice:

  1. How to turn complex and messy EHR data into meaningful clinical concepts or phenotypes?
  2. How to efficiently construct and validate clinical predictive models from EHR?

In this talk, we discuss our approaches to these challenges. For computational phenotyping, we present EHR data as data as inter-connected high-order relations i.e. tensors (e.g. tuples of patient-medication-diagnosis, patient-lab, and patient-symptoms), and then develop expert-guided sparse nonnegative tensor factorization for extracting multiple phenotype candidates from EHR data. Most of the phenotype candidates are considered clinically meaningful and with predictive power.

For predictive modeling, we introduce CloudAtlas, a cloud-based parallel predictive modeling system using big data infrastructure including Hadoop and Spark. Besides parallel model building, CloudAtlas can accurately estimate the running time and cost for a predictive modeling workflow then provHisions the proper cluster on demand in the cloud.In particular, we demonstrate that CloudAtlas can achieve 40X speedup plus 40% cost saving compared to traditional sequential execution on large EHR datasets.

Speaker profile

Jimeng Sun is an Associate Professor of School of Computational Science and Engineering at College of Computing in Georgia Institute of Technology. Prior to joining Georgia Tech, he was a research staff member at IBM TJ Watson Research Center. His research focuses on health analytics using electronic health records and data mining, especially in designing novel tensor analysis and similarity learning methods and developing large-scale predictive modeling systems. He has published over 70 papers, filed over 20 patents (5 granted). He has received ICDM best research paper award in 2008, SDM best research paper award in 2007, and KDD Dissertation runner-up award in 2008. Dr. Sun received his B.S. in Computer Science from Hong Kong University of Science and Technology in 2002, and PhD in Computer Science from Carnegie Mellon University in 2007.

Date: Monday 10 August 2015

Time: 12-1pm

Venue: Seminar Room Level 1, 75 Talavera Road, Macquarie University

Chairperson: Dr Blanca Gallego-Luxan

Content owner: Australian Institute of Health Innovation Last updated: 11 Mar 2024 6:24pm

Back to the top of this page