2nd Workshop on Data Mining for Medical Informatics: Predictive Analytics

Nov 14, 2015, San Francisco, CA

To be held in conjunction with AMIA 2015 Annual Symposium

Data Mining for Medical Informatics (DMMI) is a series of workshops that focus on the use of data mining techniques to address today's challenges in health informatics. The 2015 Workshop on Data Mining for Medical Informatics provides an opportunity for participants to discuss state-of-the-art data mining techniques and review how such techniques can be applied to clinical data.

The main theme this year is predictive analytics, which focuses on computational techniques for predicting clinical outcomes under uncertainty, such as risk assessment, diagnosis, prognosis and treatment effects. Predictive analytical methods are currently subject to intense debate in scientific and applied communities for various reasons:
  • Existing methods are geared towards clean, uniform, cross-sectional datasets and not suited for predicting from routinely-collected EHR data which are longitudinal, heterogeneous, and messy.
  • Risk prediction models tend to become quickly outdated because of rapid changes in treatment regimens and populations.
  • Existing methods are not suited for exploring "what if" scenarios (causal reasoning).
  • There are increasing concerns over privacy risks and proper consent for data sharing.
The objectives of the workshop are: 
  • Bring together researchers (from both academia and industry) as well as practitioners to present their experience and ideas.
  • Attract healthcare providers who have access to interesting sources of data and problems but lack the expertise in data mining to use the data effectively. 
  • Enhance interactions between data mining and medical informatics communities working on problems from medicine and healthcare.
This year's DMMI workshop will be co-located with the 2015 American Medical Informatics Association (AMIA) Annual Symposium. For more information on the 1st DMMI Workshop click here

Topics and Scope

Topics of interests include but are not limited to:
  • Discussion on different data mining techniques for predictive analytics
  • Text mining for predictive analytics 
  • Visualizations to aid or explain predictive analytics
  • Data quality assessment and improvement
  • Novel architectures for large scale predictive analytics
  • Pattern detection and hypothesis generation from observational data
  • Privacy and security issues in predictive analytics
  • Information fusion and knowledge transfer in healthcare
  • Evolutionary and longitudinal patient and disease models
  • Evaluation and validation of predictive analytics

Paper Submission and Format Guidelines

We encourage a diverse range of submissions and demonstrations from academic, healthcare organizations, and industry that addresses any of the topics listed above. Submissions can be for (1) paper / podium presentations, or (2) abstract / podium presentations.
  1. Paper submissions must be no more than six pages in length, inclusive of figures and references. 
  2. Abstract submissions are limited to two pages.
Papers should be formatted in AMIA format styles. Manuscripts must be submitted as Adobe Portable Document Format (PDF) files. Other file formats will not be accepted.

Full papers and abstracts must be submitted electronically through the EasyChair system at the following link: https://easychair.org/conferences/?conf=dmmi2015

Important Dates

Deadline for submission: September 20th, 2015
Notification of acceptance: October 10th, 2015
Camera-ready Papers Due: October 19th, 2015
Workshop: November 14th, 2015

Workshop Chairs


Fei Wang
University of Connecticut
Gregor Stiglic
University of Maribor
Niels Peek
University of Manchester
Nigam Shah
Stanford University
Adam Perer
IBM T.J. Watson Research Center

Program Committee

Adam Davey, Temple University

Joydeep Ghosh, University of Texas, Austin

Tudor Groza, The University of Queensland

Joyce Ho, University of Texas, Austin

John Holmes, University of Pennsylvania

Siddhartha Jonnalagadda, Northwestern University

Jin-Dong Kim, Database Center for Life Science

Robert Moskovitch, Columbia University

Zoran Obradovic, Temple University

Mykola Pechenizkiy, Eindhoven University of Technology

Mattia Prosperi, University of Manchester

Michael Rothman, PeraHealth

David Sontag, New York University

Lucia Sacchi, University of Pavia

Nicholas Tatonetti, Columbia University

Workshop Schedule

Nov 14, Saturday

8:30 – 8:45

Workshop Opening

8:45 – 9:30

Opening Keynote

Riccardo Bellazzi: Predictive analytics to face precision medicine challenges: integrating and fusing data, re-engineering methods and technologies [Slides]


9:30 – 10:30

Paper Session (10 min presentations for long and 5 min for short* paper)


Katherine Niehaus, Joshua Knowles and Nigam Shah: FIND FH – A phenotype model to identify patients with familial hypercholesterolemia [PDF]

Linda Zhang, Daniel Fabbri and Colin Walsh: A Data Driven System for Clinical Preventive Order Recommendations [PDF]

Wuyang Dai, Theodora Brisimi, Tingting Xu, Taiyao Wang, Venkatesh Saligrama and Ioannis Paschalidis: A Joint Clustering and Classifcation Approach for Healthcare Predictive Analytics [PDF]

Nirav Shah, Vivek Vegi, Ankit Dhingra, Rema Padman, Daniel Nagin and Ari Robicsek: What is a “normal” postoperative temperature? Group based trajectory modeling in postoperative knee arthroplasty patients in a large health system [PDF]

*Narges Razavian and David Sontag: Temporal Convolutional Models of Biomarkers for Disease Diagnosis [PDF]

10:30 – 11:00

Coffee Break

11:00 – 12:15

Predictive Analytics in Practice - short invited talks (12 min) with panel discussion (25 min)


Ziad Obermeyer

David Buckeridge [Slides]

Michael Rothman

Enrico Bertini [Video]


12:15 – 13:15

Lunch Break

13:30 – 14:15

Industry Perspective


Walter (Buzz) Stewart: Application of Predictive Analytics in Health Care Delivery


14:15 – 15:00

Invited talk on open challenges


Peter Szolovits: Predictive Analytics: The Promise and Problems of Change [Slides]


15:00 – 15:30

Coffee Break

15:30 – 16:15

Closing Keynote


Gregory Cooper: Clinical Alerting of Unusual Care that Is Based on Machine Learning from Past EMR Data


16:15 – 16:30

Closing Remarks

Invited Speakers

Enrico Bertini
, New York University

Enrico Bertini is Assistant Professor at the NYU Polytechnic School of Engineering in the Department of Computer Science and Engineering. His research focuses on the study of effective data visualization methods and techniques to explore and make sense of large and often high-dimensional data. He also studies how to communicate complex ideas effectively with visualization. His research has been applied to several application domains including: biochemistry, cybersecurity, development finance and climate science. His work is currently applied to healthcare, data journalism and human rights.
Professor Bertini earned his PhD degree in Computer Engineering at Sapienza University of Rome in Italy. Before joining NYU he was a Research Scientists at the University of Fribourg, Switzerland and the University of Konstanz, Germany. He is part of the organizing and program committee of the IEEE VIS conference, the premier conference in the field, and he is one of the founders of the BELIV workshop series on evaluation methods in visualization. He is also the editor of fellinlovewithdata.com and datastori.es, respectively a popular blog and podcast on visualization and data analysis.

Riccardo Bellazzi, University of Pavia

Riccardo Bellazzi is Full Professor of Bioengineering and Medical Informatics at the University of Pavia, Italy. Prof. Bellazzi is the chair of the Interdepartmental Centre for Health Technologies and the director of the Biomedical Informatics Labs “Mario Stefanelli” of the University of Pavia, as well as the director of the Laboratory of Informatics and Systems science of the IRCCS Fondazione S. Maugeri hospital of Pavia. He is Fellow of the American College of Medical Informatics, past Vice-President of the International Medical Informatics Association (IMIA), past and current member of the program committee of several international conferences in biomedical informatics. He is member of the editorial board of four medical informatics journals and associate editor of the Journal of Biomedical Informatics. His research interests are related to clinical data mining, with focus on temporal and probabilistic aspects, translational bioinformatics, clinical research informatics. Prof. Bellazzi is author of more than 150 publications on peer-reviewed journals.

Invited talk abstract:

Predictive analytics to face precision medicine challenges: integrating and fusing data, re-engineering methods and technologies

The long-term goals of the precision medicine initiative aims at predicting optimal therapies for patients and optimal health interventions for citizens by exploiting all available biomedical information, ranging from molecular to clinical variables, and from behavioral to environmental measurements. Predictive modeling is a key component in this scenario, providing methods and tools to derive actionable decision rules of wide clinical applicability.
Starting from a number of projects currently running at the University of Pavia, the talk will present current challenges and potential solutions in predictive modeling to help accomplishing the precision medicine agenda. Modeling issues related to “big data” integration and fusion will be discussed, and ideas to re-engineer existing methods and tools for the joint handling of genomics, clinical and exposomics data will be presented. 

David Buckeridge, McGill University

David Buckeridge is an Associate Professor of Epidemiology and Biostatistics at McGill University in Montreal where he holds a Canadian Institutes of Health Research (CIHR) Chair in Applied Public Health Research. A Fellow of the Royal College of Physicians and Surgeons of Canada with specialty training in Public Health and Preventive Medicine, Dr Buckeridge practices Public Health as a Medical Consultant to the Montreal Public Health Department and the Quebec Public Health Institute. As a clinician-scientist in public health, his research and practice focus on the informatics of public health surveillance and disease control. 
At McGill, Dr Buckeridge directs the Surveillance Lab, which is an interdisciplinary group of over twenty students and staff with a mission to develop, implement, and evaluate novel computational methods for public health surveillance. Laboratory research activities are funded by the Canadian Institutes of Health Research, the National Sciences and Engineering Research Council, the Canadian Foundation for Innovation, and the Bill and Melinda Gates Foundation. Dr Buckeridge has consulted on surveillance to groups such as the Public Health Agency of Canada, Canada Health Infoway, the US Institute of Medicine, the US and Chinese Centers for Disease Control, the European Centers for Disease Control, and the World Health Organization. He has a M.D. from Queen's University, a M.Sc. in Epidemiology from the University of Toronto, and a Ph.D. in Biomedical informatics from Stanford University.

Gregory Cooper
University of Pittsburgh

Dr. Gregory Cooper is Professor of Biomedical Informatics at the University of Pittsburgh. His research focuses on the application of probabilistic modeling, machine learning, Bayesian statistics, and artificial intelligence to biomedical informatics problems. He is best known for his research on Bayesian networks, especially work on learning Bayesian networks from data.  Current research projects include causal modeling and discovery from big biomedical datasets, machine-learning-based clinical alerting, computer-aided medical diagnosis and prediction, and methods for detecting and characterizing infectious disease outbreaks. 

Invited talk abstract:

Clinical Alerting of Unusual Care that Is Based on Machine Learning from Past EMR Data

Medical errors remain a significant problem in healthcare. Electronic medical records (EMRs) have shown great promise in helping health care providers to identify and reduce medical errors. Computer-based monitoring and alerting systems play a key role in this effort. We have developed a method for alerting that is based on machine learning from EMRs. In particular, the method uses data in an EMR system to learn a probabilistic model of the usual care of past patients. For a current patient, it derives the probability of each clinical care action that the patient has recently received (e.g., the administration of a given medication). A care action that has a very low probability will trigger a clinical alert that the action is anomalous. We hypothesize that anomalous actions correspond to medical errors often enough to make such alerting worthwhile. This approach has the advantage that it provides broad coverage of clinical care, is completely data driven, can readily adapt to new clinical environments and locations, and can be continually updated over time. This talk discusses the implementation and laboratory evaluation of a version of this system that sends alerts on patients in the intensive care unit (ICU). The results support that this approach is promising. 

Ziad Obermeyer
Harvard Medical School and Brigham and Women's Hospital

Ziad Obermeyer is an Assistant Professor of Emergency Medicine at Brigham & Women's Hospital and Assistant Professor of Health Care Policy at Harvard Medical School. He is a faculty affiliate of the Harvard Institute for Quantitative Social Science and Ariadne Labs at the Harvard School of Public Health. His research combines insights from clinical medicine with methods from biostatistics, computer science, and econometrics, to translate large observational datasets into meaningful inferences at the patient and provider levels. He holds an A.B. (magna cum laude) from Harvard and an M.Phil. from Cambridge, where he was a Frank Knox fellow in the history and philosophy of science. He also holds an MD from Harvard Medical School, and completed a residency in emergency medicine at the Brigham & Women's, Massachusetts General, and Boston Children’s Hospitals.

Michael Rothman, PeraHealth

Dr. Rothman’s holds a PhD in Physical Chemistry and has had over 30 years’ experience in data analysis and mathematical modeling. Over the last ten years, following the avoidable death of his mother in a hospital, he has focused on leveraging the data stored in hospitals’ medical records to improve the quality of care. Dr. Rothman is the developer of the patient acuity models known as the (Florence A.) Rothman Index (RI) and the Pediatric Rothman Index, which are currently being used to assist in patient care at more than 40 hospitals. He is currently the Chief Science Officer of the company founded to integrate the RI into hospital EMR’s, PeraHealth. The RI has helped clinicians improve the quality of care for more than 3 million patients. Prior to this work, he was an independent consultant. He spent 18 years at IBM, part of which was at the TJ Watson Research Labs.  
Dr. Rothman received his PhD from the University of Michigan, and his ScM and ScB from Brown University.
His work in biomedical informatics has been published in: BMJ Open, J of Biomedical Informatics, BMJ Quality and Safety, and J of Hospital Medicine.

Talk title: Excess Risk and the Rothman Index – Building and Validating a General Measure of Patient Acuity

Walter (Buzz) Stewart
, Sutter Health

Walter (Buzz) F. Stewart, PhD, MPH, VP and Chief Research Officer. Buzz is VP, Chief Research Officer at Sutter Health in 2012 with a mission to improve quality, efficiency, and the patient experience using digital health, data analytics, and rapid learning. He oversees the health system-wide strategy for achieving transformational change through research in patient care, population health management, and community health. Prior to joining Sutter Health, Buzz served as Associate Chief Research Officer at Geisinger, where he founded the Center for Health Research in 2003. Buzz is also an entrepreneur, having founded and led several companies in health care and education. From 1983 to 1995, Buzz was on the faculty of the Johns Hopkins Bloomberg School of Public Health, where he continues to serve as an adjunct professor. He has authored more than 350 published articles and book chapters. He received his PhD in Epidemiology from the Johns Hopkins University Bloomberg School of Public Health.

Invited talk abstract: 

Application of Predictive Analytics in Health Care Delivery

The opportunities to apply data mining and medical informatics to clinical care data seem limitless, especially in guiding clinical decision making, facilitating patient management, and improving patient outcomes.  In this presentation, Buzz will focus, in particular, on use of electronic health record data and cover: 1) areas of application for predictive analytics and performance requirements; 2) the diversity of data and challenges with using these data, including dominant sources of noise; 3) data driven versus knowledge driven approaches to modeling; 4) considerations in testing models for use in clinical practice; and 5) barriers to translating models to clinical practice.

Peter Szolovits
Massachusetts Institute of Technology

Peter Szolovits is Professor of Computer Science and Engineering at MIT and a professor in the Harvard/MIT Health Sciences and Technology (HST) program.  He heads the Clinical Decision-Making Group at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). His research centers on the application of AI methods to problems of medical decision making, natural language processing to extract meaningful data from clinical narratives to support translational medicine, and the design of information systems for health care institutions and patients. He received his bachelor's degree in physics and his PhD in information science, both from Caltech. Prof. Szolovits was elected to the Institute of Medicine of the National Academies and is a Fellow of the American Association for Artificial Intelligence, the American College of Medical Informatics and the American Institute for Medical and Biological Engineering. He was the 2013 recipient of the Morris F. Collen Award of Excellence from the American College of Medical Informatics.

Invited talk abstract:

Predictive Analytics: The Promise and Problems of Change

New technologies introduce new capabilities but also often highlight difficulties that are unanticipated.  Our ability to make statistical predictions from huge data sets enables new scoring, warning and decision support systems. Yet we are mostly unsure of what to predict, how to present those predictions to users, how accurate our predictions need to be, how they can work into the workflow of health care, or whether they will in fact improve outcomes.  I will review our successful methods of building predictive models, highlight some needed improvements, and speculate on how we can most positively impact health care.