3rd Workshop on Data Mining for Medical Informatics: Learning HealthNov 12, 2016, Chicago, ILTo be held in conjunction with AMIA 2016 Annual SymposiumDMMI 2016 workshop is sponsored by AMIA Knowledge Discovery and Data Mining Working Group
The life and biomedical sciences are massively contributing to the big data revolution, due to advances in genome sequencing technology and digital imaging, growth of clinical data warehouses, increased role of the patient in managing their own health information and rapid accumulation of biomedical knowledge. Under this context, data mining and machine learning techniques, with the goal of knowledge discovery and deriving data driven insights from various data sources, has played a more and more important role in medical informatics. Effective data mining approaches have been applied in many medical problems including drug development, personalized medicine, disease modeling, cohort study, comparative effectiveness research, etc. The main theme of the workshop this year is learning health, which aims to derive actionable and timely insights based on the real-world experience of millions of patients, and make them useful to clinicians, patients and all other healthcare stakeholders. This topic has received a lot of interests and debating recently. We would like to invite the researchers from both academia and industry who are interested in this topic to participate in this workshop, share their opinions and experience, as well as discuss future directions. KDDM WG Data Competition. This is a new one-hour session that will be included in this year’s workshop: the KDDM WG Data Competition winner presentation. The task for the competition is surgical site infection prediction with a dataset extracted from a cohort of 7725 patients undergoing gastrointestinal surgery, with a total of more than 4.5 million blood tests. The data sponsor is University hospital of North Norway (UNN). The data will contain all blood tests performed on these patients close in time to the time of surgery, including their numerical or categorical value. Eighty percent of the data (training data) will be released to the participants for model development; and the rest of the data will be held out for evaluation purpose. Participants will use the training data to construct a predictive model for identification of high-risk patients susceptible to SSI. The performance of the participants will be evaluated through quantitative predictive performance on an evaluation dataset and qualitative clinical relevance. The winner will be announced at the workshop along with their presentations.
This year's DMMI workshop will be co-located with the 2016 American Medical Informatics Association (AMIA) Annual Symposium. For more information on the 1st or 2nd DMMI Workshop click here (2014, Washington, DC) or here (2015, San Francisco, CA).
Topic areas for the workshop include (but are not limited to) the following: • Comparative study of different data mining methodologies in learning health • Text mining and natural language processing in learning health • Visual analytics and learning health • Novel architectures for learning health systems • Data quality assessment and improvement • Pattern detection and hypothesis generation from observational data • Privacy and security issues in learning health systems • Information fusion and knowledge transfer in healthcare • Evaluation and validation of learning health methods • Mining temporal data for guiding timely decision making • Methods for personalized diagnosis and treatment Paper Submission and Format GuidelinesWe encourage a diverse range of submissions and demonstrations from academic, healthcare organizations, and industry that addresses any of the topics listed above. Submissions can be for (1) paper / podium presentations, or (2) abstract / podium presentations.
Papers should be formatted in AMIA format styles. Manuscripts must be submitted as Adobe Portable Document Format (PDF) files. Other file formats will not be accepted.
Full papers and abstracts must be submitted electronically through the EasyChair system at this link. Selected submissions will be invited to International Journal of Big Data and Analytics in Healthcare (IJBDAH), the journal website is here and Journal of Health Informatics Research, the journal website is here .
Workshop Chairs
|
|
|
|
|
|
Program Committee
- Marzyeh Ghassemi. MIT.
- Joyce Ho. Emory University.
- Xia Hu. Texas A&M University.
- Ying Li. IBM T. J. Watson Research Center.
- Zitao Liu. Pinterest.
- Inci M. Baytas. Michigan State University.
- Robert Moskovitch. Ben-Gurion University.
- Loakeim Perros. Georgia Institute of Technology.
- Narges Razavian. New York University.
- Yiye Zhang. Cornell University.
- Jiayu Zhou. Michigan State University.
Workshop Schedule
Type | Time | Presenter | Title |
8:30-8:35 | Jianying Hu | KDD WG Opening Remark | |
Invited Talk | 8:35-9:20 | Jane Snowdon | The Power of Data in the Era of Cognitive Computing: The Next Frontier for Healthcare (Slides) |
Long Paper Presentation | 9:20-9:35 | Shao Fen Liang, Talya Porat, Archana Tapuria, Brendan Delaney and Vasa Curcin | A Dynamic Medical Terminology Mapping System – MeTMapS |
Long Paper Presentation | 9:35-9:50 | Carlo Combi, Pietro Sala and Matteo Mantovani | Approximate Functional Dependencies for expressing Trend-Event correlations: proposal and applications in the clinical domain |
Long Paper Presentation | 9:50-10:05 | Fabrício Kury and Olivier Bodenreider | Desiderata for Drug Classification Systems for their Use in Analyzing Large Drug Prescription Datasets |
Break | 10:05-10:30 | Break | Break |
Invited Talk | 10:30-11:15 | Justin Starren | Mining Clinical Data: Why integrated repositories are the future |
Long Paper Presentation | 11:15-11:30 | Joseph Finkelstein and In Cheol Jeong | Mining Tempotal Telemonitoring Data for Advanced Prediction of Asthma Exacerbations |
Long Paper Presentation | 11:30-11:45 | Naresh Sundar Rajan, Ramkiran Gouripeddi and Julio Facelli | Measuring Validity of Phenotyping Algorithms across Disparate Data using a Data Quality Assessment Framework |
Lunch | 11:45-13:00 | Lunch Break | Lunch Break |
Invited Talk | 13:00-13:45 | Rema Padman | Paving the COWPath: Data-driven Service Innovations in Healthcare Delivery |
Data Competition Presentation | 13:45-13:50 | Eileen Koski | Introduction of the Data Challenge |
Data Competition Presentation | 13:50-14:05 | Prabhu RV Shankar, Anupama Kesari, Kamalashree N, Priya Shalini, Charan Bharadwaj, Nitika Raj, Sowrabha Srinivas, Manu Shivkumar, MS, Anand Raj Ulle, MTech, Nagabhushan Tagadur | Predictive Modeling of Surgical Site Infections Using Sparse Laboratory Data |
Data Competition Presentation | 14:05-14:20 | Prathyusha Mandagani, Shaun Coleman, Anam Zahid, Annie Pugel Ehlers, Senjuti Basu Roy, Martine De Cock | Machine Learning Models for Surgical Site Infection Prediction |
Data Competition Presentation | 14:20-14:35 | Kendall Park | Evolving clinically-relevant decision trees to predict surgical site infections |
Data Competition Presentation | 14:35-14:45 | All Audience and Presenters | Q & A |
Break | 14:45-15:15 | Break | Break |
Short Paper Presentation | 15:15-15:25 | Lisiane Pruinelli, Bonnie Westra, Karen Monsen and Gyorgy Simon | A Novel Clustering Methodology to Address Liver Transplant Population Heterogeneity |
Short Paper Presentation | 15:25-15:35 | Bisakha Ray | Automated Topic Detection of Messages in Online Health Forums |
Short Paper Presentation | 15:35-15:45 | Bo Jin, Haoyu Yang, Cao Xiao, Ping Zhang, Xiaopeng Wei and Fei Wang | Multitask Dyadic Prediction and Its Application in Prediction of Adverse Drug-Drug Interaction |
Invited Talk | 15:45-16:30 | Nitesh Chawla | Leveraging big healthcare data to answer important population health management questions |
Invited Talks
Dr. Jane L. Snowdon is the Director, Watson Health Partnerships, for IBM. She is responsible for building a partner ecosystem that aims to transform the medical field, and improve both patient care and individual wellness by creating new solutions using Watson Health, Apple Research Kit and Apple Health Kit. She is an Advisory Board member for The Georgia Institute of Technology. Prior to this role, Jane L. Snowdon was Chief Innovation Officer, IBM U. S. Federal Government, in Washington DC. She was responsible for developing and driving innovation strategy and defining offerings that combine client mission requirements with IBM products, services, IBM Research's technology investments, and Federal Systems Integrator partners. Jane was the Director of IBM's Federal Cloud Innovation Center in Washington DC. She co-chaired the Cyber Security Education and Workforce Development Working Group with the Department of Homeland Security (DHS) and the National Institute of Standards and Technology (NIST). Jane was a member of the Intelligence and National Security Alliance (INSA) Council on Technology and Innovation and DHS's Innovation in Acquisitions Working Group. Jane served as an Advisory Board member for the Center of Innovation and Entrepreneurship at George Mason University. | Title: The Power of Data in the Era of Cognitive Computing: The Next Frontier for Healthcare Abstract: Researchers and clinicians are increasingly aware that speeding up the quest for cures may hinge on the ability to make sense of vast, complex, and ever-changing information. Diagnosis and treatments require a tremendous understanding of medical literature, population health trends, patient histories, genetics, social determinants, and more. Cognitive systems can empower researchers and clinicians to deliver insights to their patients – faster and easier than previously possible. This talk will (a) introduce the basic concepts of cognitive computing and informatics in healthcare decision support, and (b) describe case studies where cognitive computing assists doctors in developing individualized, evidence-based treatment options for patients; enhances clinicians’ ability to find clinical trials for which their patient may be eligible; and discover ways to help oncologists and radiologists quickly and accurately analyze medical images to improve diagnosis and treatment. | |
Dr. Nitesh Chawla, Ph.D., is Frank M. Freimann Professor of Computer Science & Engineering and Director of The Interdisciplinary Center for Network Science & Applications (iCeNSA) at the University of Notre Dame. He is passionate about Big Data for the Common Good. His research is making fundamental advances in network science and data science, especially in the areas of link prediction and co-evolution in networks, inter-genre networks, anomaly detection, learning from imbalanced data, non-stationary data, and evaluation issues for machine learning and data mining algorithms. His research is bridging disciplinary boundaries for transformative applications in healthcare, education, environment, and national security --- technology meets society to augment human intelligence and creativity. | Title: Leveraging big healthcare data to answer important population health management questions Abstract: The availability of big data in healthcare and medicine is presenting unprecedented opportunities to advance in both personalized healthcare and population health management. In this talk, I will provide two examples of leveraging electronic medical records and claims data to draw insights into population health from both resource management and procedures perspective. I will discuss a network-based analysis drawing on nationwide healthcare data, which includes a novel metric to identify diagnosis comorbidity pairs between two generalized population subgroups, which can be particularly valuable in providing resource planning and targeted care for individuals from specific populations. Secondly, I will demonstrate how aggregate population-level Medicare data can provide value for physicians themselves, demonstrating how big data can be aggregated from multiple sources to provide insights into highly complex matters such as procedure choice. | |
![]() | Dr. Rema Padman is a Professor of Management Science & Healthcare Informatics, Heinz College of Carnegie Mellon University. Professor Padman's research addresses problems at the interface of healthcare, information technology and management science, particularly healthcare information systems, operational planning and management, and data mining and decision support methods. Her current research in the healthcare domain investigates data mining methods for healthcare decision support; evaluating the use and impact of information technology and systems in healthcare environments, particularly for point-of-care disease management; and, examining tradeoffs between access and confidentiality in large multidimensional public-use and healthcare databases. Her research on these topics has been funded by the National Science Foundation, National Library of Medicine, DARPA, and the Army Research Office. | Title: “Paving the COWPath: Data-driven Service Innovations in Healthcare Delivery” Abstract: Addressing clinical challenges in assessing and responding to many patients’ risks of chronic diseases and related complications and their progression are complex, high-dimensional, information processing problems faced by time-constrained clinicians. This talk presents recent research on data-driven service innovations that indicate promising potential to deliver substantial cognitively-guided information to clinicians and patients for improving health care delivery and outcomes. We combine statistical machine learning, information visualization, and electronic health data to find (1) informative, contextualized, two-dimensional projections of disease risk assessment, (2) longitudinal trajectories of disease progression, and, (3) clinical pathways of the co-progression of multiple clinical events that are associated with chronic disease management. Insights from these studies can potentially result in new evidence to support clinicians in providing patient-centered treatment approaches and empower patients with chronic conditions to better manage the disease and its complications |
Dr. Justin Starren is the Chief of Preventive medicine-health and Biomedical Informatics in the Department of Preventive Medicine, Associate Professor of Preventive Medicine (Health and Biomedical Informatics)and Medical Social Sciences, Northwestern University Feinberg School of Medicine. His current research continues to focus on new ways to make health care computing more useful. This includes developing intuitive, novel Human Computer Interfaces (HCI) for health care, including working the design of graphical icons for clinical applications, addressing data overload for clinicians and issues in affective computing. A related line of research is developing methods for the integration of clinic research computing into clinical care. | Title: Mining Clinical Data: Why integrated repositories are the future. Many institutions are creating data warehouses to support research and clinical operations. In most instances research warehouses are partial copies of the operational data. Alternately, researchers can only access the operational data by working through clinical IT staff. The Northwestern Medicine Enterprise Data Warehouse uses a different model. It is a single, integrated repository of clinical and research data on six million patients. From its initial design, it has served as a single, common repository for both research and clinical operations. This talk will discuss the structure and governance of this unusual model. We will present the benefits and challenges of this model in practice. Having a common repository allows research results to move more rapidly into practice. This talk will also discuss a number of projects that demonstrate model. |