Research overview

My research is in machine learning/data mining and natural language processing, with an emphasis on applications in health informatics.

For example, one of my core ongoing research aims concerns optimizing the processes of evidence-based medicine using novel natural language processing and machine learning methods. The aim is to reduce the (human) workload involved in conducting systematic reviews, so that we can realize the aim of evidence-based care in an era of information overload.

More broadly, I am interested in core machine learning/natural language processing issues: e.g., structured and unstructured classiļ¬cation techniques; semi-supervised learning methods; learning from imbalanced data; and learning from alternative forms of supervision. Finally, I have recently been involved with projects that involve statistical models for novel problems in NLP, including narrative and sociolinguistic structures.


My work has been supported with grants from the National Institutes of Health, National Science Foundation, the Army Research Office, Seton hospital, Amazon and seed funds from Brown University.

Machine Learning in Healthcare (MLHC)

I'm co-chairing MLHC (formerly MUCMD) this year, please check out the call for papers and consider submitting something.


06/01/2016 Talk @ U Lisbon/INESC-ID

I'll be giving a talk at the University of Lisbon this June.

05/19/2016 NIH grant funded

Our NIH "Big Data to Knowledge" proposal, Crowdsourcing Mark-up of the Medical Literature to Support Evidence-Based Medicine and Develop Automated Annotation Capabilities has been selected for funding! This is a collaborative effort with Ani Nenkova and Zachary Ives.

05/01/2016 Joining Northeastern CCIS

I'll be joining the College of Computer and Information Science (CCIS) at Northeastern University this coming fall!

A random sample of semi-recent publications

Byron C Wallace, Kevin Small, Carla E Brodley and Thomas A Trikalinos. Active learning for biomedical citation screening ACM SIGKDD international conference on Knowledge discovery and data mining; 2010.

Byron C. Wallace, Kevin Small, Carla E. Brodley, Joseph Lau, Christopher H. Schmid, Lars Bertram, Christina M. Lill, Joshua T. Cohen and Thomas A. Trikalinos. Toward modernizing the systematic review pipeline in genetics: efficient updating via data mining. Genetics in Medicine; 2012.

Mengqi Jin, Hongli Li, Christopher H. Schmid and Byron C. Wallace. Using Electronic Medical Records and Physician Data to Improve Information Retrieval for Evidence-Based Care International Conference on Healthcare Informatics (ICHI); 2016.