Health Data Science and Systems Meeting 10:30am – 12:00pm

There is a big difference between learning Natural Language Processing (NLP) and executing your first full cycle NLP project. This talk transpired as we reflected on several years of effort building out a continuous improvement NLP lifecycle.  We will show you what it means to ‘zoom out’ and consider the true effort of NLP.  We will also discuss the impact of executing inside of a clinical environment, which has extensive ramifications.  All of this will be done within the context of an example use-case.  We will ask you to imagine you have been tasked with processing all 300 million notes at your health system through a binary classification exercise.  This will be a mix of slideshow guided lecture and Jupyter based code demonstrations to help solidify a few key points.  At the end of the talk, you will receive a copy of the slideshow and the code for you to review and practice on your own.

Bill Riedl graduated from the UC Davis Health Informatics program in 2011 and has been employed at the health system ever since, holding a variety of roles.  Bill primarily specializes in Natural Language Processing, Time Series analysis, has a growing interest in Multi-Modal representation strategies, and builds many web applications and API’s.  His career has covered a wide range of clinical domains including inpatient and critical care medicine, chronic disease management, mobile health, and cancer.  Most recently Bill has produced a continuous improvement lifecycle and associated technologies for training, deploying, and maintaining Natural Language Processing models for production use at UCDHS.  He is currently working on a massive effort to de-identify all 300 million notes in the UCDHS EMR to create the world’s largest de-identified clinical notes repository.