The Health Data Science (HDS) cluster is one of UC Davis DataLab’s Research and Learning Clusters (RLCs), which operate independently with DataLab support. HDS focuses on computing systems, data security technology, and analytic methods that advance research on health data. We asked the graduate students involved in the cluster about the work they are doing to advance their own research and enrich the cluster community.

Charlie Fornaca

Acquiring enough data for downstream analyses is often the first hurdle to any data science problem. In addition to being scarce, data for healthcare research is particularly challenging due to its sensitive nature and personal identifiers. Having synthetic medical data can help researchers hasten the development of statistical methods and analytics. Many solutions for generating synthetic medical data exist, with different combinations of features, costs, and user-friendliness. My goal is to explore and compare the different existing solutions for generating synthetic data, based on both closed and open source code.

Chitrabhanu Gupta

I am working on data access technologies for researchers in healthcare. My goal is to facilitate research by simplifying the process of gaining access to sensitive data and research computing environments. In order to accomplish this goal, I have been studying scientific workflows of various researchers and analyzing how well state of the art secure research computing frameworks and tools serve those workflows. I am also testing data management tools and researching the integration of Differential Privacy into them. Differential Privacy is a method for sharing sensitive datasets by describing the patterns of groups within the dataset while withholding information about individuals in the dataset. This is achieved by injecting noise into the dataset. Successful implementation would allow researchers to gain access to sensitive data, as well as share such data effortlessly within the differentially private framework, and thus advance their research.

Chris Lucas

I am a Health Informatics graduate student with background in biological science and clinical research, with emphasis in oncology and microbiology. As part of HDS, my role is as a liaison between the health sciences and the data science groups. My goals within the group are to make data science tools less scary for the average user. UC Davis Health and the UC Davis main campus have worked hard on multiple data science tools and projects which would assist in many research projects for UC Davis Health researchers. However, these tools are underutilized and in my experience when they are used they are poorly understood by medical professionals. As a health informatics student I attempt to straddle both worlds in order to bring data scientists and health researchers together.

If you would like to learn more about the Health Data Science & Systems RLC, you can visit their cluster page here.