CA Election 2022 Data Challenge

November 7th Public Symposium

November 7th (5-7pm) was the finale event of the CA Election 2022 Data Challenge! We hosted keynote speakers MacKenzie Smith (Vice Provost of Digital Scholarship at UC Davis), Jesse Salinas (Yolo County’s Assessor/Clerk-Recorder/Chief Elections Official), and Cal Matters data journalist and data reporter Jeremia Kimelman and Erica Yee, and announced this year's Challenge winners: Team A-to-ZEV! Congratulations to A-to-ZEV and to Jean Ji, this year's collegiality award winner.

MacKenzie SmithVice Provost of Digital Scholarship and University Librarian at UC Davis

MacKenzie Smith creates and leads the strategic vision for the UC Davis libraries and oversees operations for the university’s four main libraries (Shields, Physical Sciences & Engineering, Carlson Health Sciences, and Blaisdell Medical). She leads the library’s efforts to integrate digital resources an information technology to serve the university’s academic community. She has published on information technology and digital knowledge management, and works toward creating and sustaining efforts that enhance efficient and effective access to knowledge. Smith leads the DataLab advisory group.

Jesse SalinasAssessor/Clerk-Recorder/Registrar of Voters for Yolo County

Jesse Salinas has extensive experience working in government and nonprofit leadership. He has served Yolo County as the Assessor/Clerk-Recorder/Registrar of Voters since 2016, and also serves on the Election, Recorder and Assessor Association’s Legislative Committees. Previously he served a decade as principal management analyst in Yolo County’s Administrator’s Office and Department of Financial Services. He worked for the CA School Boards Association as its first research analyst, helping to shape the state’s education policy discussion. In the nonprofit sector, he served as an Executive Director of Communities in Schools of CA. Jesse holds a bachelor’s degree from UC Santa Cruz with a major in Psychology and minor in Computer Science, and a Master’s degree in Public Policy from the Goldman School of Public Policy at UC Berkeley.

Jeremia KimelmanData Journalist, CalMatters

Jeremia Kimelman is a data journalist who uses code and data to make policy and politicians easier to understand. He was previously a graphics editor at the COVID Tracking Project and a data journalist at NBC News covering elections and national politics. He is a UC Davis alum who is excited to be back home in California after an extended time as a New Yorker. If he isn’t on the computer you can find him out in the garden or on a bicycle.

Erica YeeData Reporter, CalMatters

Erica Yee is the data reporter at CalMatters helping develop data-driven graphics. As a California native, she is excited to contribute to important coverage of state issues through engaging projects. She is a recent graduate of Northeastern University, where she majored in journalism and information science. Erica previously interned at the San Francisco Chronicle, CNBC, and Boston.com.

Challenge Results

Six teams of of UC Davis students, scholars, and alumni submitted final projects to the Challenge. The Challenge engaged over 100 participants, many of whom were undergraduates. Participants' backgrounds reflected the diversity of our campus, hailing from many domains ranging from computer science, statistics and applied math to psychology, sociology, cognitive science, and political science. Teams developed a research question related to a proposition on the California November 2022 ballot, and leveraged public datasets to create a reproducible project and data visualization to communicate their findings.

Projects were shared via an online asynchronous showcase and reviewed by a dozen judges including UC Davis faculty, data scientists, and invited external reviewers. The judges were universally impressed with the quality and creativity of the projects completed for this short competition. The Challenge mentors also praised the teams’ curiosity and interest in asking meaningful questions and to help inform civic dialogue. One judge remarked, “While I know this wasn’t an explicit goal of the Challenge, the projects were really interesting and I now have a better idea of how I’m going to vote.” Congratulations to all of our participating teams! Learn more about their projects, below.

The winning projects most epitomized the Challenge’s goals of promoting data literacy and employed reproducible research methods to make data-driven storytelling accessible. These teams used innovative approaches to investigate thoughtful, civically-relevant questions.

Join us at the public Symposia on November 7th for the 2022 competition results!

DataVis

Team “DataVis” explored Prop 29, which would impose new rules on dialysis clinics including requiring at least one physician or nurse practitioner on site or remotely available during operating hours. By using CA state data on specialty care clinics and physician salaries, this team calculated the net expenditures of dialysis clinics and investigated whether adding a physician to a clinic could cause it to shut down. While they acknowledge limitations of the data and approach used, their preliminary findings indicate that some clinics will likely lose revenue if the proposition passes, with potential impacts for patient costs and accessibility although quantifying those was outside the scope of their analysis. Exploring positive potential operational improvements were the proposition to pass, including benefits to patient quality of care, would be an interesting next step for this research.

Team members include undergraduate students Sai Sindura Vuppu (Statistics), Srihita Ramini (Cognitive Science & Computer Science), Vibha Raju, (Computer Science) and Vaishnavi Kulkarni (Applied Mathematics).

A-to-ZEV

Team “A-to-ZEV” explored Prop 30, which proposes additional taxation of California’s multimillionaires to fund zero-emissions vehicle programs and Wildfire Response and Prevention Programs. They utilized state personal income tax and median home price data combined with data from the American Community Survey and CA Energy Commission to explore the proportion and location of CA residents who would be affected by the increased taxation. They also used these data to identify potentially where new electric vehicle chargers would need to be built to increase equitable access across the state. They conclude that only a very small number of Californians would be affected by the increased income tax, and that a significant number of locations experiencing a “charger gap” would be likely to receive new EV chargers as a result of the taxation, although funds for those specific locations aren’t explicitly demarked in the proposition. 

Team members include graduate students Tisura Gamage (Transportation Technology and Policy), Jean Ji (Energy Systems), and Trisha Ramadoss (Transportation Technology and Policy).

GitData

Team “GitData” explored Prop 30, which proposes additional taxation of California’s multimillionaires to fund zero-emissions vehicle programs and Wildfire Response and Prevention Programs. Using CA state data on vehicle and wildfire emissions combined with the CA annual personal income tax report, this team calculated the amount of greenhouse gases that could potentially be reduced if this proposition is implemented. They also investigated whether the goals of these programs are achievable in the timeline proposed. Their preliminary findings suggest that even if this proposition passes, it is unlikely that the state will achieve its goal of reducing greenhouse gas emissions to 80% below the 1990 level—85.371 million metric tons—by the year 2050 with this proposition alone. They suggest effective future partner initiatives should also aim to reduce other sources of greenhouse gas emissions, such as cargo ships, industrial pollution, planes, and public transportation.

Team members include undergraduate students Weilin Chen (Statistics), Hengyuan Liu (Statistics), Kathy Mo (Statistics), and Li Yuan (Data Science, University of Michigan).

Occam's Razor

Team “Occam’s Razor” explored Prop 30, which proposes additional taxation of California’s multimillionaires to fund zero-emissions vehicle programs and Wildfire Response and Prevention Programs. They used state infrastructure data on zero-emissions vehicle sales and charging station locations across California to investigate the potential effectiveness of Prop 30 if it passes. They found that locations with large EV sales, such as Los Angeles and San Bernardino counties, tend to have high amounts of available chargers as well, and conclude that if Prop 30 fails, private networks (which cost more but need less maintenance over time) could potentially benefit more than public networks (which utilize more power and slow charging speeds).

Team members include Statistics undergraduate students Lukas Barrett, Kaleem Ezatullah, Andrew Muench, Michelle Tsang, and Connor Young.

Deadline Warriors

Team “Deadline Warriors” explored Prop 27, which proposes the legalization of online sports betting for ages 21+. This team utilized sports betting data gathered over the course of a year to investigate the impact of legalizing online sports betting on existing demographics. Their project considers the broader online sports betting market and found that there are gender and age discrepancies in risky gambling behaviors, online participation in sports betting, and win-loss variance, with men ages 20-40 exhibiting the most frequent and riskiest betting behaviors.

Team members include undergraduate students Owen Levinthal (Statistics), Kevin Gui (Data Science), and Karissa Ning (Data Science) and alumnus Nicholas Goray (Communications and Computer Science).

Unnamed

Team “Unnamed” explored Prop 27, which proposes the legalization of online sports betting for ages 21+. Team members utilized state population, stadium location, and stadium capacity data to explore Californians’ investment in sports and sports gambling and interrogate whether the passing of Prop 27 would indeed improve circumstances for the unhoused populations in the state. In comparing CA state data with that of other states, the team concluded that the economic benefits would contribute to solving the homelessness crisis in the state, which has been exacerbated by the pandemic.

Team members include undergraduate students Geyang Guo (Artificial Intelligence), Siyu Liu (Economis), and Xinwei Song (Data Science).

About the Challenge

Overview

California voters are presented with several ballot initiatives each election year. These propositions are an important way for Californians to shape the future of our state.

But, many voters say there are too many of them and that they’re too complicated and confusing to understand. Voters also often worry about their ability to make an informed decision. The UC Davis DataLab runs Election Data Challenges to leverage public data to help us understand each election’s ballot initiatives, grow our data science community, and encourage participation in the civic process.

For the CA 2022 Election Data Challenge, participants working in teams of two or more selected one of the November 8th, 2022 California ballot initiatives and, using at least one publicly available dataset, created a project culminating data visualizations that explored or analyzed an aspect of the issue. Multiple teams could choose to work on the same ballot initiative, but each team must have had their own unique research question and project.

The 2022 Challenge built upon DataLab’s “PropFest 2018” and “CA 2020 Election Data Challenge,” where successful projects included pursuits that:

  • Analyzed potential impacts of a proposed initiative on specific regions, sectors, and/or populations;
  • Tracked and summarized the historical development of a proposition, including its supporters and opponents;
  • Uncovered trends in public response to the issue; and
  • Fact-checked rhetoric or claims on both sides of the debate.

DataLab provided support in helping match participants into teams and get started on their projects. We also hosted weekly open work sessions, technical office hours, and mentor Q&A sessions (see detailed Timeline, below). By October 24th all competing teams uploaded a short (< 10 minute) video presentation of their project and data visualization(s) (along with the link to the project’s public GitHub repository that includes a readME and brief report) to a Virtual Showcase which ran asynchronously on Slack from October 25th-26th. All teams were encouraged to submit their project (even if unfinished), review each other’s visualizations, and offer helpful, supportive, and constructive comments and questions. By observing the progress of the other teams, participants not only grew their network and skill set, but also gained insights to help improve their final project. The most collegial individuals also won a prize!

Judges from DataLab and across the University reviewed all submissions. Selected finalists received additional mentorship and won up to $500 as well as the opportunity to present their project to the broader campus community at an online public Symposia on Monday, November 7th (5-7pm)

Prizes

Anyone affiliated with the UC Davis and wider UC community is invited to participate on a team. In past years we’ve had teams composed of undergraduates, graduate students, postdocs, staff and even high school students engage in the Data Challenge. Prizes are only awarded to teams whose project focuses on an issue relating to a single initiative on the November 8th, 2022 CA ballot. Only current UC Davis students and postdoctoral scholars are eligible to win monetary prizes; teams without a lead who is a current UC Davis student or postdoc are welcome to submit a project to earn a certificate of participation and win swag packs.

Prior team prize categories have included:

  • Most accessible
  • Most innovative
  • Most data-licious

Individual prizes are also awarded to the participants who demonstrate great collegiality, perseverance, and high engagement throughout the Data Challenge. This includes providing helpful and supportive feedback and resources to other participants and teams on the Slack workspace, during the Showcase, and during other Challenge-related activities.

Expectations

The goal of the Data Challenge is to support data literacy and explore data visualization applications to promote quantitatively informed civic dialogue. The emphasis of this challenge is on the process of working with data to uncover insights and provide an experience for applying data science to address real-world challenges. Projects and data visualizations can encompass anything related to the ballot initiative, but this challenge will not support political agendas. The goal is not to convince people how to vote, but to help yourself and the wider community understand how to use data to investigate civic topics including the ballot initiatives.

Full transparency of the data, code, outputs, and interpretations is expected from all participants. In addition:

  • Projects must use at least one publicly available dataset.
  • All data visualizations must be reproducible. 
  • All projects must include a summary report and detailed documentation.

Teams must provide access to all materials used to produce their data visualization through a public GitHub repository. Best practices are expected for the organization of the repository, which should include all data, code, and outputs, along with a detailed readME explaining the files and links to the source datasets. 

For both the Showcase and Symposia presentations, teams should explain their data visualization, and highlight the process used for its development. Template slides will be provided to all registered teams. At a minimum, presentations should include: 

    • Brief overview of the issue (your the research question) and its relevance for the given ballot initiative;
    • Where and how the data were obtained;
    • What tools, technologies, and techniques were used to analyze and visualize the data;
    • How they interpreted those findings;
    • What the data illuminates about a given issue pertaining to the ballot initiative;
    • Limitations of the source data or resulting visualization for understanding the issue

This Challenge provides an opportunity to learn and practice the process of developing a data science project. For the Showcase and Symposia, teams are encouraged to share any challenges they faced developing the project, how they overcame those challenges, and ask for suggestions and advice from others.

Timeline

Friday Sept 23   Challenge website goes live!
Tuesday Sept 27 6-7pm

Virtual Challenge Kickoff. We will discuss goals of the challenge, introduce the timeline and resources, provide details about the showcase and symposium, and answer participant questions. We’ll then open up breakout rooms to facilitate team match-making for interested participants. Register now for the Zoom link.

Wednesday Sept 28 1-2:30pm DataLab Technical Drop-in Office Hours
Thursday Sept 29

3:30-4pm

 

4-6pm

In Person Kickoff. (Zoom link also available).

Open in person work session and team match-making (DataLab Classroom, Shields 360). Individuals who do not have a team should submit the registration form in advance to receive help getting matched with a team and to join the Data Challenge Slack workspace.

Tuesday Oct 4 4-6pm Mentor Q&A session (registered teams must RSVP for Zoom link)
Wednesday Oct 5 1-2:30 DataLab Technical Drop-in Office Hours
Thursday Oct 6  4-6pm Open in-person work session (DataLab Classroom, Shields 360)
Tuesday Oct 11 4-6pm Mentor Q&A session (registered teams must RSVP for Zoom link)
Wednesday Oct 12 1-2:30pm DataLab Technical Drop-in Office Hours
Thursday Oct 13 4-6 Open work session (moved to Zoom; see Slack for link)
Tuesday Oct 18 4-6pm Mentor Q&A session (registered teams must RSVP for Zoom link)
Wednesday Oct 19 1-2:30pm DataLab Technical Drop-in Office Hours
Thursday Oct 20 4-6pm Open work session (moved to Zoom; see Slack for link)
Friday Oct 21 10am-12pm Final check-in Q&A with Challenge organizers before the virtual showcase. Zoom link will be distributed to all registered teams and posted on the Challenge Slack workspace.
Monday Oct 24 12pm Deadline to submit to the Showcase (see Slack workspace for instructions). All teams must be registered by this date.
Tuesday-Wednesday Oct 25-26   Virtual Showcase for registered teams, mentors and judges!
  Oct 31-Nov 1   Judges announce finalists to present at the Webinar. Finalists meet with Organizers for presentation refinement.
Monday Nov 7 5-7pm Public Symposia featuring keynote speakers and presentations by Challenge finalists. Free and open to the public but registration required to receive the link.
Tuesday Nov 8 Election Day GET OUT AND VOTE!

 

Watch the Orientation Video

 

Challenge Mentors

Thank you to all of the UC Davis faculty, staff and other volunteers who helped mentor this year’s challenge teams:

Krishnakumar Balasubramanian, Assistant Professor, Statistics

Colin Cameron, Professor, Economics

Shizhe Chen, Assistant Professor, Statistics

Katherine Florey, Martin Luther King Jr. Professor of Law, School of Law

Dahlia Garas, Research Program Director, Plug-In Hybrid & Electric Vehicle Research Center, Institute of Transportation Studies

Peter Kramlinger, Visiting Assistant Professor, Statistics

Can Lee, Assistant Professor, Statistics

David Michalski, Social and Cultural Services Librarian, Shields Library

Christy Navarro, Health Informatics Research Data Officer, Clinical and Translational Science Center

Adam Siegel, Languages and Literatures Librarian, Shields Library

Amy Studer, Health Sciences Librarian, Blaisdell Medical Library

Aaron Tang, Professor of Law, School of Law

Megan Van Noord, Health Sciences Librarian, Carlson Health Sciences Library

Xiner Zhou, Ph.D. Candidate, Biostatistics

Registration Info

Virtual Kickoff and Team Matching Event

Join us on September 27th from 6-7pm (Zoom only) and September 29th from 3:30-6pm (in person + livestream) to learn about the Challenge goals, timeline and resources, and connect with potential teammates. Register here for the Zoom link.

Register Your Team

Have a team in mind and want to get started? Register now to participate in the challenge events and be added to our Slack workspace. Don’t have a team but want to participate? Complete the individual registration form by September 28th at noon and we will provide support with team matching. All teams must be registered by October 24th to be eligible to compete in the Virtual Showcase to win prizes and the chance to present at the Symposia! Interested but not sure if you qualify to participate? Contact us.

Get Help

Mentor Q&A Sessions

Experts from the UC Davis community have volunteered to meet with participating teams periodically throughout the Data Challenge. Zoom links for these sessions will be distributed in advance through the Challenge’s Slack workspace.

More information coming soon! 

Technical Mentor Office Hours

DataLab’s data science team hosts weekly drop-in office hours on Wednesdays from 1-2:30 pm. They can provide support with developing research questions and approaches, troubleshooting your code, and finding learning resources. To join the virtual office hours see this page for details and how to obtain the Zoom link. DataLab’s technical experts at these drop-in sessions include:

Wesley Brooks, Research Data Science
Oliver Kryelos, Virtual Reality Data Specialist
Pamela Reynolds, DataLab Associate Director
Tyler Shoemaker, Postdoctoral Scholar of Digital Humanities
Michele Tobias, Geospatial Data Specialist

Final Check-In with Challenge Organizers

A final optional check-in session for all teams is scheduled for Friday, October 21st on Zoom from 10am-noon. The Challenge organizers will go over guidelines for project submissions to the virtual showcase and answer any questions. Materials and Zoom link will be distributed in advance through the Challenge’s Slack workspace.

Challenge Slack Workspace

All registered individuals and teams are invited to join the Challenge’s Slack workspace. Check out channels #getting_started, #resources, #team_formation and #help_me to ask for and share helpful tips.

More Resources

Find a Ballot Initiative

The CA November 8th, 2022 CA ballot initiatives cover topics including health care, human rights, and tax reform.

Ballot Initiative Number Name of Initiative Topic
Prop 1 Guarantee Abortion Rights in State Constitution Human Rights, Health
Prop 26 Legalize Sports Betting at Tribal Casinos Taxes
Prop 27 Allow Online Sports Betting Taxes
Prop 28 Guarantee Funding for Arts and Music Education Education
Prop 29 Impose New Rules on Dialysis Clinics Health
Prop 30 Tax Millionaires for Electric Vehicle Programs Taxes, Climate Change
Prop 31 Uphold Ban on Flavored Tobacco Products Health

Discover Open Data

Teams are expected to use at least one open data set for their project. Not sure where to start? Check out this portal made available for California. Need to find more publicly accessible datasets? Come to the mentoring sessions and/or reach out to your Research Librarians!

Setting Up Your Project

Documenting Your Project

Learning new data science skills

Check out the recordings, slides and code repositories from previous DataLab workshops on topics ranging from getting started with git and GitHub, to working in R, Python, SQL, QGIS, and on topics ranging from machine learning and data visualization to Bayesian statistics. Interested in a specific topic?  Send us an email to suggest a topic for a future workshop!

Research Librarians