This quarter in the DSI, we will be teaching the following workshops. They are open to any student, professor, or researcher. If you have any questions, please contact us at digitalscholarship@ucdavis.edu.

 

February 3: 10:30-12:00pm

Minimal Command Line and Git

A hands-on practicum on the U/Linux command line and Git version control during which we’ll learn the basics of getting around using a command line shell and using Git to manage files. No prior experience is necessary. We’ll download all the needed software and get it installed on our machines together, have some directed play on the command line, connect to a remote Git repository, and push a repo and commits to our own, remote Git repositories. We’ll spend very little time talking about what’s under the hood or investing the many powerful tools that Git offers for advanced users. The workshop is a boots-on-the-ground, get your hands dirty practicum designed to get you up and running on your own computer and give you skills necessary to start using command line Git. (Note: if you don’t know what the command line is, you should definitely take this workshop.) No previous programming or Unix command line experience is necessary. This is a hands-on workshop, so participants must bring a laptop on which they have administrative privileges (the ability to install software) to the workshop.

 

February 3: 12:00-1:30pm

Text Mining Fundamentals

This workshop will introduce basic concepts in text mining and Natural Language processing through discussion and hands-on coding of text processing functions that lay the groundwork for nearly all text mining processes. No prior programming experience is necessary to take the workshop; however, familiarity with the command line and Git is a prerequisite. (Participation in the February 3rd workshop from 10:30-12 on command line and Git will prepare you well for this workshop.) We’ll code together as a group, leaving no text-miner behind. Topics covered will include: Word frequency analysis, basic chunking/tokenization, token distribution, and keyword in context (KWIC) analysis. Please come to the workshop with a working R development environment and Git already installed and operational on your system.

 

February 17: 12:00-1:30pm

Natural Language Processing: Text Normalization and Entity Extraction

This hands-on workshop will introduce a collection of basic practices in Natural Language Processing. Topics covered will include: Named entity extraction, part of speech tagging, stemming, and lemmatization. Participants must have some programming experience in R and familiarity with the Unix command line and Git. (Participation in the January 20th workshop on Slash and Burn Command Line and Git and the February 3rd workshop on Text Mining Fundamentals will prepare you well for this workshop.) Please come to the workshop with a working R development environment and Git already installed and operational on your system.

 

March 3: 12:00-1:30pm

Natural Language Processing: Form and Meaning

This workshop will cover two NLP methods for assessing syntactic and semantic complexity and identifying key concepts represented in texts. Specific topics covered will include: hapax richness, author attribution, and Term Frequency-Inverse Document Frequency (TF-IDF) weighting. Participants must have some programming experience in R and familiarity with the Unix command line and Git. (Participation in the January 20th workshop on Slash and Burn Command Line and Git and the February 3rd workshop on Text Mining Fundamentals will prepare you well for this workshop.) Please come to the workshop with a working R development environment and Git already installed and operational on your system.

 

March 10: 12:00-1:30pm

Natural Language Processing: Text Classification

This hands-on workshop will cover the theory and practice of Topic Modeling as a method of untrained text classification. We will run a variety of models on the same corpus, identifying and discussing the function of model parameters and their effect on output. Validation practices will also be discussed. Participants should have moderate experience R and Git. Familiarity with the R “TM” package will be beneficial but is not required. (Participation in the previous four workshops in this series will prepare you well for this workshop.) Please come to the workshop with a working R development environment and Git already installed and operational on your system.