Updated Tue Mar 7 12:54:00 EST 2023
Feb 1: MM assignment 1 Studio 1 in-class BK assignment 1 required readings for Feb 8 observations on Studio1
Week 2 readings
Studio 2 (basic Unix commands, week 2)
BK assignment 2 (due Feb 15)
Studio 3 (exploratory data analysis, week 3)
BK assignment 3 (due Feb 22)
Studio 4 (starting Python, week 4)
BK assignment 4 (due Mar 1)
Awk to Python
no studio or assignment
Studio 5 (Pandas, week 6)
BK assignment 5 (due Mar 22)
Quick links and old stuff:
Command-line tips and tricks
Loose ends from studio 2
Google Drive for the class
Google Drive class folder
recommended readings for Feb 1
Overview, guidelines, administration
Some files can't be put on the public Internet because of copyright issues. You can find those on the Google Drive for the class.
barrett.zip, a Zip file of Elizabeth Barrett Browning's Sonnets from the Portuguese, which we will use in the first class or two.
shakespeare.zip, a Zip file of Shakespeare's sonnets for the first take-home programming exercise.
Basic Unix commands, a quick summary.
Ken Church's Unix for Poets, a useful take on Unix commands for non-experts.
This seminar introduces students to basic concepts of working with literary texts and working with data. Crossing the divisional boundaries of literary analysis and quantitative and computational reasoning, we'll learn how to develop a compelling research question, to explore a few of the many methodologies for using computation to analyze literature, and to put our work in context of the long history of literature conceived of as data. We'll think broadly about the role of humanities in data science, and learn the importance of interpretation, exploration, iteration, creativity, analysis, and critique in both literary and quantitative work.
Weekly readings, reflections, and in-class studio work will introduce the key concepts, methods, and histories of digital humanities focused particularly on literature and data. Students will explore these methods and concepts through short code assignments, reflection work, exercises in data curation and critique, and final projects. Course meetings will begin with short lectures and discussion, and the second half of class will be studio-based, with visits from practitioners and researchers across and beyond campus.
In this class, you will:
The first half of every class will require active listening to lectures and participation in discussion, while the second half of every class will require active engagement and participation in the studio assignments. It is your responsibility to arrive at the seminar on time, to complete the readings and exercises, to complete your reading reflections and discussion questions, and to be prepared to engage in class discussions and activities. Because this course meets only 12 times, we cannot accommodate more than one unexcused absence before it will affect your grade, with the exception of medical or personal emergencies. We are still in a global pandemic, and if you find yourself struggling please contact one of us asap so that we can work out an accommodation.
KN95 masks are required. Always bring a mask and wear it properly. We know that this is really a pain in the nose most of the time, but your cooperation will help to keep all of us (and our friends and families) safer.
In-Class Participation: 20% Weekly Reading Assignments and Short Papers: 20% Programming Exercises: 30% Oral Presentation & Final Project: 30%
As discussed above, for the discussion parts of class, be prepared to contribute in positive ways to our classroom culture. During lectures, guest visits, and studio sessions we expect active engagement as well as a willingness to try again. Learning how to fail productively is part of both literary analysis and quantitative reasoning. Admit when you don't know something and ask questions so that you can better judge when you need to adjust and change course. Conversely, help your classmates when you can; people learn things at different speeds and in different ways, and hearing how something works from multiple sources can be good for everyone.
Here are a few guidelines for class participation:
This class is an experiment and you are our collaborators. We rely on your observations to know how the course is going so that we can adjust week by week.
Every week (after every class and before the next class) we'll require you to complete a short reflection on the prior class: what was mysterious or frustrating or engaging, what you would like to know more about. These are written reflections, so pay attention to sentence structure and punctuation, but they are informal, around 300 words (and no more than 500). You may choose to reflect on recommended or required readings if we discuss them in class, or even if we don't, but no pressure.
You'll have five main reading and writing assignments over the 12 weeks of the course:
Reflection papers do not have arguments; they're explanatory papers that document your experience completing the exercise or giving a reflective critique of an object (e.g., “this dataset was easy to access and I can see how it related to the graph on the website because the article clearly explained it,” etc.). We'll give more information on these later on.
Introductions. Who are we and what do we bring to this class? What is the history of the attempt to turn humanities sources into data? In what ways is tabulating or counting words in literature nothing new? In what ways is it (now) entirely new? What is the data in literature? What does a 19th-century sonnet sequence have to teach us about the math of poetry?
Historical overview of fields Computational Humanities, Digital Humanities and questions about methodology. Introduction to a few examples of exploring literature with data. Using sonnet structures to learn about close reading. Close & distant reading as practices. How does what we choose to count, and the way that we count things teach us about literature? Do we need a computer for this?
What do we learn when we make decisions about a format? What are the formats for data work? How do our choice of formats for data shape what questions we can ask? How does the representation of information work in literary formats and standard data formats? What does all of this have to do with how we find things now and later? Data types and file formats.
Do we start with a question or do we start with the data? Methods vs. tools vs. questions in reading literature as data. What does “regular” data look like in the humanities? What kinds of data do we use as scholars and readers of literature?
A tour of CDH data. Visit from CDH staff. What is lost when we translate historical sources into tractable data? Spring Break: pick your dataset.
Let’s look at the data you’ve decided to work with. This class will blend studio and discussion as we decide what and how your data might lead to a good research question. Students may elect to work together on a particular dataset or to work solo, but this class will be devoted to creating a work-plan for the rest of the semester, depending on what students decide about their literary data, and we’ll leave class with a research question and a plan of action. Data Biography due.
The final four weeks of class will be responsive to the questions students worked on in week seven; we’ll follow the class needs from here on out, so the following weeks are a rough sketch.
What is Natural Language Processing and what does it have to do with literature?
Programming Assignment 5.
Introduction to basic visualization. What stories do we want to tell with our dataset and what is the best way to tell them?