Introduction to Text Analysis: A Coursebook

Posted in: text analysis  collaboration  w&l  resources  pedagogy 

[Crossposted on the WLUDH blog]

I am happy to share publicly the initial release of a project that I have been shopping around in various talks and presentations for a while now. This semester, I co-taught a course on “Scandal, Crime, and Spectacle in the 19th Century” with Professor Sarah Horowitz in the history department here at Washington and Lee University. The course counted as digital humanities credit for our students, who were given a quick and dirty introduction to text analysis over the course of the term. In preparing for the class, I knew that I wanted my teaching materials on text analysis to be publicly available for others to use and learn from. One option might be to blog aggressively during the semester, but I worried that I would let the project slide, particularly once teaching got underway. Early conversations with Professor Horowitz suggested, instead, that we take advantage of time that we both had over the summer and experiment. By assembling our lesson plans far in advance, we could collaboratively author them and share them in a format that would be legible for publication both to our students, colleagues, and a wider audience. I would learn from her, she from me, and the product would be a set of resources useful to others.

At a later date I will write more on the collaboration, particularly on how the co-writing process was a way for both of us to build our digital skill sets. For now, though, I want to share the results of our work - Introduction to Text Analysis: A Coursebook. The materials here served as the backbone to roughly a one-credit introduction in text analysis, but we aimed to make them as modular as possible so that they could be reworked into other contexts. By compartmentalizing text analysis concepts, tool discussions, and exercises that integrate both, we hopefully made it a little easier for an interested instructor to pull out pieces for their own needs. All our materials are on GitHub, so use them to your heart’s content. If you are a really ambitious instructor, you can take a look at our section on Adapting this Book for information on how to clone and spin up your own copy of the text materials. While the current platform complicates this process, as I’ll mention in a moment, I’m working to mitigate those issues. Most importantly to me, the book focuses on concepts and tools without actually introducing a programming language or (hopefully) getting too technical. While there were costs to these decisions, they were meant to make any part of the book accessible for complete newcomers, even if they haven’t read the preceding chapters. The book is really written with a student audience in mind, and we have the cute animal photos to prove it. Check out the Preface and Introduction to the book for more information about the thinking that went into it.

The work is, by necessity, schematic and incomplete. Rather than suggesting that this be the definitive book on the subject (how could anything ever be?), we want to suggest that we always benefit from iteration. More teaching materials always help. Any resource can be a good one - bad examples can be productive failures. So we encourage you to build upon these materials in your courses, workshops, or otherwise. We also welcome feedback on these resources. If you see something that you want to discuss, question, or contest, please drop us a line on our GitHub issues page. This work has already benefited from the kind feedback of others, either explicit or implicit, and we are happy to receive any suggestions that can improve the materials for others.

One last thing - this project was an experiment in open and collaborative publishing. In the process of writing the book, it became clear that the platform we used for producing it - GitBook - was becoming a problem. The platform was fantastic for spinning up a quick collaboration, and it really paid dividends in its ease of use for writers new to Markdown and version control. But the service was new and under heavy development. Ultimately, the code was out of our control, and I wanted something more stable and more fully in my hands for long-term sustainability. I am in the process of transferring the materials to a Jekyll installation that would run off GitHub pages. Rather than wait for this final, archive version of the site to be complete, it seemed better to release this current working version out into the world. I will update all the links here once I migrate things over. If the current hosting site is down, you can download a PDF copy of the most recent version of the book here.

Update: I got around to doing that! You can find the new, improved, and more stable version of the site here.