Brandon Walsh

We Data: An Icebreaking Activity

Posted in: digital humanities  pedagogy 
Crossposted to the Scholars' Lab blog.

This semester, I’m teaching an undergraduate course at UVA called “Data for the Rest of Us.” The course is, broadly, an attempt to introduce principles and practices of critical data literacy to students who might otherwise find full-blow data science intimidating. I’ve published the whole course online, and I’ll likely be posting some reflections off and on as the course proceeds. The centerpiece of the course is the group construction of a dataset and associated narrative description based on their research interests, an outcome that I’m largely basing on Responsible Datasets in Context. Much more to say about the design of the course as a whole, but I wanted to document a little about the first day today. If RDiC is where the inspiration for the course came from, I’ve had the first day’s activity in my head just as long. I planned to introduce myself and the course in a few minutes before quickly shifting gears into something I called “We Data.” Here’s the prompt:


We Data: An Icebreaking Activity

I want to get to know you. In the spirit of the course, your job is to make a dataset about yourselves to share back with me. You have 45 minutes in which to do so. You have to hit the following buckets.

  • Identify your data
    • Who are you?
  • Describe your data
    • What are you interested in describing as a part of your dataset? What are you not?
  • Collect
    • Gather your data in one place.
  • Clean
    • Edit the dataset so that it’s in a presentable, consistent format.
  • Analyze
    • What stories are in your dataset?
  • Distribute
    • How will you share it back to me?

Some advice

  • Not everyone has to work on the same things! Divide up.
  • Use what you have.
  • Don’t spend so much time talking. Do.

The goal was to introduce the students to the pipeline we will follow in the course as we develop skills in data management and construction: Identification, Description, Collection, Cleaning, Analysis, Distribution. These are the units the course follows, so the activity mirrors the work they will do on their final project while introducing some of the same questions and topics. I also hoped it would serve a lot of the same function as a more traditional set of icebreakers, with the added hope that the activity would feel a little more connected to the course than such things might otherwise.

I thought it went pretty well all things considered! A few thoughts and reflections follow while they are fresh in my mind:

  • I mostly spent the 45 minutes of the activity floating among the groups, using the opportunity to practice names, get to know each group, and engage with students about their responses. I also took care to watch the clock and encourage people to move on to different parts of the activity’s pipeline at certain stages.
  • The students took some prodding to talk as a group rather than spend the time quietly inputting things in sheets on their own. But this also might be due to the fact that the students spent a few minutes at the start setting up Google Sheets to collaborate on—once everyone had everyone’s email addresses they were off.
  • Given the way the course was pitched, I expected some folks to have questions about what a dataset was. I was a bit surprised to see how quickly people immediately jumped to Google Sheets. It turned out that the vast majority of my students come from STEM or the social sciences, and this disciplinary background might have something to do with how quickly and easily they dove right into working on the project.
  • As I had hoped, the students started with objective facts (siblings, name, major, etc.). As time wore on, they found their way to much more subjective and interesting questions (favorite book, favorite tv show, favorite snack). Some even wound up in philosophical territory: “what would be your answer to the trolley problem?”
  • The students noted that much of their data wound up being a mix of textual and numerical information, though one group shared dog photos in the spreadsheet. I might have a hand in encouraging this delightful bit of chaos.
  • For the analytical section, the students mostly narrated their way through the stories that their group’s datasets told about themselves. But one group—admittedly the one that happened to have two stats majors—actually produced graphs! They shared some scatter plots, pie charts, and even showed a map where they plotted their hometowns.

The activity concluded with the students sharing back their group’s datasets and analysis with each other. I followed this with a general discussion of the activity using the following guiding questions:

  • What is data?
  • What is it not?
  • What was hard?
  • What did you learn?
  • How did it feel to represent yourselves in this way?

One thing that came up was the students’ willingness to share personal data and not worry too much about it. For one, nothing they shared was all that sensitive. Several students connected this comfort to the general pervasiveness of technological surveillance in their daily life. If just about anything can be found online, why sweat any small thing too much? Even knowing that this was a specific response to an isolated activity, the conversation was an interesting window into how differently younger generations view privacy and personal data.

I’ll definitely be using a version of the same activity in future semesters. Onward!