Debriefing: Houston and Audenaert

This post gives you a chance to share your thoughts about Friday evening’s presentation by Natalie Houston and Neal Audenaert.

I noticed at least a couple of issues that have come up in earlier sessions of this class that might be worth elaborating on:

1. Why use digital tools for humanities work? Dr. Houston’s argument here was especially interesting, I thought. On the one hand, she echoed the arguments of Drs. Guldi and Black about the need to use computers just to manage the sheer abundance of digitized sources at our disposal. But she also suggested that digital methods might allow scholars in her field to bypass or reframe some of the major ways of thinking about the field, particularly by upsetting the notion of what constituted the literary “canon” in the nineteenth century.

2. Is digital reading a form of distant reading, or something else entirely? There was an interesting exchange about whether what The Visual Page project is doing counts as “close” reading (after all, it’s trying to look at things like the difference between a semicolon or a comma) or “distant” reading (because of the size of the corpus). Perhaps what we need is some other term, like “digital reading,” to describe what projects like this are doing.

3. Why collaborate, and how? We’ve talked before about the need for collaboration in many digital humanities projects, so it was interesting to be able to hear both from a computer scientist and a literary historian about their experiences working together. Dr. Houston noted that this kind of collaboration can feel unfamiliar at first given that so much of our training in the humanities focuses on individual work; has that been your experience, too?

4. Metadata! We’ve had several speakers now who have been involved in projects that analyze bibliographic metadata to ask new questions: Black uses compression clustering on archival finding aids, Guldi uses Paper Machines to get a sense of huge bureaucracies and what different organizations talk about, and Houston is building a database of published volumes of poetry in the nineteenth century. All of this suggests the importance, and richness, not just of data but of metadata.

Feel free to comment on any of these topics, or something else entirely, in the comments.

9 Responses to Debriefing: Houston and Audenaert

  1. It seems not so much that collaboration is new to historians — we are constantly working with others in the archival and editing phases of our work — but rather it is the kind of collaboration involved with digital humanities. We may know how to talk with other academics (or at the very least other historians), but engaging in a dialogue with someone whose brain works in profoundly different ways than your own seems a bit daunting. Historians need to develop some knowledge of the computer skills necessary for the projects they wish to take on, and (correspondingly) the computer scientists need to have some understanding of the “why” of our projects. The technologically adept folks need not completely grasp the historiographical significance of our scholarship, nor should we necessarily have to master the jargon and conceptual frameworks of what IT people are doing. (Although, historians who can master these skills, like our own Caleb McDaniel, have an advantage when it comes to creating an useful dialogue between themselves and computer scientists.) This is what Drs. Houston and Audenaert exemplified: a respect for one another’s work that allowed them to craft a style of discourse that worked for them. After all, what is necessary in the end is finding some way to work together to produce the most effective and efficient digital humanities project possible.

  2. When I first sat down to write my debriefing post, I wanted to question the use of the term “digital reading.” I’d still like to do that, but I’m not sure what path I’m going to go down with it, and if you’ll forgive me, might present a thoughtful stream-of-consciousness post. (My apologies in advance!) I’m interested in questioning whether or not we can call this a reading in the traditional sense of the word, considering that a computer program is scanning data based on preset codes for symbols, spacing, etc. But then again, once the data set is produced by the program, it is still up to an actual person to read it and analyze it accordingly. John and I brief discussed this on the way to today’s Brown Bag, and he suggested that it seems like a quantitative rather than qualitative reading, which caused us to question again whether this can be called a “reading.” For me this begs the question, if it isn’t a reading then what is it? My first inclination is to call it a digital scan, which I think is what’s going on technologically, but what is the scholar doing with the data? (Am I even right to call it data? This seems rather clinical/scientific and therefore foreign and scary to me.)

    This has brought me to a new thought. Is a program like this meant to do more than help scholars select a set of texts to examine? I mean, if you search for specific criteria within the set of poetry books that Dr. Houston has (was it 1,300 or 13,000? I can’t recall), is this different than me selecting texts to read based on a keyword search? Is the computer program providing us with new information, or sorting out/suggesting what books we might want to look at for specific information? I happen to think that it’s a bit of both, particularly when you’re talking about decorative flourishes on pages, but I wonder if the program selects books that the scholar needs to go back and inspect further, or if you can use the “raw data” without going back to the physical copy of the book. My intention is not to reduce projects like this to a search engine, and I hope that my questions don’t come across that way. I think this is a really interesting project that has the potential to do a lot of fascinating work on visual material and the way that producers and consumers both see pages, but I just have a couple of lingering questions about the application of the program.

  3. I think the most interesting thing that the lecture illuminated was how digital humanities can be used to process data in ways that historians are limited by human biases. We can be self-referential of our flaws and try to recognize them, but at the end of the day, all researchers have biases that are difficult to overcome when doing extensive qualitative analysis. I think the aid of digital tools is extremely helpful because it allows a different level of processing that the human brain cannot do by itself. With Dr. Houston’s work, digital tools allow data to be analyzed objectively by first setting the parameters and then letting the program run by itself. By seeing how computer scientists can aid the digital humanities, it allows new dimensions of study to be explored.

  4. I’d like to build off of Kelly’s point about what exactly we gain from a project like this. It seems to me that there is some overlap here between this project, and the ideas/projects discussed by Chad Black. This type of project allows us to discover and refine new kinds of questions. Rather than finding connections and ideas in finding aids, this project will allow Houston to find connections, themes, etc. within the thousands of works outside what is considered the typical literary canon. Once she finds those, however, isn’t she then applying some kind of more “traditional” reading to a smaller number? That was the impression I got. I kept thinking throughout the presentation about Black saying that he is a traditional historian who uses digital methods to define, and refine, his questions and sources.

  5. Great comments all!

    It sounded like one of the things Houston wants to do with this tool is a different sort of “book history” that would be able to use data about the layouts of the page to draw conclusions about the markets and audiences for certain books, for example, so in that sense it’s possible to imagine uses of this tool other than as a way to select a group of texts for closer reading.

    That said, it seems to me like even as a selection tool the program could be useful. In a well-received talk on topic modeling, David Mimno has said that one way of thinking of the digital humanities is just as “computer-assisted” humanities. That’s not a lesser or less valuable version of digital humanities though. To use his comparison, meteorologists now use “computer-assisted forecasting.” That is, the computer doesn’t do all the work; the person still has to interpret the data. Even so, computer-assisted forecasting is a heck of a lot better than sticking a finger into the wind and sniffing for rain!

  6. I really enjoyed the presentation because we did get to hear from both sides of a “digital humanities” team. What I thought was really interesting was the premise behind what they were doing – I personally just go straight to the text on the page so I do miss these contextual hints. I think listening about the learning curve that Neal first experienced when introduced to this new project was really fascinating and I always enjoy seeing such good collaboration! It’s always interesting to see what those fresh to new material notice as interesting.

  7. To me, this presentation showed the “field of dreams” that programming can be; that is, dream it and they can build it. I really liked that Neal was open to suggestions and excited about the new possibilities.

    Whitney, I agree that we do not necessarily need to master the skills of collaborators, but I do believe we should work toward a common language, which often times is filled with “jargon” when working with an interdisciplinary team. I see it as no different than working in an archive in a foreign language.

    Kelly, I think these types of tools enable us to engage with topics on scales much more flexible than in traditional historical scholarship. “Digital reading” could be equated to skimming a book. We can understand the book (or poem) based on our different levels of engagement. Examining poems based on a specific set of criteria is not necessarily to identify specific works, but instead to identify trends in the literature. Selecting texts to read based on keyword is much different from viewing trends in texts at a different scale.

    Sophie, I really like the idea of objectivity, but there are dangers in claiming objectivity through computer-based work. Computer analysis is inherently subjective because programmers have to decide how data are computed, processed, and visualized. Even before this, researchers choose what is included in the dataset and how it is included.

    John, I agree with you on the idea of using these tools to find themes, but I wonder if it is necessary to apply a more “traditional” reading to a defined group of sources to produce “useful” scholarship. Of course there is always a need for ground-truthing data, but is there potential for a model of scholarship on a smaller scale, or must we always include microhistory in our work? In my understanding, this type of scholarship is what Natalie’s tool has the potential to support/create.

  8. Christina Villarreal

    Dr. Houston’s insight on using digital humanities to escape our human biases is especially important when considering the possibilities of digital tools. Her example of exiting the “canon,” as the debriefing mentions, was something that I hadn’t thought of before. This way of using the tool, taking something out of its frame in order to reconsider it, can contribute and possible challenge many established histories. Digital reading, as something separate form any other reading, or the idea of different ways of reading text makes the presentation and delivery of words more relevant. Words provided within books can be evidence of things that other forms of text will lack or display differently. Examining texts in this way can reveal a much about the text’s significance. I even think that collaboration is necessary in this process of reading. Unbiased eyes can catch patterns that a well-trained eye can overlook.

  9. I have to agree with you, Christina, about the canon factor—certainly I think I tend to ignore the idea of canonicity in historiography, maybe because we rarely discuss it in undergrad classes, but we have our own problems in sorting past the canonical secondary sources to other analyses that might have been referenced more rarely. I know that in my thesis I’ve actually profited from looking up dissertations, but the scope of how many existing these and dissertations might be lurking out there in topics only tangentially related to what you’re working on? Forbidding to think about sometimes.

    I hear you, Wright, about the idea that microhistory doesn’t have to show up in everything, but one of the interesting themes that seems to be emerging out of this class is the idea that there’s a fairly sharp line between micro and macro. THAT’S something we have to give some thought to, disciplinarily, because specializing more in one scale or style of analysis than the other one could really radically change what you were able to do with your career. It seems like it would be the clearly more-valuable thing to be able to switch between scales as you saw fit, even if maybe not always within the same project. I can imagine, though—but I could be off base with this—how marketing yourself as more of a specialist in larger-scale information rather than close reading could help you present yourself as having a more distinctive skillset. Not sure if I really have a solution, or if that even counts as a problem, but long story short I think everybody should be cultivating as much versatility as they have the time to.