Thanks to everyone for the good discussion last night at Chad Black’s lecture and workshop! This post contains some thoughts about both, with an invitation to you to share your own reactions.

The Lecture: I thought it was interesting that some of the same challenges that have come up earlier in this course, like how to categorize ambiguous historical realities, came up again in this talk. To me it was also interesting to hear Chad talk about how digital visualizations of his data were actually critical to exposing the limitations of the data in a way that close readings of a few documents might not.

The Workshop: Much of our conversation centered around how to think of digital history or history “programming” as an approach to problem-solving, a way to address specific obstacles or speed bumps that might arise as part of your workflow as an historian. Chad also shared with us how the problem of “archival abundance” (i.e., returning home with thousands of digital photos) led him into scripting for the first time. Have you also encountered this problem of archival abundance in your work? Are there “problems” in your workflow that you are now thinking about tackling with computational methods? Did Chad’s experience give you a sense of how someone might move from “hacking” workflow problems to using the same tools for preliminary research or substantive questions about historical content?

Please post your comments about any of this (or anything else that the visit prompted for you). Also, you may also want to read more about Chad’s clustering techniques on Digital Humanities Now.

5 Responses to Debriefing

  1. I liked how Chad offered perspectives from the high-tech and low-tech side of how digital history can be applied to research. Whenever research is completed in a short amount of time (like how he collected thousands of photos in 10 days), there are going to be huge organizational issues to deal with later on. I thought it was interesting that he showed our class not only how to use Python to do very specialized functions, but also offered general tips accessible to all digital researchers (e.g. label your files so that if they are displaced to another folder, you still understand what they are.) Since I have no prior knowledge of computer science specifics, even basic (but not immediately obvious) tips like we talked about are going to be very useful moving forward.

  2. Since I wasn’t able to attend Chad Black’s talk, I’m trying to piece together what all was discussed through this post as well as twitter–it’s amazing how much information can be put out by a couple of people tweeting in 160 characters! What I’ve found particularly interesting is this tweet: “@parezcoydigo: Is deep archival knowledge/familiarity a predecessor or product of text mining for historians? #dhtopic” #ricedh.” I’d like to couple this with the idea of “archival abundance” and combatting it through computational methods. If you’re creating keywords/categories for your evidence, how important is it to know what you’re looking at, as opposed to what you’re looking for? By simply plucking this tweet out of obscurity, I might have missed another aspect of the conversation, but I’m not convinced that it has to be either/or. I feel as though knowledge of the archive is necessary to make research effective and efficient, but that after you’ve examined your evidence, you’ll know it ten times better. And if you don’t combine what you think your terms/categories should be with what the archive provides you with, how effective is your method?

  3. While I also unfortunately missed the lecture and discussion afterwards, one point that Caleb mentioned in the debriefing caught my eye: the idea that a visualization can reveal limitations on the data. Yet might an image also reveal limitations on the methodology of gathering that data? Correspondingly, could a data visualization reveal additional helpful variations on methodology, variations that would expand, confirm, or deny the assertions gathered by other methods?

    Chad’s clustering techniques that he presents on his DH Now blog are intriguing (if overwhelming), yet I — either fortunately or unfortunately — do not have the “overabundance” of material that it seems clustering requires. I wonder if clustering might be useful for one without a surplus of data?

  4. I thought it was really interesting to listen to Chad’s journey and learn how his own organizational methods eventually had to change as the project progressed and his individual journey to incorporating scripts into his work.

    I also definitely noticed the similarities to last time’s presentation, where labeling and categorization were both heavily emphasized even though it wasn’t something I had really thought too much about before taking the class. It’s fascinating to me to think about how those decisions can take the project in a totally different direction.

  5. Christian Hauser

    Chad was particularly fun to listen to during the workshop as he seemed so comfortable with the material. It was neat to see what initial observations he could make using purely computational methods on the data sets, in fact that made the topic much more interesting to me than it otherwise might have been. The clustering work set me thinking on application to other areas of interest as well.

    I was glad to see the ‘computers-are-a-tool humans-are-the-brain’ refrain repeated as we saw how the historian had to iteratively work through the process to get useful results and interpret them. This despite out science-fiction trained mentality that the machine takes over more and more of the work, the scholar is as involved as ever (he just might get a coffee break while the data is crunching).