Today in class I briefly mentioned TF-IDF (Term Frequency-Inverse Document Frequency) as a possible way for us to identify "give away" words that might appear more frequently in a particular document. Here are some introductory explanations of the method:
I also mentioned Named Entity Recognition in class; this is the same library used by the Rezo Viz tool that Daniel and Alyssa showed us in their Voyant Tools presentation. It may be possible for us simply to use Voyant as an interface for NER and export a list of place and person names from our ads, but we need to look into this further.
If you still feel a bit lost with these examples, don’t worry; we will spend more time clearing up confusion on Friday and throughout the next week. The point of these exercises is to show some of the challenge that comes from representing information that is interesting to humanists in formats that computers can more easily digest. On Friday, we’ll also talk about the arguably more challenging task of deciding what information we want to represent!
These are the other links that were discussed today:
It’s also a subject that has come up quite a bit in my Twitter stream lately. Here are some highlights for your perusal:
The editorial board of an academic journal recently resigned over the restrictive licensing of its publisher, sparking discussion about what academic editors could or should do about open-access publishing.
Big issues remain with regard to the evaluation and financial sustainability of these new ideas about digital publishing, but it does seem like some promising conversations are already beginning. Feel free to post your reactions to any of these links in the comments.
Last night, Cameron Blevins, Jeri Wieringa, and Annie Swafford (left to right in the video above) joined us for a fantastic Google Hangout about their experiences as grad students in the digital humanities and digital history. Please post your reactions and follow-up comments here!
Our next speaker, Chad Black, will be here on November 1 to deliver a lecture entitled “Quito Jailed: Institutional Profiling in the 18th Century.” He will also be leading us through a workshop that evening on how to use some simple Python scripting to do some preliminary research using an archive Finding Aid. Cool!
AHA Task Force on Digital Humanities → As mentioned in our discussion last Thursday, here’s the petition that University of Nebraska graduate student Jason Heppler (@jaheppler for those of you on Twitter) and several others put together asking the American … Continue reading →
Comments Off on AHA Task Force on Digital Humanities
This website is currently the home for HIST 318, an undergraduate course that will be using digital tools to locate, analyze, and visualize a collection of runaway slave advertisements from a Texas newspaper.