Category Archives: Links

Some Text Mining Resources

Today in class I briefly mentioned TF-IDF (Term Frequency-Inverse Document Frequency) as a possible way for us to identify "give away" words that might appear more frequently in a particular document. Here are some introductory explanations of the method:

And here’s a cool visualization experiment using TF-IDF made by Tim Sherratt, who also made the Real Face of White Australia and Headline Roulette sites shown in class today.

I also mentioned Named Entity Recognition in class; this is the same library used by the Rezo Viz tool that Daniel and Alyssa showed us in their Voyant Tools presentation. It may be possible for us simply to use Voyant as an interface for NER and export a list of place and person names from our ads, but we need to look into this further.

Mapping Ancient Trade Routes

On Monday we talked a little bit about how a future historian might be able to reconstruct roads using geolocated tweets. One of you shared with me this project showing that a current historian is doing something similar using coins and Google Earth. Thanks for the cool digital history project tip!

JSON Examples and Links

If you’d like to look more closely at the JSON examples discussed in class, here are the exhibits from the handout. To test their validity, you can copy each one to your clipboard and paste it into the JSONLint site and click on "Validate." You may also want to take a look at the JSON specification page that I had up on the screen.

If you still feel a bit lost with these examples, don’t worry; we will spend more time clearing up confusion on Friday and throughout the next week. The point of these exercises is to show some of the challenge that comes from representing information that is interesting to humanists in formats that computers can more easily digest. On Friday, we’ll also talk about the arguably more challenging task of deciding what information we want to represent!

These are the other links that were discussed today:

Finally, after today’s lightning-quick introduction, you may be interested in knowing why historian Ian Milligan thinks that JSON rocks.

Links from Monday

Before coming to class on Wednesday, please be sure to go through the readings for January 15. We will be talking about two big questions in regard to these readings:

  1. Why would an archive of tweets be useful to historians?
  2. How does Twitter work "under the hood"?

You may also be interested to look more closely at some of the things I introduced in class yesterday, which can be found at the following links:

The Futures of Publishing?

One of the subjects that came up frequently in our roundtable and comments thread, as well as my interview with Jason Heppler, was the future of academic publishing.

This is something I’ve thought about a lot lately, partly because I was asked to do a presentation on online publishing for a series being run by the HRC.

It’s also a subject that has come up quite a bit in my Twitter stream lately. Here are some highlights for your perusal:

Big issues remain with regard to the evaluation and financial sustainability of these new ideas about digital publishing, but it does seem like some promising conversations are already beginning. Feel free to post your reactions to any of these links in the comments.

Grad Student Roundtable on Digital Humanities

Last night, Cameron Blevins, Jeri Wieringa, and Annie Swafford (left to right in the video above) joined us for a fantastic Google Hangout about their experiences as grad students in the digital humanities and digital history. Please post your reactions and follow-up comments here!

Some of the links mentioned:

Aside

Doing Digital History at Harvard →

Aside

Reading Digital Sources: A Case Study in Ship’s Logs → An entry point into a fascinating series on digital history by Ben Schmidt, with some good discussion of topic modeling and text analysis

Up Next: Chad Black

Our next speaker, Chad Black, will be here on November 1 to deliver a lecture entitled “Quito Jailed: Institutional Profiling in the 18th Century.” He will also be leading us through a workshop that evening on how to use some simple Python scripting to do some preliminary research using an archive Finding Aid. Cool!

In preparation for Professor Black’s visit, you may want to check out his book, The Limits of Gender Domination: Women, the Law, and Political Crisis in Quito, 1765-1830, or his website. If you’d like to learn a little bit more about Python, check out this introductory guide for historians.

Aside

AHA Task Force on Digital Humanities → As mentioned in our discussion last Thursday, here’s the petition that University of Nebraska graduate student Jason Heppler (@jaheppler for those of you on Twitter) and several others put together asking the American … Continue reading