For those history of science types out there, I just finished working on a project with David Hubbard, Anouk Lang, Kathleen Reed and Lyndsay Troyer for the (now completed) IVMOOC on the History of Science Society’s journal, Isis. We ended up with a visualization that tracked changes in authors’ locations from 1913-1937 to 1988-2012, and also mapped the dominant themes in Isis article titles from 1913 to the present. There’s probably still a lot to do with the history of the journal, but I think we made a pretty good start.
One of the secondary questions of my research has been what themes in famine reporting were dominant among all famine reports in different locales. What, for instance, was the most common framework for famine reporting in New York in 1847, and how did that differ from the frameworks employed in Britain, the American South or Indian Territory. I’ve tried a few really clunky ways of representing this, by tracking the number of iterations of certain themes by place and time. (I should say that these are themes I’ve assigned myself – they differ somewhat from place to place, with major overlaps – and include references to the availability of potato (coded as “potato”), appeals for aid (coded as “appeals”) and discussions of American obligation (coded as “American sympathy”). As a result, these themes are somewhat subjective – the next step in this visualization is to mine the text of all of the reports I’ve collected, but that’s for another day)
Anyway, as part of this IVMOOC I’m taking while biding time before my defense/trying to grapple with data in a more systematic way, I learned about “burst analysis.” Basically, this is a way of tracking increased incidences of certain words in articles/titles/subject headings/whatever over time. Jon Kleinberg, who developed this kind of analysis, describes it as a way of tracking “the appearance of a topic in a document stream [a]s signaled by a “burst of activity,” with certain features rising sharply in frequency as the topic emerges.” So basically, a topic “bursts” when it is discussed with greater and greater frequency (as determined by a set of key words) and the burst ends when that frequency dips. There’s a lot of math involved in figuring out the “burstiness” of any given theme, but the fabulous Sci2 tool thankfully does all that for me.
So, here’s my first attempt to map “bursts” in famine reporting themes:
I think there are a few interesting things about this visualization, which I’ve intuited but never really seen so clearly. The first is that the major themes I’ve highlighted in my dissertation “burst” at very different times. I suspect that this has to do with the speed at which news traveled in the mid-nineteenth century, but the fact that the newspapers of the urban South contained an uptick in discussions about immigration in 1849 is interesting as well. I also love the little blip of interest in nationalism in New York in the middle of 1847 – there’s a much more extended discussion of the problems facing the Irish nation in 1848, but perhaps later references to nationalism didn’t occur rapidly enough to constitute a “burst.”