Famine data

One of the secondary questions of my research has been what themes in famine reporting were dominant among all famine reports in different locales.  What, for instance, was the most common framework for famine reporting in New York in 1847, and how did that differ from the frameworks employed in Britain, the American South or Indian Territory.  I’ve tried a few really clunky ways of representing this, by tracking the number of iterations of certain themes by place and time.  (I should say that these are themes I’ve assigned myself – they differ somewhat from place to place, with major overlaps – and include references to the availability of potato (coded as “potato”), appeals for aid (coded as “appeals”) and discussions of American obligation (coded as “American sympathy”).  As a result, these themes are somewhat subjective – the next step in this visualization is to mine the text of all of the reports I’ve collected, but that’s for another day)

Anyway, as part of this IVMOOC I’m taking while biding time before my defense/trying to grapple with data in a more systematic way, I learned about “burst analysis.”  Basically, this is a way of tracking increased incidences of certain words in articles/titles/subject headings/whatever over time.  Jon Kleinberg, who developed this kind of analysis, describes it as a way of tracking “the appearance of a topic in a document stream [a]s signaled by a “burst of activity,” with certain features rising sharply in frequency as the topic emerges.” So basically, a topic “bursts” when it is discussed with greater and greater frequency (as determined by a set of key words) and the burst ends when that frequency dips.  There’s a lot of math involved in figuring out the “burstiness” of any given theme, but the fabulous Sci2 tool thankfully does all that for me.

So, here’s my first attempt to map “bursts” in famine reporting themes:

I think there are a few interesting things about this visualization, which I’ve intuited but never really seen so clearly.  The first is that the major themes I’ve highlighted in my dissertation “burst” at very different times.  I suspect that this has to do with the speed at which news traveled in the mid-nineteenth century, but the fact that the newspapers of the urban South contained an uptick in discussions about immigration in 1849 is interesting as well.  I also love the little blip of interest in nationalism in New York in the middle of 1847 – there’s a much more extended discussion of the problems facing the Irish nation in 1848, but perhaps later references to nationalism didn’t occur rapidly enough to constitute a “burst.”

“Are you a math person? You look like a math person.”

Having submitted my dissertation for review, I find myself with some time on my hands.  While many people have suggested that this would be an opportune moment to relax my father, who is also an academic, suggested that it merely freed up time to begin new projects! Write articles! Learn new skills!  Having taken one morning off this week to drink cocoa and read a novel, I think I’m all done relaxing and ready to get started.

A few years ago, after a thrilling session on network analysis at the AHA, I decided that I was going to teach myself network analysis.  That, much like undergraduate attempts in stat classes on linear regression analysis populated by econ majors, didn’t go quite as planned, and I mostly gave up and began to rely in IBM’s online ManyEyes software, which produces nice, if slightly clunky visual representations of data.  But just yesterday, I received notice of Indiana University’s free MOOC on information visualization (referred to as IVMOOC, which is really quite fun to say), which is offered just when I need something to occupy my time/keep me from compulsively re-editing a document I’ve already turned in.  The preliminary survey for the course suggests that it’s mostly geared towards people who already have data-driven backgrounds, so for the next eight weeks, I expect to feel much like I did when confronted with Chi-squared problems in my senior year of college – completely over my head, but having loads of fun.

At the same time, I also hope to get acquainted with the open source Quantum GIS software, which seems like it would be a pretty nifty way to deal with the map-making problems I’ve been confronting recently.

Also revising one article.  Also writing another article about the movement of information in the mid-nineteenth century, which hopefully utilizes some of what I’ve picked up from IVMOOC and Quantum GIS.

At any rate, the enthusiasm made possible by my new-found time must have been obvious to the woman sitting next to me during my novel-reading/cocoa-drinking morning off.  As she got up from her seat next to me at the cafe, she turned and said “Are you a math person?  You look like a math person.”  We’ll see.

The more [history] you learn, the more [history] you see

 

Credit: Bill Amend at http://www.foxtrot.com/

I’ve been throwing out variations on this line since I first saw this strip, and I’ve been having quite a few “the more history you learn…” moments in the past few weeks because of the hurricane.

On Saturday, the Press of Atlantic City reported that NOAA classified Sandy as a post-tropical cyclone right before it made landfall in NJ, a decision which is estimated to save homeowners/cost insurance companies millions of dollars in deductibles.  NOAA isn’t a political body, but the classification is a fortuitous one for those facing insurance claims for their destroyed property, and it was echoed by NJ Governor Chris Christie when he issued an executive order prohibiting insurance companies from charging hurricane deductibles.  (For a really fascinating discussion of the relationship between disasters and flood insurance, see parts II and III of Ted Steinberg’s Acts of God.)  Though most of the article was about the impact of this call on insurance claims, the article briefly digresses into talking about what it means for a scientific body to be in charge – however indirectly – of a huge financial decision:

“If this was a court case, you’d have multiple meteorologists on the stand,” said Campbell H. Wallace, an attorney for the Professional Insurance Agents of New Jersey.

There is no court case. Insurance companies in New Jersey, New York and Connecticut have agreed to waive costly hurricane deductibles, which could have run in the millions of dollars along the three-state area.

Wallace said the insurance industry accepts the fact that the National Weather Service is “legally tasked” with making such determinations. He said meteorologists are judged by their peers and credibility is paramount to them.

The Wallace quote reminds me of another apparently ancillary fact about the Atlantic hurricane – the Galveston Hurricane of 1900 which killed upwards of eight thousand people.  Although meteorologists, both in the U.S. and in Cuba registered concerns about a storm headed for the Gulf of Mexico, the National Weather Bureau’s policy was to limit the use of the word “hurricane” in official correspondence, because it might engender widespread panic.  On top of all of the other reasons for the high Galvestonian death toll (the misguided belief that hurricanes never struck that part of the Gulf, little way for ships to communicate observations from the middle of a storm, buildings that were particularly susceptible to storm damage) some of the blame must go, and has gone, to whomever made the decision that “hurricane” was just too dangerous a word for the American people.

In some ways, what is happening with insurance companies today is the flipside of what happened with the NWB and Galveston – in defining what counts as a hurricane, and what is “merely” a post-tropical cyclone (the two can be differentiated by as little as 1 mph difference in maximum wind speeds measured on the ground) the NOAA is saving – intentionally or no – thousands of people millions of dollars in total.