Adrea Lawrence

education historian

Walking Through Big Data: One Historian’s Path | History of Education Society, 6 November 2015, St. Louis, MO

Historians of education are known for their intensive, close examinations of documentary evidence.  Through careful “document analysis,” one identifies patterns, anomalies, curious players, key events, policy, and change over time all in that which is deemed to be “educational.”  Of late, the trend toward publishing what might be called case studies or microhistories has manifested.  This includes my own work on American Indian education.  In the broader fields of history and literary criticism tools for analyzing large bodies of textual data have become prominent in discussions about the digital humanities.  Our tiny subfield has just a handful of people who have been experimenting with such tools.  Today, I will walk you along my meandering path in an attempt to understand the terrain of 19th century children’s magazines in terms of what they published and, by extension, what their publishers hoped children and families would learn.

I began along this path as I was finishing research for my book Lessons from an Indian Day School several years ago.  As I did a scattershot search in the serial set for anything related to Pueblo Indians, I came across several articles that recounted histories from a couple of Pueblo Indian communities.  Charles Lummis, the notable—and in the BIA’s eyes, notorious—adventurer, writer, librarian, and editor, published a number of these stories in St. Nicholas Magazine. This magazine, in fact, had one of the longest runs of any children’s periodical in the US, from 1873-1939; and, it had a significant average monthly circulation at 70,000 copies distributed.1  I tabled these stories for a while.

When I went back to them, I began to read each story deeply, and I was captivated by the ethnographic depth with which Lummis recounted the stories and the scenes in which they were told to him when he lived at Isleta Pueblo, not far from Albuquerque, from 1890-1892.  Screen Shot 2015-11-30 at 6.57.46 PM
I paid close attention to the coyote stories, as Coyote among most, if not all, Pueblo Indian communities is a trickster, and thus a teacher.  I went and studied Lummis’ papers, looking at many of his records from the years surrounding his time at Isleta and his subsequent work with Adolph Bandolier.  I learned that Lummis was devoted to the Isleta community and had worked against the BIA to aid Isletan parents in their right to determine the course of formal education for their children.2  I also learned that the larger questions I had about colonization as an educative phenomenon had reached their limits with the method of exegesis.

To examine how colonization might have been instructive (or not) to those who experienced it in the American West, I needed a broader approach.  I needed to look at a much wider and deeper body of evidence.  And, I needed to be prepared for what I might or might not find. I had found the Lummis articles in the HathiTrust repository. I had downloaded each page individually in order to look at what he had written as an institutional subscription was necessary to download a year’s worth of issues at a time. I felt like I had walked myself right into a rut that was ten years deep. And then, the HathiTrust opened up the HathiTrust Research Center, which allows researchers to create worksets of materials in the repository and run a number of algorithms to analyze the textual data. My rut was flattened.Screen Shot 2015-11-30 at 7.01.32 PM

What shape would my worksets take? In playing around with the tools for researchers, I found that I could identify volumes and issues of periodicals in the HathiTrust and cross-reference those with the Digital Public Library of America and WorldCat. I found, for example, the HathiTrust has nearly the full run of St. Nicholas Magazine, but that it has only a portion of the Youth’s Companion. As I began working my way down the list of prominent children’s magazines in the nineteenth century—those that ran the longest or had a significant circulation—I compiled worksets and ran what seemed to be initially elementary analyses of each: deployable word counts.

As Lummis’s articles piqued my interest in how non-Natives recounted significant tribal histories with a wealth of information about their environments and natural history, I hoped that there would be many such examples of this co-optation. Screen Shot 2015-11-30 at 7.03.40 PMAnd, I had hoped that I would be able to identify patterns in what I was seeing. So, I requested initially short lists, beginning with the top 2000 words for a workset (or magazine collection). I found that I had to edge up the word counts to 7000 to ensure that “Indian,” “Indians,” “savage,” “native,” and the like actually appeared in the word counts. Just with these word counts, I found that the Youth’s Companion had a much greater frequency of such words than St. Nicholas Magazine and Peter Parley’s and Merry’s Museum magazines. Why was this the case, and would topic models show something that supported or refuted this seeming buried indicator of interactions with American Indian communities in what is now the continental US?

And what, by the way, is topic modeling? Megan Brett, who published an introduction to topic modeling in the Journal of Digital Humanities in 2012, writes, “Topic modeling is a form of text mining, a way of identifying patterns in a corpus. You take your corpus and run it through a tool which groups words across the corpus into ‘topics’.”3 In other words, a “topic” is a group of words that appears repeatedly over a constellation of documents.4 A software program called MALLET, which was developed by Andrew McCallum at the University of Massachusetts-Amherst, is the tool most humanists engaging in topic modeling use, including the HathiTrust. Though one can download MALLET for free through U-Mass Amherst, it’s not intuitive to use. There is an interface called the Topic Modeling Tool that one can download from GitHub to make MALLET more usable. Or, you can try out a couple of other tools like the HathiTrust’s internal algorithm for topic modeling that will run your worksets as you see fit. Or, there is Paper Machines,5 a Zotero plug-in that spits out beautiful stream graphs. It was important that I find a tool that I can use with a relatively intuitive interface for a couple of reasons: 1) I don’t code or understand the inner workings of the backend software my machine runs on, 2) I’m new to topic modeling, so I’m most interested in understanding how the process works and how it can help me most efficiently direct my energies in reading texts—or their reliefs—closely and contextually.

Initially I began tinkering with Paper Machines. I have used Zotero for a number of years, and I had, or could get, loads of pdfs. As I was creating worksets in the HathiTrust that were specific to discrete children’s magazine titles, I also downloaded entire issues and volumes that the HathiTrust had, saving the files on my hard drive, in the cloud, and in Zotero. When it came time to actually running the analyses, I was puzzled.Screen Shot 2015-11-30 at 7.17.47 PM I modified the stoplists, or words not to include in the analysis, to nix words like “the” and other frequently used words as well as personal names that appeared in the in topic models that Paper Machines kept spitting out.Screen Shot 2015-11-30 at 7.18.55 PM What I got was no clearer. I was still seeing mostly the use of personal names across years. After reading the Journal of Digital Humanities special issue on topic modeling, it became clear that while Paper Machines was beautiful and promised to deliver an illustrative visualization of topics over time, what it actually produced was not necessarily reliable. Adam Crymble, the critic, notes that Paper Machines is more for getting one’s feet wet with data visualization rather than creating robust analyses.6 Crymble attributes this to a lack of documentation and functionality to the fact that Paper Machines is an ad hoc tool developed by busy faculty members who aren’t necessarily computer scientists. And, in my brief foray with the tool, there may be serious issues with the accuracy of pdf files that rely on hit-or-miss optical character recognition, or the ability of the computer to “read” the text verbatim as we would read the text. This frustration led me right back to the HathiTrust.

At this point, what I did was revisit the handful of secondary sources (literally a handful exist) on children’s magazines in the US in the nineteenth century to re-identify those that ran the longest or had a notable circulation in relation to similar publications.7 Screen Shot 2015-11-30 at 7.23.43 PM I identified four children’s magazines on which to run my analyses: Youth’s Companion, Youth’s Friend, Peter Parley’s which merged with Merry’s Museum, and St. Nicholas. As a group, these magazines ran from 1821-1943. I then created a workset inclusive of these magazines and began running analyses.

Because the corpus I was analyzing includes 721 volumes, each analysis took 45 minutes to an hour. I ran several analyses identifying 10, 10, and 50 topics. I wanted to see how consistent the analyses were (hence the 10 and 10), and how granular they could get (with 50 topics). What I found was some gibberish and some very curious groupings.Screen Shot 2015-11-30 at 7.26.33 PM

Now, I’ve been writing about American Indian education history for several years. My current project focuses on colonization as an educative phenomenon over multiple generations. Given the detail and reception of Lummis’ articles in the 1890s about Isleta Pueblo and the rapid movement of Euroamericans and African Americans west after the Civil War, I expected to see evidence of some sort of prominent discussion of colonization in the children’s magazines I was analyzing. But I only could identify one topic that could be read with low inference as one that directly addressed colonization: Screen Shot 2015-11-30 at 7.27.37 PMtopic 9. This was interesting. I went back to Andrew Goldstone and Ted Underwood’s piece on topic modeling in the Journal of Digital Humanities, and they had this nugget of a reminder: “By forcing us to attend to concrete linguistic practice, topic modeling gives us a chance to bracket our received assumptions about the connections between concepts.”8 What I was seeing was what I was seeing. Why was I seeing this? Screen Shot 2015-11-30 at 7.26.33 PMWhat does this suggest about the editing process and intent of children’s magazines?

To respond to these questions, I returned to my ethnographic training. I closely examined each of the 50 topics that the HathiTrust analysis produced. I then began looking across topics for supercodes—or supertopics—that could be identified with low inference. Then, I named them. The supertopic with the greatest number of topics I identified as Knowledge, Art, Learning. It included eight topics. Screen Shot 2015-11-30 at 8.08.55 PM The next two supertopics each had six topics: Nature, Landscapes, Animals and Patriotism, War, Legend. The subsequent supertopics each had four topics: Nuclear Family & Home, Manliness, Built Environment, and Time. Finally, the last collection of supertopics each had three topics: Advertisements + Membership, Technological Innovation + $$, and Transportation.

To check the accuracy in my coding, or naming, I returned to the initial two analyses I ran on the Children’s Magazines workset, comparing my supertopics with the identified 10 topics produced in each of the first two analyses. Generally, my supertopics were consistent with these analyses. Had I just used the initial two analyses, I wouldn’t have been able to make out the Built Environment or Time, nor would Knowledge, Art, Learning have been prominent. This suggests that the relationship between the number of documents analyzed and the number of topics produced can be set by the researcher to gauge the low inference nuance with which one can read a body of texts. It also suggests that the topics produced can serve as a type of validity check for the close reading of texts that historians are accustomed to doing.

What the topics and supertopics tell me, at this juncture, is that colonization was not framed as colonization. In children’s magazines, it manifested in specific forms of learning, the recounting of patriotic or legendary conflicts, the assumption of the nuclear family, the built environment, transportation, and what constituted manliness. This might sound flip, but in studying federal policies toward American Indian communities in the nineteenth and early twentieth centuries, these very topics pervaded the Office of Indian Affairs schooling system and its curriculum. Outside the school, the narrative of manifest destiny has been an always already “fact” in US history. And this narrative was crafted for and by Euroamericans for Euroamerican audiences. Colonization, in other words, would appear to be pervasively ambient. How could this be, given the often violent encounters that readers of these magazines must have experienced or heard about? And, what about discussions in adult literary magazines? Did they also background direct discussion of Euroamerican interactions with American Indian communities? What did this look like?

At this point, I’m finding that I need to do two things. First, I need to run similar analyses of prominent adult literary magazines to see what topics emerge.Screen Shot 2015-11-30 at 8.21.25 PM This might tell me whether or not there were parallels between the intended audiences. One of the underlying questions I know have, thanks to Don Warren and AJ Angulo’s work on agnotology in education history, is whether or not discussion of colonization was actively masked for children who might well have experienced confrontation first hand. Second, I’m beginning to think about colonization and its associated policies differently. After conversations with colleagues in psychology and counseling, it seems worth examining trauma as a multigenerational phenomenon that has major implications for both policy formation and learning at the genetic level.9 This has been echoed experientially by my father-in-law, who was on the front lines in the Vietnam War. In a recent conversation I had with him, he remarked, “After ground combat, what is there to be afraid of?” This comment stopped me cold. Surely, he was not the first person to have this realization. What, then, are the educational implications of this? And, how might have this have manifested in popular literary magazines and policy in the nineteenth and early twentieth centuries?

1. R. Gordon Kelly, Children’s Periodicals of the United States, Historical Guides to the World’s Periodicals and Newspapers (Westport, CT: Greenwood Press, 1984), 378.

2. See also John Gram, Education at the Edge of Empire: Negotiating Pueblo Identity in New Mexico’s Indian Boarding Schools (Seattle, WA: University of Washington Press, 2015).

3. Megan R. Brett, “Topic Modeling: A Basic Introduction,” Journal of Digital Humanities 2, no. 1 (2012),

4. See also Miriam Posner, “Very Basic Strategies for Interpreting Results from the Topic Modeling Tool | Miriam Posner’s Blog,” October 29, 2012, Ted Underwood, “Topic Modeling Made Just Simple Enough.,” The Stone and the Shell, accessed July 14, 2015, . Scott Weingart, “Topic Modeling for Humanists: A Guided Tour,” The Scottbot Irregular, accessed October 21, 2015, .

5. Chris Johnson-Roberson and Jo Guldi, Paper Machines | Visualize Your Zotero Collections, 2012,

6. Adam Crymble, “Review of Paper Machines, Produced by Chris Johnson-Roberson and Jo Guldi,” Journal of Digital Humanities 2, no. 1 (April 4, 2013): 77–80,

7. Mabel F. Altstetter, “Early American Magazines for Children,” Peabody Journal of Education 19, no. 3 (1941): 131–36. M. O. Grenby, “The Origins of Children’s Literature,” in The Cambridge Companion to Children’s Literature, ed. M. O. Grenby and Andrea Immel (Cambridge: Cambridge University Press), 3–18, accessed March 16, 2012. Hunt, Peter. Children’s Literature. Blackwell Guides to Literature. Oxford, UK ; Malden, Mass: Blackwell Publishers, 2001.

8. R. Gordon Kelly, Mother Was a Lady: Self and Society in Selected American Children’s Periodicals, 1865-1890, Contributions in American Studies, No. 12 (Westport, Conn: Greenwood Press, 1974). Children’s Periodicals of the United States, Historical Guides to the World’s Periodicals and Newspapers (Westport, Conn: Greenwood Press, 1984). Betty Longenecker Lyon, “A History of Children’s Secular Magazines Published in the United States from 1789 to 1899” (Ph.D., The Johns Hopkins University, 1942).

9. See, for example, Rachel Yehuda et al., “Holocaust Exposure Induced Intergenerational Effects on FKBP5 Methylation,” Biological Psychiatry, 2015,

Modifications, Legal Analysis, and the Influence of Music While Writing

As I am well into the revising rewriting of my book manuscript, I’m finding that I’m having an increasingly difficult time compartmentalizing “reading” and “writing.” The last chapter I revised dealt largely with legal issues around the theme of “citizen” that were contradictory, dense, and not a lot of fun to sift through again largely because (I think) I’m not trained as a legal scholar.  Comment bubbles didn’t make a whit of difference for me, and I found myself toggling between pdfs of court cases, legislation, and the working draft of my chapter, rewriting sentence by painful sentence. At one point, I had to physically draw out the relationships between all of the cases, legislation, and executive orders I read and included in the chapter, as well as drawing a timeline of cases to match up against and check the correspondence I’m using from Office of Indian Affairs officials and Pueblo Indians.  The images that surfaced—particularly the first—were a convoluted mess.  But as I returned to Vine Deloria’s legal scholarship and a number of law review articles by other folks, I was reminded that this is the nature of federal Indian law.  That made me feel better for about 10 seconds until it dawned on me that Indigenous folks have to live with the ramifications of this convolution everyday.

As this chapter dealt a lot with the relationship between land and taxes, I found myself in an all together different headspace than I was in writing about disease or just land or learning writ broadly.  My usual writing tunes, especially those written and performed by Miles Davis, John Coltrane, and Thelonius Monk, weren’t resonating.  For a month all I listened to was Thievery Corporation and Radiohead.  I recognize the symmetry of writing about taxes and listening to Thievery Corporation; but I cannot reconcile the calm of listening to Radiohead and the intellectual pain of writing this chapter.  I know I will have to revise it in a month or so in light of the other chapters, but man am I glad to have this particular chapter done.

On the bright side, I have a sketch of a timeline that accounts for polices and important events cumulatively.  I can’t wait for the license code for Adobe Illustrator to arrive, so I can begin refining it on the computer.

Revising and Rewriting

This summer my time is wholly devoted to revising my book manuscript.  The easy part has been to set a firm schedule based on the date I have to deliver the manuscript to the publisher and the beginning of fall classes.  The more difficult part has been the actual revision.  Normally, I do all of my revisions by longhand with my favorite pencil.  But this option has proved untenable because of the volume of source materials I’m working with and because my hand just gets too tired too quickly.  Also, I’ve quit Endnote and have migrated all of my sources to Zotero, meaning all of my footnotes would have to be redone.  So, I’m doing something I never thought I would do; I’m retyping—and in the process significantly rewriting—each chapter.  I told my writing group about this after they read my most recently revised chapter, and each person expressed surprise or horror.  They thought this was a lot of additional work.  I think it’s actually a rather streamlined process.  Here’s how I’ve been going about it:

  1. As I incorporate new evidence and analysis, I do two things:  add comment bubbles with the new evidence and new prose to the working draft, and if the comment bubbles become unwieldy because there are so many for a particular theme, I migrate them to an outline in a separate file.
  2. When I have all of my new evidence and analysis together, I print out the working draft with comment bubbles and the outline.  These are the only documents I use when I’m rewriting.
  3. Then, I write.  I’ve found that beginning each writing episode by retyping gets me into a conversation with what I’ve already written, and I’m able to begin rewriting within the space/time of a paragraph.  I’ve also found it much easier to reorganize sections in big chunks.  If I think I’m going to get lost or digress from my train of thought, I’ll quickly draw my argument on a piece of paper next to my computer.  In the last 6 years or so, I’ve begun to construct arguments in a meandering, rather than a linear, manner, toggling between the beginning, middle, and end of an argument instead of plowing on through.
  4. Finally, I share my work with my writing group and ask them what they are seeing as the major arguments, what they think about the organization and prose, what problems remain unsolved, what they think I should add or omit, etc.  This is probably the most exciting aspect where I learn to check my work, so to speak.

I count myself lucky in that I have a stellar writing group of colleagues from different fields and that the reviewers for my book manuscript generously provided detailed feedback in their reports to the publisher.  With the myopia and isolation that comes with working independently on a big project for years, this type of outside attention is critical…at least for me.

The Blog Redux

I started (and quickly dropped) a blog a couple of years ago when I realized that it would not “count” as part of my tenure file.  Peer review à la journals and books is the focus of all tenure talk around research at my institution.  And since I am not yet tenured, I dropped the blog.  Now I’ve set up this site.  I read a lot of scholarly blogs and blogs about interesting things.  The folks at the Center for History and New Media across the river and the people I follow on Twitter have convinced me—through their examples—that I need to have a publicly accessible digital voice, especially since I’m so interested in digital humanities and pedagogically experimenting with new technologies.  So, here it goes.  Once again.

© 2016 Adrea Lawrence

Theme by Anders NorenUp ↑