Encountering Walden

Paul Schacht, State University of New York at Geneseo
Elizabeth Witherell, University of California, Santa Barbara
Rebecca Nesvet, University of Wisconsin, Green Bay
Elisa Beshero-Bondar, Penn State Erie, The Behrend College
Fiona Coll, University of Toronto
Nikolaus Wasmoen, University at Buffalo

Scholarly Editing, Volume 39

In 2019 the authors obtained a grant from the State University of New York to develop open educational resources that would teach the fundamentals of digital scholarly editing by drawing examples from the manuscript of Henry David Thoreau’s Walden, first published in 1854 by Ticknor and Fields. A portion of the grant funding enabled the Huntington Library to digitize the Walden manuscript (HM 924) following standards of the International Image Interoperability Framework and to make the images publicly accessible from their website. These high-resolution scans make possible new pedagogical engagements with Walden; in particular, they offer students of scholarly editing a close-up encounter with the material traces of drafting and revision.

The grant’s principal investigator, Paul Schacht, is director of the Digital Thoreau initiative, which in 2014 published a “fluid-text” edition of Walden broadly based on editorial principles elaborated in John Bryant’s The Fluid Text: A Theory of Revision and Editing for Book and Screen. The fluid-text Walden enables readers to follow the evolution of Thoreau’s book over the course of its seven extant manuscript versions, first identified by J. Lyndon Shanley, who worked with the manuscript in the 1940s and 1950s. Using the open-source Versioning Machine, the edition provides a web interface for visualizing the manuscript versions in parallel columns. The Versioning Machine generates this interface by transforming source files encoded in TEI (Text Encoding Initiative) into HTML. Digital Thoreau’s source files draw on Ronald E. Clapper’s dissertation, “The Development of Walden: A Genetic Text,” and apply the TEI’s critical apparatus tag set to Clapper’s transcription of textual witnesses.

Walden is a striking example of the fluidity that Bryant describes as inherent to all textuality but manifested to different degrees in published works. For Thoreau, Walden was very much a text in flux from the fall of 1846, when he began drafting a lecture describing his life at the pond, until the summer of 1854, when he wrote a note to the printer at Ticknor and Fields about where the map of the pond should be placed in the published book. Since there are not seven complete manuscript versions, and since numerous leaves were repurposed from one version to another, we cannot know the precise content and extent of the work at each of its stages. However, to track Thoreau’s revisions in any significant passage that appears in several drafts is to gain a microcosmic stop-motion view of the authorial process that produced the whole book. Thoreau’s manuscript thus provides rich territory for exploring the objectives and challenges of scholarly editing in general and TEI encoding in particular.

Our project has focused especially on the value of the manuscript for helping undergraduates see a classic text through new eyes. In our experience, undergraduates too often approach culturally revered authors as geniuses whose thoughts arrived fully formed in moments of inspiration and were translated instantly from mind to page. Scrutinizing Thoreau’s numerous revisions to a passage on a single leaf or across multiple versions of the manuscript produces a completely different view of authorship. Modeling those revisions carefully in an encoding language such as the TEI leads to a kind of critical interpretation that constructs meaning partly through an analysis of authorial paths not taken; at the same time, it offers budding editors and digital humanists an introduction to the fundamental principles of editorial and digital practice.

In what follows, we describe three encounters with the Walden manuscript that demonstrate the different pedagogical possibilities that this resource offers in different contexts, each with its own audience and time-horizon: workshop (brief), course (extended), and scholarly editing community (indefinite).

Walden in a Workshop

Our grant funding included support for a symposium at Paul Schacht’s home campus in western New York, SUNY Geneseo, on editing and encoding in the undergraduate classroom. Originally scheduled for two days in the spring of 2020, the symposium was intended to attract colleagues in the region who had already brought scholarly editing and encoding into their classrooms or were considering doing so. The literature on this pedagogical practice emphasizes a number of themes: among them, the value of students’ engagement with books and manuscripts as material objects, the distinctive type of attentiveness cultivated when students attempt to model a textual object’s physical and semantic features in an encoding language, and the opportunity for students to understand and participate in the community that maintains one such language, the TEI.1 Our symposium would facilitate exploration of these and other benefits pertaining to undergraduate editing and encoding work while providing an opportunity for participants to share course materials, assignments, and projects. Finally, by introducing our work on the Walden manuscript, the symposium would highlight a pedagogical benefit of fluid-text editing and encoding in particular: the insight it affords students into literary composition as a process.

When the COVID-19 pandemic forced us to abandon our plans for a live event, we collaborated with members of the New York Digital Humanities group to put on a series of three virtual meetings in October 2020. At our first meeting, David Birnbaum, professor and co-chair of Slavic Languages and Literatures at the University of Pittsburgh, delivered a keynote presentation on “Theorizing and Implementing Digital Editions.” He posed several important questions. Just what do we mean by a “digital edition”? Why create one? In creating one, how might we decide which textual data to capture and publish, and which to set aside (if only temporarily)? What factors should we consider in deciding how to present the data in a digital interface? What is the importance of encoding standards and the communities that sustain them (the TEI, for example) in enabling us to create an edition that will find an audience and be useful to it?

Our second meeting was a workshop in which participants engaged directly with the Walden manuscript and explored how it might serve as a “laboratory” for students to evaluate some of the questions and implementation decisions raised in Birnbaum’s presentation. The collection of over 600 full and partial leaves in Thoreau’s hand, purchased by Henry Huntington in 1918 and now identified by the Huntington Library as call number HM 924, comprises the bulk of the surviving manuscripts the author created while writing Walden. It has a complicated history, some understanding of which is necessary for Thoreau’s revisions to be intelligible. We offered our participants a quick synopsis of that history before dividing them into groups to work directly with a few leaves of the manuscript. We trust that our readers will appreciate a similar orientation.

In the Huntington Library, as on the Huntington website, HM 924 is divided into the eight groups originally identified by J. Lyndon Shanley. Seven of these—“Draft A” through “Draft G” (“Huntington Volumes 1–7”)—are dated. The eighth, “Additional Material, separate from drafts” (“Huntington Volume 8”), is undated.

Before the collection arrived at the Huntington, the leaves had been arranged by Thoreau admirer and early editor Franklin Benjamin Sanborn using a published version of Walden as a template. Shanley was interested in how Thoreau created Walden, and he realized that Sanborn’s arrangement could be reordered to reveal the stages of the book as Thoreau developed it. Shanley first grouped leaves according to their physical features (kinds of paper, ink, and handwriting) and their contents (sequential page numbers and the continuation of sentences from one leaf to another). He then organized the groups and established date ranges for them by correlating dated passages in Thoreau’s Journal with the contents of groups, and by following some revision sequences from one group to another.2

It is important to know that only the first group of leaves, Draft A, constitutes a nearly complete version. At each subsequent stage of work, rather than recopy what he had already written, Thoreau revised the text, added new material, and rearranged leaves. Consequently, all the groups following Draft A are incomplete, and the contents are often internally discontinuous.

As presented in our overview, the physical life of HM 924 before and after publication testified to Thoreau’s process of composition as one combining painstaking, incremental change with wholesale reimagining, a process that offers a powerful counterexample to the notion of the author as inspired genius. But beyond the manuscript’s usefulness for deflating this notion, we wanted our participants to see its power for helping students construct critical interpretations of a fluid text by developing what John Bryant calls revision narratives. These are interpretive accounts of particular revision sites that hypothesize possible sequences of change, the forces driving them (including but certainly not limited to authorial intention), and their relation to other revision sites as well as larger textual wholes. In Bryant’s formulation, “editors must be willing to be narrators of revision; that is, they must convert the bewildering array of data in their encoded textual apparatuses into pleasurable revision narratives.” Bryant continues:

Fluid-text editing is critical editing. . . . Fluid texts must be edited critically because the means by which we transcribe manuscripts, distinguish authorial and editorial variants, infer versions, and hypothesize revision sequences are all acts of judgment. But more than this . . . a fluid-text edition is not so much an imagined thing as it is an interpretation, a map for reading shifting intentions as revealed through variant sequentialized versions.3

Our workshop exercise directed participants’ attention to a sample passage comprising paragraphs 3 and 4 in Walden’s fourteenth chapter, “Former Inhabitants; and Winter Visitors.” In this chapter, Thoreau memorializes some past inhabitants of Walden Woods, including the formerly enslaved Cato Ingraham and Brister Freeman and a formerly enslaved woman he calls “Zilpha,” whose name in reality, according to Elise Lemire, was Zilpah.4 This sample passage lent itself well to the practical aims of the session for several reasons. First, it is relatively short: in each version, the two paragraphs are spread over three manuscript pages. The paragraphs were first written in the fifth (E) draft of the manuscript, then rewritten only once, in the sixth (F) draft, making comparison across versions easier than it is for passages in Walden with more complex revision histories. Even so, these two manuscript paragraphs include enough cancellations, corrections, transpositions, and interlineations (in pencil and ink) to provide a glimpse into Thoreau’s complex revision process. And, small as they are, the changes raise compelling critical questions.

To establish a trajectory for their thinking about Thoreau’s revisions, we had participants begin by examining the published version of the passage, an exercise that primed them to inspect each change with an eye toward hypothesizing Thoreau’s path between first inditing and final destination. To mitigate the challenge of deciphering Thoreau’s handwriting, which often seems to reflect the speed of his thought, we attached a few lifelines to the manuscript images from the Huntington: Elizabeth Witherell’s diplomatic transcriptions of the pages in question and links to the more semantic representation of Thoreau’s changes in Walden: A Fluid-Text Edition.

We sent the participants into videoconferencing breakout rooms with a mission to identify one or two changes they found especially intriguing and to discuss them, guided by the following questions developed by Fiona Coll:

  • What seems to be the sequence of changes and what are some possible reasons for them?
  • In what ways do these changes introduce subtle changes in textual meaning within or between revisions?
  • Is there anything else about these images that captures your interest?

Participants gravitated toward Thoreau’s revision of a sentence describing Zilpah. In version E, the sentence “She lead a hard life & somewhat witch-like” was transformed via penciled cancellation into “She lead a hard life & somewhat inhumane,” as seen below. (The writing begins at the bottom of one page and continues onto the next.)

Figure 1: Bottom of HM 924, E (Vol. 5), p. 165.

Figure 1: Bottom of HM 924, E (Vol. 5), p. 165.

Figure 2: Top of HM 924, E (Vol. 5), p. 166.

Figure 2: Top of HM 924, E (Vol. 5), p. 166.

In version F, Thoreau revised again: “She led a hard life, and somewhat inhumane.” The substitution of “inhumane” for “witch-like” occasioned extended if inconclusive discussion about possible reasons for the change as well as speculation about the effect on the passage’s tone and meaning. The correction of “lead” to “led,” it was observed, served as a useful reminder that great authors can be as orthographically challenged as the rest of us.

Figure 3: Excerpt from HM 924, F (Vol. 6), p. 133.

Figure 3: Excerpt from HM 924, F (Vol. 6), p. 133.

Participants had much to say in a shared Google Doc about the physical appearance of Thoreau’s marks on the page, noting differences between what one called the “light hashmark” that crosses through “witch-like” in version E and a “much more decisive” cancellation-line on the following page.

Figure 4: Detail from Figure 2.

Figure 4: Detail from Figure 2.

Figure 5: Excerpt from HM 924, E (Vol. 5), p. 167.

Figure 5: Excerpt from HM 924, E (Vol. 5), p. 167.

One participant wondered whether the lightness of the first cancellation-line signaled hesitation on Thoreau’s part. Another wanted to know why Thoreau chose to make certain revisions in ink and others in pencil. Might the choice of pencil in some cases indicate a desire to preserve stages of the revision process for his own reference? Such questions sparked a more general curiosity about the manuscript’s material features as evidence of Thoreau’s writing process, and they raised new questions that could only be answered, if at all, by looking beyond the pages in front of them. For example, would a comprehensive examination of the manuscript turn up recurring patterns in the way Thoreau used pencil and pen to add and cancel text? For the purpose of our workshop or the kind of classroom discussion it was designed to model, it was this curiosity, rather than the aptness of the questions or the prospect of determining an answer, that mattered.

For those of our participants who were practicing or aspiring teachers of undergraduates, this short encounter with the Walden manuscript, framed by the idea of textual fluidity, exemplified how examining the material traces of a writer’s revision process might broaden their students’ conception of “literary analysis.” And it paved the way for considering how to take the next logical step with their students: introducing them to the TEI as a community standard for encoding descriptive and interpretive assertions about a text.

Walden in a Class

Rebecca Nesvet closed out our second meeting by describing her two experiments teaching the TEI to undergraduate students using the Walden manuscript as laboratory. In both experiments, her ambitious agenda included not only introducing students to scholarly editing and encoding as activities shaped by a community of practice but also leading them to a deeper understanding of Thoreau’s life and thought in relation to his own time and to ours.

In his essay “Walden,” Richard J. Schneider notes that readers tend to approach the book as an uncritical account of “the hermit” Thoreau “sitting meditatively by Walden Pond”5 in 1845–47, thinking precisely the thoughts and words that he would publish in 1854. This myth, Schneider explains, draws its cultural power from its similarity to the idea of the “return to Eden”; the myth’s celebration of “stasis”—the unchanging thoughts of the unchangeable writer at the eternal pond—is “very appealing . . . to a postindustrial society faced with overwhelming change,” such as our own.6 As we have shown, this myth could not be further from the truth of Walden’s evolution over the course of the surviving and conjectured manuscripts.

In two runs of an undergraduate capstone course at University of Wisconsin, Green Bay, in the Fall 2019 and 2020 semesters, Rebecca Nesvet debunked this myth by asking her students to document revision in Thoreau’s creative process and their own creative lives using the TEI. Notably, the 2020 run took place at the height of the COVID-19 pandemic, necessitating a last-minute shift to virtual (online synchronous) instruction with students who were, for the most part, socially isolated, whether in the university’s residence halls, private student housing, or with their families across the state.

First Run: 2019

In the Fall 2019 semester, students read and discussed several different texts from the Thoreau canon, including Walden, the essays “Walking” and “A Plea for Captain John Brown,” and brief extracts from Bradley P. Dean’s edition of the unpublished, unfinished work Thoreau called “Wild Fruits.”7 To get a visceral sense of Thoreau’s experiment at the pond, students visited the university arboretum and fanned out to engage in what we would now call socially distant observation, self-reflection, and journaling.

In the classroom, they were introduced to the Text Encoding Initiative as both an open, customizable encoding standard and as the initiative referenced in its name—the ongoing labor of an active global community of practitioners. In teams of two, they used TEI Roma JS to design schemas that could guide their encoding of pages from HM 924. Knowing that their work might contribute to Digital Thoreau, which was similarly engaged in developing a document model for the manuscript, made the work seem immediately relevant and not merely academic. It is neither necessary nor always possible for student encoders to contribute to an ongoing scholarly project, but the opportunity, when available, has the potential to alter radically students’ relationship to their coursework, such that it is no longer merely an assignment undertaken to satisfy a professor but also, and more important, a public contribution to the sum of knowledge.

Navigating the new Roma JS tool’s choices was a challenge for some, as was transcribing in conformity to the schemas they designed. As novice encoders, they struggled to find appropriate tags, committing a fair amount of tag abuse—that is, applying tags for markup purposes other than those indicated in the TEI. They were often unaware of elements or attributes waiting to serve their needs. But these struggles and discoveries made the editing work dynamic, and it heightened students’ awareness of their editorial choices, Thoreau’s authorial decisions, and the critical and creative dimensions of documentary editing.

After confidently presenting their work on a panel at the University of Wisconsin, Green Bay, College of Arts, Humanities, and Social Sciences (CAHSS) undergraduate conference in December 2019, students wrote final reflections on their experience as TEI encoders and documentary editors. “Sauntering” in the arboretum, wrote one student, became a working model for transcription and encoding of manuscript features as well as for designing a markup schema to support those features: “I liked transcription . . . because it’s mostly about detail, and during our walk we needed to use detail in describing things we observe.”8 The TEI exercised her “ability to notice something and focus.” What she noticed was that Thoreau not only cut and revised his work through deletion but also used his own deletion “markup,” circling a word, phrase, or passage and then crossing it out with a single stroke. This student’s editing suggested to her that Thoreau’s deletion process momentarily foregrounds the cut material, as if to honor it before removing it from the evolving text. Attempting to understand the material evidence of Thoreau’s authorial intentions appeared to reframe the student’s understanding of the writing process in much the same way that her visit to the university arboretum reframed her understanding of Thoreau’s engagement with nature.

Second Run: 2020

In the Fall 2020 semester, a new iteration of the course contained significant changes, some of them occasioned by the shift to pandemic pedagogy. With many students living in exile from campus, the arboretum journaling was replaced by journaling anywhere outdoors that a student considered a natural space. Sixteen-year-old Northern Irish teenager Dara McAnulty’s urban naturalist writing in both his blog and his book, Diary of a Young Naturalist, informed this pedagogical choice.9 Since Digital Thoreau’s own schema for encoding HM 924 was nearly complete, this second run of the course placed less emphasis on TEI customization and more on implementing the TEI Guidelines for Manuscript Description.

Students transcribed passages from HM 924 that had made their way in some fashion into “Former Inhabitants; and Winter Visitors,” in which, as described above, Thoreau confronts the fragmented, mediated legacies of various people—most of them African American, some of them enslaved, all of them workers, none of them belonging to the Concord elite—who once lived at the pond, cultivated food, and exercised creativity, effectively performing aspects of Thoreau’s experiment long before he would, though more from necessity than choice. In these passages, Thoreau introduces two former inhabitants: Zilpah—who sings about bones, whose cottage was burned, whose pets were killed—and Brister Freeman, who planted trees that outlived him. Their encounter with these portions of the Walden manuscript helped students appreciate Thoreau’s developing awareness of his presence at the pond as both historical and geographical. At the same time, it gave them an opportunity to think about the life stories of Walden’s former inhabitants.

Once again, the experience of transcribing Thoreau’s manuscripts and encoding the revisions and other manuscript features in accordance with the TEI Guidelines proved transformative for most. “Doing this transcription gave me a really interesting look at Thoreau’s writing process and made me reconsider the passage I transcribed in a few ways,” one of this cohort’s student editors reflects. She adds:

All of his later edits, added in pencil, made [it] into the published Walden. The longest of these edits added more detail about Brister Freeman’s burial site. This addition is really interesting to me, because it says a lot about Brister Freeman’s status and treatment while he was alive through his placement in death—buried to the side of the cemetery, near enemy soldiers. . . . It’s also interesting to see the additions and deletions from a writer’s perspective. It’s encouraging to see physical evidence that lauded writers whose work has survived generations don’t just sit down and pour out flawless prose right from the start.

This student’s reflection offers evidence that there is more than one way to articulate a critical perspective on a text, and that documentary editing in the TEI offers pathways to critical understanding not available through a traditional essay “about” the text.

Walden Unbound

In discussing the Text Encoding Initiative with our workshop participants in October 2020, Elisa Beshero-Bondar emphasized that the TEI is about community-building, and she remarked on the size and scope of the communities of practice who have engaged with it over three decades. As articulated in its founding Poughkeepsie Principles (1987), the TEI was from its first incarnation intended as a set of guidelines rather than an absolute standard, with an emphasis on decision-making for shared human-readable and machine-readable documents that makes reading and transmission possible, unimpeded by competing computer operating systems and technological change. In this portion of the workshop especially, we wanted to emphasize the TEI’s community as its foundation.

The TEI Guidelines are a community-maintained resource for text-scholarly practice. To publish code is to share with the TEI community at large and the smaller communities of practice formed through editing projects. Working together on a TEI-encoded project involves customizing the TEI to meet specific needs or requirements, and these customizations benefit from background knowledge of editorial theory and practice, attention to the relationship between text and context, experimentation, and refinement of project goals. Customization is one of the most vexing, challenging, and intellectually stimulating aspects of any project. The team that edits a document, whether in the context of a small class or a long-range project, works best when all understand the semantics of the elements and attributes selected from the TEI Guidelines. Using the TEI’s ODD language (“One Document Does it all”), team members articulate their project’s own specific interpretive methods in a meaningful, organized exchange with the larger community of practice behind the published TEI Guidelines. When seen as a form of exchange, between individual editors or projects and the broader TEI community, customization can provide an opportunity to make the work of textual encoding more meaningful and reflective to new scholarly editors, despite the extra effort and skills required.

Students attempting to customize the TEI in the short timeframe of an academic term, particularly if they are new to scholarly editing, are unlikely to do so in a way that meets the strict standards of a rigorous scholarly edition. TEI customization involves an encounter with something immensely complex and very difficult to organize—the TEI itself—that is initially disorienting in much the same way as an encounter with Thoreau’s sprawling manuscript. Yet we believe that this encounter with complexity is valuable for students even, perhaps especially, when there is insufficient time to complete the task decisively. Among other things, the encounter is a kind of introduction to the broad community of scholarly editors and its constitutive methods, questions, and debates with the potential to transform what can easily seem like a dry, technical exercise into an experience of open-ended inquiry.

As a follow-on activity to the ODD-customization that Rebecca Nesvet’s students undertook in Spring 2020, we believe it would be valuable to have students examine, discuss, and apply an ODD designed by a scholarly project team for the purpose of encoding the same material. (Our own project ODD, still in development, may be found in GitHub.) The resulting discussion, when properly framed, ought to feel much like those we have with students about a scholarly article they have been assigned to read.

As Rebecca Nesvet’s classroom experiments show, students who engage in scholarly decision-making become disciplined observers and practice a new kind of critical reading, one that foregrounds agency and creativity. As they debate and discuss customization decisions in the shared language of professional scholarly editors, they become a small, project-focused community that understands itself in relation to the larger community of scholarly encoders. The experience can make the most adept and eager students want to work with manuscripts and archives on long-term projects or even pursue further study in order to join the community of professional editors themselves. Meanwhile, it enables all students to experience college education as participants in a collective and creative endeavor, accomplished through mutual deliberation and support, and not merely as individuals seeking self-development or career preparation.


In the “Spring” chapter of Walden, Thoreau famously describes a bank created by a railroad cut near his beloved Walden Pond; he remarks how the moving streams of thawing sand and clay resemble patterns elsewhere in nature, such as rivers, anatomical structures, and vegetation. He writes, “When I see on the one side the inert bank,—for the sun acts on one side first,—and on the other this luxuriant foliage, the creation of an hour, I am affected as if in a peculiar sense I stood in the laboratory of the Artist who made the world and me,—had come to where he was still at work, sporting on this bank, and with excess of energy strewing his fresh designs about.”10 The Walden manuscript itself may serve as a kind of laboratory for learning about manuscripts and encoding; it is equally, of course, the laboratory in which Thoreau the artist strewed about his “fresh designs” as he moved from version to version, or revised within a single version. Nearly every page bears witness to his own “excess of energy,” and on each page we come to where Thoreau is “still at work.” By encoding Thoreau’s work of revision, we try to provide insights into both his creative process and the complex work of art that was its product.

Still at work, too, is the team that has embraced this encoding task in an effort to produce a companion to Walden: A Fluid-Text Edition. In team discussions, we have returned, over and over again, to the questions that David Birnbaum raised in the first of our October 2020 workshop sessions. What do we aim to accomplish through a fresh encoding of Thoreau’s revisions? What will that encoding offer readers that is not already available to them through Clapper’s encoded dissertation? What data must we capture to give those readers new knowledge and new value? What data lie outside our scope, at least for now? Will the new project produce the same kind of “edition” as the existing one, in the familiar shape of a text that comprises the whole of the author’s work and can be read in linear sequence from first word to last? Or will it rather look more like a body of revision snapshots, each attempting to tell the story of some particularly interesting or significant sequence of authorial changes? Might such a collection be allowed to expand organically as, in seeking to tell the story of well-known passages, we stumble unexpectedly on less scrutinized ones with revision histories crying out to be narrated? What kind of interface is needed to make our narration clear and compelling? Should this be a social edition, one that invites readers to contribute revision narratives by applying the project’s ODD to passages that they select from the manuscript themselves?

For now, without question, we are excited to invite colleagues to use the manuscript of Walden with their students in the ways we have described here, and, as they try new experiments in their own laboratories, to let us know the results.

