Annotating CWGK Documents with MashBill

CWGK is working with Brumfield Labs of Austin, Texas, to build an annotation and entity management system that will allow CWGK to locate, identify, and link together every person, place, organization, and geographical feature in every CWGK document. The annotation application, MashBill, has been live since February 2017, and CWGK staff and Graduate Research Associates working remotely from eight university campuses across the country have (as of April 2017) identified nearly 5,000 unique entities which appear over 8,000 times in nearly 700 CWGK texts.

CWGK published a preliminary plan for MashBill in the fall of 2016, but with the system now up and running, this post will move through through each step of the annotation process with screenshots.


The first step is to search for and select the assigned document on the CWGK website.

In the document view screen, the annotator activates a browser plugin called Hypothes.is, which enables annotation and commentary on any web page. All CWGK staff and GRAs are members of an invitation-only Hypothes.is group, which collects data and feeds it into the MashBill system.

The next step is to highlight all entities (people, places, organizations, or geographical features) at their first mention in the text of the documents, select annotate when the Hypothes.is icon appears above the text, and click “Post to CWGK”.

Once an annotator completes this process, they can click on the Hypothes.is icon in the  browser toolbar to review all of the highlighted entities.

The annotator then moves into MashBill itself, where each user sees a dashboard of their own previous work, a running tab of the latest work in the database, and search fields to find an entity or document. Those search fields allow the annotator to look up the document number which has just been highlighted in MashBill.

Each of the character strings highlighted in Hypothes.is appear on the MashBill document screen.

The user selects “identify” to search the database for entity names which are at least a 30% match to the transcribed character string. This degree of proximity suggests likely matches, but still allows flexibility to account for name abbreviations, misspellings, and the use of titles to identify individuals.

MashBill suggests known entities, but if the entity in question has not yet been added to the database, the annotator moves to the entity creation screen.

After research in approved, authoritative, and reliable sources, the annotator writes a short entity “biography”, fills out a bibliography section, marks up any textual features including italics and underlining in Markdown, and fills in the metadata fields relevant to the entity type.

 

The annotator confirms the information is correct and creates the entity, which is automatically linked to the character string highlighted in Hypothes.is.

If an entity already exists in the MashBill database, the user simply chooses the correct entity from the suggested list and MashBill automatically links the entity record to the character string.

The annotator proceeds until all of the entities for the document have been identified. They then click “Document Needs Reviewed” which sends the document into the fact-checking queue.

When another staff member checks work for accuracy and adherence to editorial style, the document will be marked complete, and MashBill will insert reference tags containing the unique identifier for each entity biography into the TEI-XML transcription of the document stored in GitHub. These files will be re-imported into the existing CWGK Omeka site along with the entity biographies, allowing hyperlinked navigation between text and biography.

The final step in the current CWGK annotation process is social networking, documenting all of the relationships between individuals and organizations present in the text of the document itself.

Each relationship between entities is classified as one of a handful of types: familial, political, legal, economic, social, military, and slavery. Entities can have multiple relationships within documents if the relationship between the two is multifaceted or evolves as the document proceeds. Entities can also have the same type of relationship documented in multiple documents, adding weight to the vector between those two nodes. entities can be involved in a complex network of relationships.

When the relationships have been identified and created, the annotation stage on this document is complete and the annotator moves on to the next assignment.

New CWGK Document Brings KHS Staff Together

It’s always nice when CWGK documents walk right into our office! While going through old family papers, KHS Head of Reference Services Cheri Daniels found an 1865 land grant to one of her ancestors, Matthew Pace, signed by Governor Thomas E. Bramlette. Land grants such as these are particularly difficult for CWGK to track down because they move administratively from the County Courts briefly to the executive department in Frankfort, and then back into the hands of the grantee. Documents like this one, in short, will likely have to come to CWGK via family holdings like Cheri’s.

As the CWGK staff got out the scanners, the story really took off. Register of the Kentucky Historical Society Associate Editor Stephanie Lang noticed the name of one of her Floyd County ancestors, William J. May, on the grant. KHS’s library collections came to the rescue, and the team quickly pulled maps of Floyd and Magoffin counties to locate the specific plot of land granted in this newly accessioned CWGK document.

Will this document change the way we understand the Civil War era in Eastern Kentucky? Perhaps not. But it does underscore the importance of every document in the CWGK corpus. Each document contains a link to the lives and stories of everyday people from across the Commonwealth and the globe. And bringing these documents together in digital public space allows CWGK researchers to make connections between one another in the context of our shared past.

Look forward to the digital debut of the Matthew Pace collection soon at Discovery.CivilWarGovernors.org!

CWGK on Papers of Abraham Lincoln Review & Planning Team

Civil War Governors of Kentucky project director Patrick Lewis joins a world-class group of scholars and editors on the Papers of Abraham Lincoln Review and Planning Team. The Abraham Lincoln Presidential Library and Museum convened the team to assess over 15 years of editorial work on the Papers of Abraham Lincoln and to consult on digital platforms to publish images, transcriptions, and annotations of documents from throughout Lincoln’s life.

In addition to Lewis, other members of the Review and Planning Team include:

  • Daniel Feller, director of the Papers of Andrew Jackson project at the University of Tennessee-Knoxville
  • Susan Perdue, director of the Documents Compass program at the Virginia Foundation for the Humanities
  • Matthew Pinsker, director of Dickinson College’s House Divided Project
  • Jennifer Stertzer, director of the University of Virginia’s Center for Digital Editing and senior editor for the Papers of George Washington Digital Edition

These projects represent the cutting edge in documentary editing and digital history. The inclusion of CWGK among them is a testament to the importance of the work this project has done since it organized in 2010. In addition to delivering a new perspective on the Civil War to teachers, students, and researchers across the Commonwealth and the United States, CWGK has earned a seat at the table for important discussions about where the history field will go in the twenty-first century.

Read more about the Review and Planning Team in the State Journal-Register

“These folks that were brought in have worked on different projects around the country, and have many years of experience in different areas,” Lowe said. “They’re all quite skilled in documentary editing and understand that world.”

The Papers of Abraham Lincoln project began in 1985 as the Lincoln Legal Papers Project, dedicated to finding all surviving records from Lincoln’s legal career. When that work was finished, the mission was expanded in 2000 to finding all Lincoln documents and putting them into a digital format.

SHA Graduate Council Features CWGK & Public History

Civil War Governors of Kentucky project director Patrick Lewis and Kentucky Historical Society colleague Mandy Higgins led a #TuesdayTakeover of the Southern Historical Association’s Graduate Council Twitter feed on February 14, 2017.

The SHA Grad Council invites historians to share career advice with emerging professionals in graduate programs across the United States. Lewis and Higgins live tweeted their work day and used their activities to offer tips and advice on managing public history careers, digital history startup and sustainability, and the transferability of graduate skills into the public history workplace.

Preview the day’s advice below, and see the full recap here:

Civil War Governors of Kentucky Editor Hosts Webinar for Kentucky’s Librarians and Archivists

Civil War Governors of Kentucky (CWGK) assistant editor Tony Curtis hosted a webinar on October 14, 2016 entitled “Researching the Civil War Governors of Kentucky” for Kentucky’s librarians and archivists as a part of the Continuing Education program offered through the Kentucky Department for Libraries and Archives (KDLA). The webinar focused on the launch of “Early Access“–the first stage of accessibility–in June 2016, allowing users to browse and keyword search over 10,000 documents.

The next step–“Annotation Beta”–is to deliver approximately 1,500 documents, annotated and set within dense social and geographic networks through NHPRC funding. The presentation demonstrated how CWGK will shape the ways researchers, students, and teachers will explore the past in the future.

Click HERE to listen to the webinar.

Voices of the Filson Interview on WXOX 97.1FM (Louisville, Ky.)

Listen to Civil War Governors of Kentucky assistant editor Tony Curtis as he returns to the Filson Historical Society archives to discuss the project and its future plans about annotation and social networking on an episode of the Voices of the Filson on WXOX 97.1 FM with the Filson’s own associate curator of collections Aaron Rosenblum.

Audio provided by Voices of the Filson on WXOX 97.1FM and the Filson Historical Society.

The Rogue Historian Podcast

Listen to #CWGK project director Patrick Lewis discuss the project on an episode of The Rogue Historian with Keith Harris.

We discuss:

  • Digital history and how it is useful
  • A historical “social network” being developed through CWGK annotation
  • The place in digital humanities for early career historians
  • How to use the documentary project’s user guides

Listen to the episode here

rogue

Kentucky Ancestors Online Feature

Want to learn how to search the new Civil War Governors site? How to use its features to build a research project for class or for family or local history? Interested in applying these 10,000+ documents to your home town or family tree?

Read our new feature in Kentucky Ancestors Online, the KHS digital magazine devoted to Kentucky families, locations, stories, resources, and migration.

Project Director Patrick Lewis examines the historical roots of a local legend from Trigg County.

Closing Out Grant Year 2015-16

The Civil War Governors of Kentucky staff is wrapping up the grant year for both of our major federal grants, from the NEH and the NHPRC. This is a good time to reflect back on what we have accomplished.

And we are now poised to enter a new grant year. What will Civil War Governors be doing between now and next October?

Civil War Governors is also going live in 2017, hosting a major scholarly conference in Frankfort and presenting at professional organizations and community groups across Kentucky.

A Facet of Early Access: How do I search?

by Tony Curtis

This might seem like an obvious topic, but there are several ways to search the Early Access website. As a matter of fact, there are three ways: (1) Use the “Browse” function; (2) Use the “Search Collection” function; or (3) Use the “Advanced Search” function. It depends on the objective of your search, as to which function best suits your particular needs.

The “Browse” function is an appropriate choice for individuals who would like to search a particular repository and/or a particular collection. For example, say you are interested in researching the first Confederate provisional governor of Kentucky—George W. Johnson—and you know that the Kentucky Historical Society houses a collection of personal papers for George W. Johnson. You would click the “Browse” button on the main menu in the top right quadrant of the website. Then scroll down until you see the “Kentucky Historical Society” repository button. CLICK. Then scroll until you find the “George W. Johnson Papers” button. CLICK. And commence your browsing of the collection at the item level.

Browse

browse_2

browse_3

browse_4

If you would like to use the quickest search function, then you are in luck. The “Search Collection” function appears on the main screen—and every screen thereafter—for ease of user access. Just plug in your term or terms and commence your search of the entire collection. You can also narrow your search by using simple or Boolean search operators. For example, a search for Benjamin F. Buckner (no quotation marks) returns 180 search results, while “Benjamin F. Buckner” (with quotation marks) returns one search result. Searching Benjamin AND Buckner (with Boolean operators) returns sixteen search results. So try different combinations of words and operators when using the “Search Collection” function.

The most complex and most effective search function to use is the “Advanced Search” function. This is a faceted search, meaning that is allows the researcher to narrow search results by using many different criteria that have been built into the metadata by project editors. To use this search function, click the “Advanced Search” option underneath the “Search Collection” search box on any page. You will arrive at a page that allows you to target your search by using three specific keyword search fields. You can select from a list of eleven fields to narrow your search: Accession Number, Collection, Date of Creation, Dates Mentioned, Document Genre, Document Title, Editorial Note, Item Location, Place of Creation, Repository, and Transcription. Any combination of these fields will help you narrow your search. For example, say I wanted to find all documents sent to Thomas E. Bramlette from Covington, Kenton County, Kentucky, in 1865. I would conduct the following faceted search to Thomas E. Bramlette (Field: Document Title); 1865 (Field: Date of Creation); and Covington, Kenton County (Field: Place of Creation).

faceted-search

faceted-search_2

faceted-search_5

faceted-search_3

faceted-search_4

This faceted search returns twenty-eight search results, while a Boolean search conducted using the “Search Collection” returns fifty-six results. A simple search with all these terms and no operators returns 8,910 documents. Thus we see the benefit of the faceted search function. I would suggest experimenting with all the search functions and see which one fits your research objectives the best and it may change from search to search. Early Access currently contains just over 10,000 documents and this number is only going to continue to grow over time.

So what do you say, how about a few searches? Bet you can’t search just once.

Tony Curtis is an Assistant Editor of the Civil War Governors of Kentucky Digital Documentary Edition.