Annotating CWGK Documents with MashBill

CWGK is working with Brumfield Labs of Austin, Texas, to build an annotation and entity management system that will allow CWGK to locate, identify, and link together every person, place, organization, and geographical feature in every CWGK document. The annotation application, MashBill, has been live since February 2017, and CWGK staff and Graduate Research Associates working remotely from eight university campuses across the country have (as of April 2017) identified nearly 5,000 unique entities which appear over 8,000 times in nearly 700 CWGK texts.

CWGK published a preliminary plan for MashBill in the fall of 2016, but with the system now up and running, this post will move through through each step of the annotation process with screenshots.


The first step is to search for and select the assigned document on the CWGK website.

In the document view screen, the annotator activates a browser plugin called Hypothes.is, which enables annotation and commentary on any web page. All CWGK staff and GRAs are members of an invitation-only Hypothes.is group, which collects data and feeds it into the MashBill system.

The next step is to highlight all entities (people, places, organizations, or geographical features) at their first mention in the text of the documents, select annotate when the Hypothes.is icon appears above the text, and click “Post to CWGK”.

Once an annotator completes this process, they can click on the Hypothes.is icon in the  browser toolbar to review all of the highlighted entities.

The annotator then moves into MashBill itself, where each user sees a dashboard of their own previous work, a running tab of the latest work in the database, and search fields to find an entity or document. Those search fields allow the annotator to look up the document number which has just been highlighted in MashBill.

Each of the character strings highlighted in Hypothes.is appear on the MashBill document screen.

The user selects “identify” to search the database for entity names which are at least a 30% match to the transcribed character string. This degree of proximity suggests likely matches, but still allows flexibility to account for name abbreviations, misspellings, and the use of titles to identify individuals.

MashBill suggests known entities, but if the entity in question has not yet been added to the database, the annotator moves to the entity creation screen.

After research in approved, authoritative, and reliable sources, the annotator writes a short entity “biography”, fills out a bibliography section, marks up any textual features including italics and underlining in Markdown, and fills in the metadata fields relevant to the entity type.

 

The annotator confirms the information is correct and creates the entity, which is automatically linked to the character string highlighted in Hypothes.is.

If an entity already exists in the MashBill database, the user simply chooses the correct entity from the suggested list and MashBill automatically links the entity record to the character string.

The annotator proceeds until all of the entities for the document have been identified. They then click “Document Needs Reviewed” which sends the document into the fact-checking queue.

When another staff member checks work for accuracy and adherence to editorial style, the document will be marked complete, and MashBill will insert reference tags containing the unique identifier for each entity biography into the TEI-XML transcription of the document stored in GitHub. These files will be re-imported into the existing CWGK Omeka site along with the entity biographies, allowing hyperlinked navigation between text and biography.

The final step in the current CWGK annotation process is social networking, documenting all of the relationships between individuals and organizations present in the text of the document itself.

Each relationship between entities is classified as one of a handful of types: familial, political, legal, economic, social, military, and slavery. Entities can have multiple relationships within documents if the relationship between the two is multifaceted or evolves as the document proceeds. Entities can also have the same type of relationship documented in multiple documents, adding weight to the vector between those two nodes. entities can be involved in a complex network of relationships.

When the relationships have been identified and created, the annotation stage on this document is complete and the annotator moves on to the next assignment.

New CWGK Document Brings KHS Staff Together

It’s always nice when CWGK documents walk right into our office! While going through old family papers, KHS Head of Reference Services Cheri Daniels found an 1865 land grant to one of her ancestors, Matthew Pace, signed by Governor Thomas E. Bramlette. Land grants such as these are particularly difficult for CWGK to track down because they move administratively from the County Courts briefly to the executive department in Frankfort, and then back into the hands of the grantee. Documents like this one, in short, will likely have to come to CWGK via family holdings like Cheri’s.

As the CWGK staff got out the scanners, the story really took off. Register of the Kentucky Historical Society Associate Editor Stephanie Lang noticed the name of one of her Floyd County ancestors, William J. May, on the grant. KHS’s library collections came to the rescue, and the team quickly pulled maps of Floyd and Magoffin counties to locate the specific plot of land granted in this newly accessioned CWGK document.

Will this document change the way we understand the Civil War era in Eastern Kentucky? Perhaps not. But it does underscore the importance of every document in the CWGK corpus. Each document contains a link to the lives and stories of everyday people from across the Commonwealth and the globe. And bringing these documents together in digital public space allows CWGK researchers to make connections between one another in the context of our shared past.

Look forward to the digital debut of the Matthew Pace collection soon at Discovery.CivilWarGovernors.org!

CWGK on Papers of Abraham Lincoln Review & Planning Team

Civil War Governors of Kentucky project director Patrick Lewis joins a world-class group of scholars and editors on the Papers of Abraham Lincoln Review and Planning Team. The Abraham Lincoln Presidential Library and Museum convened the team to assess over 15 years of editorial work on the Papers of Abraham Lincoln and to consult on digital platforms to publish images, transcriptions, and annotations of documents from throughout Lincoln’s life.

In addition to Lewis, other members of the Review and Planning Team include:

  • Daniel Feller, director of the Papers of Andrew Jackson project at the University of Tennessee-Knoxville
  • Susan Perdue, director of the Documents Compass program at the Virginia Foundation for the Humanities
  • Matthew Pinsker, director of Dickinson College’s House Divided Project
  • Jennifer Stertzer, director of the University of Virginia’s Center for Digital Editing and senior editor for the Papers of George Washington Digital Edition

These projects represent the cutting edge in documentary editing and digital history. The inclusion of CWGK among them is a testament to the importance of the work this project has done since it organized in 2010. In addition to delivering a new perspective on the Civil War to teachers, students, and researchers across the Commonwealth and the United States, CWGK has earned a seat at the table for important discussions about where the history field will go in the twenty-first century.

Read more about the Review and Planning Team in the State Journal-Register

“These folks that were brought in have worked on different projects around the country, and have many years of experience in different areas,” Lowe said. “They’re all quite skilled in documentary editing and understand that world.”

The Papers of Abraham Lincoln project began in 1985 as the Lincoln Legal Papers Project, dedicated to finding all surviving records from Lincoln’s legal career. When that work was finished, the mission was expanded in 2000 to finding all Lincoln documents and putting them into a digital format.

SHA Graduate Council Features CWGK & Public History

Civil War Governors of Kentucky project director Patrick Lewis and Kentucky Historical Society colleague Mandy Higgins led a #TuesdayTakeover of the Southern Historical Association’s Graduate Council Twitter feed on February 14, 2017.

The SHA Grad Council invites historians to share career advice with emerging professionals in graduate programs across the United States. Lewis and Higgins live tweeted their work day and used their activities to offer tips and advice on managing public history careers, digital history startup and sustainability, and the transferability of graduate skills into the public history workplace.

Preview the day’s advice below, and see the full recap here:

Kentucky Ancestors Online Feature

Want to learn how to search the new Civil War Governors site? How to use its features to build a research project for class or for family or local history? Interested in applying these 10,000+ documents to your home town or family tree?

Read our new feature in Kentucky Ancestors Online, the KHS digital magazine devoted to Kentucky families, locations, stories, resources, and migration.

Project Director Patrick Lewis examines the historical roots of a local legend from Trigg County.

Closing Out Grant Year 2015-16

The Civil War Governors of Kentucky staff is wrapping up the grant year for both of our major federal grants, from the NEH and the NHPRC. This is a good time to reflect back on what we have accomplished.

And we are now poised to enter a new grant year. What will Civil War Governors be doing between now and next October?

Civil War Governors is also going live in 2017, hosting a major scholarly conference in Frankfort and presenting at professional organizations and community groups across Kentucky.

Graduate Research Associates 2016-17

Overview

The Kentucky Historical Society seeks eight Graduate Research Associates (GRAs) familiar with 19th century United States history to write short informational entries for the Civil War Governors of Kentucky Digital Documentary Edition (CWG-K). GRAs will receive a stipend of $5,000 each and can work remotely from their home institutions.

Each GRA will annotate 150 assigned documents each. Each GRA must be a graduate student in at least the second year of a M.A. program in history or a related humanities discipline. In accordance with its commitment to facilitating relationships between history practitioners and organizations in Kentucky and nationally, KHS hopes that these GRA positions will help advance the professional skills of early-career historians in Kentucky and elsewhere. Preference will be given to candidates who are enrolled in graduate programs in history at Kentucky universities, though applicants worldwide are encouraged to apply. These positions are funded by a grant from the National Historical Publications and Records Commission (NHPRC), a branch of the National Archives.

CWG-K is an annotated, searchable, and freely-accessible online edition of documents associated with the chief executives of the commonwealth, 1860-1865. Yet CWG-K is not solely about the five governors; it is about reconstructing the lost lives and voices of tens of thousands of Kentuckians who interacted with the office of the governor during the war years. CWG-K will identify, research, and link together every person, place, and organization found in its documents. This web of hundreds of thousands of networked nodes will dramatically expand the number of actors in Kentucky and U.S. history, show scholars new patterns and hidden relationships, and recognize the humanity and agency of historically marginalized people. To see the project’s work to date, visit discovery.civilwargovernors.org.

Scope of Work

Each GRA will be responsible for researching and writing short entries on named persons, places, organizations, and geographical features in 150 documents. Each document contains an average of fifteen such entities. This work will be completed and submitted to CWG-K for fact-checking before June 30, 2017.

Research and writing will proceed according to project guidelines concerning research sources and methods, editorial information desired, and adherence house style. This will ensure 1) that due diligence is done to the research of each entity and 2) that information is recorded for each item in uniform ways which are easy to encode and search.

All research for the entries must be based in primary or credible secondary sources, and each GRA is expected to keep a virtual research file with notes and digital images of documents related to each entry. These will be turned over to CWG-K at the completion of the work. CWG-K will fact-check all entries for research quality and adherence to house style. CWG-K projects an average rate of one document annotated per two hours of work. Each GRA may expect to devote approximately 300 hours to the research—though the actual investment of time may vary.

Each GRA will work remotely. Interaction with the documents and the writing of annotations will take place in a web-based annotation tool developed for CWG-K, which can be dialed into from any location. CWG-K will make use of online research databases to make its work efficient and uniform. Other archival sources may be of value but are not required by the research guidelines. Securing access to the paid databases required by CWG-K (Ancestry.com, Fold3.com, and ProQuest Historical Newspapers: Louisville Courier Journal) is the responsibility of the GRA. If regular institutional access to these databases is not available to the GRA through a university or library, it is the responsibility of the GRA to purchase and use a subscription to these databases. KHS will not reimburse the GRA for any travel, copying, or other expenses incurred in CWG-K research.

In order to maintain quality and consistency as well as to foster a collegial and collaborative work culture, CWG-K will conduct weekly virtual “office hours” via Google Hangouts, during which GRAs are required to dial in, ask questions of staff, share expertise and research methods, and make connections with their peers at other universities. Virtual attendance at these office hours is mandatory, and multiple sessions may be offered to accommodate schedules.

The Kentucky Historical Society will hold copyright for all annotation research as work for hire.

Evaluation Criteria

A proposal should consist of at least a narrative statement of professional ability in the form of a cover letter, a CV, and two letters of recommendation. Additional supplementary materials that demonstrate capacity in the evaluation factors may also be included. Applications are due by September 16, 2016 to Tony Curtis, tony.curtis@ky.gov.

The Kentucky Historical Society will evaluate the proposals based on the following factors:

Research Experience (70 points): Describe your familiarity with research in 19th century U.S. history. Describe some projects you have undertaken. What sources have you used? Have you been published? Have you interpreted historical research in forms other than a scholarly peer-reviewed publication? How does the proposed research project differ from those you have undertaken in the past? Describe your familiarity with the strengths and weaknesses of online research databases such as Ancestry.com, Fold3.com, ProQuest, and Google Books.

Project Experience (30 points): Describe any work you have done in the editing of historical documents. Discuss how a project such as CWG-K maintains balance between thorough research and production schedules. Have you worked on other collaborative projects in the field of history or otherwise? Describe your ability to meet deadlines and regulate workflow. Describe your understanding of and/or experience with the Digital Humanities. From what you know of the CWG-K project, how does it fit with current trends in the field? What do you hope to gain from working on the CWG-K project?

Institutional Affiliation (10 points): Additional points are available to applicants who are enrolled in graduate programs at Kentucky universities. Applicants claiming this status should discuss how they will use this experience to help build and sustain relationships among history organizations across the state and articulate why such relationships are valuable. This does not imply any relationship between KHS and the educational institution.

Civil War Governors Reviewed on HistoryNet

Ural Rev“Easily explored by browsing or keyword search, this superb site offers excellent resources for those whose reading, research and writing interests lay at the crossroads of the battlefield and the home front.”

Read more from University of Southern Mississippi Professor Susannah J. Ural’s review of the new Early Access interface from her Ural on URL column on HistoryNet:

http://www.historynet.com/the-war-on-the-net.htm

SCWH Cross-Post: So, You Want to Create a Digital Project

CWG-K XMLHave you found a hidden gem of a collection that you want to share with the world? Thinking of creative ways to actively engage your students in the work of history? Want to attract students to your department and develop diverse career skills for history majors?

If you have answered “yes” to any of these questions, a digital project might be in your future. But how exactly do you do start?

From the earliest conceptual stages through our Early Access web development, Civil War Governors has learned quite a bit about designing and launching a digital history project—sometimes the hard way.

Read some distilled tips from project director Patrick Lewis at the Society of Civil War Historians blog.