- Secret History of Our Streets. Six ordinary streets, each telling us about how life in London has changed in 150 years. Involves both past and present residents.
- Reel History of Britain. A social history of 20th Century Britain showing how people worked and lived using viewers' personal memories and rare archive newsreel footage.
Wednesday, 30 October 2013
Micro-history for Genealogists
Sunday, 20 October 2013
Do Genealogists Really Need a Database?
- Tree-based. These load an XML file into an internal tree structure and allow a program to navigate around it. The World-Wide Web Consortium’s (W3C) Document Object Model (DOM) is a commonly used version. These are very easy to use but they can be memory hungry. Also, a program may only want access to part of the data, or it may want to convert the DOM into a different tree that’s more amenable to what it is doing. In the latter case, it means there will be two trees in memory before one is discarded in favour of the other.
- Event-based. These perform a lexical scan of the XML file on disk and let the program know of parsing events, such as the start and end of elements, through a call-back mechanism. The program can then decide what it wants to listen for and what it wants to do with it. This is possibly less common because it requires more configuration to be provided in the code. SAX is the best known example.
Wednesday, 16 October 2013
Claverley Property Document Transcript
Sue Adams makes a case for transcription being an essential step to assimilating and understanding an historical document at Claverley Property Document Analysis, Part 1: Transcript. She uses the example of manorial court record of a property transaction in 1844 and has invited me to show how this might be transcribed in STEMMA®.
Sue had already transcribed the raw text and inserted her own annotation relating parts of it to the handwritten original. She had also included explicit line numbers to help make that correlation more easily.
This is a useful exercise for me as STEMMA is still an evolving project that hopes to address transcription as part of its comprehensive narrative support. There are several parts to this exercise so I would like to itemise them for subsequent discussion:
- Identifying the people, places, events, and dates.
- Linking people and place references to any corresponding entities.
- Handling marginalia.
- Handling uncertain characters.
- Handling uncertain or unfamiliar words.
- Adding line numbers.
- Linking multiple page scans with single transcription.
In the interests of clarity I want to approach these items in a stepwise manner. I’ll initially deal with item vii.
Scanned images are represented in STEMMA using a Resource entity. The situation in Sue’s example of having multiple related page-scans but a single transcription occurs frequently and there are several ways of handling it. The common feature is to put those scans in a single folder (or use a common root for the file name) and employ a parameterised Resource entity to represent them all. This is convenient since it also provides a single point to associate the transcribed text.
<Resource Key=’rClaverleyPropertyTrans’ Abstract=’1’>
<Title> Manorial court records for Claverley property transaction, 1844 </Title>
<Type> Document </Type>
<URL> file:mydocuments/ClaverleyProp/P{$Image}.jpg </URL>
<Params>
<Param Name=’Image’ Type=’Integer’/>
<Param Name=’Page’ Type=’Integer’/>
</Params>
<Text>
... transcribed text from below...
</Text>
</Resource>
The ‘Page’ parameter is defined more for documentation purposes than for image access. A specific image could be represented using a derivative of this Resource using STEMMA’s inheritance mechanism. For instance:
<Resource Key=’rClaverleyPropertyTrans_P544’>
<BaseResourceLink Key=’ rClaverleyPropertyTrans’/>
<Params>
<Param Name=’Image’> 1284247 </Param>
<Param Name=’Page’> 544 </Param>
</Params>
</Resource>
This creates another named Resource that inherits properties from the base Resource. However, in this instance, I will elect to generate transient unnamed Resource entities for each page using a ResourceRef element since I can do this on-the-fly and place them at the appropriate points in the transcription body. For instance:
…narrative text...
<ResourceRef Key=’rClaverleyPropertyTrans’ Mode=’SynchImage’>
<Param Name=’Image’> 1284247 </Param>
<Param Name=’Page’> 544 </Param>
</ResourceRef>
…narrative text...
The Mode attribute causes the image not to be displayed, but identifies it as related the current transcription. This allows transcribed lines, paragraphs, columns, etc., to be linked to specific locations in the current image (using x/y attributes), and thus allowing parallel scrolling of the image and transcription.
I’ll now add relevant mark-up to the text in order to handle items i and iii-v. This will also include an equivalent to Sue’s annotation linking the text segments to the corresponding page images. STEMMA has support for diplomatic transcription[1] and transcription notes (see Recording Evidence) but there are schemes with more comprehensive sets such as TEI. The strength of STEMMA lies in the way it identifies persons, places, events, etc., and links them to any corresponding entities in your data, which we’ll come to later.
<Narrative Key=’nTranscript’>
<!--- Clairfying notes -->
<Text Key='tMessuage'>
A dwelling house together with its outbuildings, curtilage, and the adjacent land appropriated to its use.
</Text>
<Text Key='tAppurtenances'>
Something added to another, more important thing; an appendage.
</Text>
<Text Key='tIngress'>
A way to enter a place or the act of entering a place.
</Text>
<!-- Start of main court session -->
<Text Key='tCourtSession'>
<ms>
<ResourceRef Key='rClaverleyPropertyTrans' Mode=’SynchImage’>
<Param Name='Image'> 1284247 </Param>
<Param Name='Page'> 544 </Param>
<Text>
New court session starts half way down the page
</Text>
</ResourceRef>
<Anom Mode='Marginalia' Posn=’L’>
<ms>
<b>Manor</b> of <b>Claverley</b>} to wit
<DateRef Value='1844-04-25'>25th April 1844</DateRef>
</ms>
</Anom>
<page id=’544’/><line num=’1’ posn=’L’/>
<b>The Court Baron</b> purchased of <PersonRef>Thomas Whitman</PersonRef><br/>
Esquire Lord of this manor held at the dwelling house of<br/>
<PersonRef>John Crowther</PersonRef> called the <PlaceRef>Kings Arms</PlaceRef> situate in <PlaceRef>Claverley</PlaceRef><br/>
within this manor on <DateRef Value='1844-04-25'>Thursdays the twenty fifth day of<br/>
April in the year of our Lord One thousand eight hundred<br/>
and forty four and in the seventh year of the reign of her<br/>
present Majesty Queen Victoria</DateRef> Before <PersonRef>Francis Harrison</PersonRef> deputy<br/>
Steward there and in the presence of <PersonRef>Christopher Gabert</PersonRef> and<br/>
<PersonRef>Edward Crowther</PersonRef> two copyholders of this manor.<br/>
</ms>
</Text>
<!-- Body of our cases of interest -->
<Text Key='tCases'>
<ms>
<ResourceRef Key='rClaverleyPropertyTrans' Mode=’SynchImage’>
<Param Name='Image'> 1284248 </Param>
<Param Name='Page'> 561 </Param>
<Text>
Case 1 not transcribed as it does not concern people of interest. I photographed the start of the court session, then skipped to the cases of interest on a later page. Page number query - not the page following the previous image. Case x starting half way down page.
</Text>
</ResourceRef>
<page id=’561’/><line num=’1’/>
<b>To this Court</b> come <PersonRef>John Wilson</PersonRef> of <PlaceRef>Aston</PlaceRef> within this<br/>
manor Farmer and <PersonRef>Samuel Nicholls</PersonRef> late of <PlaceRef>Catstree</PlaceRef> in the<br/>
<PlaceRef>parish of Worfield</PlaceRef> but now of <PlaceRef>Bridgnorth</PlaceRef> in the <PlaceRef>county of Salop</PlaceRef><br/>
Gentleman Devisees in trust named in the last will and testament<br/>
of <PersonRef>John Felton</PersonRef> heretofore of <PlaceRef>Hopstone</PlaceRef> but late of <PlaceRef>Draycott</PlaceRef> within<br/>
this manor Yeoman late copyholder of this manor deceased<br/>
in their own proper persons and in consideration of the Sum<br/>
of three hundred and fifteen pounds seven shillings of lawful<br/>
British money to them the said <PersonRef>John Wilson</PersonRef> and <PersonRef>Samuel Nicholls</PersonRef><br/>
.in hand well and truly paid by <PersonRef>Sarah Ward Nicholls</PersonRef> of<br/>
<PlaceRef>Catstree</PlaceRef> aforesaid Spinster before the passing of this surrender<br/>
as and for the purchase money for the hereditaments hereinafter<br/>
mentioned surrender into the hands of the Lord of this manor<br/>
by his deputy Steward aforesaid by the rod according to the custom<br/>
<ResourceRef Key='rClaverleyPropertyTrans' Mode=’SynchImage’>
<Param Name='Image'> 1284249 </Param>
<Param Name='Page'> 562 </Param>
</ResourceRef>
<page id=’562’/>
of this manor <b>All</b> that piece or parcel of land called or known<br/>
by the name of <PlaceRef>Mill Hill</PlaceRef> and all that newly erected <Alt> messuage <FromText Key='tMessuage'/></Alt> or<br/>
dwelling house and outbuildings on the same piece of land or some<br/>
part thereof with the <Alt> appurtenances <FromText Key='tAppurtenances'/></Alt> formerly <PersonRef>Grosvenors</PersonRef> and<br/>
late <PersonRef>Onions's[?]</PersonRef> situate in the <PlaceRef>township of Sleathton</PlaceRef> in the <PlaceRef>manor<br/>
of Claverley</PlaceRef> in the <PlaceRef>county of Salop</PlaceRef> formerly in the occupation<br/>
of <PersonRef>John Felton</PersonRef> and now of <PersonRef>William Ferrington</PersonRef> or his undertennants<br/>
containing by admeasurement three acres one rood and sixteen<br/>
perches or thereabouts being by computation the half of one<br/>
third part of a nook of land <b>To</b> the use and behoof of the<br/>
said <PersonRef>Sarah Ward Nicholls</PersonRef> her heirs and assigns for ever at<br/>
the will of the Lord according to the custom of this manor<br/>
<NoteRef Mode='Inline'>
<Text>
Case y. Undecipherable mark in margin.
</Text>
</NoteRef>
<line num=’1’/>
<b>To this Court</b> comes <PersonRef>Sarah Ward Nicholls</PersonRef> of <PlaceRef>Catstree</PlaceRef> in<br/>
the <PlaceRef>parish of Worfield</PlaceRef> in the <PlaceRef>County of Salop</PlaceRef> Spinster in her own<br/>
proper person and by virture of a surrender to her use at this<br/>
Court made by <PersonRef>John Wilson</PersonRef> of <PlaceRef>Aston</PlaceRef> within this manor<br/>
Farmer and <PersonRef>Samuel Nicholls</PersonRef> late of <PlaceRef>Catstree</PlaceRef> aforesaid but now<br/>
of <PlaceRef>Bridgnorth</PlaceRef> in the said <PlaceRef>County of Salop</PlaceRef> Gentleman Devisees in<br/>
trust named in the last will and testament of <PersonRef>John Felton</PersonRef><br/>
heretofore of <PlaceRef>Hopstone</PlaceRef> but late of <PlaceRef>Draycott</PlaceRef> within this manor<br/>
Yeoman late a copyholder of this manor deceases desires to<br/>
be admitted tenant to the Lord of this manor according to the<br/>
custom of this manor of and to <b>All</b> that piece or parcel of land<br/>
called or known by the name of <PlaceRef>Mill Hill</PlaceRef> and all that newly<br/>
erected <Alt> messuage <FromText Key='tMessuage'/></Alt> or dwelling house and outbuildings on the same<br/>
piece of land or some part thereof with the <Alt> appurtenances <FromText Key='tAppurtenances'/></Alt> formerly<br/>
<PersonRef>Grosvenors</PersonRef> and late <PersonRef>Onions's</PersonRef> situate in the <PlaceRef>township of Heathton</PlaceRef><br/>
in the <PlaceRef>manor of Claverley</PlaceRef> in the <PlaceRef>county of Salop</PlaceRef> formerly in the<br/>
occupation of <PersonRef>John Felton</PersonRef> and now of <PersonRef>William Ferrington</PersonRef> or<br/>
his undertenants containing by admeasurement three acres one<br/>
rood and sixteen perches or thereabouts being by computation<br/>
the half of one third part of a nook of land <b>To</b> whom the<br/>
Lord of this manor by his deputy Steward aforesaid by the<br/>
rod according to the custom of this manor hath granted the<br/>
premises aforesaid with the <Alt> appurtenances <FromText Key='tAppurtenances'/></Alt> and seizin thereof<br/>
<b>To</b> have and to hold the same premises with the <Alt> appurtenances<FromText Key='tAppurtenances'/></Alt><br/>
unto the said <PersonRef>Sarah Ward Nicholls</PersonRef> her heirs and assigns<br/>
<b>To</b> the use and behoof of the said <PersonRef>Sarah Ward Nicholls</PersonRef> her heirs<br/>
<ResourceRef Key='rClaverleyPropertyTrans' Mode=’SynchImage’>
<Param Name='Image'> 1284250 </Param>
<Param Name='Page'> 563 </Param>
</ResourceRef>
<page id=’563’/>
and assigns for ever at the will of the Lord according to the<br/>
custom of this manor by the rents and customary services<br/>
therefore due and of right accustomed and for such estate and<br/>
<Alt>ingress <FromText Key='tIngress'></Alt> the said <PersonRef>Sarah Ward Nicholls</PersonRef> doth give to the Lord<br/>
for a fine six pence half penny and four sixth parts of a<br/>
farthing and she is admitted tenant thereof in form aforesaid<br/>
and doth to the Lord fealty<br/>
<PersonRef>
<Alt Value='Francis'>Fran</Alt> Harrison
<Text>Signature</Text></PersonRef><br/>
Deputy Steward of the said manor
<NoteRef Mode='Inline'>
<Text>
End of court session, another session follows.
</Text>
</NoteRef>
</ms>
</Text>
</Narrative>
This may look complicated but remember that this is the internal representation. Using an appropriate tool, it would all look just like a fancy word processor (see Structured Narrative for an in-depth presentation).
The Anom element has been used to reference text in the margin, and the Alt element has been used to add both alternatives (as in the signature) and clarifications. For instance, to provide definitions for Messuage and Appurtenances, neither of which are in my day-to-day vocabulary. The NoteRef element has been used to add general transcription notes and annotation.
There is a reference to Queen Victoria which I’ve left without any PersonRef mark-up. This was because it made more sense to include it as part of the preceding DateRef element.
In the mark-up so far, I have only identified person and place references by raw PersonRef and PlaceRef elements respectively. These indicate that they are such references but they do not identify the actual person or place being referenced. This difference is part of STEMMA’s E&C support and is referred to as shallow and deep semantics. I do not know whether all the associated persons will be represented individually in Sue’s data, but I am guessing that the places might be. The following is a small example demonstrating how some of the raw place references can be associated with specific places, and how the corresponding Place entities can be used to hold images, documents, maps, or historical narrative (see A Place for Everything for further details).
<Place Key='wShrops'>
<Title> Shropshire </Title>
<Type> County </Type>
<Names>
<Sequences>
<Canonical>Shropshire</Canonical>
<Sequence>
<Tokens>
<Token>Shropshire</Token>
<Token>Salop</Token>
<Token>Shrops</Token>
</Tokens>
</Sequence>
</Sequences>
</Names>
<ParentPlaceLnk Key='wEngland'/>
</Place>
<Place Key='wClaverley'>
<Title> Claverley Parish </Title>
<Type> ParishCivil </Type>
<PlaceName> Claverley </PlaceName>
<ParentPlaceLnk Key='wShrops'/>
<Text>
Claverley became a Royal Manor in 1102
</Text>
</Place>
<Place Key='wClaverleyVillage'>
<Title> Claverley Village </Title>
<Type> Village </Type>
<PlaceName> Claverley </PlaceName>
<ParentPlaceLnk Key='wClaverley'/>
</Place>
<Place Key='wAston'>
<Title> Aston </Title>
<Type> Hamlet </Type>
<PlaceName> Aston </PlaceName>
<ParentPlaceLnk Key='wClaverley'/>
</Place>
<Place Key='wBridgnorth'>
<Title> Bridgnorth Parish </Title>
<Type> ParishCivil </Type>
<PlaceName> Bridgnorth </PlaceName>
<ParentPlaceLnk Key='wShrops'/>
</Place>
<Place Key='wBridgnorthTown'>
<Title> Bridgnorth </Title>
<Type> Town </Type>
<PlaceName> Bridgnorth </PlaceName>
<ParentPlaceLnk Key='wBridgnorth'/>
</Place>
<Place Key='wKingsArms'>
<Title> Kings Arms Public House </Title>
<Type> Building </Type>
<PlaceName> Kings Arms </PlaceName>
<ParentPlaceLnk Key='wClaverleyVillage'/>
</Place>
This subset of the available places was selected to show the depth of the place-hierarchy and to demonstrate one with name variants. A typical reference to a couple of these would be:
<PersonRef>John Crowther</PersonRef> called the <PlaceRef Key=’wKingsArms’>Kings Arms</PlaceRef> situate in <PlaceRef Key=’wClaverleyVillage’>Claverley</PlaceRef><br>
within this manor on…
Note that the original place names, as they were written, are preserved in this scheme. The extra Key attribute is effectively a conclusion that associates those units of evidence with real places.
It’s worth pointing out, at this stage, that there are two broad approaches to creating a proof argument: top-down and bottom-up. The top-down approach is intuitively what most people would think of. You would write your main conclusions and cite the appropriate bits of evidence along the way. STEMMA builds the other way so that pieces of evidence — say in a transcript — can be highlighted using a NoteRef element and connected to a piece of text proposing a rationale or explanation. They, in turn, can be linked to other items of text to create a hierarchy of conclusions. However, this is a structural issue and a good tool can make the process appear in either direction.
In summary, I’ve tried to give a relatively clear picture of how STEMMA would be applied to this transcription, as opposed to simply producing one of potentially many end results to pick through. The code you see in this blog was largely created by hand and so I apologise for any coding errors.
** Post updated on 19 Apr 2017 to align with the changes in STEMMA V4.1 **
[1] A clear discussion on the difference between diplomatic transcription and typographic facsimile may be found in: Mary-Jo Kline and Susan Holbrook Perdue, Guide to Documentary Editing, 3rd edition, chapter 5, section IV (http://gde.upress.virginia.edu/05-gde.html#h2.4 : accessed 16 Oct 2013).