If you think this is about bequests, wills, estate planning,
or probate then you’d be wrong. I’m afraid this is about software inheritance and how it simplifies the creation of one genealogical
entity (e.g. a Citation or an Event) from a similar one. Some amount of code is
inevitable as this is really intended for a software-orientated audience, but I
will try and explain what is happening and what the advantages are.
Anyone with knowledge of Object Orientated Programming (OOP)
will already be familiar with software inheritance.
A programming concept called a ‘class’ is used to describe some real-world entity
(e.g. an employee), including the data associated with it (e.g. name, salary) and
the operations that can be performed on it (e.g. promotion). A ‘derived class’
can then be created from such a generic ‘base class’ in order to describe a
more-specialised entity (e.g. a salesperson, or an engineer). In this small illustration,
that would allow all the common aspects of an employee to be programmed once
and automatically shared by all the various employee types; the derived classes
embracing any extra data or operations associated with specific cases.
STEMMA software, for instance, has a base class representing
a generic
subject entity corresponding
to some subject mentioned in historical sources, such as a person. That
encapsulates all the common aspects such as name handling (see
Game
of the Name) and their relationship to events and sources (see
Time-Dependent
Attributes). STEMMA also has derived classes that extend that base class in
order to represent specific subject entities, such as a Person, Animal, Place, or
Group; each of which has some slightly different requirements, including the
representation of their respective hierarchies.
What I want to present in this article, though, is the
inheritance mechanism provided in the STEMMA data model itself rather than in the
associated software. This came about because many of my data files were created
by hand in the early days, and I wanted a means to avoid duplication and to enable
the re-use of entities. Little did I know how much I would come to rely on this
feature.
This inheritance mechanism is applicable to each of the
entity types: Event, Citation, and Resource. However, there is an additional
parameterisation mechanism applicable to the latter two that works in
conjunction with inheritance.
Inheritance
Let me pick a very simple example to kick this off. Say
we’re about to create an Event entity for an English household in the 1901
census. We’ll need the census date for this — which many of us would have to
look up — but we’ll very likely have further households to document from that
same census. Wouldn’t it be nice to only enter the date and description just
once. The code, below, creates a base Event entity representing the day of that
census. This merely contains the event type and sub-type, and the specific
date. The ‘Abstract’ attribute imposes certain restrictions to ensure that it
constitutes a sound basis for inheritance. A second Event then inherits the
details in order to describe the census event in a particular household.
<Event Key=’eCensus1901’ Abstract=’1’>
<Type> Survey </Type><SubType> Census </SubType>
<When Value=’1901-03-31’/>
</Event>
<Event Key=’eCensus1901ManningGrove’>
<BaseEventLnk Key=’eCensus1901’/>
<PlaceLnk Key=’wManningGrove’/>
</EventLnk>
Now, you may be thinking that a good software product would
know about the various census events and enter the date, place, or other
details, for you. That’s true but the product can never know all of the events
in your ancestors’ lives, and the more micro-historical your focus then the
more esoteric your required event types will be. What I was doing by hand could
be implemented inside a product as a custom-Event builder, but the bigger
difference is that this dependency wasn’t simply an aid to data entry; the
dependency was modelled in the data file, and any change to the base entity (such
as adding narrative) would be reflected in all dependent entities.
A previous post,
Rock
Family Trees, showed an example that built up a custom Event entity to use
as a base for representing musical events. This effectively encapsulated the
use of custom types to describe musical events and, more specifically, changes
in band membership.
<Dataset Name=’RockFamilyTrees’
xmlns:et=’http://familyofrock.com/event-type’
xmlns:est=’http://familyofrock.com/event-subtype’>
<Event Key=’eMusicalBand’ Abstract=’1’>
<Type> et:Musical </Type>
<SubType> est:BandMembership</SubType>
</Event>
<Event Key=’eDannyJoined’>
<PlaceLnk Key=’wBrixton’/>
<When Value=’1968-08’/>
<BaseEventLnk Key=’eMusicalBand’/>
</Event>
This same mechanism may be used for Resource entities describing
data files, physical artefacts, or both. For instance, the following base entity
might describe a collection of original photographs that also happens to have
been digitised.
<Resource Key=’rElizPhotos’ Abstract=’1’>
<Title> Elizabeth’s Photographic Collection </Title>
<URL ContentType=’image/jpeg’> file:Eliz-Photos/*.jpg </URL>
<Type Artefact=’1’> Photograph </Type>
<DataControl>
<Permission> Elizabeth gave permission to
share with family in 2008 </Permission>
</DataControl>
<Text>
Collection received from Elizabeth Smith on
<DateRef Value=’2008-06-09’/>
</Text>
</Resource>
A simple entity representing one specific digitised
photograph from the collection might appear as:
<Resource Key=’rPhotoSmithFamily’>
<BaseResourceLnk Key=’rElizPhotos’/>
<URL> file:Eliz-Photos/SmithFamily1952.jpg </URL>
</Resource>
This inherits quite a bit from the base entity, including a
permissions notice that software would display when any type of sharing is
attempted. Note that if that notice were modified in any way then it would
automatically affect all the derived entities that depend on it.
However, the following section will indicate how this example
can be improved upon.
Parameterisation
Whereas a Resource entity uniquely identifies a data file
though its URL string, a Citation entity requires both a URI string and a set
of parameter values to uniquely identify an information source.
A Citation entity uses parameters to represent individual
citation-elements, as described in
Cite
Seeing, and the following example uses them to describe a published book
<Citation Key=’cOldNottm’ Abstract=’1’>
<Title>Old Nottingham Notes</Title>
<URI> http://stemma.parallaxview.co/source-type/book/ </URI>
<Params>
<Param Name=’Author’>James Granger</Param>
<Param Name=’Title’>OLD NOTTINGHAM : Its Streets, People, etc</Param>
<Param Name=’Publisher’>Nottingham Daily Express Office</Param>
<Param Name=’Date’>1902</Param>
<Param Name=’Page’ ItemList=’1’/>
</Params>
</Citation>
The URI implies a given set of named and typed parameters
that are relevant to this source type. This base Citation provides parameter
information about the book as a whole, but not the specific page(s) — note that
selected parameters, such as this one, may specify a series of values. That
page information might be provided in a new Citation entity that inherits from
the base one as follows:
<Citation Key=’cHandleysHospital’>
<BaseCitationLnk Key=’cOldNottm’/>
<Params>
<Param Name=’Page’>94</Param>
</Params>
</Citation>
Alternatively, the page information could be provided when
the base entity is referenced; say in some narrative. This effectively creates
an unnamed, transient Citation entity through inheritance:
<CitationRef Key=’cOldNottm’>
<Param Name=’Page’>94</Param>
</CitationRef>
Parameterisation is available in both Citation and Resource
entities, and the values may be inherited from a base entity, declared
explicitly in the body of an entity, or applied to a link from one entity
instance to another. All of these schemes can be used together.
The parameters may also be substituted into selected items by
using ${param-name} markers. For Citation entities, this is available in the
citation-title, the format-string, the values of parameters themselves (e.g.
within a Params element), and narrative elements. For Resource entities, it is
available in the resource-title, URL, parameter values, and narrative elements.
The next example shows a simple parameterised Resource for
accessing individual photographs from a given folder. The base Resource defines
the names and types of the parameters, and derived entities or entity
references can specify the corresponding parameter values.
<Resource Key=’rPhotos’ Abstract=’1’>
<Title>Family photograph:${PhotoName}</Title>
<URL ContentType=’image/jpeg’>file:myphotos/family/{$PhotoName}.jpg </URL>
<Params>
<Param Name=’PhotoName’ Type=’Text’/>
</Params>
</Resource>
<ResourceLnk Key=’rPhotos’>
<Param Name=’PhotoName’>Tony</Param>
</ResourceLnk>
This last, more-involved example will illustrate how the
inheritance and parameterisation mechanisms can be used in conjunction with
both Citation and Resource entities in order to handle online images. It uses a
shorthand source citation for a general census page of England & Wales for
a particular year, e.g. [RG13/3178/51/12].
While not recommended, this catalogue-reference example makes an illustration
easier to read.
<Resource Key=’rCensusImage’ Abstract=’1'>
<Title>1851-1901 Census Images of England and Wales</Title>
<URL>http://www.census.com/image?series=${Series}&piece=${Piece}&folio=${Folio}&page=${Page}</URL>
<Params>
<Param Name=’Series’ Type=’Text’/>
<Param Name=’Piece’ Type=’Integer’/>
<Param Name=’Folio’ Type=’Integer’/>
<Param Name=’Page’ Type=’Integer’/>
</Params>
</Resource>
<Citation Key=’cCensus1901’ Abstract=’1’>
<Title> 1901 Census of England and Wales </Title>
<DisplayFormat Mode<=’RefNote’>
<Text Language=’eng’>
[<Subs><i>${Series}/${Piece}/${Folio}/${Page}</i></Subs>]
</Text>
</DisplayFormat>
<URI> http://stemma.parallaxview.co/source-type/census-eng-wales </URI>
<Params>
<Param Name=’Series’>RG13</Param>
<Param Name=’Piece’ Type=’Integer’/>
<Param Name=’Folio’ Type=’Integer’/>
<Param Name=’Page’ Type=’Integer’/>
</Params>
</Citation>
<Source Key=’sCensus1901ManningGrove’>
<Title> 1901 Census for Manning Grove</Title>
<Frame>
<CitationLnk Key=’cCensus1901’>
<Param Name=’Piece’>3178</Param>
<Param Name=’Folio’>51</Param>
<Param Name=’Page’>12</Param>
</CitationLnk>
<ResourceLnk Key=’rCensusImage’>
<Param Name=’Series’>RG13</Param>
</Param>
<Param Name=’Folio’>51</Param>
<Param Name=’Page’>12</Param>
</ResourceLnk>
</Frame>
</Source>
Now there’s a lot going on here. The Source entity ’sCensus1901ManningGrove’
represents a specific page in the 1901 census of England & Wales. It
nominates a specific Resource for the census image and an associated Citation,
both of which inherit a number of items from the base Citation entity (’cCensus1901’)
and base Resource entity (rCensusImage). For the Citation, this includes the source-type
URI, format-string, and parameter names & types. For the Resource entity,
it includes a URL for accessing the associated page images via a hypothetical
Web.
An important point regarding the application of parameter
substitution is that it always occurs after the inheritance process has
completed. Hence, the following distinct stages may occur:
- Inheritance of fields from
the base entity.
- Overriding (in memory)
with explicit fields from the derived entity.
- Creation of a transient
unnamed entity from the parameter settings in a *Lnk/*Ref element.
- Substitution of current
parameter values, in the source-order of their substitution markers.
Hence, in the last example, stage two hasn’t been employed,
so the CitationLnk element specifies parameter values to create an unnamed
Citation entity in memory, directly from the base entity.
Conclusion
Most of the STEMMA entity types have their own concept of a structural hierarchy, e.g. lineage for
Persons and Animals, geographical/administrative hierarchies for Places and
Groups, source provenance for Citations, and hierarchical Events. An inheritance hierarchy, though, is
fundamentally different in that it allows sharing of data between related
entities. As stated above, this is more than just a mechanism of convenience
for automatically adding required data to a new entity; the dependency is
represented in the data and so any change to the base entity will automatically
affect all the derived entities.
Although the mechanism requires that a derivation can only
be made from an abstract entity, the mechanism can be multi-level, i.e.
deriving new abstract entities from prior ones. This can be used, for instance,
to add parameters for the citation of a specialised source type based on a more
generic one. Some real examples may be found in
JessonLesson.xml.
** Post updated on 19 Apr 2017 to align with the changes in
STEMMA V4.1 **
No comments:
Post a Comment