The subject of time-dependent attributes is not one that
I’ve seen discussed very often. I want to illustrate how easy they are to
handle when the underlying data model is event-orientated.
To most readers interested in this issue, the subject will
be interpreted in the context of personal attributes; those relating to a
person. Although in STEMMA® it equally applies to the attributes of a place, or
of a group (including families, organisations, regiments), I will focus on personal
attributes in order to keep things relevant to family historians.
By time-dependent, I mean those attributes that may change
over time, and which we are likely see progressive variation of in different
sources. Obvious examples would be height and weight, although such items are
rarely recorded in our data. A more familiar example might be a person’s
residential address. Time-dependent attributes contrast with those fixed ones,
usually attributable to our birth, such as our biological parentage, birth sex,
date of birth, and place of birth.
The only in-depth presentation on this subject that I’m
aware of is the paper submitted to FHISO by
Richard Smith: Expressing time-dependent personal
attributes. This paper comments that the GEDCOM way of handling them, by
attaching a DATE tag to an item, doesn’t cope with all possibilities, including
a personal name and someone’s effective sex. Unfortunately, since GEDCOM is
primarily a lineage-linked representation then the temporal nature of the
attribute has to be added as an afterthought, and the relationship to other
attributes for that person, the date or event that they pertain to, attributes
of other people associated with the same event, and the sources supporting that
event, are uncoordinated at best.
The proposal
in the aforementioned FHISO paper works for a representation involving people
and their lineage (e.g. GEDCOM), and any representation where people must have attributes directly associated with
them, but would be unnecessary if there was a natural representation of time
and events. The core concept that is missing, and which would automatically
unify those uncoordinated items, is the Event entity — a representation of
a moment in time, or a span of time, for which source information exists.
Let’s pick a really obvious case to illustrate the
event-based approach. A person’s age is time-dependent, and we won’t see the
same value in all the sources we consult. We would never think of associating
all these variant ages directly with a Person entity, and the same approach
should apply to other time-dependent attributes. In the case of age, we usually calculate an associated
date-of-birth from the contextual date of the information and record that
instead. The ability to do this calculation is specific to this particular attribute,
and it could not be applied to, say, a residential address, occupation, or
military rank. It’s also a conclusion derived from information. We’ll continue
to take the simplistic view of the age
attribute, though, in order to explore the general case.
Every source of information has a temporal context; a date,
or range of dates, to which it pertains. That source information therefore
supports the concept of an event, and can be represented by an equivalent Event
entity to which those sources are connected. If the source information mentions
specific persons (or places, or groups) then their associated entities can be
connected to that Event. Any attributes — or what STEMMA calls Properties
— given in the information are then associated with the corresponding
connections between the subject entities and the Event.
Using this approach, those attributes are now associated
with the Person (or other subject entity), the relevant date(s), the relevant
place, the relevant sources, and any other Persons (or subject entities)
mentioned in the same source information.
Someone’s recorded age should be monotonically increasing
with time but that’s not what we see. We will encounter values that do not
progress smoothly, and may even appear contradictory by remaining static or
running backwards. The importance of this is that what is written cannot be
treated as fact, and may need
some clarification, correction, or other annotation. The general Property
mechanism in STEMMA allows such annotation, as well as adding conclusions about
their interpretation such as the identification of a place. This is discussed
further in Evidence
and Where to Stick It, and in Is That
a Fact? (which also mentions some coding examples).
One last
consideration: the aforementioned FHISO paper introduces a complication in the
form of:
Consider a letter written in 1821 that says “In 1799
I was living in Shrewsbury where my father was a schoolmaster”.
This indirect documentation of attributes may be a
complication for GEDCOM but not for a representation that fully embraces
events. I have already defined an Event entity as a representation of a moment in time, or a span of time, for
which source information exists, and this particular source simply has
information supporting multiple events to different degrees. This is no
different, say, to a military enlistment record from 1899 that mentions a
marriage occurring in 1870. The source directly supports the enlistment event
but also supports the marriage event to a lesser extent. The same applies to a
census. The information, as given to an enumerator on census night, directly
supports the census event, but a recorded age doesn’t support a birth event to
the same degree. In other words, a source as a whole may contain information
supporting multiple events to different degrees.
No comments:
Post a Comment