You may have heard the term Persona (pl. Personae) being
used in a genealogical context, especially by people with a software
background. What is a persona, though? Do you ignore it as a software
aberration? Does it have any value?
A persona is the term used to describe the reference to some
person from one specific source. There are no conclusions in a persona — only information
— and so it is sometimes inaccurately referred to as an “evidence person” in
order to distinguish it from the traditional “conclusion person” that we might
have in a family tree. Hence, a persona is not equated with any actual person.
The use of the term evidence is
inaccurate here as evidence is an intangible mental construct, unlike source
information. It is what we think certain source information means.[1]
In principle — I know we all work in different ways — it is
possible to take a number of similar personae and group them together in order
to form one or more conclusion persons that we can identify with actual people;
more on this process in a moment, though. Some advocates describe a multi-tier
process where this grouping occurs at different levels. For instance, personae
that are obviously similar being grouped first, and then those groups being
tentatively grouped themselves based on less obvious criteria. It’s interesting
to note that these persona groups are not personae themselves since they are
the result of some inference and conclusion, and ideally need some
justification.
This brief outline embodies the generally accepted nature of
a persona. However, things start to go awry from here and opinions begin to
differ. The persona is a much-debated concept in the Evidence & Conclusion
model of genealogy, and many threads on the subject can be found on the BetterGEDCOM wiki, such as Do
we need persona?
As a representation of the reference to person from a single
source, there cannot be much debate over the concept. However, in practice,
that information is usually distilled down to a number of named properties, as in the illustration
above. I’m using the STEMMA®
terminology of properties here rather than “facts” or “PFACTs”, etc. STEMMA
defines a Property
as extracted and summarised source information,
and acknowledges that they require the same support for uncertain characters,
uncertain interpretations, and other anomalies as do transcriptions. Properties
are valuable as a window onto the supporting information but they do not
replace the raw information since they are only a digested form of it. To do
that would lose the contextual parts of the information such as: what the event
was, who else was there, what parts they played, and how reliable the source itself
is.
The persona concept itself can be traced to a 1959 paper
entitled Automatic Linkage of Vital
Records[2].
Indeed, there are still those who believe that one of the primary uses of
personae is in their automated combination by software. This might yield a
first-pass result when many records are involved but I would be very concerned
about accepting that result without putting in the real analysis expected of
genealogical research. However, this is straying into the field of how software
might utilise personae rather than their expressive power.
The origin of the term itself is uncertain but at the
meeting that kicked off the GenTech
model, in 1994, Tom Wetmore gave a talk entitled "Structured Flexibility
in Genealogical Data" in which he stressed the need to record evidence
data, and where he used the term persona
in that context. The concept of persona exists in several data models,
including GenTech and more
recently GEDCOM-X.
So, is there any merit in representing personae in our data?
STEMMA records Property values for a person reference, such as their name, age,
and occupation, but it also wants to retain the source context of that information
— the where-and-when. It does this by subdividing its Event entities into a
number of SourceLnk elements, each of which is supported by a distinct source.
Those SourceLnk elements may contain multiple PersonLnk elements corresponding
to person references in that source and these are, therefore, similar to
personae.
<Person Key=’pWilliamElliott’>
<Eventlet>
<!-- Private event (no other persons involved) -->
<When Value=’1870-11-17’/>
<SourceLnk Key=’sEveningPost’>
<PersonLnk>
<Property Name='Name'>
Wm. Elliott
</Property>
<Property Name=’Age’> 29 </Property>
</PersonLnk>
</SourceLnk>
</Eventlet>
</Person>
<!-- Multi-person events -->
<Event Key='eCensusElliott1851'>
<SourceLnk Key=’sCensusElliott1851’>
<PersonLnk Key=’pWilliamElliott’>
<Property Name='Name'>
William Elliott </Property>
<Property Name='Age'> 10 </Property>
<Property Name='Occupation'>
Scholar </Property>
<Property Name='BirthPlace' Key='wUttoxeter'>
Staffordshire Uttoxeter </Property>
<Property Name='Relationship’
Key='pTimothyElliott'> Son </Property>
<Property Name='Status'/>
</PersonLnk>
</SourceLnk>
</Event>
<Event Key='eCensusElliott1861'>
<SourceLnk Key=’sCensusElliott1861’>
<PersonLnk Key=’pWilliamElliott’>
<Property Name='Name'>
William Elliott </Property>
<Property Name='Age'> 20 </Property>
<Property Name='Occupation'>
Labourer </Property>
<Property Name='BirthPlace' Key='wUttoxeter'>
Staffordshire Uttoxeter </Property>
<Property Name='Relationship’
Key='pTimothyElliott'> Son </Property>
<Property Name='Status'>
Unmarried </Property>
</PersonLnk>
</SourceLnk>
</Event>
<Event Key='eMarriageElliott1862'>
<SourceLnk Key=’sMarriageElliott1862’>
<PersonLnk Key=’pWilliamElliott’>
<Property Name='Name'>
William Elliott </Property>
<Property Name='Age'> 21 </Property>
<Property Name='Occupation'>
Hammersman </Property>
<Property Name='ResidencePlace'
Key='wVictoriaStreet'> Victoria Street Derby
</Property>
<Property Name='Role'> Groom </Property>
<Property Name='Status'>
Unmarried </Property>
</PersonLnk>
</SourceLnk>
</Event>
The PersonLnk elements representing the subject references
are assembled from the discrete Property values derived from the supporting
source. When the Properties describe relationships for the subjects then they
can also be represented, and may be inter-person relationships (such as wife-of
or wife-of-brother-of), membership of some group, or ones relative to
referenced places. Putting this information into Events allows the information
to be presented by time (i.e. a timeline), or geography, or both. The Property
values for the Event itself, such as the dates or place, may also be specified
in the SourceLnk element as Event properties.
OK, so why don’t I describe these sets of Property values as
personae and use them as such? For a start, the interpretation and
summarisation of these items constitutes a level of inference, and so they are
one level removed from the persona concept. STEMMA also generalised the concept
so that there are equivalents for all of its subject references, including
places, groups, and animals. Furthermore, as of STEMMA V4.0, there is a much
closer concept that has true value for research and analysis purposes. Its Source
entity allows references to subjects (such as persons), and to dates and
other important details or phrases, to be marked, collected, and built into a
network for a graphic analyser. This allows those references to be analysed in
terms of other context from the source information, and for similar references
— in either a single source or across multiple sources — to be assembled into
multi-tier persona-like entities.
In summary I believe the concept of personae has merit in
micro-history data, but without the contextual information that surrounded
those references in their respective sources then they cannot be used for
research purposes. Similarly, STEMMA’s sets of Property values are merely an extracted
and summarised form of information from a source and are not designed for deep analysis.
Conversely, its Source entity embraces references to more subjects than merely
persons, and to any information that the researcher feels will be important to
their historical analysis. This is not mandating a given research methodology —
which is a basic premise of STEMMA — but it does provide support for a genuine
approach to handling complex evidence.
** Post updated on 22 Nov 2015 to align with the changes in
STEMMA V4.0 **
[1] “QuickLesson 13: Classes of Evidence—Direct, Indirect &
Negative“, Evidence Explained: Historical
Analysis, Citation & Source Usage (https://www.evidenceexplained.com/content/quicklesson-13-classes-evidence%E2%80%94direct-indirect-negative : accessed 10 Sep 2014).
[2] H. B. Newcombe, J. M. Kennedy, S. J. Axford, and A. P. James,
“Automatic Linkage of Vital Records”, Science,
Vol. 130, No. 3381 (16 Oct 1959): p.954–959.
No comments:
Post a Comment