Should our computer software support a single standard of
genealogical proof, and if so then should it be part of the data that we share?
The question of whether a genealogical data model should
support a single standard of proof came up recently on the RootsDev developer group, and the consensus was
‘No’. This may come as a shock to some people but bear with me while we examine
the context more deeply.
The Genealogical Proof Standard (GPS) is held
in high regard amongst genealogists, not just as a standard of professionalism
but also as evidence of their credibility in academic circles. It would
therefore seem reasonable to expect software to support it, and this has
certainly been recommended before[1].
The crux of the issue is not that software shouldn’t support it, but that it
should not involve a mandatory and standardised representation in a data model
designed for sharing.
Although I’m not aware of any alternatives to GPS in the
field of genealogy, other fields do have their own “standards” of proof. This
obviously includes mathematics, but other investigative fields, too, such as CSI,
archaeology, and heir hunting. Whether this is significant to us is debatable,
but what is clearly significant is that the application of the GPS is not
prescribed by the standard itself. What I mean by that is that the process by
which a user addresses the requirements of the GPS — the five elements — and how it keeps track of their evidence and its applicability,
would be done differently by different products. An entry-level product may
take a very literal approach which could result in it appearing more laborious,
but a sophisticated product could make it more palatable through better insight
and visualisation. This has to be the prerogative of the product designer, and
any attempt to impose a straightjacket through standardisation would be the
death of that standard.
Indeed, a product may not be concerned with the GPS at all
if the context is inappropriate, or if the commerciality would be in doubt.
While we can recommend the GPS, and encourage it through training and
qualifications, a heavyweight product may not have the mass-market appeal to
make it viable.
I briefly covered this topic in some of my early thoughts on
standardisation at: Musings
on Standardisation, but my lack of clarity there caused at least one
serious rift between a colleague and me. The difference, you see, is that those
thoughts, and those in the recent RootsDev thread, are talking about the data
model used to exchange data, and not about the software products that we use.
I’ve just explained why the designers and vendors of software must be given
free-rein to innovate as they choose. The data that those products share may or
may-not include the meta-data associated with a particular research process.
Now this is going to raise a whole bunch of questions and potential
responses so let me just recap on things. Our core data will involve both
evidence and conclusions, and so it must indicate how they relate to each other
— independently of the research process used by the creator. Any data used to
support a specific research process is technically meta-data because it’s an
aid to the research rather than a product of the research. It doesn’t matter
whether we consider our data to be lineage-linked, evidence-based, event-based,
or all of the above (as I do).
So what impact does this have a standardised data model used
for exchange? In computer programming, there is a pragma concept which involves additional statements being
associated with the computer code. These statements are designed purely to
direct a software program how to process the computer code. They are not part
of the code’s language, or its grammar, and those pragma statements are
specific to different programs. This situation is actually quite similar to the
genealogical case since the core data is independent of such meta-data, being
quite valid without it, and that meta-data would be specific to a given
product. If the two contributions are distinct then we have the potential
benefits of both worlds. We can share our core data (including lineage, events,
places, sources, transcriptions, attachments, etc) with anyone, and optionally
share our additional research status with someone using the same product.
So is this possible? In fact, it is quite easy since the XML data syntax routinely uses XML namespaces to
differentiate different grammars in the same data contribution. Although I
tried to give an introduction to the benefits of this feature in Digital
Freedom, it’s still quite technical. You might imagine it as a
foreign-language annotation that someone has added to someone else’s text in
order to explain the context of selected words and phrases. You could share the
combined work with someone, or filter out the foreign annotation and leave the
original work. Although this namespace concept is mostly associated with XML,
it is actually something that could be employed with any data syntax, if
necessary, including a GEDCOM-like one.
[1] Mark Tucker, “10 Things Genealogy Software Should Do", Family History Technology Workshop
(Provo, UT: Brigham Young University, 2008), http://fhtw.byu.edu/static/conf/2008/tucker-10-fhtw2008.pdf (accessed 26 Dec 2015).
No comments:
Post a Comment