My previous post, The
Future of Online Trees, prompted a flurry of reaction, most of which was
positive; however, it did suggest, implicitly, that many people find it hard to
think beyond their entrenched views, and that my explanation may have assumed
too much.
This follow-up collects together some of the explanatory
comments that I'd since posted around the Web, and tries to make a coherent argument for what genealogical research should
entail.
The previous article made several negative comments about
existing online trees, including:
- That assembling their associated conclusions directly from raw digitalised information is not always easy, and gets very hard as you go further back in time (e.g. before census returns and civil registration).
- That it's hard to tell naive trees from properly researched ones, and that, no, a bunch of citations are not a useful indicator.
- That naive trees usually persist long after a creator may have abandoned them, and could steer new researchers down the wrong path.
- That proof arguments (i.e. reasoned explanation), as opposed to simple proof statements (i.e. citations), are almost never provided.
The primary issue behind my article was that many
genealogical conclusions require their research work to be written up in order
for them to be assessed, and for that research to be cited either by trees or
other research work. Proof statements alone are only applicable when the
sources offer direct answers and do not conflict with each other, but identity
problems and family reconstruction can require lengthy arguments that examine
multiple sources. The results will often address groups of correlated people
rather than just some specific person, and so it's not realistic to expect that
work to be tucked away in a single person entity (on a tree) or in a single
person page.
An associated issue is that the contributors to online trees
— and probably the users of genealogy software in general — routinely talk
about individual "claims", and the supporting sources for those
claims, as though they're all independent of each other. This is fallacy! The idea that a specific claim can be justified in isolation, and
linked directly to one or more sources that give a direct answer, is a huge
oversimplification of the research process, and yet this is a mindset that is
hard to argue with.
One of the positive things I suggested in my previous article
(possibly the only one) was that there are researchers who do publish their work
online (e.g. in blogs), and that online trees could reference their research
as "authored works": a recognised source category that supplements
those of "original" and "derivative". There is no issue at
all with representing this in GEDCOM files — the data format most often used to
transfer data between two places — nor any significant issue with online
providers recognising such work as a specific source category.
Many traditional genealogists write-up their work in
academic journals, but this is more about kudos than about helping a community of genealogists; few of us will be
subscribers to these journals. This is a shame because they cannot distance
themselves from online genealogy, nor ignore the associated problems, because
we're all tarnished with the same brush. If we describe our work as
"genealogy" then it will be linked automatically to the prevailing
impression of its most common form: online trees.
It may be hard to see what I'm getting at if you haven't
participated in research in other fields. All the fields that I am aware of,
such as in science and medicine, rely on published works. This could be in
journals or online, but by far their biggest difference from what is currently
considered genealogical research is that newer works cite older works. The
consensus is then built up through layers of research, each of which may
support or refute previous work. There's a saying about standing on the
shoulders of giants, and it makes perfect sense: someone could have spent a
lifetime solving one particular deep mystery, and so to expect someone else
(beginner or otherwise) to find the same answer directly from raw online
information is unrealistic. I cannot think of any other area of research that
works as genealogy currently does, and where conclusions are either copied
blindly from those of someone else or constructed independently from raw
information. This is a little like a surgeon creating an independent textbook themselves
by simply dissecting the evidence — a cadaver in this case. Knowledge and
progress come from sharing research, and by building on the research of others.
It's step-wise, progressive, and takes time. And without seeing any written
research then you cannot tell whether someone made their conclusions in 30
minutes or 30 years.
So what size of work are we talking about? Is it just a
single paragraph? Well, it could be, or it could be a couple of thousand words,
as with several blog articles that I've encountered. I have two unpublished
works of 5000 words, myself, that I want to contribute to the community, and
for posterity, but also a work-in-progress that is already at 10,000 words —
such is the complexity.
A note on the use of wikis as a medium for collaborative
research is necessary because they were mentioned by a few people. It is true
that wikis can be, and are, used for such research purposes, but they have
significant weaknesses. They are often limited in the richness of their
presentation — usually amounting to more of a protracted discussion, as the old
BetterGEDCOM wiki demonstrated — but genealogical research requires support for
rich formatting, images, tables, and citations. Not all blogs offer this, but
there are usually ways of achieving it (see Summarised
Blogger Tips for instance). Wikis have little, if any, editorial control,
and no attribution support beyond their confines. Also, that they constitute a
confined medium — forcing people to contribute outside any personal medium or
prior work — would put too many people off. By contrast, blogs are not confined,
they may be linked or associated with other work by the author, and their articles
have immediate attribution. Wikipedia was also mentioned as an example of
successful collaboration, but it has strict rules that prevent original work or
theories being presented. It relies on secondary sources, and so implicitly
collects information that is already in the mainstream. This certainly doesn't
prevent edit wars but it does place it apart from collaborative wiki-based genealogy.
So, my suggestion is to separate attributable research work
from tree-based conclusions, and to cite such work for the harder cases rather
than just some raw information. This suggestion is not rocket science so why aren't we doing it?
Nice!
ReplyDelete