Friday, 28 March 2014

What to Share, and How

The subject of public family trees frequently does the rounds. In this post, I want to examine what people would like to share — software permitting — and the functional requirements of that sharing.

At the fore of the current posts on this subject is probably James Tanner’s Genealogy’s Star blog. Although there are too many posts there for me to cite separately, they consider the differences between having public trees versus private trees, using online trees versus local genealogy programs[1], and unified online trees versus user-owned ones. I’ll use the following table in order to put these terms into perspective.



In other words, a tree maintained using a local genealogy program is private to you, although you could give a copy to someone else. When such a tree is hosted online then there is a choice. If it is part of a unified tree then it is necessarily public, but if held as a separate user-owned tree then it could be either public or private.

There are both advantages and disadvantages to maintaining local trees, and a comprehensive list was recently posted by Renee Zamora on Renee's Genealogy Blog, but why do people want to create online trees? One reason is simply for use as “cousin bait”, and attracting distant relatives with a view to controlled sharing. This is the closest description of my own situation as my definitive data is held on a local machine. It is a position that will become increasingly difficult to maintain as online data is judged more and more by the sources it elects to cite. Some people do not want to share their data publicly for more delicate reasons, and a good case was presented by Kerry Scott at Why Don’t People Post Public Family Trees?

My own reasons for not sharing more data online are deeper than the implication above that I simply want to use a local program. My STEMMA® R&D project is one factor since I am developing software to support that custom data representation. Possibly more important, though, is the fact that my data is far from being a simple family tree. It is a representation of general micro-history that incorporates family history, family trees, pedigrees, timelines, narrative, etc. I cannot, therefore, share everything since there is no standard for this type of data, and no sites (at the time of writing) that are properly structured for this type of data.

James Tanner’s post Family Trees: Unified vs. User Owned caused me to think more about what I would like to share, and how, so I will try to expand on my brief response to him. I detest our industry’s preoccupation with “family trees”, and the way that it leads newcomers into believing that is the be-all and end-all of genealogy. I don’t know of a single experienced genealogist who only wants to collect names, dates, and places associated biological lineage in order to create a tree. They’re all interested in family history, and all aspects of that history. Although I’m out on a limb by declaring an interest in general micro-history. including the history of places, groups, and non-relatives, this is merely a superset of family history.

A very significant issue with any type of historical work is that it is a creative work. It involves research, thoughtful analysis, and some skill in writing it up accurately and interestingly. This is more than just an assembly of facts that anyone could find in the public domain. Even when a public tree is given source citations, it would be little more than an assembly of such facts. If it were possible to share our data as creative works then our requirements would suddenly align with those of authors of other online works, whether fiction or non-fiction. It struck me how close those requirements are to the issues people currently raise as obstacles to sharing their genealogical data publicly. For instance:

  • Attribution – Ensuring that their authorship is acknowledged. Allowing their work to be cited by the work of others as opposed to having it plagiarised.
  • Integrity – Allowing other researchers to see their work, but not to edit it. Their work could be connected to a central tree for indexing purposes but should not be assimilated entirely into the tree in order to preserve its structure or narrative form.
  • Drafts – Allowing revisions of their work, and possibly the addition of tentative items that they don't want to expose until they're more confident of them.
  • Longevity – Ensuring that their work will persist after they are no longer able to contribute.
  • Privacy – Allowing certain information to be disclosed at some point in the future (e.g. some respectable point after their death).

Obviously I cannot speak for everyone out there, but if this were possible now then I would gladly share all of my research. However, I would clarify that a tree-based site that simply accommodated rich-text notes is not what I’m thinking of. It would have to fully accommodate a structured representation of historical data that includes all of the items I mentioned above, including narrative, and yet could be indexed by a tree, a pedigree chart, or a timeline, etc. This is certainly possible and is one of the goals driving STEMMA development.

I can’t quite work out the dynamics behind the industry advertising and the tools that we’re provided with. As I said, the concept of a family tree is endemic, but whether the advertising influences tool development, or vice versa, is hard to determine. As a software developer myself, I sometimes wonder whether developers see our tools more as a technical challenge than something that has to satisfy the requirements dictated by real genealogy. Collaborative Web sites, where we build a single picture of something, are a good example. Ignoring those sites that are wiki-based collaborations, everything I have seen is related to “unified family trees” rather than anything involving events, places, and narrative. The fact that even these existing sites are problematic supports my view that they are considered to be challenging. Although I demonstrated that other forms of collaboration are possible at Collaboration Without Tears, I also feel that it should be possible to upload “rich” (see above) user-owned data contributions to hang off a unified lineage-based framework. This step would be more significant than it may sound but I’ll defer any detailed presentation until another post — if there’s any interest, of course.

[1] It would be restrictive to term these ‘genealogical database programs’ since a local program does not necessarily have a database. As explained in DoGenealogists Really Need a Database?, a memory-resident database might be constructed on-the-fly from permanent and definitive data.