Tuesday, 25 February 2014

Revisiting the Family Group



In a previous post about genealogical data, I described a generic Group entity designed to model the many varieties of a Family Unit, and more besides. I am now undoing its implementation and re-crafting the Group so that it becomes one of the most fundamental data entities in the recording of history.

In Family Units, after denigrating the precise definition of a ‘family’, and any possibility that we could objectively identify one through records, I described a generic implementation that could mean whatever you wanted it to. This followed in the footsteps of the GENTECH project which took a similar generic stance[1]. The main difference in my implementation was the use of Set-operations to group people together, and to derive one Set from another.

As a very brief example of how this worked, consider a marriage of two people, both of whom had been previously married and who bring associated children into the new marriage. We would therefore have multiple contributions to the new ‘extended family’, including the conjugal children and two sets of stepchildren.


The idea being that once the three groups of children have each been defined then the extended family is easy to define since it is simply the Set-wise union of those and the two parents. The implementation even addressed the thorny issue of time-dependent membership of a group. For instance, once a child gets married then they’re typically viewed as part of a separate family unit, and so they’d leave one Set and enter another.

One problem with this general approach is that it’s very people-focused, and that means it is most useful for categorising people. This is not bad, but it’s certainly not the whole story either. Anyone studying history, including genealogy and family history, will appreciate that an organised real-world entity such as a company has a history, or “life”, of its own.

Although I never used these Groups for representing family units, I did intend for them to be used to model things such as military units, societies, clubs, etc. It didn’t take long before I realised that my hasty implementation wasn’t enough. The corresponding Groups would have to share a lot in common with both Persons and Places; a pattern I’d already capitalised on in their case.

So what was missing? In STEMMA® V2.0, Groups only had a title, a type, and a time-dependent Set of Persons. The following features were also required but were only supported by Person and Place entities:

  • Alternative names, variant spellings, and name changes. Persons and Places already shared a common implementation for this.
  • Parentage. Being affiliated to a bigger Group, as opposed to being a subset of some other Group.
  • Requirement to be associated with Events, and hence to cite supporting sources for that Event, and identify extracted and summarised items of information (aka ‘Properties’) relevant to the Group.
  • Ability to reference resources such as images, photographs, documents, etc., applicable directly to the Group.
  • How to deal with Groups being split or merged during their history. This is already a problem for Places, and my eventual solution should apply to Groups too.

The next release of the STEMMA specification (V2.2) will incorporate all this functionality but, surprisingly, it streamlines things rather than complicates them. Software engineers will understand when I say that it allows me to share certain base classes in the implementation. Effectively, Groups then become a top-level entity in the specification, alongside Person, Place, and Event.

As a real example from my own data, anyone who read A Grave Too Far may remember that it mentioned a British cavalry regiment that one of my ancestors served in: 14th/20th Hussars. This regiment was created through the merger of the 14th King's Hussars and the 20th Hussars in 1922. The honorific "King's" was added back into the title in 1936. This short description already involves a merger, a rename, and corresponding events. I associate historical sources with this regiment-type Group, including the relevant Wikipedia page: 14th/20th_King's_Hussars. My blog-post also cited a newspaper reference to the movements of the regiment: “Cavalry Change at Risalpur” (footnote 6). This was particularly interesting because it was relevant to the life of my ancestor, and yet that reference only described the movement of the regiment and not of any specific people. However, the data relationship between the information source, the Event, and the Group, was identical to a similar situation involving the movement of some individual.

As explained in Evidence and Where to Stick It, events are an anchor point for declaring supporting sources. They are effectively a snapshot of history from which we can derive our evidence, and virtually all sources are relevant to some primary event. The situation, here, is the same for people as for groups.

The duality of persons and groups has already been explored by Ronald L. Breiger, of Harvard University, in the context of empirical analysis of interconnected groups and people[2]. In the context of micro-history, the benefit is two-fold since the relationship of people to groups can be modelled but also the groups themselves, independently of people. The latter is important for anyone studying the history of an organisation, military units, colleges, universities, etc. For too long, this type of research has been neglected from a software point of view, and not considered in relation to genealogy or family history, even though the standards for research should be the same, and many of us have already ventured into such territory.



[1] GENTECH Genealogical Data Model”, National Genealogical Society: References for Researching (http://www.ngsgenealogy.org/cs/GenTech_Projects : accessed 24 Feb 2014), attachment “Data Model 1.1” (http://members.ngsgenealogy.org/GENTECH_Data_Model/Description_GENTECH_Data_Model_1.1.doc : created 29 May 2000, accessed 24 Feb 2014), sec.5.4.5.
[2] Ronald L. Breiger, "The Duality of Persons and Groups", Social Forces, vol.53, no.2, Special Issue (Oxford University Press, Dec 1974), pp.181-190; online at JSTOR (http://www.jstor.org/stable/2576011 : accessed 24 Feb 2014); full online copy also available at http://www.rci.rutgers.edu/~pmclean/mcleanp_01_920_313_breiger_duality.pdf (accessed 24 Feb 2014).