Returning to Normalised Names and Dates

This post is a follow-up on a familiar subject that James Tanner recently revisited at Returning to the issue of standardized place names and dates; that of software trying to standardise place names and, to a lesser extent, dates.

Figure 1 - Tanner Street, London SE1.[1]

This is a subject that James has mentioned before, and I generally agree with him. However, I wanted to follow-up on the subject, rather than indulge in blig-blog[2], because I fear that readers may conflate some different concepts under this same heading. My post is therefore about normalisation rather than standardisation (using my normal UK spelling).

His basic point is that some software forces you to select one specific, standardised name for a place at the expense of the more relevant, and probably more accurate, one found in the record itself. He makes the very clear observation that:

“Genealogically important records are generally created as a result of the occurrence of an event. Such events occur in a particular and very specific geographic location.”

I agree entirely with this! Source information was recorded as a result of some event, and so has a specific time-and-place context that must be captured accurately. I belong to the school of thought that place names should be recorded as they appear in the sources, and not changed to some close alternative or to some modern equivalent. However, I am also a software developer who believes in the use of place-hierarchies (see my previous post at A Place For Everything) so how do I reconcile the two?

The underlying issue is that places may have multiple names. These may be concurrent or the result of some name change in the history of that place, but that does not mean that older or less-used names are invalid. The extent and boundaries of a place may have changed during its existence, and it may even have been divided, or merged with somewhere else, and so a modern name may be totally inappropriate. Finally, the parent place — that larger-scale bounding place — may have similarly changed over time.

Part of the solution requires that a place be recognised as a real-world notion that physically exists — just like a person — and not simply as a name. This then allows the same place entity (as represented in your data) to be referenced by any of its names, and obviates the need for a standardised reference. The history of the place, its various names, its location and extent, any boundary or name changes, associated images, documents, and maps, would all be held as part of that single place-entity by the software, and those alternative references all pointing to that same collection of information. We take this approach for granted when dealing with people, and their alternative formal or informal names, but it equally applies to any named entity.

STEMMA adopted this approach right from its inception. It was refined V2.2 to allow connections between places which were not hierarchical, such as when a place was divided, or neighbouring places were merged (see Related Entities). More recently, it was recognised that this enhancement also made it possible to support alternative types of place hierarchy, such as geographical, administrative, census & civil registration, or ecclesiastical, and to cross-link them when relevant. An example might be when a registration district overlaps a particular ecclesiastical parish. They are different places, and with independent hierarchies, but their locations overlap. If you want to relate a civil birth registration to someone’s baptism then this type of correlation is very useful.

Another part of the solution is a bilateral approach to the recording of information from a source, whether it’s a name, a date, or any piece of data. This means recording both the original form, verbatim, and a separate normalised version of it, if possible (see Is That a Fact?). As well as supporting places with an uncertain identification, this also means that you retain the exact name used in the source, including any spelling errors and transcription issues, but can still connect it to an appropriate place entity in your data. In effect, it is the place entity that is standardised rather the place name.

I recently dealt with a case of this in my previous blog-post at My Ancestor Changed Their Surname. A woman was recorded with the exact same birth place in three successive census returns (1851, 1861, and 1871): “Barkworth, Lincolnshire”. To my knowledge, there isn’t, and never had been, a place with this name, but there was an East/West Barkwith nearby. This approach means that I can retain the exact spelling used by the census enumerators, and make a tentative association with the standardised place entity representing Barkwith.

A similar argument applies to dates, too. It has been suggested that software can parse, and so decode, any written date once it has been transcribed. This may be possible for Gregorian dates, assuming that they’re written in English (or some other known language), and that they don’t employ an ambiguous, all-numeric representation, but that’s not going to help in many real-life situations. The recorded date may have uncertain characters in it, or it may be an informal or relative expression of the date. For instance, cases such as “last Sunday” or “two days before my grandmother’s birthday” are also dates but they require a human to decode them by using context from elsewhere, such as the date of recording/publication or the identification of the author.

So this is good for our recorded data, but what about the user-interface (UI) that software products or Web sites present to us? I know that Ancestry, for instance, has a drop-down list of standardised place names in its search forms, and that alternative names or spelling are not available. One reason for their exclusion may be the sheer size of the resulting list, but it would be quite possible, in principle, to include all accepted variants. In Ancestry’s favour, you can still search on an unknown name — such as the one presented above — and it will perform a textual search rather than a known-place search. Contrast this with findmypast which has recently adopted a similar approach to place names as Ancestry. It also presents a drop-down list of standardised place names, say, in its census search forms, but if you enter an unknown one then it is simply ignored. In fact, if you tried to re-edit your search criteria then you would see (at the time of writing) that your non-standard name will have vanished; discarded by the form. This is interesting because when you report a census transcription error to findmypast, then it now includes duplicate fields for “Birth town”, “Birth county”, and “Birth place”, with one set including the suffix “as transcribed”. It looks like they would like to support the same type of search modes as Ancestry, but their UI currently makes this impossible.

One obvious question that I haven’t answered here is why we need a normalised version at all. Well, if you want your software to do something useful, rather than simply help you to put names and dates on a tree (which is rather mundane and trivially easy from a software perspective), then it needs to understand certain contextual data. It needs a normalised, computer-readable representation to work with. If you want to do a proximity-based search then it needs to understand which places are near to other places, or are within other places. If you want to present information using a map then it needs to know the location and boundary extent of the referenced places — things that can be looked up for known place entities.

A similar situation exists with dates. If you want to perform a search between two date limits then it needs to understand what lies within that temporal range and what lies outside of it. If you want to present a timeline then it needs to know what sequence your events occurred in.

James concludes with a suggestion blaming the issue on programmers and developers, and it’s here that I’m afraid I have to disagree:

“My opinion is that the main reason for this issue of standardization involves the desires of the programmers to regularize their data for search purposes”

Outside of very small teams then programmers have little say in the functionality of a product. Gone are the days when computer-illiterate managers would delegate full responsibility to a programming team. Large organisations, in particular, have product managers whose responsibility it is to meet market needs, and — depending on how on modern they are — have some synergy with a software architect.

In summary, place entities have to be standardised because they are representing something in the real world, but not so with place names. Any number of place names may reference the same place entity. If some software component doesn’t accommodate that then it implies a deep lack of appreciation somewhere in that chain of responsibility.

[1] Picture displayed by permission of Fashion and Textile Museum, London  SE1 3XF (
[2] My own whimsical term for when two blogs go back-to-back responding to each other. Analogous to ping-pong.

My Ancestor Changed Their Surname

Have you ever had the feeling that all is not as it seems when researching? Maybe I’m just a perennial sceptic but it seems to happen a lot. One instance occurred back in 2011 while researching the Watts family of Nottingham, as introduced in Harsh Times. Although resolved, it didn’t stop there and it happened again when I decided to write things up for this article.

Figure 1 - Stanhope Street, Sneinton, 1910.[1]

Back then, I had got in touch with the descendants of two other branches of the Watts family: Christine and Barbara. When comparing notes, Christine told of a family tale that her great grandfather had changed his name from William Watts to William Thorpe, and that she was looking for possible reasons.

Well, there are the obvious possibilities, such as evading the law or creditors, but when I heard that there were two different marriage registrations then I had my doubts. The first marriage occurred in 1897 and there were three children registered with the surname Watts. The second marriage occurred in 1920 and there were a further three children registered with the surname Thorpe. The obvious interpretation would be that William Watts and William Thorpe were different people, and that the family tale was designed to cover this up, but suppose that this was a bluff. What if it was the same person but he was trying to make it look like he was a different person?

The first thing to do was compare the two marriage registrations. The first was quite straightforward: William Watts married Elizabeth Bond on 26 Jun 1897 at the Nottingham Register Office.[2] William was a ‘Fruit Hawker’ living at 12 Commerce Street, and Elizabeth was living at 21 Commerce Street. William’s father, [William] George Watts (deceased) was a ‘Frame Work Knitter’, and Elizabeth’s father, Henry Bond, was a ‘Fruit Hawker’, the same as the groom. Witnesses were a Richard Bell & Ada Morton.

The second marriage wasn’t quite accurate. It occurred on 24 May 1920 at Nottingham St Albans.[3] Elizabeth declared that she was a spinster and yet gave her married name of Watts. This resulted in her father’s name being recorded as Henry Watts, occupation ‘Hawk’, rather than Henry Bond.  William Thorpe’s year of birth was about 1880, and so significantly different to William Watts’ (1877). The groom’s father, James Thorpe (not William George Watts, and not deceased), was a ‘Dealer’, the same as the groom, and both the bride and groom were living at 2 Wilmot Place, Sneinton Road. Witnesses were a George Scott & Florrie Scott.

I think it’s safe to say that the two grooms were distinct people, but what was the full story?

Elizabeth Bond had an illegitimate child three months before she married William Watts. The child was John Henry Bond, and his father was a George Wright, born 1873 in Kirkby-in-Ashfield.  Elizabeth never looked after John Henry, though, as he was raised by his father’s parents in Bulwell.

The 1901 census suggests that the first marriage had already broken down by that time. William Watts, and his son (Francis), were living with his widowed mother at 4 Parliament Terrace.[4] Elizabeth was a boarder in the household of a Mary Davis at 7 West Street.[5] I won’t cover the minute details but having established who the father of William Thorpe was then it was possible to create a small family tree for the Thorpe family. Using that tree allowed the members of this Davis household to be identified more correctly. In other words, all was not as it seemed.

Relationship to Head
Year of Birth
Mary Davis
Estranged wife of James Thorpe (William Thorpe’s father).
Ben Thorpe
Benjamin Thorpe, son of George William Thorpe (James’ brother).
Edwin Davis
Edwin Thorpe (brother of William Thorpe).
George Davis
George Henry Thorpe, Mary’s grandson by daughter Charlotte; baptised on same day as Edwin.
William Davis
William (aka “Willie”) Thorpe, Elizabeth’s future (2nd) husband.
Sarah Ford
Elizabeth Watts

Table 1 – Household of Mary Davis, 1901.

So, Elizabeth and William Thorpe were living with his mother, who had already separated from her own spouse. But where was Elizabeth’s youngest child, Thomas, who was only about one year old? It seems he had been consigned to the workhouse.[6]

Now if the break-up began around 1901, who was the father of Elizabeth’s third child who was registered as George William Watts in 1904? Well, his birth certificate may have indicated that his father was William Watts,[7] a ‘Fruit Hawker’, but his later marriage certificate indicated that his father was William Thorpe,[8] a ‘Hawk’, thus explaining the apparent conflict.

When looking at the 1911 census then neither of them show up, and this may have been deliberate. In 1861, the offence of bigamy in England & Wales was redefined in section 57 of the Offences against the Person Act 1861; replacing section 22 of the previous Offences against the Person Act 1828. This provided for an exclusion “…to any person marrying a second time whose husband or wife shall have been continually absent from such person for the space of seven years then last past, and shall not have been known by such person to be living within that time,…”. There was no onus on either party to go and find their other half, and so many separating couples — particularly in the poorer classes — would simply avoid each other for seven years.

The birth certificate for first child to be registered to William Thorpe and Elizabeth, James Thorpe (b.1910), records the mother’s name as “Elizabeth Thorpe, formerly Bond”, and so was a bit premature given that they didn’t marry until 1920. More importantly, though, it gave their address for 1910 as 6 Fairholm Terrace, Storer Street.[9] Looking this up in the 1911 census simply indicated that it was unoccupied. However, switching to the census Enumerator Summary Books (ESB) and navigating to the same address showed the expected occupier as “Mr Thorpe”, which was then struck-out.[10] Whether he was hiding or not, this shows that (a) he was the last known person at that address, and (b) that he resided there quite recently.

It was not hard to find that William Thorpe died on 3 Jul 1932, aged 53, at 700 Hucknall Road  of ‘Myocardial failure. Acute bronchitis. No PM’. He was recorded as a ‘General Dealer’ of 31 Dennett Street, and the informant was his widow, Elizabeth.[11] This place of death was actually a workhouse, later known as Valebrook Lodge.[12]

Finding the death of William Watts was a little more hit-and-miss because there were several possibilities, and certificate copies are expensive in the UK. It was eventually found that he died on 28 Jun 1925, aged 47, at the General Hospital of ‘Cerebral haemorrhage accelerated by fracture of the skull caused by a fall down the cellar steps at his home the same day’. He was recorded as a ‘Hawker’ of 23 Holland Street.[13]

Things could have stayed like this for a while, but then I received an email from Barbara, in 2013, saying that she’d come across a newspaper report of a John Watts dying with identical — and I mean virtually word-for-word — details.

Yesterday morning John Watts, aged 48, a hawker, of 23, Holland Street, Goose-gate, Nottingham, was found unconscious, with injuries to the head, at the bottom of the cellar steps at his place of residence. He was conveyed in the city ambulance to the General Hospital, where he died in the afternoon.[14]

This was a vital piece of information because it confirmed that William Watts was going under the assumed name of John Watts. It was then possible to return to the 1911 census and identify him living at 2 Holland Court, Holland Street, with a Sarah Ann Hickman (born c1876), and a young daughter, Harriet Ann Hickman (born c1905).[15] William later married Sarah, in 1918, but as John W. Watts.[16]

So what can we tell from these dates? We can see that the first marriage started to break down by 1901; in between the births of Thomas Watts and George William Watts (who later indicated his father was William Thorpe). We know that William Watts remarried in 1918 and Elizabeth remarried in 1920. The penalty for bigamy was up to seven years in prison and so they would have been careful to observe that minimum 7-year duration, and that suggests the couple began their “blind separation” in about 1910–1911; a date substantiated by their names and locations in the 1911 census.


I was happy with this result, but when I began writing it up for this article then I got that same feeling again: all was not as it seemed. I was not a direct descendant of Elizabeth Bond and so I wasn’t intending to research her family. I had noted that all the public trees had recorded that her parents were Henry Bond and Sarah Phillipson (born 13 Jun 1844 in Ulceby, Lincolnshire) and that they married at Nottingham St Mark on 19 Jul 1868. A copy of the marriage certificate supported this, and gave Henry’s father as Samuel Bond (see below), but gave no indication of Sarah’s father.[17]

The implication of the marriage was that Sarah (Phillipson) Bond was the mother of Henry’s first three children: Frances, John Henry, and Elizabeth — Annie was later born c1880 in Dudley, Worcs. As far as I could see, all the public trees that contained these children were relying on the GRO index (see below) and assumed that they were the parents. What caught my eye, though, was that the baptism records for the children explicitly gave their mother as Sarah Hunt, but who was she? Most Church-of-England baptisms don’t include the mother’s maiden name, and if present then it usually indicated that the father was unidentified or the couple were not married. However, this was a Roman Catholic parish and I didn’t immediately appreciate that recording the mother’s maiden name was the norm in such parishes. This led me down a blind alley until I had made that connection.

The following is a quick comparison of the data for the children in the parish register and in the civil registration index of births (GRO index).

Baptisms: Nottingham St. Barnabas RC[18]
Civil registration index of Births[19]
Mary Frances; born 30 Nov 1869; bpt: 2 Jul 1876.
Frances; registered 1869-Q4 (7b:221).
John Henry; born 21 Jan 1875; bpt: 20 Feb 1875.
John Henry; birth registered 1875-Q1 (7b:285); death registered 1875-Q3 7b:179).
Elizabeth; born 30 Oct 1878; bpt: 16 Nov 1878.
Betsy; registered 1878-Q4 (7b:292).
Table 2 – Children of Henry and Sarah Bond.

A copy of the birth certificate for Frances[20] and for John Henry[21] confirmed that their mother’s maiden name was Hunt rather than the expected Phillipson.

Interestingly, Henry Bond was also baptised at Nottingham St Barnabas RC on 30 Mar 1876, at the grand old age of 35, in the spring following his son’s death and three months before the baptism of Frances. His parents were given as Samuel Dobbs (not Bond) and Sarah Bond, thus being another child born out of wedlock.

As an aside, Henry was in serious trouble during March 1871: “Henry Bond, alias Dobbs” was found guilty of “Having counterfeit coin (2 offences. before convicted of felony)”, and sentenced to two years prison with seven years police supervision.[22] In 1874, he gave trial testimony to being bribed, along with others, in municipal elections.[23]

The following table attempts to present a census timeline for the location of Henry’s wife, Sarah, on the basis that she originated from Ulceby, as all the public trees suggested.

9 (Book 20)
Front Street, Ulceby, Lincs.
Sarah’s family. Father (Thomas), b.c1816, is a “Taylor”. Mother, Mary [Dale], b.c1811. Neighbours are Dales.
Front Street, Ulceby, Lincs.
Sarah b.c1844 in Ulceby, Lincs. Father, b.1813, is a “Taylor & Draper”. Mother b.c1811.
Holderness Road, Southcoates.
Sarah b.c1846 in Ulceby, Lincs. Sarah is a domestic servant. Surname mis-recorded as “Philipston”.
1 & 2
High Street, Ulceby, Lincs.
Sarah’s family. Father, b.c1813, is a “Tailor”. Mother, b.c1811.
17 Millstone Place, Nottingham.
Sarah b.c1843 Caistor, Lincs. No sign of Henry but see conviction mentioned above. Daughter “Sarah” (ditto’d) is 16 months and so is almost certainly Mary Frances.
5&6 Horsefair, Kidderminster, Worcs.
Henry lodging.
Sarah at 41 B’ham St, Court 7, Dudley (Worcs.).
Sarah b.c1842 in Caistor, [Lincs.]. Daughters [Mary] Frances and Elizabeth present, plus Annie (5 months, b. Dudley). [Annie d. 1887 in Nottingham].
7 Trumpet Street, Nottingham.
Sarah b.c1842 Nottingham (ditto’d). [Mary] Frances and Elizabeth present, plus granddaughter Ellen (b. 1889 [to Frances], Nottingham).
Beech Avenue Workhouse, Broxtowe, Nottingham.
Henry is a widower. [Sarah d. 1899, Nottingham, aged 57] [Henry d. 1910, Nottingham, aged 69]
Table 3 – Timeline for Henry’s wife, Sarah.

The first thing to note here is the discrepancy in Sarah’s details. Up until, and including, her marriage to Henry in 1868, the details indicate that she was born Sarah Phillipson, 1844–1846 in Ulceby, Lincolnshire. Thereafter, from their first-born, the details indicate that she was born Sarah Hunt, 1842–1843 in Caistor, Lincolnshire. Now Caistor is only about 10 miles south of Ulceby, which may not sound much to a US person but these were tiny rural places and so that’s a significant separation. Also, the town of Caistor was in the Registration District of Caistor, while Ulceby was in the Registration District of Glanford Brigg.

Figure 2 - Ulceby, Lincolnshire, England

A check on the birth of Sarah Phillipson[24] confirmed what others had documented; that she was born in Ulceby on 13 Jun 1844, at 5am, to Thomas Phillipson, a tailor, and Mary (Dale) Phillipson. The subsequent baptism on the 17 Jun 1844[25] also confirmed the same parentage and father’s occupation.

Clearly there’s a major discrepancy in the details of Henry’s wife that begins around the point of the marriage.

My first thought was that Sarah Phillipson and Sarah Hunt were different people, and that Henry married one and left her for the other. I don’t believe that’s the answer, though, because the chances of “switching” to someone else with the same given name, a similar age, and born in a similar location of a different county (North Lincs.), seemed too coincidental. While it wouldn’t have been absolutely necessary at the time, it is worth mentioning that there was no visible re-marriage of Henry Bond to a Sarah Hunt either.

Remembering that Sarah Phillipson’s marriage certificate gave no details for her father at all, it is possible that she was raised by Thomas but that he was not her natural father. The birth details, including the actual time of delivery, strongly support Mary being the natural mother. The apparent change of Sarah’s details might be explained by certain previously-withheld information being made known to her following her marriage. It might also explain why she left the family and travelled some 90 miles to live in Nottingham.

However, Sarah wasn’t the last child, and her parents were not separated, so it’s hard to understand how someone else might have been the father. I could find no references to the family in the local newspapers. Without wanting to make a marathon of this article, I’ve summarised my search to establish the family of the Ulceby Sarah Phillipson in Table 4. Details came from the parish register images (see note [25]) unless indicated otherwise.

Thomas Phillipson
Bpt: 23 Dec 1812 in Barrow upon Humber, Lincs.  Born to John & Ann.
Bur: 10 Dec 1875 in Ulceby.
Mary Dale
Bpt: 28 Dec 1810 in Ulceby.
Bur: 4 Apr 1893 in Ulceby.

Bpt: 9 Feb 1834 in Ulceby. FamilySearch transcribed the year incorrectly as 1836.
Bur: 17 Apr 1891 in Ulceby.
Bpt: 13 Dec 1835 in Ulceby.
Bur: 24 Mar 1840 in Ulceby.
Bpt: 16 Feb 1839 in Ulceby.
Bur: 3 Jan 1887 in Ulceby.
Mary Ann
Bpt: 7 Mar 1841 in Ulceby.

Bpt: 17 Jun 1844 in Ulceby.

Bpt: 15 Nov 1846 in Ulceby.

Bpt: 15 Apr 1849 in Ulceby.
Bur: 25 Dec 1851 in Ulceby.
William Thomas
b. c1854 in Glanford Brigg.[a]
d. 1916 in Gainsborough.[b]
Table 4 – Family of Thomas Phillipson and Mary Dale.
[a] FreeBMD and Census.
[b] FreeBMD.

It’s clear that this family remained quite local to Ulceby, and didn’t have a connection with Caistor.

One way to prove whether we really were looking at two different people, as opposed to one persona which had morphed into another, would be to find them both mentioned in the same set of records, and this happens in the 1871 census. We know that Henry’s wife was in Nottingham at that time (see Table 3), but we also find Sarah Phillipson having returned to her family’s address on Front Street, Ulceby.[26] She is still unmarried but appears to have two young children.

This is a showstopper as far as the public family trees go, but who was Sarah Hunt? Well, a very good candidate may be found in the Caistor Union Workhouse in 1851.[27] She was with her mother, Frances Hunt, born c1811 in “Barksworth” (probably a mis-recording of East/West Barkwith, Lincs.) and who was a house servant. The age of Sarah suggests that she was born c1847, which doesn’t precisely match the age of Henry’s wife, but the mother’s given name is of particular interest. The first-born child of Henry and his wife was Frances, and their first grandchild — born to this daughter and Benjamin Gould before their marriage — was also Frances.

I haven’t proved that Henry married this particular Sarah Hunt, although there is significant evidence that requires following through; for instance, by obtaining the birth certificate for this Sarah Hunt who was born in Caistor, and by establishing her whereabouts in 1861. What I have shown is that Henry did not marry the Sarah Phillipson from Ulceby, and hence that all the public trees claiming this will need re-evaluating. The Sarah on Henry’s marriage certificate did not know who her father was, and I believe that was Sarah Hunt, but it remains a mystery why it records the surname Phillipson. All names are written in the hand of the same clerk, and although there is an “x his/her mark” for the witnesses, there isn’t for the bride or groom. It could, therefore, be a simple clerical error on the original parish record since transcriptions of that match the certificate details.[28] Their marriage was “after Banns” and so I would also check the Banns book to see if a different surname was recorded there.

A corollary to this, and a lesson to us all, is that a single source makes a very tenuous case, and this applies as much to family trees as to historical research. It is also partly why the Genealogical Proof standard (GPS) involves “reasonably exhaustive research”. We’ve all been caught like this — myself included — where we pin too much on a single source. This may be because too little evidence exists, or because of the expense involved. In England & Wales, for instance, the cost of a certificate for a birth, marriage, or death — which would constitute a certified copy of an original source containing primary information — is currently £9.25, or about US$15.50, and the fact that you cannot buy an unstamped, research-only copy (unlike Ireland) puts a large part of the blame for inaccurate British genealogy on the government’s shoulders. On top of this, the GRO index of civil registrations — most often cited through the FreeBMD site — is a derivative source, and is acknowledged to contain errors and omissions. It does not identify the parents of a child. At best it identifies the father’s surname (if known) and the mother’s maiden name (depending on the date), but without a copy of a certificate then there’s a lot of assumption involved. Even the dates are those of the registrations, rather than of the vital events themselves, and are rounded up to the corresponding yearly quarter. Hardly a substitute for a certificate, is it?

Finally, some blame has to be passed to those purveyors of mass-market genealogy who insist on deluging people with name matches — most of which are inappropriate or just plain useless — as though that gives value for money. If genealogy were that easy then 95% of online trees would be correct. I rest my case!

