Monday, 15 August 2016

Blogs as Genealogical Sources


Many people publish their family history stories or research on a blog, or some other type of Web site. Why aren’t these searched by the large genealogical providers? Is there a problem, or simply a lack of vision? This should be a win-win possibility.

Apparently, some time ago, Ancestry used to search blog material, but their concept had a number of fundamental problems:

  • The material was searched without the permission of the owners.
  • The content was copied and so diverted traffic away from the original blogs.
  • The search was based on simple name matches, or other word matches.

So isn’t there a way of doing this with the permission of the owners, and not by copying the material to a separate location, and better integrating with their normal search function? Yes of course there is!

I actually made a suggestion to them that they allow members to voluntarily list the links to their blog or Web articles, together with relevant meta-data for the referenced individuals and their relationships. This meta-data would be essential if their search had solicited information from the end-user such as a name, place-of-birth, date-of-birth, parent names, spouse name, etc.

So what’s different here? Well, the meta-data allows a more functional search operation, and doesn’t require them to copy and pre-index the raw text of the articles. Also, the end-user would be directed to the actual blog rather than a copy of it if they selected an associated match. All Ancestry would retain is the URL link and the associated meta-data, and no copying also means no copyright issues.

Surely, this sounds like the win-win that it should be. Ancestry would have more sources to search, and for zero cost other than for providing the corresponding software; members get increased traffic on those blogs from other researchers, thus increasing collaboration. Lastly, it would be entirely voluntary and so there would be no licensing issues. A search through a new category of, say, “Members’ Articles” (or implied by “Search All Records”) would show links to online articles that are relevant to the current search criteria.

 So why can't this be done by simply attaching an article to an existing tree?

  • The author may prefer to write research articles rather than maintaining trees.
  • The article may already be published on a separate site or blog.
  • The article may reference people from several distinct families, and so a single tree would not suffice.
  • Some references may be to "incidental people" where no lineage information is available, and yet the article could provide invaluable historical context for them.

Also, you would still need meta-data to support a genealogical search rather than a plain text search. For instance, in my previous blog A Sad Career, I could indicate that Ellen Poland also went under the names Helen Polin and Elenor Polin. Also, that her parents were Owen Polin (aka MacPolin) and Rosanna Polin (aka Poland, and with many given-name variants). As well as providing basic “facts” such as names, date-of-birth, etc., the meta-data would also indicate relationships to other individuals referenced in the corresponding article such as a spouse or parents. This FOAF (Friend of a Friend) concept will one-day be a basic tenet of Web sites and blogs, but for now it would have to be done by the genealogical provider.

Obviously there would have to be a way of avoiding misuse, such as offensive material or advertising, but that sort of moderation must already occur in their message boards.

This suggestion follows quite naturally from previous work I have been involved in with STEMMA.  Our Days of Future Passed — Part II discussed the importance of narrative generally, and especially of marked-up narrative. Before that, What to Share, and How - Part II tried to explain how a STEMMA file contained different types of data, including lineage and other relationships that could be used as meta-data. However, the watered-down scheme presented here is achievable now, using standard technologies.