Sunday 8 August 2021

Stop Adding Incorrect Details for My Family


This is a woefully common reaction in online genealogy. Whether someone is overwriting your details in a shared tree, or publishing incorrect details about your family in their own tree, it can be frustrating when you see incorrect details published and then blindly copied by others.

But wait a moment, why does your opinion carry more weight than theirs? If you politely tell someone that their details are wrong, and they either ignore you or tell you to go forth and multiply, why would you expect them to change anything? If you provide them with a reference to a census page or parish record (or other) and they say ‘Oh, I already have a John Smith in my tree, so thanks but no thanks’ then you may begin to see the problem.

Yes, there are a few edge-cases, such as someone telling you your own father was Peter rather than Paul, or insisting that you were born in Kathmandu rather than the East End of London, but we are looking at the more general case here.

Far too often, a record by itself does not sufficiently identify a person, or their relationships to other people. Even a group of corroborating records may be insufficient to connect them. But useful evidence is not necessarily the same as a direct answer; to show that John Smith1 cannot be right, yet without confirming that John Smith2 is correct, is still a useful product of your research. In a surprisingly large number of cases, there will be no direct evidence, and the naive expectation fuelled by advertising that you can build your tree from online records, becomes a sad disappointment. To get past such brick-walls, you need to get into inferential genealogy, and this then implies that you need to write up the how and why of your research for it to be accepted. In fact this is a general tenet that applies to all claims, with more details being required where there was more inference. If you need to convince another researcher that their information is incorrect then you need to provide such an explanation.

This is a lesson yet to be learned by the hosts of online records and online trees. Their insistence on the simplistic research model of ‘search here and build it there’, compounded by ‘if you cannot find it, copy it’, has already done untold damage to our collective research.

A few years ago, I drew an analogy between research in academic fields and in online trees at Research in Online Trees, pointing out that in all academic fields (including professional and academic genealogy), it is not a direct A-to-B process. There are intermediate steps where researchers may build upon, or even challenge, past work by other researchers. But for that to happen, dedicated researchers need a way to write up and publish their research for others to find.

Note that attaching  a write up to a particular person or family in your tree is not good enough: even if the text is searchable via search engines such as Google (which it rarely is because it is stuck in a database) then it may not be specific to that person or family. It could involve multiple generations in more than one family. As with those other academic fields, a comprehensive write up will also need basic formatting, and especially a mechanism for footnotes and citations.

Many researchers who fall into this category, and who are not solely interested in academic journals, will resort to blogs. These can be highly effective ways of making such arguments, but they are ignored by the hosts of online trees. Yes, they can be cited (sort of), but you are forced to find them using a search engine. If your surnames also happen to be common terms then this is not going to be very productive. Unsurprisingly, I have explained how these hosts can easily fill that gap, allowing their users to search registered blogs along with their own online records, and without diverting traffic away from those blogs or risking any type of copyright: Blogs As Genealogical Sources. Reactions — few as they were — implied the idea was too complicated, or that it would cost too much money to implement, or that it would involve payments to the blog authors, or simply that no one else does it. All of these notions were ill-founded, but their lack of both vision and responsibility will leave us wanting in terms of a more robust model for sharing and collaboration in general.

Monday 2 August 2021

Is Pinterest a Valid Source?


At the end of 2019, I made a case for online trees being a valid source, although with some caveats. I recently thought about a similar case for Pinterest, but the situation is not the same there so I wanted to dig over the main issues.

 Like many people, I started a Pinterest account when it first appeared, and then got disinterested when I realised that it was all smoke and mirrors (or images thereof), that my feed could not be tailored to deliver what I really wanted to see, and when I got deluged by unwanted advertising. In fact, I have just closed my account as it had no value for me.

Pinterest has been criticised for many reasons, including some content being pornographic or obscene, being overtly political, hosting commercial scams, spreading misinformation (especially medical), or focusing on people's eating disorders or weight problems. But what was it supposed to be?

According to Wikipedia, Pinterest is an "American image sharing and social media service designed to enable saving and discovery of information ... and ideas", but the reality is much more mundane. It is now basically an image sharing site with no obvious purpose. You see images were supposed to be just a taster that encouraged people to pin them, and click on them to get information; an image on its own — with no caption, link, or accompanying information — is a dead-end.

Let me pick a specific case, one that initially encouraged me to look at Pinterest: images of old places. I love to see historical pictures of my home town, but on Pinterest they invariably contain no details, or any caption. If I wanted to search for an image of a particular place then I cannot — the search bar simply finds boards of that name shared by other users. If I happen upon a rare or interesting image there then I might, if I'm lucky, recognise the place, but what about the date, or the photographer, or the story behind the picture?

If I was doing this as part of a research project then I have a deeper problem: provenance. Where did the image come from? Who took the original, and is it in copyright? Pinterest does have a mechanism for a copyright holder to get material taken down, but this is fighting against the tide because it already makes it so easy for people to share anything and everything that they might find, online or otherwise. At the very least, it should have implemented a mechanism identifying the initial point of entry of an image onto Pinterest, i.e. who first loaded it, and where from.

The situation is more complicated than this, though, and doomed to failure in the hands of people who treat it like stamp collecting. I have several images in my blog posts that I have taken pains to get permission to display from the copyright holders, and I had shared those same posts via Pinterest using those images. And yet I have these images in isolation on other people's Pinterest boards. That is, pinned images, divorced from my blog, without the associated information, and without any provenance or attribution. The images had been appropriated to sit in someone's "gallery" of images that they like, but that serves no purpose beyond the private pleasure of such hoarders.

So, if Google turns up some image during your research that resides on Pinterest, what do you do? Would there be any point in citing it at all, in the way you might for other social media? Google does provide a search-by-image mechanism through which you might be able to identify a non-Pinterest copy — ideally being older and with more details — but then a Google search could equally have found that, so what purpose does Pinterest serve? As a means of sharing, it is naively structured and simply exacerbates sharing issues already present on the Internet. But as a source of information that is worth reading and citing then it is a non-starter.