Friday, 8 February 2019

Tool for Annotating Image FIles

MetaProxy is a free little program that allows archival descriptions, search terms, provenance, and any other meta-data (i.e. data about data) to be associated with images and other file types. The program is designed to run under Windows and requires no installation.

It may be downloaded for free from:


This includes a PDF user guide, and even a copy of the source code. A description of the general principle being demonstrated can be found at:


An equivalent version for the Mac is being investigated separately.

So what does it do?

Unlike similar commercial programs, it does not hide your meta-data in a database, or within invisible areas of an image file, or in any other opaque form of storage; it is in plain text files that can be loaded into a simple text editor, and the content of those files is completely dictated by you.

Think of this as taking something that might normally be scribbled on the obverse of a photograph and storing it in a separate text file.

The idea of using a separate file for this purpose is not new — they're known as "buddy files", or sidecar files — but this program shows that they can be effective without having to conform to some data standard, and without having to use specialised programs to see/edit their content. The only convention is that the image and its buddy file share the same name (e.g. picture.jpg and picture.meta) and they must currently reside in the same folder as each other. Such buddy files would be copied with their neighbouring images if you ever move them, or make backups, and would be just as permanent — more permanent than using hidden meta-data inside an image because that is often wiped clean by cavalier Websites and image-editing software.

When double-clicking on a buddy file, this tiny program finds the associated image and launches it in its default viewer, e.g. Microsoft Office Picture Manager, and then displays the content of the buddy file over the bottom third to present that extra information to you. Sound simple?


As well as providing a place for your "back of the picture" annotation, the buddy files are a good place to put search terms. I am using #hashtags in this document (e.g. for relevant surnames, places, dates, etc), but that is not an essential format. Consider, for instance, if you have a digital image of a newspaper clip, or of a handwritten will, and that you've also made a textual transcription of it. Well, if you put the transcription into the buddy file then a Windows Search will find the buddy files relevant to what you're looking for (remember that you cannot search the text inside an image file, but you can search text transcribed into a text file), and then double-clicking on any one of them would show both the original image and your transcription.

The program doesn't care what type of image files you're using (*.jpg, *.jpeg, *.tiff, *.png, etc.) and will even work with other types of data, such as Abode Acrobat files, HTML files, or SVG family tree images. For instance, if you have an article written in a Word file called article.docx and a corresponding buddy file called article.meta, then double-clicking on the buddy file brings up both, just as with images. Not only that, if you have article.jpg and article.docx then both will be launched in their respective viewers before displaying the content of the buddy file. In other words, when you double-click on a buddy file, it will find all the files in the same folder with a matching name and display each of them.

An accidental feature of the program (one that I didn't anticipate at the start) is an ability to describe and open a collection of images. In a demo given on Mondays with Myrt on 28 Jan 2019, I used an example with the following files:

RisalpurCemetery.1.jpg                      First image and its buddy file
RisalpurCemetery.1.meta
RisalpurCemetery.2.jpg                      Second image and its buddy file
RisalpurCemetery.2.meta
RisalpurCemetery.3.jpg                      Third ...
RisalpurCemetery.3.meta
RisalpurCemetery.4.jpg                      Fourth ...
RisalpurCemetery.4.meta
RisalpurCemetery.meta                     Buddy file for the whole collection

That collection related to a cemetery in Pakistan, formerly British India. As described above, double-clicking on, say, RisalpurCemetery.1.meta would display that buddy file and the associated image (RisalpurCemetery.1.jpg), etc., but double-clicking on RisalpurCemetery.meta would launch the whole collection. This works because the program replaces "meta" with "*" in the initial buddy file name in order to generate a wildcard for it to find all the associated files. In this case, the generated wildcard also matches the other buddy files: the per-image ones.

As of v2.0, the program may also be used to open your image or other data file. I'll describe this in a moment, but if you right-click on an image (say), select metaproxy.exe using "Open with" (but not making it the default), then it will search for an accompanying buddy file and open both, just as it did above. However, if you didn't yet have a buddy file then it will automatically create an empty one in the Notepad editor for you.


Because these buddy files are textual, they're searchable. This can be on specific search terms or on free text, as with transcribed data. This means that in Windows Search, you could type something like: *.meta AND #Proctor AND #1870s, and it would enumerate all the buddy files for that surname in that decade. Then double-clicking on one of these buddy files would show you both the full text from your buddy file and the accompanying image, together.
 

Getting it going


The program is very simple and so needs no installation. Just download the metaproxy.exe from the Dropbox folder (see above) and save it in a safe place somewhere on your hard drive. The *.c file in that folder is only for developers to look at and is not needed by the program itself.

In order to test this out, find (or create) a place where you have an image file; say image.png. You will then need to create a corresponding buddy file, image.meta, using the Notepad text editor and put some meaningful text in it. Although you can do this manually (see 'Creating Buddy Files Manually' in the PDF guide), the program will do this for you. Simply right-click your image file, select "Open with", then "Choose another app", and browse to the location of this metaproxy.exe program. You may have to follow several of these prompts, but don’t be enticed into the “app store”. Make sure the 'Always use this app to open this type of file' is unchecked before hitting 'OK'. This will create an empty buddy file for you in order to make it easier to catalogue a series of files. NB: after doing this once, the metaproxy program will be immediately available in the "Open with" list for that file type, and you won't have to search for it again.

Now right-click your completed buddy file (not the image file), select "Open with", then "Choose another app", and browse to the location of this metaproxy.exe, as you did before, but now make sure the 'Always use this app to open this type of file' is checked before hitting 'OK'. This is a one-time only task that tells Windows to always use the metaproxy.exe program whenever you double-click on a *.meta buddy file; you won't need to do this again.






From then on, when you double-click any *.meta buddy file, it will automatically open the associated files (i.e. the ones having the same names but different file types) using their registered applications (e.g. some image viewer, Microsoft Word, or whatever). It will also open the buddy file in Notepad and overlay it on the bottom third of the file's view to present any descriptive text you've created.

Configuration

Although the program is self-contained and requires no mandatory configuration, provision was included in v2.1 for an optional INI file. This means that a text file called metaproxy.ini can be placed in the same location as metaproxy.exe in order to change the default functionality if necessary

The default settings are equivalent to the following INI-file contents:

[metaproxy]
CreateType=.meta

[.meta]
SideBySide=False
Editor=Notepad

The [metaproxy] section defines global settings. Other sections are per buddy-file type, implying that you can define several variant buddy files with different properties.

CreateType specifies the default file type to use when creating new buddy files.

SideBySide specifies that a side-by-side mode, where image and buddy file each take 50% of the screen width, is to be used for a specific buddy-file type. The default is to overlay the buddy file over the bottom third of the image. Having this option allows you to define a buddy file type that's configured to better handle transcriptions.

Editor is the name of the text editor to use for a specific buddy-file type. The editor name must be a full file specification if it is not on the normal system search path.

An example of extra section for some hypothetical *.tran buddy-file type might be as follows:

[.tran]
SideBySide=True
Editor=Notepad++

NB: whenever a file type is mentioned in this INI file, the leading period is mandatory.


[Updated to match small changes in v2.2 release]

Friday, 18 January 2019

Eskaton


More thoughts on the subject of Time, but in a different vein this time. Whether its end-of-days is historical or prophetic reflects the cyclical nature of events.


Figure 1The Destruction Of Sodom And Gomorrah, 1852, painting by John Martin  (1789–1854); public domain; via Wikimedia Commons.



On cold icy countenance did Heaven's hammer fall,
and tumult and turmoil did summon the ocean tall
For all that was, and then,
repaid in myriad gem
Given by the great hand but taken by the small.

For Aion smote the cheek of Gaia, so cool and white
Her mantle of Aegean blue loosed in silent flight
Yet innocent of all crime
The iterant chaos of time
Her fate seduced by blow celeste from indelible night.

Under legion trumpet her children wept, though mused and wise,
as fire, brimstone, and scoria burned their Eden and their eyes
Safe haven in cavern deep
Sanctuary on silver seed
To wait, to slumber, to heal, and like phoenix arise.




Copyright © Tony Proctor
#poetry #time

Friday, 4 January 2019

Organising More Resources


Time for a rant since I keep on seeing the same old suggestions for organising digital resources that fail at "naming 101".  Organising by either person name, location, or date are not only restrictive but impractical and unhelpful in many circumstances; and trying to put these terms into file or folder names is a road to nowhere.



So, Tony, what is your problem here? Well, there are several issues:

  • There's a huge difference between the physical organisation of material (digital or otherwise) and the indexing according to your mode(s) of access.
  • Coding in a file or folder name forces you to make a limiting choice. The classic example is a group photograph that has several people of different surname and family connections.
  • Access isn't always by just one category of index-term: you may want, for instance, all resources related to Proctors, in the city of Nottingham, during the 1950s; or, all Jessons who were present at a particular event.

I've said before that physically organising by the nature or provenance of the material is not only the archival way, but it has advantages for maintenance, changes of ownership, and even making inferences (e.g. identifying a person in a photograph). Anyone who splits up an inherited photographic collection should have their fingers taped together. In contrast, indexing helps you access your material according to various categories, such as surname, location, etc.

In Organising Digital Resources, I described the difference between these two concepts, and how indexing for your mode(s) of access can be done according to multiple inclusive categories; you're not forced to choose just one. Unfortunately, although we have real archivists in our community, the tendency is still to miss the analogy between their professional organisational schemes and the digital world, and so oversimplify such things to coding surnames, etc., into file and folder names. Professionals in the digital world do not do this, and their schemes would also be the ones used by archives to implement their own schemes.

Perhaps the best arguments against the way things should be done are: (1) that your software of choice may be rather limited, or (2) that browsing the resources in the absence of any specialised software leaves them hard to understand.

In a much older article, Organising Photographs, I mentioned the use of meta-data, and how this could be used to add important information (visible only to software) to images, or to add index-terms to all file types (including images) in order to aid in their access from a simple Windows search. This is not unlike writing information (or source labels) on the back of physical photographs, except that software could use the digital equivalent to help access them. A major goal of that article was to show that digital resources could be indexed using very simple software technology available on all our computers, in contrast to using some highly specialised genealogical software. Well, some people would still not like the invisible nature of that meta-data, and it's still poorly supported by standards, and hence by different computer operating systems.

So let's explore the analogies with physical artefacts. If you had a photograph album then you would probably have written details underneath each picture. If you didn't have an album — just a biscuit tin on the top of your wardrobe — then you may have written details on the back of each picture. However, I have some WWI photographs of soldiers that were sent to their families as postcards. That means you cannot write on either side without damage to the precious original, and in that case you might just have a separate piece of paper, or better still an envelope, with the salient details written on it.

One alternative for digital  resources might be to use a simple non-specialised bit of software such as Excel, which uses multidimensional hierarchical indexes — effectively a scaled down version of an OLAP database. It's proprietary, yes, and it's opaque, yes, but it can be kept alongside the digital resources.

Because I arrange my own digital material hierarchically, akin to a micro-archive (see Hierarchical Sources), then I also use text files to describe the material at each level (e.g. for fonds, series, or even items), and this presents a very simple alternative that is a closer analogy to "separate paper" idea: buddy files. For instance, in a collection of photographs, I would have a single text file of a fixed name (e.g. Description.txt) with all the details of what the collection is, where it came from, when, and who had it before that. Alongside each photograph (i.e. at the item level) I often have a buddy file of the same name with not just a plain-text description of the where, when, and who, but tags that can be searched on.

For example, suppose I have an image of family photograph called Picture1.jpg then I might also have a Picture1.txt text file with tags as follows (descriptive text not shown here).

Figure 1 – Picture1.txt and Picture1.jpg

This has effectively three categories of hierarchical index-terms, and a search through the buddy files for #Proctor, #Nottingham, #1950s would throw up the name Picture1 whose associated image could be viewed unaided by any specialised software.

Here's another example for comparison:

Figure 2 – Picture2.txt and Picture2.jpg


How you name the item-level files is partly irrelevant, as long as they're unique at each level of organisation. For photographs, you could invent your own scheme of codes, similar to an archive, or use something more meaningful — it doesn't matter because anyone browsing the images directly would also have the buddy-file details on hand. For images of material obtained from an archive, though, and this would include any census images for England and Wales, I would strongly recommend using the assigned archival codes in the image names.

So, this is probably the simplest scheme possible, and it doesn't rely on hidden meta-data, or databases, or specialised genealogical software. Such software could still index your resources, as I've already explained, but this scheme provides your digital resources with plain-text notes and index-terms of their own — ones that would follow them if ever they were transferred elsewhere or duplicated for someone.

The organisation of your physical artefacts should follow the precedent set by archives, so why not do digital resources in a similar fashion. For instance, my extended family has a large stone chess board, originally seated on a wooden table, dated 1859 with the name of my ancestor carved into it, and with the initials of an in-law who was a stone mason. I may have images of it but the artefact has a fundamental existence of its own, and I would need to index this in my software. This may be unusual, but many of us have letters, certificates, ephemera, other original documents, and photographs. Older photographs were obviously printed and so scans are derivative, but modern photographs are "born digital" and so it's the printed forms that are derivative. Either way, keeping paper-based copies is always wise. Believe it or not, there are people who recommend scanning old photographs so that the paper copies can be thrown out — no taped fingers for them; I recommend those nice white jackets that fasten at the back. :-)


So wouldn't it be a little messy to select one of these textual buddy files from the search results, find the corresponding image file(s), and then open it? Well, no, not at all. It's extremely easy for some programmer to create a tiny program to do this for you, much like the code attached to the aforementioned 'Organising Photographs' article. Put simply, you could right-click on the buddy file you want, and select 'Open with <ProgramName>', and that program would find the image file for you and automatically open it in place of the text file. To be more bullet-proof, it would be best to use a special file type rather than *.txt (e.g *.meta), in which case the program could be registered as the one to always use for that file type, and you would merely have to double-click on the *.meta buddy file. There must be a commercial opportunity here.


A software tool was developed to demonstrate this basic principle under Windows, and it may be downloaded for free from Dropbox folder. No installation is necessary, and the folder includes both a PDF user guide and the original source code. A description of what it does may be found at Tool for Annotating Image Files.