In just a couple of weeks now I will present "Organizing and Sharing Digital Images" to our webinar audience (register here). I'm looking forward to sharing some ideas that might resonate with some of you. Since I recognize that my ideas are not the only ones out there, I'd like to share with you some additional ideas that John Zimmerman, one of our Legacy users, shared with me recently about the subject (with his permission of course). Below John describes how he manages his digital image collections....
A few years ago I visited a wealthy friend and neighbor who had pursued his family history for many years. My friend had already published six large format, hardbound volumes of over 400 pages each, and was contemplating publishing 14 more volumes. He was in the habit of printing 100 copies of each volume, and had donated copies of the books to several libraries. He me the cost of printing the six volumes, and based on that I projected the cost of future printing to be in excess of $80,000.00! I remarked on that cost to my friend, and the fact that many libraries no longer had space to store paper copies of family histories. I suggested that publishing the same data on CD would reduce the "printing" cost to around $0.20 per disk and would provide the history in a format acceptable to any library, as well as allowing for inclusion of all six previous volumes on the CD. To my great surprise my friend suggested that I accept the job of making that CD a reality.
Though I was excited by the prospect of the job at hand, making good on my claims presented some daunting challenges. Not the least of those challenges involved organizing the multitude of paper photographs, negatives, pictures in books and digital images. I found that my friend had unpublished images and text spread among over a dozen large binders and in paper file folders. Also there were files on five different computers, two of which were original IBM PCs that were inoperative. Reactivating those two dead PCs required replacement of the chips containing the Basic Input-Output System (BIOS). Once I had rejuvenated the inoperable computers I had to recover text and images from obsolete software and consolidate everything on the two PCs that would provide a primary work environment, and a backup storage location. All of that was accomplished over the next few months and I moved on to designing a menu-driven user interface for the CD and a standardized presentation format that was workable across several operating systems, browsers and physical display formats.
Although I had known my friend and his family for many years, there were aspects of his life with which I was not familiar, and it would take me some time to become comfortable with the thousands of photos and documents I had to work with. My friend had kept a daily journal of at least 1500 words since the 1930s and integrating images with that huge volume of text was an additional challenge. I badly needed an organizational plan for text and image files, and I needed it immediately!
In choosing how best to organize the image files I considered what was important about any given image, and concluded that for a genealogist the most significant information was always the name of the person in the photo, or referred to in documents. I then considered how best to identify image files. I started by reviewing how we had identified and organized physical images in the past.
Typically physical photographs and documents are identified by viewing them directly, hopefully recognizing who or what was there. Notes might be written on, or filed with the item explaining the content. However, once more than a few photographs or documents were collected it was necessary to organize them in folders organized by subject or date, and filed in alphabetical, or chronological order. Alternatively items might be tagged with an artificial numbering scheme and filed numerically, which required that an external reference list be created cross referencing the numbers with information identifying the subject of each item in the file.
Then came computers, and although they allowed us to store and view huge numbers of digitized photographs and documents in a very small space, many people continued to use the same schemes they had used to organize paper files. That was due to two factors. One factor was that the old paper filing systems were familiar to most of us, and therefore we were comfortable with them. The other factor was that until 1995 file names on PCs were limited to eight characters. That limitation made it practically impossible to create unique, descriptive file names.
In August of 1995 Windows 95 introduced long file names. That allowed creation of file names of up to 255 characters (including the path to that filename), so most digital images could have file names describing exactly what the image contained. I decided that if I named each image file so that the content were obvious from the file name alone, then I could minimize the number of folders to somewhere between one and four by using the folder structure shown in Figure 1. Such descriptive naming would also allow me generally to exchange image files with others without accompanying explanatory text, with the exception of files that contained many people or objects that required detailed identification. Simplification of the folder structure allows for ease of moving those folders if required, and ease of placement of image files where multiple computers and drives are employed.
Files stored in the Pictures folder would include only photos of individuals. If a photo contained more than one person it would be filed in the Groups folder. The Docs folder was reserved for documents, and the Places folder was for images of both places and things.
Though the use of long file names was emancipating, I realized that the key to making a system work required that I develop, and follow, a naming standard. Because the CD would be based on the HyperText Markup Language (HTML) the file names I used could not include spaces, so I chose to connect components of those file names with the underscore (_) character. That character was one of those permitted to be used in HTML code and also provided for a visual separation of file name components. To help file names to stand out from within many lines of HTML code I decided to enter file name components in all uppercase, which also prevented any confusion between letters and numbers. At the time my HTML editor did not use color to distinguish items entered between quotation marks, so the uppercase entries worked well in making those entries stand out (Figure 2).
Since individuals are always central to genealogical records my naming standard would be based on names. File names for photos of individuals would begin with the subject's surname, followed by their given name, allowing file lists to be sorted on those names. Images of ladies would be named with their maiden surnames leading, followed by their given names then their married names (if known) enclosed in dashes (-) as a way of distinguishing those married names. Ladies who married more than once would have their married surnames entered in the order of those marriages (Figure 3).
Images of documents would lead with the surnames and given names of those to whom they pertained, followed by the type of document and they date of the document or event. That would group all the like documents together for an individual, and entering dates as YYYY_MMM_DD would group like documents for an individual in chronological order by year (Figure 4).
Marriage record file names would lead with the name of the groom, followed by an ampersand (&), followed by the name of the bride, then the type of record, date of the marriage then place of the marriage. Transcribed records would have the abbreviation "TRANS" added at the end of the file name (Figure 5).
PLACES & THINGS
File names for places and things was somewhat more problematic, as not every place or thing was associated with an individual, or even a family name. Therefore though surnames were included where appropriate (for example headstones), some file names of places simply began with the address they represented, or the structure they portrayed. Compromises were required occasionally. For example the image of a headstone for a lady seldom included her maiden name. Therefore those images were named using the surname on the marker and the lady's maiden name entered following her given names and enclosed in dashes (-) as used for married names in other circumstances (Figure 6).
Photos of groups of people have always posed a special challenge for genealogists. If there are only a few people in the photo, and if they all share a surname, then the file name can include all of them (Figure 7). Sometimes it is sufficient to name a group photo for the event it portrays, or simply for the family group shown, then add details of individual names using Summary Comments (for JPG or TIF files only) accessible under file Properties in Windows (Figure 8).
The challenge soon gets out of hand when the number of people exceeds what is reasonable to include in a file name, or where surnames could be confused with given names, and vice versa. Clearly we need a solution that allows us to tag individuals within a photo so that their names pop up on a mouseover, or where such tags can be displayed or hidden with the click of a mouse. The tagging capability should travel with the image file and not depend on the end-user installing any special programs. Presently programs such as FotoTagger from Cogitum LC require that both the originator and receiver of a file have the program installed in order to display/hide tags.
Although we may use up to 255 characters in the naming of files there are some areas where abbreviations might be used to keep file names within a reasonable length. However abbreviations should only be used if they do not create questions about their meaning. In addition to the accepted three-letter abbreviations for months of the year, and the two-letter abbreviations for the 50 United States, here is the list of abbreviations that I currently allow myself to use. I should add that if I send a photo file abroad I change the two-letter state abbreviations to the full spelling of those state names.
- & in place of AND
- CEM for CEMETERY
- CERT for CERTIFICATE
- CO for COUNTY
- CP for COPY
- ENH for ENHANCED (This would apply to images that have been altered for clarity)
- PG for PAGE
- REC for RECORD
- REG for REGISTER
- TRANS for TRANSCRIPTION
To help me remember how to apply the standards I have imposed on myself I have created a single-page quick reference guide (Figure 9) which I keep at my desk.
Details aside, it is clear that some organizational plan is necessary if a researcher is to keep track of the wealth of image files that rapidly collect as they search. Hopefully it will be of some use to you, and to other researchers.