Organizing and Sharing Your Digital Images
February 10, 2012
In just a couple of weeks now I will present "Organizing and Sharing Digital Images" to our webinar audience (register here). I'm looking forward to sharing some ideas that might resonate with some of you. Since I recognize that my ideas are not the only ones out there, I'd like to share with you some additional ideas that John Zimmerman, one of our Legacy users, shared with me recently about the subject (with his permission of course). Below John describes how he manages his digital image collections....
A few years ago I visited a wealthy friend and neighbor who had pursued his family history for many years. My friend had already published six large format, hardbound volumes of over 400 pages each, and was contemplating publishing 14 more volumes. He was in the habit of printing 100 copies of each volume, and had donated copies of the books to several libraries. He me the cost of printing the six volumes, and based on that I projected the cost of future printing to be in excess of $80,000.00! I remarked on that cost to my friend, and the fact that many libraries no longer had space to store paper copies of family histories. I suggested that publishing the same data on CD would reduce the "printing" cost to around $0.20 per disk and would provide the history in a format acceptable to any library, as well as allowing for inclusion of all six previous volumes on the CD. To my great surprise my friend suggested that I accept the job of making that CD a reality.
Though I was excited by the prospect of the job at hand, making good on my claims presented some daunting challenges. Not the least of those challenges involved organizing the multitude of paper photographs, negatives, pictures in books and digital images. I found that my friend had unpublished images and text spread among over a dozen large binders and in paper file folders. Also there were files on five different computers, two of which were original IBM PCs that were inoperative. Reactivating those two dead PCs required replacement of the chips containing the Basic Input-Output System (BIOS). Once I had rejuvenated the inoperable computers I had to recover text and images from obsolete software and consolidate everything on the two PCs that would provide a primary work environment, and a backup storage location. All of that was accomplished over the next few months and I moved on to designing a menu-driven user interface for the CD and a standardized presentation format that was workable across several operating systems, browsers and physical display formats.
Although I had known my friend and his family for many years, there were aspects of his life with which I was not familiar, and it would take me some time to become comfortable with the thousands of photos and documents I had to work with. My friend had kept a daily journal of at least 1500 words since the 1930s and integrating images with that huge volume of text was an additional challenge. I badly needed an organizational plan for text and image files, and I needed it immediately!
In choosing how best to organize the image files I considered what was important about any given image, and concluded that for a genealogist the most significant information was always the name of the person in the photo, or referred to in documents. I then considered how best to identify image files. I started by reviewing how we had identified and organized physical images in the past.
Typically physical photographs and documents are identified by viewing them directly, hopefully recognizing who or what was there. Notes might be written on, or filed with the item explaining the content. However, once more than a few photographs or documents were collected it was necessary to organize them in folders organized by subject or date, and filed in alphabetical, or chronological order. Alternatively items might be tagged with an artificial numbering scheme and filed numerically, which required that an external reference list be created cross referencing the numbers with information identifying the subject of each item in the file.
Then came computers, and although they allowed us to store and view huge numbers of digitized photographs and documents in a very small space, many people continued to use the same schemes they had used to organize paper files. That was due to two factors. One factor was that the old paper filing systems were familiar to most of us, and therefore we were comfortable with them. The other factor was that until 1995 file names on PCs were limited to eight characters. That limitation made it practically impossible to create unique, descriptive file names.
In August of 1995 Windows 95 introduced long file names. That allowed creation of file names of up to 255 characters (including the path to that filename), so most digital images could have file names describing exactly what the image contained. I decided that if I named each image file so that the content were obvious from the file name alone, then I could minimize the number of folders to somewhere between one and four by using the folder structure shown in Figure 1. Such descriptive naming would also allow me generally to exchange image files with others without accompanying explanatory text, with the exception of files that contained many people or objects that required detailed identification. Simplification of the folder structure allows for ease of moving those folders if required, and ease of placement of image files where multiple computers and drives are employed.
Figure 1 - Simplified Image Folder Structure
Files stored in the Pictures folder would include only photos of individuals. If a photo contained more than one person it would be filed in the Groups folder. The Docs folder was reserved for documents, and the Places folder was for images of both places and things.
Though the use of long file names was emancipating, I realized that the key to making a system work required that I develop, and follow, a naming standard. Because the CD would be based on the HyperText Markup Language (HTML) the file names I used could not include spaces, so I chose to connect components of those file names with the underscore (_) character. That character was one of those permitted to be used in HTML code and also provided for a visual separation of file name components. To help file names to stand out from within many lines of HTML code I decided to enter file name components in all uppercase, which also prevented any confusion between letters and numbers. At the time my HTML editor did not use color to distinguish items entered between quotation marks, so the uppercase entries worked well in making those entries stand out (Figure 2).
Figure 2 - HTML Example Showing Image File Name Entries
Since individuals are always central to genealogical records my naming standard would be based on names. File names for photos of individuals would begin with the subject's surname, followed by their given name, allowing file lists to be sorted on those names. Images of ladies would be named with their maiden surnames leading, followed by their given names then their married names (if known) enclosed in dashes (-) as a way of distinguishing those married names. Ladies who married more than once would have their married surnames entered in the order of those marriages (Figure 3).
Figure 3 - Example Of File Names For Individuals
Images of documents would lead with the surnames and given names of those to whom they pertained, followed by the type of document and they date of the document or event. That would group all the like documents together for an individual, and entering dates as YYYY_MMM_DD would group like documents for an individual in chronological order by year (Figure 4).
Figure 4 - Example Of Census Record File Names
Marriage record file names would lead with the name of the groom, followed by an ampersand (&), followed by the name of the bride, then the type of record, date of the marriage then place of the marriage. Transcribed records would have the abbreviation "TRANS" added at the end of the file name (Figure 5).
Figure 5 - Example Of Marriage Record File Names
PLACES & THINGS
File names for places and things was somewhat more problematic, as not every place or thing was associated with an individual, or even a family name. Therefore though surnames were included where appropriate (for example headstones), some file names of places simply began with the address they represented, or the structure they portrayed. Compromises were required occasionally. For example the image of a headstone for a lady seldom included her maiden name. Therefore those images were named using the surname on the marker and the lady's maiden name entered following her given names and enclosed in dashes (-) as used for married names in other circumstances (Figure 6).
Figure 6 - Example Of Place & Thing Names Showing Exception Marking For Headstones Of A Married Lady
Photos of groups of people have always posed a special challenge for genealogists. If there are only a few people in the photo, and if they all share a surname, then the file name can include all of them (Figure 7). Sometimes it is sufficient to name a group photo for the event it portrays, or simply for the family group shown, then add details of individual names using Summary Comments (for JPG or TIF files only) accessible under file Properties in Windows (Figure 8).
Figure 7 - Example Of A Simple Group Photo File Name
Figure 8 - Example Of The Use Of Windows Image File Property Summary Comments
The challenge soon gets out of hand when the number of people exceeds what is reasonable to include in a file name, or where surnames could be confused with given names, and vice versa. Clearly we need a solution that allows us to tag individuals within a photo so that their names pop up on a mouseover, or where such tags can be displayed or hidden with the click of a mouse. The tagging capability should travel with the image file and not depend on the end-user installing any special programs. Presently programs such as FotoTagger from Cogitum LC require that both the originator and receiver of a file have the program installed in order to display/hide tags.
Although we may use up to 255 characters in the naming of files there are some areas where abbreviations might be used to keep file names within a reasonable length. However abbreviations should only be used if they do not create questions about their meaning. In addition to the accepted three-letter abbreviations for months of the year, and the two-letter abbreviations for the 50 United States, here is the list of abbreviations that I currently allow myself to use. I should add that if I send a photo file abroad I change the two-letter state abbreviations to the full spelling of those state names.
- & in place of AND
- CEM for CEMETERY
- CERT for CERTIFICATE
- CO for COUNTY
- CP for COPY
- ENH for ENHANCED (This would apply to images that have been altered for clarity)
- PG for PAGE
- REC for RECORD
- REG for REGISTER
- TRANS for TRANSCRIPTION
To help me remember how to apply the standards I have imposed on myself I have created a single-page quick reference guide (Figure 9) which I keep at my desk.
Figure 9 - Image File Naming Standard - Quick Reference
Details aside, it is clear that some organizational plan is necessary if a researcher is to keep track of the wealth of image files that rapidly collect as they search. Hopefully it will be of some use to you, and to other researchers.
I wish I had started something like this a long time ago but I guess better late than never. Thanks. Do you have your guidelines as a PDF or something we can download? I did copy and same the image. That will give me one more thing to name. :)
Posted by: Sheldon | February 10, 2012 at 11:12 AM
Sheldon - no, there is no PDF, just this article for now.
Posted by: Geoff Rasmussen | February 10, 2012 at 11:25 AM
Is there a way to list files/images that are connected/not connected in Legacy? I have lots of files that are not connected (linked to a person, source or location) and need a way to only look at the files that need work.
Posted by: Debbie Fiske | February 10, 2012 at 11:39 AM
Obviously a lot of work has gone into this system. I would question the use of Windows File Properties however. This is not a standard for embedding information in photographs; it's proprietary to Windows. The information doesn't back up to disc and would be lost with any re-installation of the operating system which is bound to happen at some point. His example is from XP and Windows7 is a different arrangement entirely. And that's just Windows.
Posted by: JL Beeken | February 10, 2012 at 12:19 PM
I would love to rename and move all my images and photo's, but then it will all have to be "re-attached" in Legacy, and I do have a lot.
Is there a way to overcome this?
It would have been better if Legacy could have imported the images without you needing to keep the file exactly where it was at that time and named precisely the way it was when added to Legacy! (Hope this sentence makes sense!)
Posted by: Stellajo | February 10, 2012 at 06:05 PM
Agree with an organized system for names of files, whether photos, documents or... I am personally using more short cuts than the full names. Example: census look like this: year,co.place,surname shortcut, given inital/s.jpg. [1850KYcamDeCOM1.jpg] only use jpg files for census records. jpg is Easier to use in a word/word perfect file. Death Certificates are Death,surname short cut [PA for Pack], given, place. In the PhotoShop Elements software version 9, I use the file properties to tell info.. about the pictures.
Long file names do not work well for me. Just thought I would share another way of doing it. Good to have a standards sheet for all the ways you do your genealogy in a software program like Legacy, your files and your research process.
Posted by: Annette D Towler | February 12, 2012 at 03:22 PM
Your article is very timely for me as I have just begun to organize my 10 years of 'collecting' data, and my mothers 30 years of paper research. Filing and relocating downloaded documents and family photos has been a nightmare for me, this system will help not only in grouping but also keep me from re-downloading things I already have! Thank you for sharing.
Posted by: Relativespirits | February 12, 2012 at 03:57 PM
Re: Posted by: Debbie Fiske | February 10, 2012 at 11:39 AM
"I would question the use of Windows File Properties however. This is not a standard for embedding information in photographs; it's proprietary to Windows. The information doesn't back up to disc and would be lost with any re-installation of the operating system which is bound to happen at some point. His example is from XP and Windows7 is a different arrangement entirely. And that's just Windows."
Windows File Properties for a photograph are, in fact, taken from the "standard" data that can be stored as part of the file. There are two standards. IPTC Data for comments and descriptions of the contents and EXIF for recording details of the camera and camera settings. Windows XP uses the IPTC data from the file and allows you to edit and store that data. Windows 7 displays both sets and will also let you edit the IPTC Data portion just like XP. There are also other software products such as the Free Irfanview which will let you edit the IPTC data for the picture if you prefer not to use Windows properties screens to do so.
Posted by: Brian Kelly | February 14, 2012 at 09:05 AM
I have found the IPTC data feature invaluable for archiving image file source and content info - you do have to complete any image editing in uncompressed formats first. I use free Xnview, which has excellent metadata tools and the ability to save templates, among other good features. File naming is always a headache and who hasn't transitioned through several schemes. A big concern for me, besides ID hints, is naming for useful alphabetical sorting - yeah, I still use file managers and navigate directory and folder trees.
Posted by: Arthur Dirks | February 16, 2012 at 11:19 AM
These are good suggestions. I've been using similar schemes with my photos.
It would be nice to see some better image management features rolled into Legacy. The current method of starting with a person and connecting images can be cumbersome and it has kept me from linking in many of my existing images. Imagine if we could instead start with a group photo or census image and then identify the list of people who should all be linked to it. Combine this with some region tagging and it would make image management so much easier. With Legacy's already powerful reporting, there is a wealth of information that could then be discovered.
Posted by: Tom Nisbet | February 16, 2012 at 12:58 PM
Tom - Legacy has tools to do what you are suggesting. See the Picture Center. It's available from the Tools menu.
Posted by: Geoff Rasmussen | February 16, 2012 at 02:27 PM
Picture Center - how did I miss that? I'll give it a try as soon as I get the migration to the new computer sorted out.
Posted by: Tom Nisbet | February 16, 2012 at 02:47 PM
"Windows File Properties for a photograph are, in fact, taken from the "standard" data that can be stored as part of the file. There are two standards. IPTC Data for comments and descriptions of the contents and EXIF for recording details of the camera and camera settings. Windows XP uses the IPTC data from the file and allows you to edit and store that data. Windows 7 displays both sets and will also let you edit the IPTC Data portion just like XP. There are also other software products such as the Free Irfanview which will let you edit the IPTC data for the picture if you prefer not to use Windows properties screens to do so."
XP must have changed that in the past few years because Windows File Properties was NOT based on the IPTC standard. And 'standard' metadata in Windows, period, yep good luck with that.
Posted by: JL Beeken | February 16, 2012 at 04:26 PM
Another partial solution for this problem is to avoid loading the filename with so much detail and have a catalogue which contains all the detail.
This does not help when sending a few photographs to another researcher as the files are not self-documenting. (But you could also send an excerpt from the catalogue.)
I give my files fairly short names (say, 4-12 characters) that are meaningful to me but give very limited information. For example, using initials, awc1 is the first photo of my mother, jb1882 is my great grandfather in 1882. I avoid spaces and dashes because some websites do not handle them correctly.
With this method, it is perhaps more difficult to ensure that there are no duplicate names, but that is important.
I document the detail of the photograph in a catalogue, which contains the image and its full description. This includes People, Date, Place, Relationships, Picture
Source, Image Name. This gives me space to be vague about dates and speculate about relationships and places.
Currently, my catalogue is in Word. I have a printed
version for reference and a PDF version on my website.
I have only about 270 photos, so maybe my way of handling this would not work for those with several thousand?
Another option is to use a photo enhancing program to add a border to the photo and then add text in the border. I tried this a few years ago but was disappointed by the amount of text that could reasonably be added, and by the resulting file size.
This could well be a good option for documenting your photos (your catalogue) and for exchanging a few with others, but probably not for display on webpages.
Posted by: Harold Silander | February 19, 2012 at 12:24 PM
I had put off including my digital images in Legacy for several years, simply because of the amount of work involved. But having returned from NZ [to Australia] recently with over 700 new images, I have begun the hard work. I was previously keeping images in an images folder, but have now decided to keep all digital material (photos, documents) in a Surnames folder with a subfolder for each surname, and the file-naming convention "surname + firstname + brief description". The most time-consuming part is the number of versions of each photo that need to be made: from the original digital image, I make an enhanced version with a caption (for sharing with others, and to guarantee that the identities will travel with the image), another without a caption for inclusion in Legacy (but with a text caption included in Legacy; this is the photo that I attach to all the relevant persons thru Picture Center). Then there are cropped versions where I want to attach a close-up to an individual, and sometimes an image of the back of the original. That makes for up to five or more versions of each image, with appropriate tags at the end of the filename (orig, caption, cropped, script (for the back of the photo).
Recording the origin of photos is also important: it's hard enough now to remember whether I got a particular photo from my own parents' archives or from other cousins, but being able to say where you got a photo from adds significantly to its authenticity. I have a handful of 1880s cartes de visite, some with names on the back, but can no longer remember exactly who gave them to me over 20 years ago. This is information that I will now include in the notes field for each photo as I add it to Legacy.
I considered the option of attaching photos to events, but given the lack of output options, I am generally simply attaching photos to individuals rather than to events. For cemeteries, for example, Sherry had suggested creating a Cemetery event to attaching gravestone photos, but they were so inaccessible that I am simply attaching them to the individual as well. The cemetery information is already in the burial field, so creating another event for the same info was duplication.
Posted by: Patrick O'Neill | February 21, 2012 at 04:38 AM
I see you use a DOCS folder under your PICTURES folder. I'm guessing you started this before Legacy added it's own DOCS folder at the same level as PICTURES, SOUNDS, and VIDEO.
My question - do you put ALL of your documents in the DOC folder? Or just ones in JPG/TIFF/etc format? And thus you put other docs (PDF, spreadsheet, word-processor, web, TXT, etc) in the "Legacy" DOC folder)? Thoughts/suggestions appreciated.
Posted by: Robert Patton | February 27, 2012 at 09:45 PM