I developed Met[A]lbum, short for Metadata Album, which is software to create electronic photo albums. These are Portable Document Format (PDF) files with one image per page along with associated metadata like headline, caption, and date. One feature is the ability to sort images by keyword. However the method of sorting by keyword is not straightforward. I shall describe the algorithm I used.
An image can have a combination of several keywords, like “Family,” “F00001,” “Portrait,” “Private,” and “Vacation.” For a collection of photos in an album, it is not just a matter of sorting keywords alphabetically. The alphabetical order of a keyword is not usually the order you want photos to appear.
Keywords are also cryptic. They are short words or phrases that often represent more detailed concepts. For example in a genealogy application, the keyword “F00001” might mean the “John P. Smith Family.” It is much easier and less error-prone to assign short, consistent keywords like “F00001” or “Smith” to your images, rather than the verbose “John P. Smith Family.”
The Algorithm
Sorting keywords begins by determining all keywords in all the images to be sorted. From this list, select just those keywords by which you want to sort your images. Then arrange these selected keywords in the order they should appear.
You can also display a more detailed phrase that corresponds to a given keyword as a section heading, or in the table of contents of your photo album. In that case, you should also define a text phrase, or “Label,” for each of your selected keywords. For instance, the label “John P. Smith Family” can display in the album in place of the keyword “F00001.”
Here is my user interface. First select a keyword from the “Available” list and click the arrow to move it to the “Sort Order” list. Move the keyword Up or Down in the “Sort Order” list as desired. Finally, add “Labels” that correspond to each keyword, if desired.
Sort Comparison Function
Sorting images is just a matter of sorting an array of image descriptors. Each descriptor holds information about the image, like its file name and metadata, including keywords. There are standard array sorting functions, and most take the name of a comparison function that it itself will call to compare the elements of your array. This custom comparison function determines if one image is “less than,” or “greater than,” or “equal to” another image. Whereas a normal sort would simply compare letters in the keyword alphabetically (e.g., “A” is less than “B”), a keyword sort tests for existence of the given keyword.
In PHP, the comparison function might appear as follows. It is a member of an image descriptor class. Assignments to temporary variables are made for readability. Also note that keywords can be stored either as textual values, or as numeric values, where a unique number like a database key represents a given keyword. $nKeywordCount is initialized to the number of keywords in the “Sort Order” list. $nKeywordIndex is the keyword in the “Sort Order” list currently being compared.
static function CompareKeywords($a, $b) { $nResult = 0; if (self::$nKeywordCount) { $save = self::$nKeywordIndex; $keyword = self::$rgKeywordOrder[$save]; $rgA = $a->GetKeywordArray(); $rgB = $b->GetKeywordArray(); $a1 = in_array($keyword, $rgA); $b1 = in_array($keyword, $rgB); if ( ($a1) && !($b1) ) $nResult = -1; else if ( !($a1) && ($b1) ) $nResult = +1; else { if (self::$nKeywordIndex < self::$nKeywordCount - 1) { self::$nKeywordIndex++; $nResult = self::CompareKeywords($a, $b); self::$nKeywordIndex--; } else $nResult = self::CompareNext($a, $b); } } return $nResult; }
When comparing two images, if the first image has the keyword but the second does not, it appears before the second. If the second has it, but the first does not, the second should appear before the first. If both images have the keyword, compare the images for the existence of the next selected keyword in the sort order list. Finally if the “Sort Order” list is exhausted, go on to compare images by other sort criteria, if any. Here, CompareNext() uses different sort comparison functions to compare, for instance, headlines alphabetically, or dates chronologically.
This algorithm works well when the keywords are mutually exclusive. That is, a given image has one and only one of the keywords in the sort order list. But, it also works when images have more of the selected keywords. In those cases, preference is given to images having the most of the selected keywords. For instance an image that has three of three selected keywords appears before an image that has only two. Those having two appear before those having one. And those having one appear before those having none. A given image appears only once in the album no matter how many matching keywords it has.
Keyword Filters
Images that have none of the selected keywords will therefore appear at the end of the album. This may or may not be desirable. If they are included and you are using keyword labels, there should be a label that applies to all such images, like “Unspecified.” If such images should not be included, two enhancements are required to limit which images are processed in the first place.
The first is an “inclusion” filter. The user can include images that contain only certain keywords. This could both eliminate images having none of the selected keywords, and selectively restrict other images as well. It can have multiple keywords and furthermore, it can be considered either “All” or “Any.” That is, images to be included must have ALL of the specified keywords, or they must have ANY one of the specified keywords. This subtle distinction is useful depending on what you want to include in a photo album.
The second enhancement is an “exclusion” filter. It is used to exclude images from processing if they have certain keywords. This can also be “All” or “Any,” although practically, if an image has ANY keyword in the exclusion list, it is probably intended to be excluded.
Summary
Sorting by keyword is more complicated than a simple alphabetical sort of the keywords themselves. Instead it tests for the existence of keywords. Provision should be made to expand short, cryptic keywords to their full meaning in the context of a photo album. Finally, including or excluding images having certain keywords is also considered part of “sorting” images.