Cover Story (sidebar) / August 1994

Image Retrieval for Compound Documents

Tom R. Halfhill

Archived information is useless if it can't be retrieved, and the easiest kind of information to retrieve is textual: documents that either originated in machine-readable form or were converted to ASCII text by OCR scanning. A much more difficult challenge is to retrieve image files or compound documents in which the target of the search is a graphic, a video clip, or a sound bite. As the definition of what constitutes a "document" evolves to include files with multiple embedded data types, the ability to search for attributes unique to those data types becomes increasingly important.

Fortunately, image-recognition technology is advancing at a rapid pace, driven by industrial, military, and law-enforcement needs, as well as business applications. Manufacturers are relying more and more on machine vision and pattern recognition to automate their inspection and grading processes. For example, a Windows-based color vision program called Way-2C from Ronald A. Massa Associates (Cohasset, MA) is used to grade lumber, inspect soda crackers, and sort pills. The military uses similar technology for target acquisition and automated sentry posts. Police departments are using pattern recognition to identify fingerprints and match photos of suspects to digitized mug shots in computerized databases.

In the past, the only reliable way to index images was to tag them with keywords describing their content. This is still a worthwhile method, but the effort required to keypunch a caption for each image is justified only if you expect to retrieve the images often, as in the case of a stock-photo agency. Another drawback is that you can't always anticipate the parameters of a search; someday, you may want to locate an image that isn't described by any of its keywords.

Pattern-recognition algorithms, some based on neural networks, are getting smart enough to search common image formats such as TIFF files for specific shapes, colors, or textures. A leading product in this field is the Excalibur EFS document imaging system from Excalibur Technologies (San Diego, CA), which runs on PC, Macintosh and X-Terminal clients. Excalibur has developed a technique called adaptive pattern recognition that can analyze all types of digital data, including graphics and sound. The vision routines are so accurate that Weyerhauser uses them to sort different grades of hardboard siding by examining the wood grain.

IBM (White Plains, NY) recently announced a similar product called Visualizer UltiMedia Query, an OS/2-based DB2 client. Visualizer uses a technology known as QBIC (query by image content), developed by IBM's Almaden Research Center and the Santa Teresa Laboratory. A photo editor could use Visualizer to retrieve pictures of flowers containing a specific shade of yellow or a particular arrangement of blossoms.

Highly specialized applications may require custom solutions, and tools are available for this purpose. For instance, Excalibur sells its recognition routines as a collection of C libraries called the XRS toolkit. A similar package of C routines, the Matrox Imaging Library, is available from Matrox (Dorval, Quebec, Canada). You can even buy PC-based tools for creating pattern-recognition programs with neural networks, such as DS2000 from Design Sciences (Vienna, VA), which supports several different network models.

As computers become more adept at handling multimedia data types, the need to efficiently store and retrieve compound documents will become as crucial to business as the paper-filled file cabinets that comprise the bulk of corporate archives today. Luckily, the technology isn't lagging too far behind the demand.

Illustration: IBM's Visualizer Ultimedia Query can locate images from a database by shape or color. In this example, the product has located all grey, circular-shaped architectural images. Results are displayed in a closest-to-sample order.

Tom R. Halfhill is a BYTE senior news editor based in San Mateo, California. You can reach him on the Internet or BIX at thalfhill@bix.com.

Copyright 1994-1998 BYTE

Return to Tom's BYTE index page