Friday, July 27, 2007

DjVu

DjVu (pronounced déjà vu) is a computer file format designed primarily to store scanned images, especially those containing text and line drawings. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy compression for bitonal images. This allows for high quality, readable images to be stored in a minimum of space, so that they can be made available on the web.

DjVu has been promoted as an alternative to PDF, actually outperforming PDF on most scanned documents. The DjVu developers report that color magazine pages compress to 40–70KB, black and white technical papers compress to 15–40KB, and ancient manuscripts compress to around 100KB; all of these are significantly better than the typical 500KB required for a satisfactory JPEG image. Like PDF, DjVu can contain an OCRed text layer, making it easy to perform cut and paste and text search operations.