You are here: Home Gamera FAQ
search website
September 2008 « »
Su Mo Tu We Th Fr Sa
123456
78910111213
14151617181920
21222324252627
282930
 

Gamera FAQ

last modified 2007-08-20

Some frequently asked questions and (hopefully) some answers.

General

  • What is Gamera?

    Gamera is a toolkit for building document image recognition systems. It consists of a programming library and a set of GUI tools for experimentation and training. Gamera hopes to reduce the development time of document recognition applications by including a number of commonly uses components to prevent "reinvention of wheels" whenever possible. Please see the Gamera overview for more information.

    The term "document" is used loosely, and can include many kinds of information presented in two-dimensional form. Gamera has been used to build recognizers for common music notation, medieval manuscript and other things.

  • What is Gamera not?

    Gamera is not a packaged document recognition system, such as OmniPage or MIDISCAN. It is a tool with which one can develop document recognition applications, but is not one itself. Developing a recognizer for Gamera is designed to be as easy as possible, but still requires a considerable time committment.

    Gamera's focus is somewhat biased towards document types that are not well supported by existing, off-the-shelf software. Certain document types, such as medieval manuscripts, are unlikely to provide the financial incentive to support the development of a commercial application.

  • Why the name "Gamera"?

    Gamera is the acronym for "Generalized Algorithms and Methods for Enhancement and Restoration of Archives". The software, which grew out of our research on a system called AOMR (Adaptive Optical Music Recognition), was christened as Gamera on 1 April 2001.

    Gamera is also the name of a overgrown turtle in a series of Japanese monster movies. There is some hope that the software, like the turtle in the Turtle and the Hare story, will eventually be triumphant.

  • What sorts of scripts will Gamera work with?

    This is "script" in the sense of "writing system", not "scripting language".

    • Scripts with small character sets and (mostly) well-segmented characters (e.g. Latin, Greek, Hebrew, Cyrllic), Gamera performs very well.
    • For cursive machine-printed scripts (e.g. Arabic) we have an active research project to implement segmentation-free recognition.
    • For large character sets (e.g. Kanji) some sort of syntactical or structural analysis of the character is necessary. This sort of thing is not implemented in Gamera at present, but there is nothing stopping an interested researcher from adding these features.
    • I'm sure there's other categories of which I'm completely ignorant.

    And don't forget Gamera has been used to develop systems for other non-text structured documents such as commmon music notation and lute tablature.

  • Why can't I put my image in and get text out?

    See the question "What is Gamera not?". There is a rudimentary framework for text extraction in roman_text.py, however, expect that there will be a lot of customization necessary for each document domain.

  • How should I get started?

    It is helpful to have a background in programming. A basic knowledge of Python is required, but most people who have experience in another mainstream language generally find Python easy to learn. The recommended reading for starters is:

  • How can I get help?

    The gamera-devel mailing list on Yahoo! Groups is the best way to contact the authors and other members of the community. If you are running into a bug, please be sure to include the following information:

    • The versions of Gamera, Python and wxPython you are using
    • Your platform
    • Any output or backtraces that are being produced
  • How should I cite Gamera (in an academic paper etc.)?

    The canonical URL for the Gamera website is http://gamera.sourceforge.net/ That URL will always contain the most up-to-date information on Gamera with links to the offical documentation and published papers.

    If you are required to cite a published paper rather than a website, the most extensive and current information is in:

          Droettboom, M, MacMillan, K, and Fujinaga, I (2003). The Gamera
          framework for building custom recognition systems.
          Symposium on Document Image Understanding Technologies: 275-86.
    

    This proceedings is difficult to obtain, but the paper is available in PDF on the Gamera website.

Installation

  • I can't get Gamera to run.

    First check the following:

    • Make sure you have the correct version of Python installed. (This is 2.2.2 or greater on Linux, 2.3.1 or greater on Windows and 2.3.0 or greater on OS-X). Verify that it is installed correctly by running any of the demonstration scripts.
    • Make sure you have the correct version of wxPython installed. Recent versions in the 2.5.x series are unstable development releases and are not supported by Gamera. You will need to visit the complete list of wxPython releases to download a 2.4.x version.
    • If you are running Gamera on the commandline, try running the gamera_gui script from a directory other than the Gamera source directory.

    If these things fail, please send a message on the mailing list. Include in your message the Python backtrace, the versions of Gamera, Python, wxPython and platform you are using.

  • I just upgraded to Gamera 3.x and now I get all these deprecation warnings that I never used to see before.

    There are some functions in Gamera 3.x that have been deprecated. They will continue to work until a future release, but you will receive warnings. See the migration guide for more information.

Writing code

  • How do I write a Gamera script?

    Gamera scripts are just Python scripts that import Gamera's modules. It is definitely a good idea to familiarise yourself with the basics of Python before diving in.

    There are a number of really basic scripts to help get you started in the documentation .

  • After classification, how do I get the results?

    The classifier stores its classifications in the id_name member variable of images. This id_name member is actually a list of possible classifications. See the id_name documentation for more information.

    When you pass a list into classifier.classify_list_automatic or classifier.group_list_automatic, the list itself is not modified. Instead, any glyphs that should be added or removed are returned in a tuple of lists (added, removed). Therefore, to get any glyphs that were newly created by either splitting or grouping, you have to do the following:

          added, removed = classifier.group_list_automatic(glyphs)
          glyphs += added
    

    There is also a convenience function `classifier.classify_and_update_list_automatic` which handles this for you.

  • When should I use C++ and when should I use Python?

    There's no straight answer here. This should be considered as a tradeoff between runtimes (always let benchmarking on real-world data determine which is better) and development time. That said, you usually won't want to go through the trouble of implementing something twice, so here is a useful rule of thumb:

    • Algorithms that need access to individual pixels should be implemented in C++
    • Algorithms that drive other long-running, low-level processes should be implemented in Python

Training

  • What's the deal with production and current databases?

    Obsolete question: As of October, 2004, the terminology of production and current databases has changed.

    • production database is now classifier glyphs
    • current database is now page glyphs

    This, and some additions to the classifier GUI, should hopefully alleviate much of this confusion.

    The page glyphs are simply the set of connected components on the page you are currently training. The classifier glyphs are the connected components that the classifier uses to make its classifications (i.e. the training data). They are documented here.

    The classifier GUI provides some flexibility as to how these two databases are saved, loaded and merged.

  • How do I train the classifier to group connected components together (such as for lower case i's)?

    The classifier can be used to both repair broken characters and recognize "legitimately broken" characters. To train broken characters, select all parts of a single character and give the symbol name the prefix _group.

    For example, to train lower case i's, select both the stem and dot of a single lower case i and train it as _group.lower.i.

  • What's with id names?

    Training is basically the act of assigning symbol names to characters so that the classifier can learn what things are. Symbol names in Gamera may contain Unicode characters, and can be delimited into categories using periods. There is deliberately no standard naming convention in Gamera: that will depend entirely on the type of document being trained. However, if your document type fits neatly into the textual types of documents supported by Unicode, you may want to use standard Unicode character names, if only to avoid reinventing the wheel.

  • How can I make classification faster?

    The first thing to look at is the set of features you're using. Gamera provides a large number of feature generation routines, some of which are rather computationally intensive. Try limiting the set of features to ones you think you'll really need.

    You can decrease the time spent loading the training data into the classifier dramatically by using classifier.serialize() to save it in a high-speed but non-portable binary format.