Pages

Monday, May 11, 2009

Developing a Document Image Retrieval System Part 2

The DLL file which extracts the descriptor from a word image is ready. The descriptor which is called Texture and Shape Representation Descriptor (TSRD) is based on the two following publications (a conference and a journal):

A demo/showcase application of the above descriptor is located at: http://orpheus.ee.duth.gr/irs2_5

The descriptor can be used (in addition for the word spotting) for the retrieval of signatures, gestures and other applications of pattern recognition. All you need is a black/white 24bit image. The black color represents the object and the white the background.

The TSRD DLL file can be download from here (mirror). It has the ability to enable/disable the features that construct the descriptor. For example the Down and Upper Grid Features in not working well for signature recognition. The structure of the descriptor is depicted in the following image:

In addition to the TSRD DLL file, an example solution for the Visual Studio 2008 is provided. This solution uses the TSRD in two scenarios.

The first scenario is to calculate the descriptor of a word image using the full feature set. The code is:

// Creating the TSRD object    
TSRD.TSRD myTSRD = new TSRD.TSRD();

// Load the Image from the pictureBox_word object. Alternately you can load it from the hard disk.
// In the end you need a Bitmap object that it represent a black/white 24bit image.
Bitmap myImage = (Bitmap)pictureBox_word.Image.Clone();

// Get the Descriptor through the TSRD object
double[] myTSRDescriptor = myTSRD.GetTSRDescriptor(myImage);

The second scenario is to calculate the descriptor of a signature. In this scenario the Upper and Down Grid Features (UGF and DGF) are disabled:
// Creating the TSRD object
TSRD.TSRD myTSRD = new TSRD.TSRD();

// Disable the Down Grid Features (DGF),Up Grid Features (UGF) and the Trimming Operation
myTSRD.UseDGF = false; // disable DGF
myTSRD.UseUGF = false; // disable UGF
myTSRD.TrimImage = false; // The trimming operation

// Load the Image from the pictureBox_signature object. Alternately you can load it from the hard disk.
// In the end you need a Bitmap object that it represent a black/white 24bit image.
Bitmap myImage = (Bitmap)pictureBox_signature.Image.Clone();

// Get the Descriptor through the TSRD object
double[] myTSRDescriptor = myTSRD.GetTSRDescriptor(myImage);
This is a screenshot of the application in the example solution: solution The solution can be download for here (mirror).

The evolution of the TSRD is the Compact Shape Portrayal Descriptor. This Descriptor is more inline with the compact descriptors of MPEG-7. It is fast to calculate, quantized and small (46 bins/elements, 3bits per sbin/element). A demo/showcase application of the above descriptor is located at: http://orpheus.ee.duth.gr/cspd/. It uses the Windows Presentation Foundation (WPF) found in .NET 3.5 SP1 for the interaction with the user. It is still a work in progress (but I am in the final stages). The requirements to run the program are:


  • Firefox 2 (and above), Internet Explorer 6 (and above)
  • Microsoft .NET 3.5 SP1 (you can download it from here)
  • Windows XP/Vista
I will write about this descriptor and the accompany relevance feedback technique.

The above two descriptors are expected to merge with img(Anaktisi). At present time, I reorganize the structure of the img(Anaktisi) interface because it is a mess. After this, the descriptors will be added.

For more information or questions email me at kzagoris@gmail.com.

Dr Konstantinos Zagoris (http://www.zagoris.gr) received the Diploma in Electrical and Computer Engineering in 2003 from Democritus University of Thrace, Greece and his phD from the same univercity in 2010. His research interests include document image retrieval, color image processing and analysis, document analysis, pattern recognition, databases and operating systems. He is a member of the Technical Chamber of Greece.

No comments: