Simple image recognition program written in Java using JOONE - neural network library.
You can download source and binaries here:
htthttp://sourceforge.net/projects/imgrecognition/
Simple image recognition program written in Java using JOONE - neural network library.
You can download source and binaries here:
htthttp://sourceforge.net/projects/imgrecognition/
Article from Codeproject.Com
While I have been coding some AI application I heard some mellow strains of a childish songstress coming from upstairs of the neighbours which they played repeatedly. It was sometimes hardly audible to catch the verses, but I managed to distinguish several characteristic phrases to have a look over some great web search engine (I like it, since it puts some of my codeproject code articles to first 1-2 pages of the search results). The only significant phrase from the song I submitted to the engine was (to prevent undue advertisment), say "фиолетовая паста" (violet paste). I expected it would have given scores of make up advertisments, but contrariwise, just one link from the first page of the search results among cosmetic industry spam pointed to some music web forum with exactly that phrase from the rhymes. The next click of mouse and second search over that engine gave me music group verses of the song, guitar tabs and put me to you tube so I was listening that marvelous music clip.
It is astounding how a person with permanent internet access can in few seconds, after having heard the music, be presented with the verses, group information and video clip to listen to. The process is described as searching on the media data content. As current web searches uses textual information to return results, consider you will be able to give it as a search query either audio, video or image sample the same way you submit your textual requests. Just as the computer was listening to some music it was able to present you the same information.
The concept known as Connected Visual Computing (CVC) is actively pursued by Intel. The CVC concerns the media data processing e.g. when in the field of view of your mobile phone cam emerges some object (ant for example) you can see on the screen its identification obtained by mobile analized its image, that it is say Camponotus herculeanus, or when you see some caption in the street on unknown language, you may view it through your mobile cam and it will display at the same location in the street the same caption but in your native language (augmented reality (AR), 2D/3D overlays), or the above presented example by the search using audio content. The market promises immense propagation. That introduced market will for the very long period of time keep the audience consuming modern hardware and software.
Here I'd like to present the general idia on how the computer may be used to desribe the image analyzing its pixel content known as the Automatic Linguistic Indexing of Pictures (ALIP). The approach is general and is always assumed to extract some descriptive features from the data and to use some rules to attibute the content to some category.
If you're intrested in the immediate applications you may contact the supporting firm System7 of the content based image recognition (CBIR) part of the project.
Basic understanding of AI approaches e.g. neural networks, support vector machines, nearest neighbour classifiers. Image descriptive and transform methods as wavelets, edge extraction, image statistics, histograms. C++/C# experience as in this article you will find how to invoke C++ dll methods from within C# application.
In my ALIP experiment I decided to annotate the simple natural image categories. There are 5 ANN classifiers in the project corresponding to:
You need to use unknown category along with the others you'd like to classify to. As otherwise AI classifier would be able to identify only e.g. animals, flowers, landscapes, sunsets with every image you give. But in real world there are other types of images that do not fall into either of the above presented categories, so you will need to meddle with AI classification thresholds which is rather cumbersome and awkward. But having additional unknown category AI classifier the results of the image identification will be as either one of the known image categories or simply unknown image type the computer can not identify using its petty knowledge.
I adore the image databases, they contain shots from all over the world really nice to observe. I've got about 20000 images for designers bought from a DVD shop. I've taken image samples from the animals, flowers, landscapes, sunsets image types and added all other image categories that do not come from the 4 ones to have unknown image type.
Now the usage of the program is simple enough. Just run the alip.exe and it will load all necessary AI classifiers files (in case of error you will have a message box and will not be able to use it). Then click the [...] button and select the directory that presumably contains some *.jpg files. You may use the ones supported in this demo under pics directory. All the found files will be added to the list box, then just click them to watch in the right panel and see the proposed category in the top left panel. In theory it should be able to comment the image as presented below.
Due to the competing intrests with the former organizations and the current one I work for, I will not be able to describe in minute details the methodology and feature extraction methods. I would rather present the general trend and categories of the features used for description of images. As searching over internet for corresponding feature computation will reveal all the necessary papers with particular formulae.
There are some demos availabe online e.g. ALIPr. They use hidden markov models HMMs and wavelet features from the images. You may try the pictures from that article using their methods or vice versa my application with their pictures and compare the annotation results.
As the AI approach is general and assumes some reduction of the original data dimensionality using either features extraction or PCA transform or both, all that is needed is to collect some data, extract the features and train AI classifiers. If you understand my face detection articles you will be able to repeat the experiment:
After you converted your raw image data to the features, just train some AI classifiers to discriminate desired positive category from negative ones.
Generaly they are divided into:
The Color features are simply the original raw image data, histogram of the image channels, image profile. Texture features are the known edge extraction methods, wavelet transforms, image statistics (e.g. 1st order: mean, std, skew; 2nd order: contrast, correlation, entropy...). And Shape features tries to estimate the object shapes found in the images. Just have a look at wiki for CBIR.
Typically the original image color space RGB is transformed to alternative spaces as YCbCr, HSV, HSI, CIEXYZ, etc... As alternative spaces might give better discrimination of the data, but you need to experiment with them anyway.
Read More
by Sebastian Anthony (RSS feed) May 24th 2010 at 9:30AM
Following on from a New Scientist article that was written a few days ago, I ended up on the website of Taeg Sang Cho -- a graduate student at MIT. He's been working on a bunch of advanced imaging algorithms -- with gifts and grants from big names like Microsoft, Adobe and Google.
His recent work -- three research papers -- is all about content-aware manipulation of photos. I'm struggling to pick one because they're all awesome, so I'll just give you the highlights:
New York invasion by 8-bits creatures !
PIXELS is Patrick Jean' latest short film, shot on location in New York.
Written, directed by : Patrick Jean
Director of Photograhy : Matias Boucard
SFX by Patrick Jean and guests
Produced by One More Production
www.onemoreprod.com
from Erico Guizzo at IEEE Spectrum:
Over the past year or so, Microsoft’s robotics group has been working quietly, very quietly. That’s because, among other things, they were busy planning a significant strategy shift.
Microsoft is upping the ante on its robotics ambitions by announcing today that its Robotics Developer Studio, or RDS, a big package of programming and simulation tools, is now available to anyone for free.
The Microsoft RDS supports a number of hardware platforms, including the Lego Mindstorms NXT, iRobot Create and Parallax Boe-Bot, and it provides a physics-based simulation environment to allow you to test your designs.
(please to note: the download is almost 500MB)
http://www.adafruit.com/blog/2010/05/26/microsoft-releases-robotics-studio-free/
I have a philosophical question and wait for your answer.
How do you define the Term "Similar"?
When 2 images are considered as “Similar images”?
According to Google, as similar is defined:
Send me your opinion at savvash<at>gmail<dot>com and i will post it here.
This year, TPAMI is celebrating its 30th anniversary. To mark this milestone, the IEEE Computer Society’s Publishing Services Department asked journal volunteers to submit their All-Time Favorite Top 10 list and explain their reasons for choosing the papers. Free, limited-time access is available to all of the papers on the list.
Citation: F.L. Bookstein, "Principal Warps: Thin-Plate Splines and the Decomposition of Deformations," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 6, pp. 567-585, June 1989, doi:10.1109/34.24792
I'd like to say I learned about Thin-Plate Splines straight from the papers by Duchon or Meinguet, but I didn't. In fact, I found out about them from this excellent paper by Fred Bookstein. I remember very well punching in the coefficients of the numerical example in that paper into Matlab and realizing how helpful this approach would be to my work on shape matching.
Citation: W.T. Freeman, E.H. Adelson, "The Design and Use of Steerable Filters," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891-906, Sept. 1991, doi:10.1109/34.93808
This is the first TPAMI paper I ever read, and it is also the reason I chose to make computer vision my career. I was hooked from their very first example of steered first derivatives of Gaussians. I subsequently devoted several years of my life to studying low level feature extraction, including a pilgrimage to the Mecca of image filtering in Linköping, Sweden.
Citation: Richard I. Hartley, "In Defense of the Eight-Point Algorithm," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 6, pp. 580-593, June 1997, doi:10.1109/34.601246
"It's the normalization, stupid."
Citation: Jianbo Shi, Jitendra Malik, "Normalized Cuts and Image Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug. 2000, doi:10.1109/34.868688
This was one of the first TPAMI papers whose formation I witnessed from start to finish, since Jianbo was my officemate. We all knew they had a hit on their hands with this one. We also knew that with the publication of this paper, our honeymoon phase with spectral clustering was over, and the nitty gritty phase was about to begin.
Citation: Harpreet S. Sawhney, Serge Ayer, "Compact Representations of Videos Through Dominant and Multiple Motion Estimation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 814-830, August, 1996, doi:10.1109/34.531801
Who could forget this paper's dynamic mosaics made from footage of Arnold riding a Harley in Terminator 2. The things they were doing with optical flow at Sarnoff Research Center in the mid-90s were indistinguishable from magic.
Citation: John Canny, "A Computational Approach to Edge Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679-698, Nov. 1986, doi:10.1109/TPAMI.1986.4767851
Before the SVD mania of the 90s, and long before the boosting craze of the 00s, a handful of towering contributions in the areas of edge detection, optical flow and regularization theory were developed on the foundations of variational calculus. The Canny Edge Detector, developed in the early 80s, was one such contribution. 25 years later it is required learning in virtually every beginning course in computer vision. Not bad for a Master's Thesis!
Citation: Yuri Boykov, Olga Veksler, Ramin Zabih, "Fast Approximate Energy Minimization via Graph Cuts," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, November, 2001, doi:10.1109/34.969114
For a few years there it seemed that every problem I was working on could be written down with a cost function that Yuri, Olga and Ramin's code could solve for me quickly and accurately.
Citation: Yali Amit, Donald Geman, Kenneth Wilder, "Joint Induction of Shape Features and Tree Classifiers," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 11, pp. 1300-1305, Nov. 1997, doi:10.1109/34.632990
Quantized tags, approximate geometric arrangements and randomized trees. There were no SIFT or HoG features back then, and the binary handwritten digits were a far cry from the sheep and motorbikes of PASCAL and MSRC, but the essential constellation based recognition approach proposed by this paper was brilliant and ahead of its time.
Citation: Shivani Agarwal, Aatif Awan, Dan Roth, "Learning to Detect Objects in Images via a Sparse, Part-Based Representation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1475-1490, Nov. 2004, doi:10.1109/TPAMI.2004.108
We still don't know what a "part" is, but that philosophical sticking point didn't stop this paper from making a big impact in object category detection.
Citation: S. Umeyama, "An Eigendecomposition Approach to Weighted Graph Matching Problems," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 10, no. 5, pp. 695-703, Sept. 1988, doi:10.1109/34.6778M
Spectral graph matching is the lesser known sibling of spectral clustering, but it is nonetheless filled with interesting theoretical nuggets, many of which I encountered for the first time in this paper. I fondly remember this as the paper that prompted me to check out a copy of Papadimitriou and Stieglitz to find out about this so called "Hungarian Algorithm."
My collaborators (Vicky and Dim) are working on new video summarization project based on multimodal data and fuzzy classifiers. The proposed technique automatically generates summaries from on-line videos (YouTube). Each frame may participate in one or more than one of the generated classes. The application, once more, will be open source.
Here is a screenshot. More details as well as the paper will be added soon.
Beijing, China
October 24-28, 2010
Important Deadline:
Submission of Papers: June 15, 2010
The International Conference on Signal Processing (ICSP), sponsored by the IEEE Beijing Section, is the premier forum for the presentation of technological advances and research results in the fields of theoretical, experimental, and applied signal processing. ICSP 2010 will bring together leading engineers and scientists in signal processing from around the world. Research frontiers in fields ranging from traditional signal processing applications to evolving multimedia and video technologies are regularly advanced by results first reported in ICSP technical sessions.
Topics include, but are not limited to:
A. Digital Signal Processing (DSP)
B. Spectrum Estimation & Modeling
C. TF Spectrum Analysis & Wavelet
D. Higher Order Spectral Analysis
E. Adaptive Filtering &SP
F. Array Signal Processing
G. Hardware Implementation for Signal Processing H. Speech and Audio Coding I. Speech Synthesis & Recognition J. Image Processing & Understanding K. PDE for Image Processing L. Video compression &Streaming M. Computer Vision & VR N. Multimedia & Human-computer Interaction O. Statistic Learning & Pattern Recognition P. AI & Neural Networks Q. Communication Signal processing R. SP for Internet and Wireless Communications S. Biometrics & Authentification T. SP for Bio-medical & Cognitive Science U. SP for Bio-informatics V. Signal Processing for Security W. Radar Signal Processing X. Sonar Signal Processing and Localization Y. SP for Sensor Networks Z. Application & Others
*Attention*
Under the support of numerous reviewers and authors, ICSP has been holded for 20 years. In this session, as a celebration for ICSP, we will hold celebration events and awards, which include Outstanding Paper Award, Outstanding Student Paper Award, etc. For details, please visit http://icsp10.bjtu.edu.cn .
*Proceedings*
The proceedings with Catalog number of IEEE and Library of Congress will be published prior to the conference in both hardcopy and CD-ROM, and distributed to all registered participants at the conference. The proceedings will be indexed by EI.
*Paper Submission*
Prospective authors are invited to submit full-length, four-page, double-column papers, including figures and references, to the ICSP Technical Committee by June 15, 2010 at http://icsp10.bjtu.edu.cn. For questions about paper submission, please contact the technical program secretaries, Ms. TANG Xiaofang and Dr. AN Gaoyun at bfxxstxf@bjtu.edu.cn and gyan@bjtu.edu.cn .
For more information, please visit the ICSP 2010 web site at:
A small application for unwrapping omnidirectional images using polar to Cartesian coordinate conversion. The size and aspect ratio of the produced images can be adjusted and the application performs bilinear or bicubic interpolation in order to improve the quality. The center of the omnidirectional image can be detected automatically using either a very simple and fast algorithm based on image thresholding or a slower but much more robust method based on edge detection and Hough transform. An example of using the second method is shown below.
Videos presenting sequences of unwrapped omnidirectional images taken from the COLD database can be downloaded here, here and here.
Download, installation and usage instructions for both Linux and Windows can be found below. If you have any questions, you experience problems with the software or you have spotted a bug, please contact Andrzej Pronobis.
Download and Installation
The application is known to compile in both Linux and Windows and depends on the OpenCV library. The source code can be downloaded either as a tar.gz file (for Linux users) or zip file (for Windows users):
• Tar/gzip file (443.46 kB)
• Zip file (446.30 kB)
Binaries for both operating systems are also available:
• Linux binary (449.10 kB)
• Windows binary (584.34 kB)
CMake is used as a build system for the sources. Windows users can install MinGW to get a C++ compiler. To build from the sources, use either the 'build.sh' or 'build.bat' script.
Examples:
Change log
-------------------------------------
2009-05-05 1.0.3776
img Retrieval
1.New Shape Descriptor (CSPD)
2.SpCD bug fixed
img Retrieval
3.MPEG-7 Descriptors Fusion
-Using HIS*
-Download Empirical (Historical) Files From the WEB.
-Using Z-Score
-Using Borda Count
-Using IRP
-Using Linear Sum
4.MPEG-7 and CCD Descriptors Fusion
-Using HIS*
-Download Empirical (Historical) Files From the WEB.
-Using Z-Score
-Using Borda Count
-Using IRP
-Using Linear Sum
5.From now on we are using a Compact Version of the BTDH for indexing and retrieval
6.New descriptor (B-CEDD). During the search process, an image query is entered and the system returns images with a similar content. Initially, the similarity/distance between the query and each image in the database is calculated with the B-CEDD descriptor, and only if the distance is smaller than a predefined threshold,the comparison of their CEDDs is performed.
7.Now you can save the retrieval results in trec_eval format
8.Indexing is now working with *.bmp,*.jpg and *.png