Monday, April 13, 2009

9 Semantic Search Engines That Will Change the World of Search

Article from:

The ideal search engine would be able to match the search queries to the exact context and return results within that context. While Google, Yahoo and Live continue to hold sway in search, here are the engines that take a semantics (meaning) based approach, the end result being more relevant search results which are based on the semantics and meaning of the query, and not dependent upon preset keyword groupings or inbound link measurement algorithms, which make the more traditional search engines easier to game, thus including more spam oriented results.

Here is a wrap up of some of the top semantic search engines which we’ve covered previously, and some updates on their research.

1. Hakia

The brainchild of Dr. Riza C. Berkan, tries to anticipate the questions that could be asked relating to a document and uses them as the gateways to the content.

The search queries are mapped to the results and ranked using an algorithm that scores them on sentence analysis and how closely they match the concept related to the query.

Hakia semantic search is essentially built around three evolving technologies:

  1. OntoSem (sense repository)
  2. QDEX (Query indexing technique)
  3. SemanticRank algorithm
  • OntoSem is Hakia’s repository of concept relations, in other words, a linguistic database where words are categorized into the various “senses” they convey.
  • QDEX is Hakia’s replacement for the inverted index that most engines use to save web content. QDEX extracts all possible queries relating to the content (leveraging the OntoSem for meaning) and these become the gateways to the original document. This process greatly reduces the data set that the indexer has to deal with while querying data on-the-fly. An advantage when you considering the wide swath of data the engine would have to search if it were an inverted index.
  • Finally, the SemanticRank algorithm independently ranks content on the basis of more sentence analysis. Credibility and age of the content is also used to determine relevancy.

Hakia performs pure analysis of content irrespective of links or clickthroughs among the documents (they are opposed to statistical models for determining relevance).

The engine has also started using the Yahoo BOSS service and also presents results in a “gallery” with categories for different content matching the query. Users can also request to try out the the incremental changes that are being tried at Hakia’s Lab.

2. Kosmix

The search company has takes its categorization concept further by providing users with a dashboard of content, aptly called - ” Your guide to the Web”. The company’s focus on informational search makes it suitable for topics when you want information on it rather than look for a particular answer or URL. For example, the search for Credit Default Swap provided a great mix of links, videos and tweets to get me started. Kosmix received $20 million of funding from Time Warner in late 2008. Its content aggregating technology will become more important as content on the web grows.

Read More

No comments: