Recently I worked with Charles Knight of the Alt Search Engines blog on a project to compare the search relevance of two major search engines - Google and Yahoo! - and to measure the improvement in relevance by adding a search engine technology called Relona into the mix.
Relona Research Labs is a unique technology targeted at enhancing the relevance of search results by suggesting changes to the search query spec - such as adding keywords, enclosing phrases within double quotes, or even reducing the number of search terms. In future, this type of technology could be plugged in into the search engine back-end to automatically refine the user query and produce better search results.
We started with a large number of actual user queries from the AOL open data set, provided to us by Relona. Out of these, we picked a substantial set of queries at random as our test search strings.
For each selected search string, we ran two tests:
(a) comparing the search results from a raw Google search and a raw Yahoo! search to score the more relevant of the two, and
(b) comparing the search results from a raw Google search with the results of a Yahoo! search for Relona-optimized queries, again to find the more relevant of the two
[Deciding which results were the "more relevant" was done manually by eye-balling the results]
The idea was to evaluate whether the addition of Relona optimization for user queries made the Yahoo! results more relevant, relative to Google results.
Personally, the biggest surprise I found was in the comparison of raw Google and Yahoo! search results. I had always imagined these two leading search engines to be comparable in terms of search relevance. But that wasn't the case; while the results were comparable in about a third of the cases, Google's results were significantly better in about half the cases - the Yahoo! results were better in the remaining cases. Often, 8 or 9 out of the top 10 results from Google would be highly relevant, as compared to 4 or 5 from Yahoo!.
Using Relona's beta solution to improve the search results - often by constraining the query or adding keywords - did appear to enhance the relevance of Yahoo!'s results and help bridge the gap. (For access to the raw data, please send me an email and we can share it with you.)
One limitation of the existing Relona search is that the engine comes up with several suggested improvements to the query, only some of which enhance relevance of results. But in order to be usable in the real world, the technology needs to come up with one "best" query suggestion. Kumar Ramanathan, co-founder of Relona, assures me that they are currently working on improving their solution along these lines.
Based on this limited test (more comprehensive testing is needed), Relona clearly helps in improving the relevance of search results!
Also, based on this constrained data set, Google's dominance as the premier search destination seems well-justified; matching or beating their formidable relevance in Web search is going to be increasingly difficult in the future (especially since web sites may themselves be changing to suit Google's algorithms as well ).
At the same time, this project convinced me that a more systematic approach to measuring search relevance is sorely needed; all the search engines I've talked to so far seem to measure their own relevance using large data sets, manual measurement ("eye-balling") and statistical results. This is certainly one way to measure relevance, but it ignores other dimensions of search relevance, such as breadth, depth and so on.