Did you mean: Google?

del.icio.us:Did you mean: Google? digg:Did you mean: Google? furl:Did you mean: Google? reddit:Did you mean: Google? Published September 28th, 2007 in GoogleMashupsPHP

Did You Mean?I’ve always wondered what would be the easiest way to have a vast spelling checker for small phases, for use if your site takes a user string, searches the database, but returns no results. At first I thought about algorithms to compare the words to results in the contents of the database, so I did I quick google search, and right there, on the top of google, was my answer!

Googles “Did you mean” feature not only finds the nearest result to what the user typed, but also the most common search phrase if the query was vague - just what you need, eh?

I guessed this has been done before by somebody, but I couldn’t find it anywhere, so I quickly threw together a PHP script to test it out…

<?php
function DidYouMean($search){
	$content = file_get_contents('http://www.google.com/search?q='.str_replace(' ', '+', $search).'');
	preg_match('#<div id=res>(.*?)<div>#', $content, $matches);
	$match = str_replace('    Did you mean: ', '', str_replace('&nbsp;&nbsp;', '', strip_tags($matches[1])));	
	if($match == '' || $match == '    '){
		return 'none';
	}else{
	  return $match;
	}
}
 
$result = DidYouMean($_GET['q']);
 
if($result == 'none'){
echo 'You spelt that query correct!';
}else{
echo 'We didn\'t find any results for "'.$_GET['q'].'", did you mean <strong>'.$result.'</strong>?';
}
?>

- External File -

If $_GET[’q'] was set as “simpsuns”, it would return “We didn’t find any results for “simpsuns”, did you mean simpsons?”.

Its in no way perfect, but its a good start and would be very easy to incorporate it into a site especially since on my localhost, it took 0.254153966904 seconds to query the Google servers. And on my Dreamhost hosting - 0.25699687004089 seconds, so very little difference, and a quick enough turn around for it to be practically used on any site , hosted on anything from Dreamhost $7.95 hosting, or a array of servers witha 1Gbit link.

Know any sites using this technique for their searches? Or edited the script to make it better in any way? Let me know in the comments…

Top of story


Comments

    There are some built-in PHP functions for these kind of issues, what I don’t really know is how accurate are their results compared to Google’s one.

    levenshtein();
    soundex();
    similar_text();
    metaphone()

    I. Stan - 12:08 pm (16/01/2008)

    Hey,

    They are pretty neat functions, especially similar_text();, which I knew nothing about yet looks very useful, thanks!

    Though, I doubt anything will ever come close to Google results, for overall search phases. For dictionary words (apple, book, Internet) yes, but for words/phrases like “Simpsons”, “Macbook” and “Steve Jobs”, I think you would probably need the power OF Google to make yourself something which gives similar results to Google search spelling. (Its kinda a big part of what they do I would imagine from the search side of the company.)

    Thanks for your comment, you have a great site! (love the colorRunner example!)

    fLUx - 3:45 am (17/01/2008)

Related Posts