Can't it just be like Google?

I often get frustrated by those who think Google is the greatest search engine that ever parsed. Don't get me wrong - I like Google, I use Google, I employ it as a verb. But if I hit the search button and get wonky results, I recognize that they are wonky and am not afraid to blame Google. (Full disclosure: I have a library science background which I'd like to think has made me into a pretty good searcher, so I will usually try a few different queries before I point the finger at the machine.)

On most of our consulting engagements, at least one person will say "I want our search to be more like Google." I have a few problems with this kind of statement. Partly it's that most folks aren't terribly critical when it comes to evaluating the relevance of Google results. It's what we know, it's what we're used to. We don't mind that Wikipedia is almost always the first result on any query - many might find that a "feature".  We're generally happy to take whatever shows up in those top 10 results and roll with it regardless of what it is, mostly because we can't know everything that is out there so we trust Google to filter it for us. We satisfice (yes, it's Wikipedia, joke intended.)

Take these same folks and plunk them in front of their enterprise search and ask whether the top ten results are usually "good enough"... Most will answer no. Why? Because we have a better idea of what information is out there and what would make for a good result. Our frame of reference is completely different - we have a much higher level of information literacy within our organization, so we are quicker to recognize when the results don't make sense. And so we blame the search engine. Which sometimes is the right response (especially if you have SharePoint... I mean, even if you type in the exact title of a document you can still get a bunch of gibberish), but sometimes the problem is... well, you know the old IT help desk joke (between the chair and the screen).

Read Adriaan Bloem's post (CMS Watch) for a great example of how using Google is kind of like trying to order a coffee at Starbucks:

"Daniel Tunkelang (Chief Scientist of Endeca)...told me search phrases are much like coffee orders: if you don't go to Starbucks often, you'll hesitantly ask for a large coffee, with milk, maybe skim milk, and you'll say please. If you go there every day, you just go in and say "venti skim milk latte," pay, and leave with your coffee.

When we use Google on the web, we're used to the same kind of formulas -- we know Google's query language because we've grown accustomed to it, not because it's particularly good or bad at understanding what we want."

It's all about that last bit - the "understanding what we want". I recently read an interesting Oracle whitepaper about folksonomy and social approaches to relevance (which I recommend) that talks about relevancy and search queries:

"A search result can only be relevant (or not) to what the searcher wants. Only secondarily is the search result relevant (or not) to a search term, query string, or criteria. This means that the possibility of relevancy is a function of how well the search term reflects the searcher's desire."

This feels like a tired librian-y complaint ("people are bad searchers"), that our ability to translate our need into a string of words that a search engine can use is really the limiting factor on just how relevant results can possibly be. If you are looking for a sample proposal for an SAP implementation in the utilities industry in North America and you type in "proposal", well you get what you deserve. But also, the less we know about the body of knowledge the results are pulling from, the less capable we are of making that judgment of relevancy.

So what are people really saying when they say "can't it just be like Google"? I think it's more about the veneer of simplicity. We don't want to think about what we type in or what might be out there. As casual users, we generally just want to scan the top 10 results and pick something that's "good enough." This works fine on the public web but hard to replicate in the enterprise, where there's a lot less to choose from and our information needs are much more specific. We understand the content, so we apply a more critical eye to the results. We often feel they are less relevant because we have more information against which to judge them.

What I'm trying to say is that while we all want search to be fast and easy, we have to accept that different search contexts (i.e. work vs. play) should carry with them different expectations around relevancy and search strategy. Although we may all have been trained on Google, we have to recognize when our information needs require more sophisticated search strategies and tools and be willing to learn them - just as we did when we were first presented with the big white box.

More like Google

When I say I want SharePoint search to be more like Google, I mean it in the technical sense. That is, I know that Google is powered by linking and pagerank, etc. and that corporate search, indexing a small set of docs not linked together in any obvious (to the indexing agent) way, may not perform in the same way. No, I get frustrated with deficiencies in performance, like poor stemming results, no wildcard (out of the box), poor presentation of results, poor keyword searching (where finding 'this' in 'this text' is different than in 'thistext' or 'this-text' or 'this_text'.