Search

April 17, 2012 - 4:58 GMT

I was very pleased to receive the following notes from a participant in one of my recent SharePoint IA courses.

“This workshop is powerful and more than meets my expectations!  I wasn't sure if I should attend as I'm not a technical person, but am a member of the IT project team.  It is immediately applicable to my job.  I plan to share the information with the project team when I return to the office… I have already gained some excellent tools to work with each of the businesses as we migrate them to SharePoint 2010. “

This was wonderful feedback and it prodded me to try to distill what it was that attendees valued about the course.

This class pulls together a number of principles around information management strategy, process analysis, user research and taxonomy to guide development of key information architecture constructs in SharePoint.   All fine and good, but similar to standard IA approaches and not terribly exciting . Describing the course curriculum doesn’t really communicate  the value that students take away from the class.   So, I spent some time trying to think through the situations in which I have seen light bulbs go on. 

It’s in the structure of interactive exercises.   Students are given seemingly simple problems to solve, yet they yield deep insights.  Take a simple exercise around term derivation. (Used for populating the SharePoint term store.) 

October 19, 2011 - 11:45 GMT

I was really looking forward to attending the Microsoft SharePoint Conference 2011 in Anaheim, CA and the event didn’t disappoint.  Not only did I get to enjoy that southern California weather but I got the chance to get reacquainted with some old friends, meet some new people in the community and immerse myself in my favorite topic: enterprise search.

The number and range of session talks was staggering.  A few titles hit me right off of the bat as sessions I wanted to see:

  • Creating Beautiful and Engaging Web Sites with SharePoint 2010
  • Best Practices from the Field: Managing Corporate Metadata and Taxonomies with SharePoint 2010
  • The Convergence of ECM and Knowledge Management: Strategies for Success

There were lots more, as well, so my days were pretty jam packed.  My most significant take-away from the conference was a general feeling of well-being as a result of learning that our approach to designing information architectures, taxonomies, and metadata schemas for SharePoint was exactly what Microsoft was advocating as best practice. 

It was also very interesting (and validating) that the song that we’ve been singing here at Earley & Associates for the last several years – that of Search as an Application – has become mainstream.  There were numerous sessions just on this topic, like:

August 12, 2011 - 4:35 GMT

Earley & Associates recently announced a webinar series on Content in Context: Why Dynamic Content and Content Choreography is Critical to Information Management. Since you may be asking yourself, “what is content choreography?” we thought we’d share the history of the term and what we mean by it.

Back in March of 2011, a major global high tech company engaged Earley & Associates to work on the redesign of a major website, site search, metadata and all new web CMS and DAM infrastructure. It was an enormous undertaking, headed by Marketing and involving brand managers, the SEO team, content authors, creative agencies, a systems integrator, a user experience design agency, technical consultants, and the IT department. The existing sites were to migrate from traditional navigation, search and single page content to a totally new paradigm of dynamic content collections, where user context would be driven by the search experience more than by navigation or site depth. With personalization. And in multiple languages. Taxonomy and metadata would play an important role in each of these areas, but just how well the whole system was going to hang together (“If we do not hang together, we shall all hang separately...”) was a real concern, and the very reason we’d been called in as a sort of SWAT team.

February 15, 2011 - 11:43 GMT

I recently pulled out my yellowed copy of Michael Dertouzos’ 1995 What Will Be: How the New World of Information Will Change Our Lives.  What I found interesting is how some of those predictions were spot on and some oddly naïve about just how much humans can change.

In “What Will Be” the term used to describe how people get their jobs done by leveraging various tools for managing documents and information was “Groupwork”.    Today, we simply use content management applications to get our jobs done.    See my recent blog, “This internet thing? It's gonna be BIG!” for more discussion on what will be, what is, and what is to come.

As I looked back over the last 15 years, I thought about the progress made in content management platforms; and the hype that accompanied each one.  “Now, we will we have an end to information chaos! We can control what goes where and enable easy access!”  Sadly, each new offering led to its own flavor of information chaos. 

So is SharePoint 2010 the platform that will solve the problem? Or, will we find that information chaos is migrated along with content?   It’s really up to you and your organization. The opportunity is there but don’t take it for granted.

As I talk to companies and other enterprises, I find that most fall into the same trap – they buy a tool, install it, roll it out and wait for their people to get more efficient and effective.  They wait… and wait… and…  Instead of things getting better, they actually can get worse. 

Why is this, I asked myself.   Here are the five things that came immediately to mind.

December 01, 2010 - 1:50 GMT

Note: This article has also been published in full on CMSWire.

Web search is something that most of us take for granted. It’s a split second operation that offers up page after page of results for what seem like an infinite number of topics. For any particular query, if we don’t immediately see a result that catches our eye, our ability to reformulate and execute a new search takes but a moment. With Google, we assume that the items listed on the first page of results - or first few pages for that matter - are the most relevant for the search query we entered and why shouldn’t we, with such a complex algorithm and thousands of intelligent people working to solve a single problem, this market leader has set the standard when it comes to connecting people with information.

On occasion however, we find ourselves conducting search after search and end up spending more time than expected trying to find a document that best matches our original query. In such cases we often accept the results we’re given and conclude that an appropriate resource to what we’re looking for must not exist or cannot be found online. While the search results are typically good enough to get us what we need, there is always some room for improvement when it comes to filtering and refinement.

November 04, 2010 - 9:07 GMT

This week I have had the privilege of teaching the information organization and access (AIIM IOA) course at a combined meeting of the Joint Task Force North, The Dept of Homeland Security, The US Army North and the US Northern Command.

From the JTF site: “The Joint Task Force North http://www.jtfn.northcom.mil/ is the Department of Defense organization tasked to support our nation’s federal law enforcement agencies in the identification and interdiction of suspected transnational threats within and along the approaches to the continental United States. “

“Transnational threats are those activities conducted by individuals or groups that involve international terrorism, narcotrafficking, alien smuggling, weapons of mass destruction, and includes the delivery systems for such weapons that threaten the national security of the United States."

One of the primary goals of this mission is the capture and dissemination of knowledge throughout a network whose mission is the protection of the United States.  I was told by the head of the knowledge management organization, Dr Rick Morris, that my contribution would go directly to improving the security of the country.  I have to say that I am truly honored to be making such a contribution to our nation. 

Also from the JTF site: “JTF North’s homeland security support role is articulated in its mission statement:

November 02, 2010 - 10:15 GMT

In an earlier blog, I introduced the term eTaxonomy.   ETaxonomy represents “embedded taxonomies”.  Many kinds of IT solutions rely on taxonomy as a core organizing principle (reference data, content object models, information architecture, metadata schemas, etc)  as opposed to simply being a navigational construct.   In this blog, I discuss applications of eTaxonomy from our recent client work.  Of note are:

  • Search
  • Document and Records Management
  • Content Management
  • Digital Asset Management
  • Ecommerce
  • Marketing Campaign Management

Search                           

Search is about metadata.  A search application “derives” metadata by creating an index of the content.  The index is information about the content, i.e. metadata. The search tool uses the index to locate documents and pages.  This “derived metadata” can be enriched by adding attributes or keywords with terms defined in a taxonomy.  Taxonomy provides a hierarchical structure of controlled vocabulary terms.   With this structure, search-enabled applications can present related concepts, broaden or narrow the search, and filter results based on “facets” or attributes.  The use of related terms (developed with a “thesaurus” – taxonomy on steroids) provides tremendous power in search applications.

Document and Records Management

September 24, 2010 - 10:14 GMT

Last week, Seth Earley blogged about the inefficacy of social tagging, but there's one scenario in which social tagging will breathe new life into an esoteric, 200-year industry: book indexing.

I've written hundreds of book indexes, presided over the American Society for Indexing, managed an international indexing partnership, taught courses, established standards, built tools, and consulted with a lot of influential folks, so trust me when I tell you that it pains me to see this happening. I believe with every fiber of my professional being that the human work of subject indexing is and will continue to be superior in quality to every alternative ever imagined. Oh well.

There is just too much information to index by hand, period. Books, periodicals, websites, blogs, messages, and documents are being produced or transformed too quickly for humans to keep pace, regardless of training and tools. Perhaps in response, the use of search algorithms becomes ever more popular, while overly optimistic expectations of retrieval quality grows increasingly preposterous. A more realistic response would be an increase in subject indexers' fees -- after all, demand is outpacing supply at an astounding rate -- but indexers haven't experienced a rate increase since the 1990s. The truth is that editorial indexing and all smart hands-on tagging is disappearing in favor of automatic approximations. And it is a reasonable argument that the substandard tagging of millions of pages and documents is better than leaving most of them without any subject metadata whatsoever.

August 09, 2010 - 11:37 GMT

In this article (originally published via CMSWire) we examine the desire to duplicate the Google experience in the enterprise by attempting to change our perspective on what we expect from enterprise search based on what we’re willing to do to make it work. 

November 20, 2009 - 1:29 GMT

The Faceted Fallacy

... If a tree falls in the forest and no one is there to hear, does it make a sound?

Yes I know it’s a silly old question, with no real definitive answers but it makes our brains think creatively about ambiguous problems, which is fun.  A recent thread in the Taxonomy Community of Practice really got me thinking this way in relation to taxonomy.  

To summarize the thread, the question was raised, what are the most commonly used facets in an enterprise taxonomy? In response one member posted a “definitive” list of primary facets that could be used as an exhaustive skeleton for the “enterprise”.  From here the conversation split in multiple ways:

July 17, 2009 - 1:02 GMT

Recently on the Taxonomy Community of Practice, a member asked the following question on faceted taxonomy design:

"I'm researching about Faceted Navigation and Information Retrieval. I've been looking over the Internet for some articles/books/white papers about which is the best number of facets to use on a classification."

Interesting question, especially given the popularity of faceted search and taxonomy. The community discussed the topic, and a a few answers were provided by members.

May 26, 2009 - 8:00 GMT

I often get frustrated by those who think Google is the greatest search engine that ever parsed. Don't get me wrong - I like Google, I use Google, I employ it as a verb. But if I hit the search button and get wonky results, I recognize that they are wonky and am not afraid to blame Google. (Full disclosure: I have a library science background which I'd like to think has made me into a pretty good searcher, so I will usually try a few different queries before I point the finger at the machine.)

On most of our consulting engagements, at least one person will say "I want our search to be more like Google." I have a few problems with this kind of statement. Partly it's that most folks aren't terribly critical when it comes to evaluating the relevance of Google results. It's what we know, it's what we're used to. We don't mind that Wikipedia is almost always the first result on any query - many might find that a "feature".  We're generally happy to take whatever shows up in those top 10 results and roll with it regardless of what it is, mostly because we can't know everything that is out there so we trust Google to filter it for us. We satisfice (yes, it's Wikipedia, joke intended.)

February 14, 2009 - 9:35 GMT

My last few blog posts on keyword research tips have generated interest from our readers regarding the relationship between the SEO task of keyword research and taxonomy. The purpose of today’s post is to examine the intersection between the two and offer a little advice for reconciling the internal perspective of taxonomy with external internet search. We can harmonize these perspectives using a data-driven approach to understand the "mental model" of the external searcher.

December 22, 2008 - 3:21 GMT

In my last post I discussed a process for putting together a broad list of keywords intended to act as the starting point for our keyword research. The purpose of this step was to give us the ability to cast as wide a net as possible in an effort to uncover as much of the language being used by our potential customers when searching for our content, products and/or services online. Doing so not only gives us the opportunity to wisely target the correct keywords, but also lets us craft our content in such a way as to tap into as much into the long tail as possible. To illustrate, I’ll use the following Top Content report from Google Analytics. As you can see this particular page, although targeted toward a specific set of keywords, generated traffic from an amazing 5,766 unique keyword combinations! This alone demonstrates the power of the long tail in driving significant amounts of traffic to your website.

Tapping into the Long Tail og Search

Keep in mind that you don’t want to generate traffic just for traffic’s sake; you want these visitors to do something while on the site, whether it’s to buy your product, fill out a form or contact your company. Web analytics aside, now that we’ve done all the groundwork and assembled our master list of terms, we’re ready to tackle the research part of our keyword research.

November 16, 2008 - 8:39 GMT

I recently met with a client who said "no one uses facets for searching..."  I expressed surprise at this comment and probed a bit as to why they thought so.  We opened their home page and I soon surmised why.  The facets they had did not seem to be very useful.  No one had tested them and I had not yet spent any time in analysis, but at first glance, they did not provide context for their content.  I recall one facet was "content type" and contained terms like "pdf", "doc" and "jpg".  There were also ambiguous terms like "article", "white paper" and "research".  I am not necessarily saying these were not useful, but I did not understand the difference between "white paper" and "research".  (Perhaps a frequent user would).

The point here is that faceted search is incredibly powerful but only if the facets make sense to users and the terms are clear, concise and meaningful.  Terms have to help users locate what they want and not frustrate them in the process.

In support of that goal, I wanted to point out some examples of bad facets - the facets that don't help anyone and that sully the good name of faceted search. 

Here is an example from the Verizon Wireless site:  

Verizon Wireless Taxonomy Facets

October 06, 2008 - 1:03 GMT

Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves." (1)

Recently Chris Anderson wrote an article for Wired magazine called the The End of Theory. The thesis of the article in a nutshell is that the impending petabyte era of data storage signals the end of the traditional scientific method of discovery. No longer are we bound to the outdated model of observation, hypothesis and measurement. Computers (developed by Google & IBM) "can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot."

August 21, 2008 - 5:30 GMT

Last week Stephanie wrote about (post) the importance of considering specific facets of search engine optimization in helping taxonomists guide clients in choosing the right keywords. To further that discussion, I thought I’d put together a series of posts to speak in more detail about using keyword research as a tool for determining (or at least being consciously aware of) the language being used by those searching for your content, products and/or services online.

Preparation - Creating Your Master List The first step in the process is the groundwork. I always allocate a certain amount of time up front to plan and prepare the list of initial keywords to be used as a basis for conducting keyword research. You need to have an inventory of words or phrases to get started, so why not put some thought and effort into generating a solid list to work from. From my perspective, the better the plan, the better the results. So let’s get to it.

August 14, 2008 - 4:59 GMT

In my last post, I mentioned the difficulty that some clients/stakeholders have in letting go of certain terminology when they undertake a taxonomy project:

Search engine optimization (SEO) has become one of the most important tools in helping us taxonomists get hard data that is meaningful and fight against the inclusion of terms that are too cute, ambiguous or otherwise detract from the findability of content.

August 11, 2008 - 7:37 GMT

There are three different types of relationships in taxonomies: 

  • Equivalent (Synonyms: "International Business Machines = IBM")
  • Hierarchical (Parent/Child : "Computer Manufacturers => IBM")
  • Associative (Concept/Concept: "Software Group - Software")

Heather Hedden's presentation on taxonomy powered discovery for a recent Boston KM Forum contained an interesting set of examples for how to organize the last type of conceptually related term sets.

  • Process and agent: Programming - Programmers
  • Process and instrument: Skiing - Skis
  • Process and counter-agent: Infections - Antibiotics
  • Action and property: Environmental cleanup - Pollution
  • Action and target: Auto repair - Automobiles
  • Cause and effect: Hurricanes - Flooding
  • Object and property: Plastics - Elasticity
  • Raw material and product: Timber - Wood products
  • Discipline and practitioner: Physics - Physicists
  • Discipline and object: Literature - Books
March 26, 2008 - 3:23 GMT

Enterprise Search Survey Sue Feldman, research VP, content technologies at IDC, and Michelle Manafy, editor of EContent magazine, the Enterprise Search Sourcebook, and conference programmer for the Enterprise Search Summit Invite you to participate in a short (2-4 minute) survey about enterprise search tool selection. They will present the results at the Summit and on the Enterprise Search Center.

Links:

February 15, 2008 - 3:24 GMT

From time to time we organize a free educational conference call series on search, taxonomy or content managment. Next month, we'll be running our Search Series. Register at Search Solutions Jumpstart.

January 15, 2007 - 10:35 GMT

Challenges of Search

I just returned from a conference in Rome where I presented a session on search. The basic premise is this: Search is not a utility. Search is an application. Search needs to be thought through and integrated into the process that it is meant to support.This does not mean that there is no place for basic search - the plug and play utility model that tools like Google Search Appliance leverage. In that case, search provides a valuable function in helping people access large stores of unorganized content. As much Google bashing as I do, I am a frequent user of Google Desktop. Hypocritical? I don't think so. GSA is appropriate for what I use it for - searching through email messages and my hard drive for certain types of information. Sometimes I find what I am looking for and sometimes I don't. But this is because of the relative effort I place on organizing my content versus the time it takes to do so. It's easier for me to search as I do and risk not finding something than it is for me to organize all of my email. On the other hand, I have a more structured method for the information that I place higher value on - proposals, SOW's, client project documents and conference presentations.

November 07, 2006 - 1:14 GMT

I was looking over my keyword analysis on the earley.com web site and I came across a strange search string. We had several hits for "Silver mining tools". I thought, how odd... I entered the search in Google and there we were: Earley & Associates had the number 1 spot on Google for Silver Mining Tools. Right next to companies that provided equipment for miners along with paid ads for sites touting silver stocks. I then noticed why Google thought this a good match. We have an article titled "Text Mining: Search's Silver Lining".

August 08, 2006 - 9:34 GMT

I wanted to post a preliminary note about our new Jump Start calls planned for September and October. Each series will consist of 4 - 90 minute calls each with 2 - 3 presenters who are experts and practitioners in their fields. Here is the tentative schedule:

Content Management Jumpstart: Each Monday 1:30 - 3:00 EST September 25th through October 16th - Topics will include:

  • The basics of content management
  • Building a content management business case
  • Content management frameworks & governance
  • Integrating content management with business processes
  • Tagging, metadata and taxonomies
  • Content publishing models & reuse
  • Personalization & targeted content
  • CMS selection & deployment
  • Web content management & syndicating content
  • Global content management

Search Solutions Jumpstart Each Friday 1:30 - 3:00 EST, October 13th through November 3rd - Topics will include:

July 23, 2006 - 11:24 GMT

I was just listening to a web cast of IDC's called "Enterprise Search 360: A Company-Wide Case Study with National Instruments and IDC". This was commercial for search vendor FAST Search and Transfer, but one of the best kinds of commercials - one by an enthusiastic customer. The web cast shoud be available here. Julie Schlembach, Search Program Manager, National Instruments did a terrific job of explaining what they have accomplished with search and content management. I really liked her style and the detail of her slides.

That said, I disagree with the basic premise that "Unified Information Systems" are the "Next Big Thing". Or that search tools are the silver bullet that will allow for "a single point of access to multiple points of information". "Unified information platforms could disrupt the business intelligence market by extending user access to far more information than previously achieved by search", claims Sue Feldman, analyst for IDC.