Indexing

February 16, 2012 - 10:35 GMT

“I’m writing a book. I’ve got the page numbers done.”  --Steven Wright

It has been both exciting and distressing to see the changes the print publishing world has been facing over the past several years. The word “exciting” is a double-edged sword, of course. The need for change is massive, even in the near term, and yet there are so many paths the industry could (and should) follow it’s hard to know what’s really going to happen next.

February 16, 2012 - 10:35 GMT

“I’m writing a book. I’ve got the page numbers done.”  --Steven Wright

It has been both exciting and distressing to see the changes the print publishing world has been facing over the past several years. The word “exciting” is a double-edged sword, of course. The need for change is massive, even in the near term, and yet there are so many paths the industry could (and should) follow it’s hard to know what’s really going to happen next.

September 24, 2010 - 10:14 GMT

Last week, Seth Earley blogged about the inefficacy of social tagging, but there's one scenario in which social tagging will breathe new life into an esoteric, 200-year industry: book indexing.

I've written hundreds of book indexes, presided over the American Society for Indexing, managed an international indexing partnership, taught courses, established standards, built tools, and consulted with a lot of influential folks, so trust me when I tell you that it pains me to see this happening. I believe with every fiber of my professional being that the human work of subject indexing is and will continue to be superior in quality to every alternative ever imagined. Oh well.

There is just too much information to index by hand, period. Books, periodicals, websites, blogs, messages, and documents are being produced or transformed too quickly for humans to keep pace, regardless of training and tools. Perhaps in response, the use of search algorithms becomes ever more popular, while overly optimistic expectations of retrieval quality grows increasingly preposterous. A more realistic response would be an increase in subject indexers' fees -- after all, demand is outpacing supply at an astounding rate -- but indexers haven't experienced a rate increase since the 1990s. The truth is that editorial indexing and all smart hands-on tagging is disappearing in favor of automatic approximations. And it is a reasonable argument that the substandard tagging of millions of pages and documents is better than leaving most of them without any subject metadata whatsoever.

August 14, 2006 - 7:28 GMT

An interesting problem was posed to a mailing list I am a part of...

Imagine that you have been using a single hierarchy to structure and organize your information for years, and it has been very successful up until now...

But now it is time to move to a different content management system, and not only that - business has changed (of course), and not every way of organizing and understanding the information could possibly have been anticipated. (Or perhaps you did anticipate some, but for practical matters limited the amount of metadata you might apply to content.) So you have new ways that users want to search and navigate, but never considered these at the start. What do you do?

June 08, 2006 - 12:11 GMT

Indexing and Taxonomy creation are closely related processes. In the first case we start with a body of content and then pull from it the key ideas, concepts, pieces of knowledge that we think users would like to access and then create pointers to the content. In the second, we look at a body of information and determine the categories that can be used to describe the content. (Usually without regard to the pointers to instances of terms).

May 21, 2006 - 7:48 GMT

This is another response to a post about the "shared drive problem." Shiv Singh of Avenue A- Razorfish commented that "Every document in an organization is not necessarily important enough to tag. Some organizations address this problem by first determining what knowledge/information/data is worth capturing for retrieval and then putting KM mechanisms in place to capture, codify and distribute it."