I spent a couple of days recently at the Semantic Technology 2011 conference in San Francisco. A number of different themes and implementations struck a chord with me – and I could overview these all together at length. But for now I will focus on two particular implementations that showcase both success, for all of us to learn from, and themes/directions, for information management professionals to pay heed to. These two themes, in my view of this conference, are: Ontology is the New Taxonomy and Managing Vocabulary to Build Semantics-Based Knowledge Experiences.
In this post I want to begin a discussion on the first--using semantic technology to build ontology-based websites--by telling the story of the BBC (British Broadcasting Corporation) World Cup Website. Quite apart from it being a landmark achievement, there are also huge implications for those who build taxonomies – since taxonomies as we currently know them (“flat” hierarchies, faceted or not) are likely going to play a more minor role in the emerging semantic web world.
This presentation by John O’Donovan (now at the Press Association, but then chief architect of BBC News and Sport Interactive) of the work at the BBC was the first that strongly resonated with me.
Briefly stated, The BBC's World Cup web site was almost certainly the biggest (at its time) pure-play implementation of semantic web technologies on a commercial media site. Or … as one pundit put it …if there was a World Cup for the Semantic Web, then the BBC may have lifted the trophy for its country.” There you go … Brit humor in action. If you can’t excel at the sport … then at least you can excel at something else. I, being British, love it. :-)
So, what was paradigm-shifting different? And, how was it done?
The BBC World Cup Site was “large” – over 700 pages. In fact it was precisely 776 pages – 32 team pages, 8 group pages and 736 individual player pages. Usually, 700+ pages – and remember these pages were changing frequently over the course of the World Cup competition as teams and players were involved in “events” -- requires a great deal of manual interaction and editorial curating to build. Not in this case.
And, what exactly drove John O’Donovan to embrace a sematic implementation? In his own words: “but search technologies and previous methods for doing this have proven to be inaccurate and there is no point in having all these pages if the quality of them is perceived to be low. You don't want to get content mixed up between different players with the same surname, for example.” Clearly, certain bundles of sematic technologies, optimally architected, can deliver more “precision” and “recall” than tried and (un)true search-based methods (which also carry the overhead of applying metadata, by the way). On the BBC site you can read two articles that provide more detail. The first article is an overview of the project and the second provides more details about the BBC architecture and technologies.
The underlying ontology, that described how World Cup facts related to each other, might seem “simple” to those who work with large-scale enterprise taxonomies. It was built around teams, players and the groups they competed in. For example, "Thomas Mueller" was part of the "Germany Squad" and the "Germany Squad" competed in "Group D" of the "FIFA World Cup 2010". The ontology also included journalist-authored assets such as stories, blogs, profiles, images, video and statistics. Simple, but powerful.
Journalist-authored content was automatically analyzed against the World Cup ontology. A natural language and ontological determiner process automatically extracted World Cup concepts embedded within a textual representation of a journalist’s story. The concepts were moderated and, selectively applied before publication. Moderated, automated concept analysis improved the depth, breadth and quality of metadata publishing in this implementation.
You can read more about the BBC Sports Ontology – to be used big time in next year’s Olympics being held in … the U.K. Looking at the example visual diagram you can see how different it is from a classic faceted taxonomy. (And, yes, I acknowledge there are logical similarities between the two kinds of models – and these similarities will make it plain sailing for current taxonomy designers to transition their skillset to ontology building and maintenance.)
Oh … and by the way … John O’Donovan’s presentation was subtitled - A Compelling Case and New Architectural Pattern for Semantics in Every Enterprise. I couldn’t write it better myself. And just let me re-iterate – ontology is going to be the new taxonomy. (Personal opinion.) Think about it.
Categories
- Business Processes (2)
- Content management (27)
- Digital asset management (16)
- Electronic Health Records (1)
- Enterprise Search (2)
- Governance (11)
- IA and usability (15)
- Indexing (6)
- Information Architecture (5)
- Knowledge management (11)
- Master Data Management (5)
- Metadata (3)
- Ontologies (5)
- Project management (6)
- Records management (4)
- Search (25)
- Semantic Technology and Web (2)
- Semantic web (9)
- SEO and SEM (9)
- SharePoint (29)
- Social network analysis (2)
- Software and technology (17)
- Tagging and folksonomy (16)
- Taxonomy (66)
- Taxonomy development (9)
- Taxonomy testing (4)
- User interfaces (9)



