Aggregate Content: Why You Need to Have It

In today’s world of content and data coming from multiple sources and in multiple formats, the idea of aggregate content is one of much discussion.   Why is it important? How is content ingested and indexed? How is it rendered? How is it made searchable?  These questions are important ones and reflect the complex and dynamic nature of increasingly aggregate content across web environments.

The first premise is that content needs to be indexed in a way that makes it searchable and scalable for any web platform and search engine.  Users expect that queries can and will find all relevant content regardless of structure, format, and location.  An ecommerce site that offers more than just products but also services, articles, buying guides, and installation guides must be able to present relevant results from all of these categories. Furthermore, the content in these categories is likely a mix of unstructured and structured data, and may even not be in text form.  Images and video content are becoming increasingly important, and in order to bridge the gap between the in store and online shopping experience one must leverage these content types as much as possible.

So how is this possible?  With deeper ontological tools like key value pairs and subject/predicate/object relationships, modern search engines and web platforms are able to aggregate content and create their own indices without the help of more traditional structured records.  Looking for common terms and grouping them together, engines like Endeca are able to semantically group content and essentially create their own temporary records that can be matched against user queries.  This powerful ability is allowing aggregate content to be leveraged by both ecommerce sites and nonprofit organizations alike.

Modern internal search engines essentially behave like web crawlers, scanning documents and indexing them accordingly.  Not only can internal content be included, but any specified external content can also be added to the index.  Say an ecommerce site wants to include external as well as internal reviews about their products.  Tweets, posts, blogs, and other sources of social media content can be crawled and relevant content can be added to the index to provide the user with a bigger picture of what they are looking for.

In today’s world of aggregate content and big data, it is more important than ever to ensure you are leveraging all available resources to give your users the most comprehensive experience possible. Users are more savvy and more connected than ever before, and if you don’t provide them with the widest perspective available they will find it their own elsewhere. In addition to providing your users with a more robust experience, you will further enable your business to deepen your analytics capabilities and increase your business intelligence.  You will be able to get an even deeper and broader view of your customer and their behavior, allowing you to provide them with all the relevant content available before anyone else does. 


Seth Earley

Seth Earley is the Founder & CEO of Earley Information Science and the author of the award winning book The AI-Powered Enterprise: Harness the Power of Ontologies to Make Your Business Smarter, Faster, and More Profitable. An expert with 20+ years experience in Knowledge Strategy, Data and Information Architecture, Search-based Applications and Information Findability solutions. He has worked with a diverse roster of Fortune 1000 companies helping them to achieve higher levels of operating performance.