SharePoint Content Structure - Let a thousand content types bloom?

"How many content types should you have?"

This is the question that came up in a conference call last week on SharePoint architecture. This organization had implemented their corporate portal on SharePoint 2007 and was interested in going forward with more portal sites but had some concerns about the approach to information architecture they had undertaken.

I answered what I would answer no matter what technology it was - "Only as many as you really need to implement the appropriate level of metadata, workflow and templates." Which is of course vague, as most good consultant-ese is. I followed up with some stats: when we work on web content management implementations, we typically end up with about 10-15 content types for a site of medium complexity. We always try to keep the structure simple and number of content types few for many good reasons, ranging from ease of content structure management to content publisher user experience.

The folks on the phone were quiet for a minute... You see, the previous consultant they had worked with had a bit of a different (read opposite) approach. The philosophy they described was that SharePoint content types should be created to the maximum degree of granularity (e.g. one content type per library) so as to reduce the need for content publishers to select a content type and tag metadata values. For example, if you had a site for human resources forms, you would have one library and content type for medical forms, one library and content type for dental forms, etc. Each content type would be extremely specific and require little tagging. "If you need 30,000 content types, then so be it" is the idea. (insert eye twitch.)

The intent behind this - to reduce uncertainty and effort for content publishers - is noble and good, and in some specific cases might be the right approach. But in general, the overly-granular content types seems to be in the realm of sledgehammer to kill a fly. To help explain why, I thought I'd enlist the help of a couple of friends and colleagues.

First, I emailed content management guru Bob Boiko, author of the Content Management Bible, to see if he agreed. His response?

"How many content types is the right number? The fewest possible to squeeze the most value out of the info you possess. If it were my system, I would create a generic type and put all the info that I could not find a business justification for into that bucket. It’s not worth naming if you can’t say clearly why you are managing it. Then I would start with the info we have decided is most valuable and put real energy into naming the type and fleshing out the metadata behind it. Then on to the next most valuable and so on till I ran out of resources. In that way, the effort of typing is spent on the stuff that is most likely to repay the effort."

Amen to that! But I also wanted to get a tool-specific view from my colleague and SharePoint expert friend Shawn Shell. So I skyped him...

ImageSo, what do you think?



  

Image Well, having a content type for every document library is certainly an interesting approach, though I think your SharePoint administrators, as well as your users, might go quite mad.

ImageSo, I think the argument is that having this many content types is supposed to make it easier on the users by presetting all choices and removing the potential for error. If you never have to choose a content type because each library has a very specific default that matches the content you are creating, then there's no confusion, the idea seems to be... From a general content management perspective, this is flawed. But what about from a SharePoint-specific standpoint?

ImageI can understand why this might make sense on the surface.  Unfortunately, I think you end up exchanging one kind of confusion for another.  Further, there's a huge maintenance implication as well. For example, if you have a content type for each library, you are, for all practical purposes requiring the user to decide where to physically store a document.  This physical storage then implies your classification -- regardless of whether a default content type is applied.

ImageSo, you're basically recreating all the ills of a fileshare folder structure.

 

ImageIn essence yes. To make matters worse, more complex SharePoint environments will necessarily include multiple applications and multiple site collections. Because content types are site collection bound, administrators will have lots more administration to create, maintain and ensure consistency across the applications and site collection. This would normally be true, but when you have such an overload of content types and libraries, the complexities of management are compounded.

ImageSo, if you have 50 content types, and you need to use them in 2 or 3 site collections, you'd have to create 150 content types. Good argument to keep your use of content types judicious. Is there a hard limit to the number of content types one can manage in a site collection?

ImageThe answer is "sort of."  There's no specific hard limit to the number of content types in a site collection, but there are some general "soft limits" in the product around numbers of objects (generally 2000). This particular limit is an interface limit where users will see slower performance if you're trying to display more than 2000 items.  The condition won't typically manifest itself for normal users, but it will for administration. The other real limit is the content type schema can't exceed 2 Gb.  While this seems like a pretty high limit, if you have a content type for each library, loads of libraries in a site collection and robust content types, there's certainly a chance to hit this limit.

ImageWhat about search? I assume that a plethora of content types would have adverse effects on search.

ImageIt absolutely does.  Like everything we've discussed here, the impact is primarily two fold: 1) administration and 2) user experience. Content types, as well as columns, can be used as facets for search.  If you have an overwhelming number of facets in results, the value facets bring is reduced.  Plus, as I mentioned before, having large numbers of content types could also produce performance problems when trying to enumerate all of the type included in the search result.



From an administrative standpoint, we're back to managing all of these content types across site collections, ensuring that the columns in those content types are mapped to managed columns (a requirement for surfacing the metadata in search results) and, if you have multiple Shared Services providers, that this work is done across all SSPs.

ImageI expect there will also be a usability issue for those trying to create content outside of the SharePoint interface. Wouldn't users have to choose from the plethora of content types if they started in Word for?

ImageThis is another excellent point.  Often, when discussing solutions within SharePoint, we think only of the web interface. When developing any solution, however, you need to keep both the Office and Windows Explorer interface in mind as well. Interestingly, using multiple document libraries, with a content type for each library, makes a little more sense from the end users perspective, since it's similar to physical file shares and folders.

However, the same challenges that many organizations are facing related to management of file shares can manifest themselves when using the multiple library and matching content type approach as well -- putting these organizations back in the same unmanageable place they started.

ImageGreat, thanks Shawn for your insights! I'll be sure to spread the word to avoid a content type pandemic.

So there you have it folks. As a general rule, less is more. Standardize, simplify and don't let your content types multiply needlessly. Your content contributors and SharePoint administrators will thank you.

 As an FYI I'd like to add the following regarding term store and managed metadata limits (repurposed from an article on Microsoft TechNet titled Capacity Management for SharePoint 2010)

Limit / Restriction Maximum Number Comments
Maximum number of levels of nested terms in a term store 7 Terms in a term set can be represented hierarchically.  A term set can have up to seven levels of terms (a parent term, and six levels of nesting below it.)
Maximum number of term sets in a term store 1000 You can have up to 1000 term sets in a term store.
Maximum number of terms in a term set 30,000 30,000 is the maximum number of terms in a term set. Additional labels for the same term, such as synonyms and translations, do not count as separate terms.

 

Total number of items in a term store 1,000,000 An item is either a term or a term set. The sum of the number of terms and term sets cannot exceed 1,000,000. Additional labels for the same term, such as synonyms and translations, do not count as separate terms. You cannot have both the maximum number of term sets and the maximum number of terms simultaneously in a term store.