The Product Data Syndication Problem

By Dan O'Connor, Director of Product Data, and Joe DiNardo, Co-Founder and COO of

Product data syndication has become one of the crucial elements within any Product Data program. This has led to new services within existing tools and focused tools meant to provide a level of process to syndication. Yet product data syndication remains a huge challenge for most businesses because of the sophistication required to manage all the outputs and the lack of evaluation of the syndication problem's size.

Let’s dig into an example of this problem by imagining we are a company that sells products on our own website. We have about 50 products each with 10 variations based on color and size, and we only sell in the United States. We have an e-commerce platform supported by a Product Information Management (PIM) tool that automates getting product data to our website. The project was remarkably successful: So successful that our leaders want us to get our products on Amazon.

The First Channel Product Data Syndication – Amazon

At leadership’s request, we first determine how to connect to Amazon. Do we want to use spreadsheets or their API? We chose spreadsheets because we don’t yet have the skill set to connect with APIs, so we attempted to download the spreadsheet for our product group. At that point, we discovered that we must complete 3 different spreadsheets, as our product lines cross groupings in Amazon. It’s not a big deal: It’s only 3 more spreadsheets.

After analyzing the spreadsheets, we find that they have attributes they request that differ from what we already collect in our PIM. Some are new attributes, but some are attribute values that need to be adjusted to match their data controls. Values in dropdown lists need to be normalized, and their character limits on their titles, descriptions, and feature bullets are different than ours. We’ll just change them as we fill out the spreadsheet. We have to figure out some new ways to manage the channel-specific IDs for these products, and pricing can be problematic to keep up with. However, we currently can manage that solely in the spreadsheets. Having a couple of resources perform this manually for a short while solves the problem today.

After two weeks of filling out spreadsheets, all 500 SKUs are ready to be loaded into Amazon. It takes another week to get everything exactly right, and another week to get the SKUs approved, but we are selling on Amazon. The product data syndication project is a success.

The Next Product Data Syndication – Complexity Arises

In fact, leadership sees such a boost from selling on Amazon that they want to sell on Walmart, Home Depot, and Wayfair. Another project is created, and it is determined that there are spreadsheets for each new retailer that we can download. This shouldn’t be difficult, as we’ve done the same activity for Amazon, right?

Wrong. We find that there are more attributes that are not in our PIM that need to be sourced and that the requirements for each spreadsheet are different. All our products go to 1 spreadsheet in Wayfair, 3 in Home Depot, and 4 in Walmart. Normalizing this data requires a single source of truth, as relentlessly asking engineers and marketing for data has led to some acrimony between teams. Therefore, it is decided to change the attributes in the PIM to normalize them against all the spreadsheets found thus far. This requires a project within a project, to determine what attributes need to be added, changed, or set to mandatory to have all the data needed in the PIM. This takes months, as there are 11 spreadsheets to normalize against including the new sheets plus Amazon. Leadership is getting impatient.

Also, we don’t sell all the same products to all retailers. Now we need to figure out which products belong in which channels and ensure we don’t put products that are specific to one channel into another channel where they don’t belong. We can manage this for now, but at some time in the near future we need to figure out a better solution.

Lastly, the different load mechanisms into our chosen retailers create issues. Some use an SFTP location, some use portals, and some expect API feeds. As it takes time for these files to be processed and approved, a fair amount of our time is spent logging into portals and looking for updates to see if there are errors or delays.

Later in the project, we found out that pricing is a bigger issue now that we’re sending data to multiple retailers. We need to simultaneously juggle a retail price for our e-commerce site, a wholesale price for purchase orders on your B2B site, a forward price for next season, plus an MFN discount for select buyers

But the project team perseveres, and we get all the new attribution into the PIM. After a couple of weeks, the data on our products is backfilled by engineering and marketing, and we have enough data to complete the new retailer spreadsheets. After several weeks of filling out spreadsheets, loading them into portals, and walking them through the approval processes, we have successfully syndicated to the next 3 retailers. Leadership wonders what took so long but is happy with the results.

The Product Data Syndication Puzzle Grows

Built on that success, Leadership wants 5 new retailers and 2 marketplaces chosen for our products. Since launching our products on Amazon there have been data changes to Amazon’s requirements that need to be maintained, and Walmart wants everyone to start using their API ASAP. Leadership also heard about A+ Content and wants us to create the new content required to complete that request. Lastly, leadership wants to start selling on Canadian and Mexican websites, so we must include translation services in the mix.

This might seem like an exaggeration, but this IS what happens. The first few syndications are easy, but trying to normalize the categorizations, attribute requirements, attribute values, data changes by the retailers, format differences, and keep asking for data from the sources to keep our single source up to date. This is no longer a project: It is a program. But leadership sees this as a project.

Why is this so difficult? Why is our company struggling to keep 500 SKUs live on 5 websites? There are lots of reasons:

  • Every retailer has their own data format, whether it’s spreadsheets, APIs, or portals. This means that manufacturers must supply data in the retailer’s chosen data format, not 1 single format that works for all retailers.
  • Every retailer has their own hierarchy to classify products on their websites, and the attributes for each category differ in data controls (number of character limitations, drop-down lists, number of feature bullets, etc). This requires data to be transformed before it can be sent to the retailer, which is resource-intensive and error-prone.
  • Retailers change their categories, schemas, and even their data formats on a regular basis. Re-mapping that attribution is time-consuming and involves diligence in watching how your products appear on these websites. Unfortunately, retailers rarely have a service to proactively warn you about changes in their requirements, which means discovering changes generally happens when a file load fails.
  • In marketplaces, distributors and bad actors can take over the listing for your products to say they are the primary seller. Although resolving 1P vs 3P issues is a fairly simple process, it consumes time to ensure the listings remain under your control and resolve the 1-party seller status issues.

Simplifying Product Data Syndication

The retailers, marketplaces, and other places you send your product data (Like Google Merchant Center, GDSN, etc.) have no desire to normalize each other's data formats and requirements. If fact, most SEO on Amazon and Google is penalized for not being unique to other listings for the same product. Whether it is on your own website or someone else's, being unique counts.

To manage this complexity, product data syndication tools are becoming more prevalent, whether part of a PIM offering or stand-alone services. Although they can’t resolve the retailer differences, they can simply the process. This requires:

  • Quality, normalized product data. A PIM as a single source of truth is the easiest way to manage the marketing data for your products, with a product data syndication tool as a way to manage pricing and other channel-specific data from the matrix of complementary systems within your ecosystem. You still need to push data to your websites, your apps, and your Business Intelligence tools. A well-designed data model, clean data, and good connectivity to downstream systems are crucial to start this process.
  • A Product Data Syndication Partner. Not all PIMs have syndication systems built in, and most syndication platforms are not PIMs. Connecting your single source of truth to a syndication partner is vital to this process.
  • Leadership buy-in that product data syndication is a program, not a project. There is maintenance involved with this process, and the projects to manage syndication will never end. Therefore, setting up teams, processes, and budgets to align with a program methodology is vital to achieve ever-changing syndication goals. A well-executed program can significantly increase top-line revenue while increasing market share across all your potential selling channels.

When leadership is ready to understand that a program is the proper option to manage your company's syndication needs, provides a tailor-made platform built specifically for centralizing, cleaning, validating and distributing high-quality data to your entire e-commerce, helping lower time to market for product launches while increases revenues and lowering costs.   

For over 25 years, Earley Information Science has been helping companies with their product data needs. EIS develops taxonomies, schemas, processes, and governance to generate the highest quality of data to power your product data syndication program.

Meet the Author
Earley Information Science Team

We're passionate about managing data, content, and organizational knowledge. For 25 years, we've supported business outcomes by making information findable, usable, and valuable.