Earley AI Podcast - Episode 19: Knowledge Graphs, Graph Databases, and Predictive UX with Steve Stesney

From SQL Rigidity to Graph Flexibility - Bridging Business and Data Teams to Unlock Knowledge Graph Value

Guest: Steve Stesney, Senior Product and Data Practice Lead at Predictive UX

Hosts: Seth Earley, CEO at Earley Information Science

             Chris Featherstone, Sr. Director of AI/Data Product/Program Management at Salesforce 

Published on: November 8, 2022

 

 

 

 

In this episode, Seth Earley and Chris Featherstone speak with Steve Stesney, Senior Product and Data Practice Lead at Predictive UX, whose career spans social media sentiment analysis, federal government data products, public affairs lobbying intelligence, and enterprise knowledge graph consulting. Steve traces his journey from early social media monitoring startups to discovering the hard limits of SQL when trying to answer real-world relationship questions about lobbyists, donors, and government officials - and the moment Neo4j graph databases opened everything up. He shares a practical framework for explaining taxonomy, ontology, and knowledge graphs to executives, explains why building the graph first and asking business questions second is always the wrong order, and describes what "predictive UX" means as a discipline for bridging the gap between data teams and the business stakeholders they serve.

 

Key Takeaways:

  • SQL databases become painfully slow and complex the moment you try to answer real-world relationship questions - graph databases eliminate that complexity by making relationships first-class citizens of the data model.
  • Disambiguation - determining whether "Bob Smith" in one dataset is the same person as "Bob Smith" in another - becomes dramatically easier in graph because every entity accumulates rich contextual metadata and relationship patterns that confirm or deny identity at scale.
  • The executive pitch for knowledge graphs is a three-layer stack: taxonomy provides the shared vocabulary, ontology defines how data concepts relate to each other as a business data model, and the knowledge graph is the AI-ready foundation on which applications are built.
  • Building a knowledge graph first and then asking what business questions it can answer is doing it backwards - user experience research and business pain points must be defined first, then the graph architecture is designed to answer those specific questions.
  • Once executives see what a knowledge graph can do, the floodgates open and they want everything answered immediately - managing that enthusiasm requires a prioritized roadmap anchored to measurable business outcomes and honest data readiness assessments.
  • Knowledge graphs are transformational products, not set-and-forget systems - clients must understand from day one that a graph solution will expand and grow, and that an eighteen-to-twenty-four month journey involves changing data entry processes, governance, and organizational behavior.
  • The role of the knowledge graph consultant is to sit shoulder to shoulder with both business and data teams, iteratively translating questions from one side to the other, because neither side alone has the full picture of what is possible or what is actually in the data.

 

Insightful Quotes:

"SQL equals predefined equals no flexibility. The queries were getting so complex trying to answer real-life questions - how am I connected to this person, is this person connected to someone I know - that they became burdensome and slow. And that was just in a proof of concept." - Steve Stesney

"I was building it backwards. The user experience, the business challenges, the pain points all have to be exposed first. Then the knowledge graph and data warehouse can be built, and applications on top of that." - Steve Stesney

"Once they drink the Kool-Aid and see what this can do, the floodgates open and they want to answer everything now. It always comes back to some mechanism for prioritization - what are the preconditions, do you have a measurable outcome, do you want to fail fast but quietly?" - Seth Earley

Tune in to hear Steve Stesney explain how a graph database transformed a public affairs firm's ability to map lobbying relationships and donor networks, why every knowledge graph project starts by asking the wrong questions, and how predictive UX brings the discipline of user experience research to the data world to finally close the gap between what business leaders need and what data teams actually build.



Links:

 Thanks to our sponsors:

 

Podcast Transcript: Graph Databases, Disambiguation, and Why User Experience Must Come Before the Knowledge Graph

Transcript introduction

This transcript captures a conversation between Seth Earley, Chris Featherstone, and Steve Stesney about the practical realities of building knowledge graphs in enterprise environments. Steve draws on his career-long journey through social media sentiment analysis, federal government directory products, and public affairs lobbying intelligence to explain why SQL hits a wall when answering real relationship questions, how graph databases solve disambiguation at scale, and why the right sequence for any knowledge graph project is business questions first, data architecture second. He also unpacks predictive UX as a discipline for bridging business and data teams, and addresses what happens when executives discover what graph can do and want everything solved immediately.

Transcript

Seth Earley: Welcome to today's podcast. I'm Seth Earley.

Chris Featherstone: And I'm Chris Featherstone.

Seth Earley: Today's guest is an expert in data product and platform development, management, operations, and strategy. After working on projects for SAGE Publications and DCI Group, he is now the Senior Product and Data Practice Lead at Predictive UX. Please welcome Steve Stesney.

Steve Stesney: Thank you, Seth. Thank you for having me. This is great and I'm looking forward to it.

Seth Earley: Maybe you could give us a thumbnail of how you got into this space - the world of data, information, and predictive user experience. Tell us what that is and how you got there.

Steve Stesney: Sure. I'll start at the beginning. Many years ago we started in a startup that was doing what is now known as social media marketing - but at the time we were looking specifically at message board comments and how brands were being perceived. It was a fascinating startup. We collected the information, analyzed it, and presented it back to clients in a way that distilled tremendous amounts of data. It really got me understanding what sentiment is, how to analyze it, how to collect it, and all the different technologies for extraction.

After that company was acquired by a larger publisher, I got my MBA and came back to the data world as a product manager at CQ Press. CQ Press had been producing federal government directories - judicial directories, congressional staff directories - for years. They decided to pull all the older archives together, digitize them, and create a biographical resume almost of everyone who had ever worked in the federal government. From there they layered on lobbying data, so you could see that a person who had worked in a congressional office was also now a lobbyist for Company Z, along with all the relationships that came with it. Fascinating product. It also brought me face to face with the limitations of SQL - the platform underneath was SQL-based, the front end Java-based, very limiting. And I started to really grasp what disambiguation meant, because we would take quarterly lobbying reports that were user-entered data - misspellings, people changing their name from Marty to Martin, companies adding LLC - and the whole process was eye-opening.

After SAGE Publications acquired CQ Press, I moved on to a public affairs firm called DCI Group. They wanted to take all their internal campaign information and pull it together to create essentially a recreation of the CQ Press product: who is who, how are they connected, how do we contact them? And this time I decided: I am building this in graph.

Seth Earley: When did you come to that realization, and what were the problems of doing it without graph?

Steve Stesney: SQL equals predefined equals no flexibility. We started the proof of concept in SQL and quickly ran into its limits. The queries were getting impossibly complex because we wanted to answer real-life questions - how am I connected to this person, is this person connected to someone I know? Those real-world questions made the query complexity burdensome and slow, and that was just in a proof of concept with a couple of internal data sources. The limitations became very stark, very quickly.

When Neo4j was starting to gain attention, I took the opportunity to transfer the project into an MVP with a graph back end. That immediately opened up so many more opportunities to establish relationships. We pulled in taxonomy, and specifically in the federal lobbying world there are specific ways of looking at data that relate to the business knowledge of how donations and donors work. We layered taxonomies on top, created the ontology, and were able to scale up the amount of data we could handle. Disambiguation also became a much easier problem to solve at scale.

Seth Earley: How did you explain the move to graph to non-technical people at the public affairs firm?

Steve Stesney: The best example was updates. We would have information from a campaign from three years ago where Bob Smith worked at Organization A. Then we would get new information from a different campaign where Bob Smith now worked somewhere else. In SQL that became two separate rows and two separate entities and you had to go through and disambiguate them manually. In graph, you have all the data surrounding Bob Smith - all his relationships. In the lobbying world there is tremendous movement; people form organizations with the same colleagues and move around constantly. When you are trying to determine whether "Robert" at this organization is the same as "Bob Smith" at another, graph lets you build metadata and attributes around that entity and say: this Bob Smith worked here, but we also know this organization is related to these five other things that Bob Smith is also related to - so this is most likely the same person.

I like to describe it this way: you throw everybody into the graph database and everyone starts as a coat hanger. You slowly add information to fill them out. They become a profile. Then as new information comes in you ask: is this the same Bob Smith? Here is all the metadata around Bob Smith - make the determination. We also created exception lists for cases we were not sure about, then implemented machine learning on top of the graph to flag them, with a human in the loop to confirm yes or no. We were training models sitting on top of the graph to recognize patterns. Not only would we disambiguate in the graph, we would put machine learning on top to hone it further, so that when similar patterns arrived for other entities the model would catch them automatically.

Seth Earley: The same principles apply to something like a Wall Street Journal investigation into regulatory executives trading in stocks of the organizations they were regulating. That is a perfect graph application - understanding an incredibly complex web of relationships.

Steve Stesney: Exactly. There is a tremendous amount of publicly available lobbying data, corporate board data, and donation data. You can pull them all together and say: this person worked on this corporate board, gave money to this specific individual, and all those connections are out there. From a corporate standpoint you can pull your different internal databases together and build off those, and that is what I focus on at Predictive UX - how do you expose that data, and what is the value in transparency and trust? Being able to understand not only where the relationships are, but where they are not.

One of the most fascinating things on the projects I work on is when we talk back to the business and say "you have all this cool stuff" and they say "wait - I thought we had this data over here" and we have to say: you actually do not have that information. It is very eye-opening, specifically in the data modeling phase, because their understanding of the data they have versus what is actually interconnectable is usually vastly different.

Seth Earley: How do you explain taxonomy, ontology, and knowledge graph to an executive who does not trust their own data?

Steve Stesney: Start with knowledge graph - why is that the end goal? A knowledge graph is your foundation for AI and machine learning. Build it and you can build applications and integrations on top of it. Then how do you get there?

First, taxonomy - do you have a business vocabulary, a shared language that describes your data? It is crucial to develop that common understanding so different groups can communicate. Second, ontology - I like to position that as the data business model. How do all of these different data points relate to each other? Take the taxonomy and put it into the common vernacular of the data, then take the understanding of the data model into the ontology. Does this represent how the business actually interacts with the data?

Once you have that, you can start filling the knowledge graph with data and test it. Can we derive the questions and the answers from a business perspective that are going to drive the business forward? You have to put those three pieces together in order to answer those business questions.

Seth Earley: Talk to us about how user experience and knowledge graphs connect, and what Predictive UX does at that intersection.

Steve Stesney: About three or four years ago when I was at the public affairs firm, it was all about the data. Here is the data, figure it out, build the solution. We took the content, took the data, built this really cool stuff, created this knowledge graph - and then we started to ask the business what their questions were. A massive light bulb went off. There is a gap between business and data teams about what the business is looking for versus what the data team is actually building. And I realized: I was doing it backwards.

The user experience, the business challenges, the pain points all have to be exposed first. Then the knowledge graph and the data warehouse can be built. Then applications on top. The CEO I worked for said "I need to answer one specific question: who is going to be in the meeting with me, and do I know this person?" That was the business question. That was what the whole graph should have been oriented around from day one.

At Predictive UX we focus on two main areas. First, organizations that already have graph databases and ask "now what do we do with it?" - we help them look at the business side and reorient around actual use cases. Second, organizations just getting started - we help them define the strategy, the roadmap, which applications to build. Insight and knowledge have to be driven by subject matter experts and the business, because those are the questions that will ultimately be asked of the graph. Once the graph is built, we layer on applications: search enhancements, machine learning, recommendation engines, similarity of content, dashboards, standalone apps. That is where business value starts to come in.

Seth Earley: And we see the same pattern constantly - organizations put the technology before the business value. You need a tool, let's go get it. But use cases and scenarios have to come first. Once they see the potential, though, the floodgates open and they want it all.

Steve Stesney: Yes - and you get questions that have been pressing from a business perspective for a very long time, and now they are starting to realize they might finally be able to get the answers. The challenge is to first define the ultimate goal. What are you ultimately trying to do? What is the key aspect of your business you would like to change? Is it time to value? Is it insights to knowledge? Work with the client to define that. Then set the stage early: this is a transformational product, not something you build and put on the shelf and have it spout out knowledge. Knowledge graphs can expand and grow, and setting those expectations from the beginning is essential.

You can easily get into a project where a client wants to pull in HR data and tax data and a third source - and you realize the tax data is just incomplete, you are collecting the wrong information. Now they have to think about how to fix that whole process. It becomes very overwhelming to answer even basic questions. The questions also just get thrown over to the data team and they come back and say "we got your answer" and the business says "that is not what I was looking for." It is really about facilitating those questions iteratively, working shoulder to shoulder, pulling both sides into the same discussions so they both understand exactly what is happening and why - and they can formulate responses that push the company forward.

Seth Earley: That is why we like to use current-state maturity assessments. You have these great plans but you have no metadata - it will not work. You have no taxonomy to drive your metadata. Then it is ongoing governance, ongoing metrics, feedback loops, continual improvement. "Day in the life" success criteria: we know we are successful when Joe can do this specific thing. A lot of times organizations skip steps and try to cut to the end. I was in a meeting earlier where someone said the use case for the ontology is to build the ontology. We need a bit more precision than that.

Steve Stesney: Exactly. And that is why setting expectations around an eighteen-to-twenty-four month knowledge graph plan means you are essentially doing a full digital transformation. Where is the data coming from? How is it entered? How clean is it when it goes in? Do we need to change processes on the website? Do we need to change data entry processes across the organization? The answer to all of these is usually yes. Push that little snowball down the hill - find the first use case that delivers value, get it working, share the success - and then the other departments start saying "we have a need for something similar." That is how you catalyze organizational creativity and build a roadmap that actually has integrity.

Chris Featherstone: What is coming up for you in 2022 and 2023?

Steve Stesney: At Predictive UX we are continuing to build out our data practice, which we started about a year ago. We are hiring, looking for specific data resources to help with projects. We are working with clients to really define what a full solution looks like - how do you build the graph database, what models do you need, what does the roadmap look like? This year was our foundational year. Next year we are looking to expand and grow significantly. We see a lot of opportunity in being that bridge between the business and the data teams, whether that means coming in to help projects that are stuck or getting new ideas off the ground. Very excited about what is ahead.

Seth Earley: It has been a tremendous pleasure speaking with you today, Steve. Thanks so much for your time.

Steve Stesney: Thank you for having me. Really appreciate it. Great to chat - looking forward to building this out. Thanks a lot, guys.

Chris Featherstone: For a lot of folks it is either super daunting to know where to start, or they do not know they need it at all - so thanks for breaking it down. It is always super helpful, especially at the data layer, even to get to a place where you can start to apply machine learning techniques.

Steve Stesney: Yeah - it has been super insightful. Good luck to everyone. Bye!

Meet the Author
Earley Information Science Team

We're passionate about managing data, content, and organizational knowledge. For 25 years, we've supported business outcomes by making information findable, usable, and valuable.