How AI and Clean Data Power Smarter Search
Guest: Bharat Guruprakash, Chief Product Officer at Algolia
Host: Seth Earley, CEO at Earley Information Science
Published on: September 30, 2025
Bharat Guruprakash, Chief Product Officer at Algolia, joins host Seth Earley on this episode of the Earley AI Podcast. With years of experience helping organizations leverage AI to connect people with information, Bharat brings deep insight into the evolving world of AI-powered search, retrieval, and agentic technologies. At Algolia, a global leader in AI-driven search and retrieval, he helps shape what’s next in unifying data, building intelligent systems, and designing platforms that understand real-time context.
Key Takeaways:
-
Misconceptions about Search and AI: Many organizations think large language models (LLMs) can handle all search needs, but true effective AI solutions require robust retrieval systems beneath the surface.
-
The Role of RAG and Memory: Retrieval Augmented Generation (RAG) remains important, but the future is moving toward agentic architectures that require "memory" and stateful interactions, not just stateless search.
-
Clean, Structured Data is Crucial: The importance of having clean, accessible, and normalized data stores as the backbone for any successful AI and search initiative.
-
Experimentation and Innovation: Enterprises struggle with a culture of experimentation and face challenges in safely running and scaling AI experiments. Autonomous experimentation, where AI can test and optimize different approaches, is emerging as a solution.
-
The Rise of Agentic Technologies: The distinction between generative AI (focused on content creation) and agentic AI (focused on task automation and execution), and how agents will soon drive more dynamic, event-driven workflows.
-
Guardrails and Risk: Implementing proper protocols (like MCP and Google’s A2A SDKs) and guardrails is essential to ensure agents act safely and within business parameters.
-
Privacy and the Future: As agents learn more about users than users know about themselves, privacy, transparency, and identity become critical concerns. The speed of change is challenging, but small, iterative steps help organizations evolve responsibly.
Insightful Quote from the Show:
"It's very risky to say, let's just boldly go forth into the unknown, right? I think you have to have experiments, right? You have to control them, experiments, but you have to Be careful about that and have a mechanism for managing that and for controlling it and for monitoring the results." - Seth Earley
"It's okay to start small. Find the small places where you can improve...and keep multiplying them. Over time, when you look back after a year or two, you'll have a very different company from when you started." - Bharat Guruprakash
"The big misconception is that AI does search, and that LLMs do search. Everyone thinks that LLMs can do all of it. The big misconception is that you don't need retrieval under the hood for effective generative and agentic AI." - Bharat Guruprakash
Tune in for a thoughtful deep dive into the challenges, opportunities, and responsible strategies for embracing AI, search, and agentic technologies in your organization.
Links
LinkedIn: https://www.linkedin.com/in/bharatguruprakash/
Website: https://www.algolia.com
Ways to Tune In:
Earley AI Podcast: https://www.earley.com/earley-ai-podcast-home
Apple Podcast: https://podcasts.apple.com/podcast/id1586654770
Spotify: https://open.spotify.com/show/5nkcZvVYjHHj6wtBABqLbE?si=73cd5d5fc89f4781
iHeart Radio: https://www.iheart.com/podcast/269-earley-ai-podcast-87108370/
Stitcher: https://www.stitcher.com/show/earley-ai-podcast
Amazon Music: https://music.amazon.com/podcasts/18524b67-09cf-433f-82db-07b6213ad3ba/earley-ai-podcast
Buzzsprout: https://earleyai.buzzsprout.com/
Podcast Transcript: The Evolution of Search, Retrieval, and Agentic AI
Transcript introduction
This transcript captures a conversation between Seth Earley and Bharat Guruprakash on the evolution from traditional search to AI-powered retrieval systems and agentic technologies. Topics include the misconceptions about LLMs and search, the transition from RAG to memory-based systems, the differences between generative and agentic AI, and the importance of starting small with experimental approaches while maintaining proper guardrails.
Transcript
Seth Earley:
Well, welcome to today's Earley AI Podcast. I'm your host, Seth Earley. Each episode explores how artificial intelligence is reshaping the way organizations manage information, create value, deliver better customer experience. Today, we're going to be diving into the future of search, retrieval and intelligent agents, and how, of course, clean, structured data is really important to the foundation for every AI success story. Joining me is Bharat Guruprakash. He's Chief Product Officer at Algolia, which is a global leader in AI-powered search and retrieval. He has spent many years helping organizations harness AI to connect people with the right information at the right time, as we like to say in Knowledge Management. He brings a unique perspective to where AI-driven search is heading and what it takes to build systems that respond to real-time context. Alright, welcome to the show.
Bharat Guruprakash:
Thank you very much, Seth. Very glad to be here, and excited to have a conversation about where the world is heading.
Seth Earley:
I know, this is such an interesting area, and what I wanted to do is start with, what are your… what's your take on the biggest misconceptions that organizations have about search, about AI and search, and if you want to kind of roll data and content into that, you can, but give me the big picture. What are people not understanding, or what are they misconceiving about this space?
Bharat Guruprakash:
Yeah, you know, I think the biggest—you know, we speak to so many customers, and about 2 or 3 years ago, it was, hey, search is dead. Today, it is, oh my gosh, we need retrieval or search. And I think the big misconception is that AI does search, and that LLMs do search, and that's partly because of the way it presents information to you. And so everyone thinks that, oh, you know, LLMs can do all of it. In the traditional sense, search is about presenting all the options to you, and you decide what the truth is. LLMs package the truth for you, so when you search for something, they sort of package it up and give it to you. I think the big misconception is that you don't need retrieval under the hood for effective generative and agentic AI. And today, we're seeing that happen, like, suddenly now retrieval is one of the hottest topics.
Seth Earley:
Yeah, so...
Bharat Guruprakash:
It's interesting, for me at least, being in a search company, to see this sort of evolution.
Seth Earley:
Yeah, it's been a pendulum, and it's really swung from one extreme to the other. And, it was something where people were saying, oh yeah, you know, LLMs are going to do it. And then, what do we find? We find that, number one, LLMs have a model of the world, but they don't necessarily understand your world, your products and services and solutions and so on, or your customers, your routes to market, your competitive advantage, your knowledge, right? All of that stuff is proprietary in how you differentiate in the marketplace, and an LLM isn't going to do that unless we actually connect it to our sources. So, that brings in this idea of retrieval augmented generation, and for those of you who are not familiar with RAG, retrieval augmented generation, it's saying to the LLM, don't just use your knowledge of the world to answer my question, use your knowledge of the world to make this conversational and understand the user's intent, but answer the question based on my data, right? Get it from my source of information, my source of truth. And guess what? That's retrieval, right? That's search at the end of the day. So talk more about how that kind of starts to burn. And people, to me, it seems like, oh, yeah, RAG, just use RAG, right? Like, oh yeah, it's just RAG, and that solves the problem. It's like, check the box. Tell me about why that's not necessarily true, and why RAG is not necessarily the answer. It's part of the answer, but it depends on so many other things. So maybe you could break that apart a little bit.
Bharat Guruprakash:
Yeah, and you gave a really good explanation of what RAG is, and, you know, I will say that even RAG is becoming passé now. It's no longer the thing, and the reason for that is because RAG is good when you're doing sort of a stateless conversation. So, you know, you're just in that session, you've had a little bit of a conversation, and it pulls relevant information and augments the generation. But what we're moving into, the world of agents, you need statefulness. And what does that mean? It means that agents can remember what happened the last time, a week ago, a month ago, a year ago, what you did on this particular topic. And so, RAG is sort of not able to help with that.
And so I think, you know, that is where the limitations of RAG come in. The second thing is, you know, I just want to touch upon what you said, the limitation of an LLM. LLMs are very expensive to train, and so the history that they have of the internet is actually outdated. It's maybe 6 months old, it's maybe a year old, and to retrain it is very, very expensive, which is why LLMs have to depend on retrieval to get the latest information. So, you know, you put the fact that more information is being created all the time, and the fact that the LLMs don't have that information built into their world model, you need the ability to pull information, which is a search problem, which is a retrieval problem. And so, yeah, RAG is no longer in vogue, Seth, unfortunately. The new word is memory now, and everyone is talking about that, because it allows agents to become more powerful and actually do useful work.
Seth Earley:
So that's interesting, and, you know, it's funny because the terminology and the technology and the latest and greatest changes so quickly, and organizations have challenges not only absorbing—there's currency of the approach, and then there's the ability to absorb change. Right? And, of course, technology, it changes faster than the business can absorb, and then, you know, it's changing more quickly, the business can only absorb so much, and so you're always going to have this lag, and so we still, you know, are talking to lots of people that say, hey, I just want to find my stuff, right? So, I still think there is, even though you may say that RAG is passé, I think that there's still a fundamental challenge around retrieval and finding information. How do you reconcile that with what organizations are trying to do? And when you talk about memory, you're talking about retaining context, right, over sessions, over longer periods of time. You know, I'm amazed when I am putting stuff into my LLM of choice, which can be Claude or ChatGPT, and I'm giving it very complex problems to solve, and I'm giving it lots of information, and it does amaze me at the ability for it to synthesize from what I'm giving it based on what it's already keeping in long-term memory, which I'm always telling it to update. I'm saying, make sure you put this in long-term memory. So tell me a little bit more about, you know, how these things are complementary, and then what does an organization need to do in order to get the best of both worlds. So tell me what happens if they ignore one or the other, and then what's the foundation for both?
Bharat Guruprakash:
Yeah. So, you know, when we talk about retrieval augmented generation, we're really talking about retrieving content that is maybe only with a particular enterprise. And then we are augmenting and creating a generation out of that. When we talk about memory, we're talking about context, as you said, but what is context? Context is the content that is relevant from the past, content that is relevant right now to the current conversation, or the current task that you have going on, it is related to the interactions that you had before, so that you can, you know, bring that forward into what is the latest, you know, up-to-date, you know, context that you have with the topic. So I'll give you an example. I lost my bike, I had to file an insurance claim.
And what happens is, you know, I had to, like, go through the claim, explain everything, like, the whole rigmarole, right? And they said, oh, we're going to process it. Two weeks later, they hadn't processed it. I called back, I had to explain the entire thing again. And the customer service agent was like, oh, we haven't done anything yet. Now, in the world of agents with memory, agents should have all of that context available to them. They should know that I had an issue with my bike, I filed a claim, here's what's happening with it, here's the status. And they should also know that I called back and got frustrated because nothing got done.
And so all of that should be available to the agent so that when I call back the third time, they don't ask me to explain everything again. They say, hey, we know what's happening, we're so sorry, here's where it is, here's what we're doing about it. And so that's what memory does. It allows for this continuity of experience that we as humans expect but that systems have not been able to provide until now.
Seth Earley:
Right, and this is where the idea of, you know, customer journey and customer experience management comes in, because you're saying that the system should understand where I am in my journey, what I've done, what my history is, and be able to anticipate what I need next. And that's really powerful. So when we think about, you know, building these systems, what are the foundational elements that organizations need to have in place?
Bharat Guruprakash:
Yeah, so I think, you know, the first thing is data. You need to have your data in a place where it is accessible, where it is clean, where it is structured in a way that systems can understand it. And so that's the first thing. The second thing is you need to have the ability to connect to all the different data sources that you have. Because in most enterprises, data is siloed. You have your CRM data, you have your marketing data, you have your product data, you have your customer support data, all in different places. And so you need to be able to unify that and make it accessible.
The third thing is you need to have the right technology stack that can actually do the retrieval, do the ranking, do the understanding of context, all of those things. And then the fourth thing is you need to have a culture of experimentation. Because what we're finding is that the way AI works is very different from traditional systems. Traditional systems, you build it once, you deploy it, it works the same way forever. With AI, you need to constantly be testing, constantly be iterating, constantly be improving. And so you need to have a culture where that's okay, where it's okay to experiment, where it's okay to fail, where you learn from those failures and you keep improving.
Seth Earley:
And that's a big shift for a lot of organizations, because they're used to, you know, the waterfall approach, where you spec it out, you build it, you deploy it, and then you're done. And with AI, it's much more iterative, much more experimental. And I think that's one of the challenges that organizations face, is how do you create that culture of experimentation? How do you get people comfortable with the idea that things might not work the first time, and that's okay?
Bharat Guruprakash:
Yeah, and I think, you know, part of it is starting small. You don't have to boil the ocean on day one. Find one use case, one workflow, one area where you can apply this, and start there. And as you learn from that, as you see what works and what doesn't, then you can expand. And I think that's the key, is to start small, learn, iterate, and then scale.
And I think the other thing is, you know, making sure that you have the right guardrails in place. Because with agents, especially, there's this question of, you know, how much autonomy do you give them? How do you make sure they're acting within the parameters that you want them to act within? And so you need to have the right monitoring, the right controls, the right guardrails to make sure that as these systems are operating autonomously, they're doing so in a way that's safe, that's aligned with your business goals, that's compliant with regulations, all of those things.
Seth Earley:
Right, and I think that's where a lot of organizations get nervous, is the idea of giving systems autonomy. Because, you know, what if it does something wrong? What if it makes a mistake? What if it causes harm? And so how do you balance that need for autonomy with the need for control and oversight?
Bharat Guruprakash:
Yeah, I think it comes down to, you know, defining very clearly what the scope of the agent is. What can it do? What can't it do? Under what circumstances can it take action on its own? Under what circumstances does it need to escalate to a human? And I think that's where a lot of the work needs to happen, is defining those rules, defining those guardrails, and then constantly monitoring to make sure that the agent is operating within those boundaries.
And I think, you know, we're also seeing the emergence of new protocols and new standards around this. So, for example, there's MCP, which is the Model Context Protocol, which is trying to standardize how models interact with different data sources and different tools. There's Google's A2A, which is Agent to Agent, which is trying to standardize how agents communicate with each other. And I think these kinds of protocols and standards are going to be really important as we move into a world where we have multiple agents operating, potentially from different vendors, potentially with different levels of autonomy, and they all need to work together.
Seth Earley:
Right, and that's where things get really interesting, because you're talking about not just one agent, but multiple agents that are potentially interacting with each other, coordinating with each other, and that adds a whole new layer of complexity. So let's talk a little bit more about agentic technologies. Because I think there's a lot of confusion in the market about what an agent actually is. You know, some people think it's just a chatbot. Some people think it's something more sophisticated. So can you help clarify what we mean when we talk about agentic AI?
Bharat Guruprakash:
Yeah, so I think, you know, I'll start first by just stating that there's a difference between generative AI and agentic AI, and I think a lot of people—we're still in the world where there's a lot of confusion about those two things. Generative AI, as the name suggests, is generating new content that didn't exist before, like summaries and recommendations and so on and so forth. Agentic AI is about executing tasks. Now, it might generate some new content to help it to execute the task, but it is all about executing tasks.
And so, in the world of agents, it's about automation of tasks, it's about improving your productivity, your efficiency, because the agents can do the tasks that maybe humans were doing before, and they can do it in an autonomous way. And so, you know, when we think about a simple framework to think about this is, in the world of search, traditional search, humans were doing requests, and they would get a response. In the world of agents, you're going into event-driven, which means that, hey, if something happens, do this. And that's what an agent does, and it combines it with goal-driven. So I have a goal, if this happens, do that. And it's doing all of this autonomously.
And so, firstly, I think most people confuse the two, because what you see today put out as agentic solutions are like chatbots. Like, really, that's what you see, like, little chatbots on the right, bottom right, and they said, that's our agent. And the reality is that that's Clippy from the past, if you remember. It's just a really fancy Clippy, but it is just a thing. Now, so, first up, that's the difference between the two.
Seth Earley:
Generating content versus achieving an action or executing an action.
Bharat Guruprakash:
Exactly. A simple example would be, let's say there's a customer support agent that you're talking to, and it can do the entire return. It can process the return by itself, it can also trigger the shipping of any new items, etc, without any humans in the loop. And so it sort of is taking care of all of that and increasing the efficiency and reducing the humans-in-the-loop process that might be there. So I think, you know, that's a simple agent that we're starting to see. So I think that's how enterprises are also thinking about it. They're looking at different workflows that are existing within their organization, and they're trying to see where can we insert autonomous AI to replace that. And so that's what I think—that's how most companies are starting. And so, you will see simple ones right now, because everyone is going to start with the simple workflows, and then, at some point, it'll get much more sophisticated.
Seth Earley:
Yeah. So, when you think about the stuff that we've been doing for a while in the industry, robotic process automation, that is doing things that are rules-based, so what is the line between RPA and autonomy? Because we're saying that RPA can operate semi-autonomously, right? It can go off and do some stuff, and replace a person keying in information, or go from one screen to another, or execute some predefined processes. So when do we get into what we're referring to as an autonomous agent versus robotic process automation?
Bharat Guruprakash:
I think it comes to event-driven. So, can you build rules or process automation, which are really static workflows? Can you build all the static workflows in the world? Probably not. So, can you have agents rewrite or create new workflows on the fly because certain events that you didn't think about have happened? And so that's where the difference comes in. RPA is very much, if this, then that. It's very deterministic. You know exactly what's going to happen. With agents, they can reason, they can make decisions based on context, they can adapt to new situations that weren't explicitly programmed in. And that's the big difference.
Seth Earley:
Right, so there's an element of intelligence and reasoning that goes beyond just following a predefined set of rules.
Bharat Guruprakash:
Exactly. And I think that's what makes it powerful, but that's also what makes it a little bit scary for enterprises, because with RPA, you know exactly what it's going to do. With agents, they have some degree of autonomy, some degree of reasoning capability, and so you need to make sure that they're reasoning in the right direction, that they're making decisions that are aligned with your business goals.
Seth Earley:
Right. And so that brings us back to the importance of guardrails, monitoring, all of those things. So talk a little bit more about what organizations need to think about in terms of implementing these agentic systems safely.
Bharat Guruprakash:
Yeah, so I think, you know, the first thing is defining very clearly what the agent's scope is. What is it allowed to do? What is it not allowed to do? The second thing is having monitoring in place so that you can see what the agent is doing, you can understand the decisions it's making, you can see if it's operating within the boundaries that you've set. The third thing is having a kill switch, basically, a way to stop the agent if it's doing something that you don't want it to do. And the fourth thing is starting small, as I mentioned before. Don't give the agent too much autonomy too quickly. Start with limited scope, limited autonomy, and as you build trust, as you see that it's working correctly, you can gradually expand the scope.
And I think the other thing that's really important is transparency. The agent should be able to explain why it's making the decisions it's making. It shouldn't be a black box. And I think that's one of the challenges with some of the current AI systems, is they make decisions but they can't necessarily explain why. And for enterprises, especially in regulated industries, that's a problem. You need to be able to explain why a decision was made, especially if it affects customers, especially if it affects compliance, all of those things.
Seth Earley:
Right, and I think that's one of the big challenges as we move into this agentic world, is how do we maintain that level of transparency, that level of explainability, while also getting the benefits of the automation and the efficiency that agents can provide?
Bharat Guruprakash:
Yeah, and I think, you know, that's where a lot of the research is happening right now. How do we make these systems more explainable? How do we make them more transparent? And I think we're seeing progress on that front, but it's still early days. And so I think for now, the best approach is to start small, start with use cases where the risk is relatively low, where if something goes wrong, the consequences aren't catastrophic. And as you build confidence, as the technology matures, then you can expand to more critical use cases.
Seth Earley:
Right. And I think that's good advice for organizations that are thinking about getting into this space. Start small, learn, iterate, build confidence, and then scale. So as we wrap up here, I wanted to get your thoughts on where you see all of this heading. What does the future look like for search, for retrieval, for agentic technologies?
Bharat Guruprakash:
Yeah, I think, you know, we're at a really interesting inflection point. I think in the next few years, we're going to see agents become much more prevalent, much more sophisticated. They're going to be handling more and more tasks that humans are doing today. And I think that's going to fundamentally change how we work, how we interact with technology. I think we're also going to see the emergence of multi-agent systems, where you have multiple agents working together, coordinating, collaborating to achieve complex goals.
And I think, you know, as I mentioned before, there are challenges around privacy, around security, around making sure these systems are safe. But I have confidence that we'll figure those out. We've had major technological shifts in the past—the Industrial Revolution, the internet—and we've adapted. And I think we'll adapt to this as well. But it is moving fast, and I think organizations need to start preparing now, because the pace of change is only going to accelerate.
Seth Earley:
Yeah, and I think, you know, for any decision maker listening, it comes back to the basics, right? It's what are you trying to accomplish? What value are you creating? What is your competitive advantage? How are you improving the employee experience or the customer experience? How are you adding efficiency to the process? How are you improving your competitive advantage? And those are the fundamentals, and those are the basics, and then we build on those and try to still anticipate the risks and the potential downside of these things.
Bharat Guruprakash:
100%, I would say just to add to that, Seth, I would just say, for all your viewers and enterprises, it's okay to start small. Find the small places where you can improve all of the things that you just said, but keep doing them more and more. Multiply them. And I think, you know, over time, when you look back after a year or two years of doing this, I think you'll have a very different company from when you started. And you might not realize it along the way, but if you keep doing them in small iterations, you will end up with a much better outcome.
Seth Earley:
Yeah, the evolution is happening, and it's funny, I was talking to another person who's going to be a future guest on the podcast about risk management, compliance, and they're saying there's still some companies out there that's like, no to AI. Absolutely not, no. No, we're not going to do it. And it's like, geez. You're not going to stay in business if you don't think about this.
Bharat Guruprakash:
There will always be people at both ends of the spectrum, and, you know, that's just the nature of the world and enterprises, so… but I don't think we can ignore what is happening as well.
Seth Earley:
Absolutely not. Well, Bharat, thank you so much for sharing your perspective on how AI is transforming search, retrieval, agentic technologies, these are all such very important topics, and for listeners, be sure to subscribe to the Earley AI Podcast for more insights. Bharat, where can people find you? We'll put it in the show notes, but, LinkedIn, your company…
Bharat Guruprakash:
LinkedIn, Bharat Guruprakash, you know, that's the—I check it a lot. I'm more of a LinkedIn person than a Twitter person, or X, and so LinkedIn is the best way to reach me if you want to talk more about this stuff, and it was wonderful to be on here, Seth.
Seth Earley:
Well, thank you again. This has been terrific. Really interesting conversation, so I really appreciate your time. And thank you to our audience, and thank you to Carolyn for doing all the production work behind the scenes, and we will see you on the next episode of the Earley AI Podcast. Thank you.
