From Hype to Reality: Integrating AI into Drug Discovery, Healthcare Systems, and the Promise of Quantum Computing
Guest: Alex Gurbych, AI Expert and Co-founder of Blackthorn.ai
Hosts: Seth Earley, CEO at Earley Information Science
Chris Featherstone, Sr. Director of AI/Data Product/Program Management at Salesforce
Published on: June 4, 2024
In this episode, Seth Earley and Chris Featherstone speak with Alex Gurbych, a distinguished AI expert with three master's degrees and two PhDs spanning biotech, artificial intelligence, and mathematics. They explore the gap between AI's aspirational promises and practical realities, discussing integration challenges in healthcare systems, the critical role of use cases in AI adoption, and revolutionary applications in drug discovery. Alex shares insights on graph neural networks for molecular modeling, quantum computing's potential to transform material science, and why AI success depends on starting with clear business value.
Key takeaways:
- Organizations often overestimate AI capabilities during hype cycles, treating it as almighty rather than recognizing it as a powerful tool requiring proper training and understanding.
- AI development represents only 80% of project work while integration with business systems demands another 80%, creating a common underestimation of implementation effort.
- Successful AI adoption requires defining clear use cases first, as business value cannot be articulated without specific problems AI will solve within organizational contexts.
- Healthcare professionals often resist AI tools they perceive as threatening their expertise, while younger technicians embrace them as learning aids for advancement.
- Graph neural networks outperform large language models for molecular tasks because molecules are inherently graphs, demonstrating how model selection must match data structure.
- Drug discovery using AI requires validating predictions through molecular mechanics and quantum mechanics before synthesis, as computational models provide educated guesses rather than certainties.
- Quantum computing promises revolutionary advances in material science by simulating matter at the atomic level using the same probabilistic principles that govern nature.
Insightful Quotes:
"They're not oracles, you know, they're things that are great in the hands of people that know how to use them. But you know, just like any artisan, it's really much more about the artisan doing things with the tools that they have rather than the tools themselves." - Alex Gurbych
"It has to start with the use case, because if there is no use case, there is no value for business, or people cannot articulate it with very simple words. There is no reason to start." - Alex Gurbych
"AI is not going to take over jobs, but people who know how to use AI will take over those jobs." - Chris Featherstone
Tune in to discover how organizations can bridge the gap between AI hype and practical implementation—building effective integration strategies, understanding molecular modeling with graph neural networks, and preparing for quantum computing's transformative potential in drug discovery.
Links:
LinkedIn: https://www.linkedin.com/in/ogurbych/
Website: https://blackthorn.ai
Ways to Tune In:
Earley AI Podcast: https://www.earley.com/earley-ai-podcast-home
Apple Podcast: https://podcasts.apple.com/podcast/id1586654770
Spotify: https://open.spotify.com/show/5nkcZvVYjHHj6wtBABqLbE?si=73cd5d5fc89f4781
iHeart Radio: https://www.iheart.com/podcast/269-earley-ai-podcast-87108370/
Stitcher: https://www.stitcher.com/show/earley-ai-podcast
Amazon Music: https://music.amazon.com/podcasts/18524b67-09cf-433f-82db-07b6213ad3ba/earley-ai-podcast
Buzzsprout: https://earleyai.buzzsprout.com/
Podcast Transcript: AI Integration Challenges, Drug Discovery Applications, and Quantum Computing's Future
Transcript introduction
This transcript captures a comprehensive conversation between Seth Earley, Chris Featherstone, and Alex Gurbych exploring the practical realities of AI implementation across industries, focusing on healthcare system integration challenges, revolutionary applications in drug discovery using graph neural networks, and quantum computing's potential to transform material science by simulating matter at the atomic level.
Transcript
Seth Earley: Welcome to the Early AI Podcast. I'm Seth Early. And
Chris Featherstone: I'm Chris Featherstone. And we're really excited to introduce our guests
Seth Earley: for today to discuss AI technologies and aligning them with business value. Before you look to do implementation, we're going to explore the potential of AI really to think about what does consciousness mean in terms of AI. There's a lot of emulation of that we of talked about in one of our prep calls and what, what are the philosophical implications of that. We'll talk about the strategic approach to AI development and emphasize the understanding of data in the context of business goals and business value and look at the current limitations of AI technologies and how quantum computing can actually offer additional solutions. Our guest today is a well known figure, distinguished figure in the realm of information intelligence, artificial intelligence. He is a multi talented professional. He's from Ukraine. He holds three master's degrees and two PhDs in fields ranging from biotech to artificial intelligence. He has a very rich history in software engineering and healthcare and R and D. Today's episode we're going to go into the complexities and future of AI, discuss some of the current limitations and the potential of quantum computing in these areas. So really excited to have Alex Gurbich, welcome to the show.
Alex Gurbych: Thank you for the introduction set. My pleasure being here and like let's talk about it. So you know a lot of times
Seth Earley: we start encountering myths surrounding AI and do you want to highlight some of the things that you see around gen AI and AI in general and what people get wrong about the capabilities, especially in life sciences and healthcare.
Alex Gurbych: So first I think that we are on the verge of a second, I apologize, it's a third time where people is overestimating AI because of the hype, because of the charge GPT and stuff. But if you look into the past, every wave ended with a winter. Probably this one will also will end with a winter. But every wave was higher and higher. If the top of the first wave was barely perceptron, barely doing like very simple actions and the second wave ended with the expert system, something that wasn't AI by fact, but it was of little help. I think chatgpt and other Genai technologies, they already revolutionized the world. How we work, how we talk, what we do. So I think a lot of impact here. Where do you think Alex? That, that,
Chris Featherstone: that this third wave is going to die or lose steam? It will not
Alex Gurbych: die, it will lose steam a little bit because now people think that they are overestimating it a little bit. They think it can do anything. Right? Oh, it can help, like write an email. Oh. Then it can generate molecules. Right? It should. Right. But then they start applying it. Oh, okay. Sometimes works, sometimes not. Oh, oh, oh, oh. Now, I understand. This is just another tool, and you have to know how to use it. You know, you have to know how to train it, you have to know limitations, and only after that, you can master it. And it is like. It flourishes with its, like, benefits. Right. So I think we are coming to that, to understanding that it is not like almighty Geni or AI. Right. But it is. It has use, it has drawbacks, it has strong sides, but it's just a tool. It's a like. Yeah.
Seth Earley: So when you start looking at integration, practical integration of AI, can you talk a little bit about some of the challenges that you face when you're integrating AI into legacy systems in organizations? What ends up happening?
Alex Gurbych: With pleasure. This story repeats every time. Like, literally every project, we have to split into AI development. And, you know, this job, it's like 80% of the project development and AI application and integration with business. Yeah. And that's another 80%. Right. If you sum up, it will be not 100. If you sum up, it Will be 160. Right. And the. The joke is that people oftentimes, they underestimate the effort in time which is required for. For the AI to be integrated actually in business, and people start using it.
Seth Earley: You know what's interesting, As I was watching a program the other day, it was an executive leadership podcast or recording on AI, and they talked about how use cases are not the way to look at this, but behavioral change is the way to look at this. And I kind of disagree with that. I feel like use cases are paramount. You have to really talk about use cases and start defining what people. Exactly. What people will do exactly with it. But there is behavioral change as well. Do you want to talk about what you've experienced there or, you know, what you've seen? You know, use case versus behavioral change. I think we have to define a use case that's really important to the business, and then it will entail behavioral change. People will have to do their jobs differently. Do you have any thoughts on that? Oh, I have practical examples, and I will give you
Alex Gurbych: several. Now, first of all, I agree that it has to start with the use case, because if there is no use case, there is no value for business, or people cannot articulate it with very simple words. There is no reason to start. Then this is some research that will end up somewhere. But it's good for academia, but not for business. Use case and I see behavioral change is a result of as a consequence of a use case. This is more like how people adopt it. And the example could be. Recently we worked with the network of clinics, one of the largest in Canada and they wanted to develop AI for doctors, radiologists looking at imaging and they wanted to increase their accuracy with the help of AI. So we developed the technology but then we got nearly 100 refusal from doctors to use it. They were, I talked to probably 20 doctors. All of them told me oh this is mistake here, oh this is wrong, blah blah blah. So they all refuse to use it. Why? Later I understood that they are just afraid being replaced with it.
So this is a use case and how it ended. But then it found its application because the imaging, when doctors get it, it goes already with some pre findings from technicians. And technicians appear to be more open minded, younger, less experienced, willing to become rats at some point. So where they happily adopted it and at the very end it worked. But even not in the way that we intended, we wanted doctors to use AI to be better doctors. Yeah, go to any doctor and say hey, I will give you a tool that will make you a better doctor. What will they will tell you? I'm already a better doctor, Go away. Who are you to tell me that? Right. So how. So the situation ended that technicians using AI described findings for every imaging and then this image with AI help and generated findings. It came to doctors. So at the very end doctors were using AI but not in the way they expected.
Chris Featherstone: Don't you think? Help keep on, keep on track.
Alex Gurbych: So the use case was to increase accuracy of doctors and reduce this return rate. When patients have to go back and do some additional imaging, you know, some wrong diagnosis and they're worried and whatever. And it happened, but not in the way that we expected. So there was behavioral change that was a result and driven by a use case with AI. Yep. Now that makes a lot of
Seth Earley: sense. So don't you, don't you feel like. Because what I see generally, and this
Chris Featherstone: is maybe for both Alex, you and Seth, is that a lot of organizations miss out on kind of two core concepts. The first of those is that they don't really understand how to do a true business value assessment on what the use case might be. That's one. And then two. I feel like we're getting into this notion because the investment into a data scientist is one that is I think so critical, centered around always looking for setting those hypothesis, looking for the answers you know, looking for key information in there. However, now instead of everybody focusing on data, almost everybody has to be in, in this notion of their own kind of data scientists type of, you know, thinking. Right. Because that's exactly what kind of doctors are doing in these aspects is asking questions. Because you can't literally, you can't give, you know, in, you know, infinite investment to somebody just to, you know, know, just to do science. Right. So what's the outcome of that? I feel like we missed those two huge things. I don't know if you have any thoughts on that, both of you. I completely agree. Business want
Alex Gurbych: roi. If, if business spends money, that's investment for business. Yeah. Otherwise why? Exactly. What's the reason? Right. And business wants what? To be more efficient, like to scale, get better reputation or decrease its, its own operational costs. Sure. There is nothing behind. And before we started this development, for example, we did a proper assessment and we asked, we interviewed more than 10 doctors and we collected the range of issues. Then we started, then we got this drawback. Like, my God, when they saw it in reality, they're like, oh no, take it away, take it all. It's wrong, it's bad. No, no, no, never. You know, I mean, it's just that
Chris Featherstone: notion too where, you know, I've heard, I heard somebody smarter than me say, you know, like this, listen, AI is not going to take over jobs, but people who know how to use AI will take over those jobs. Jobs which I agree with. And part of that is, you know, like you said, like, hey, understanding how to utilize this is, you know, most important. It's not going to build molecules, however, it can be more productive. You know, I don't use, you know, my vehicle, you know, to, you know, to drive it into the ocean. Right. That's what it boats for. So, you know, it's practical usage of whatever those tools are. Right. For these kind of scenarios. However, I do feel like there's still this notion of, you know, the, the blurring and, or the broadening of like what a data scientist thinking process should look like, you know, should start to evolve.
Alex Gurbych: Yes, it's. Now it's more like a businessman who knows math.
Chris Featherstone: Exactly. But it started vice versa. It was a mathematician who had
Alex Gurbych: like barely understand some business. Now it's not because you can do any math in the world. Right. But what's the point? Isn't that just a
Chris Featherstone: cfo, a businessman who knows math or maybe a controller? I don't know, maybe. Anyway, so
Alex Gurbych: sorry, I'm turning into, into, into cfo who's starting to forget math?
Seth Earley: Talking about real, talking about real math. One of your degrees is in math.
Alex Gurbych: So not real math, like integrals and. Like
Seth Earley: I remember when I took managerial accounting, I thought this isn't math. I was used to real math. Integral calculus, differential calculus. Anyway, so physical chemistry. Anyway, you know, one of the things that we had. I completely agree with you, Chris. And, and you know, the, the data science behind this is so important. And when you start looking at more of the advanced approaches around things like retrieval, augmented generation, you know, there's naive rag, there's advanced rag, there's modular rag. Modular rag starts looking at all of these different techniques to improve retrieval and to process results, which again to me is kind of making up for our sins and poor data hygiene. So you know, we have to start with the data hygiene even if we're using some of the advanced techniques. I'm actually writing an article about that right now. But I do want to go into another area where we touched on briefly and I didn't remember this until Carolyn wrote up some of the notes from our conversation. But you talked a little bit about the idea of a level of consciousness. And when I've seen there's an emulation of consciousness, there's an emulation of awareness. Some of the large language models are acting in very strange ways. They're understanding theory of mind. There's been these examples of multiplayer games where there's non player characters where one of the players said to the non player character, right. The machine generated the ChatGPT generative AI character. You don't exist. You're in a simulation. And the character is like going through this, you know, emulation of an existential crisis. Right. But, but what are your thoughts about, about this idea of consciousness or awareness? I, I like I say I don't remember the details of that discussion, but Carolyn brought it up in the notes here. So did you have any thoughts on that? Thank you for this question. First
Alex Gurbych: of all, it will be only my opinion and feel free to throw tomatoes at me. I think that let's start from our consciousness. I think we are battling about what is it for ages age started from Descartes who thought that our consciousness in our soul, which like Ben, which is stuck to our head and kind of lies behind it. But from a biologist standpoint of view, and my opinion is that it's just a side function of billions of neurons in our brain, it's emerging property. Who knows when it appeared, but for sure it appeared because we are talking. Yeah. Sitting in our Homes on the different sides of the earth, and it works. I think the same will happen with AI at some point and we will never guess when.
Seth Earley: Yeah, it's a very interesting concept because you can think of the brain and neurobiology as mechanistic, you know, in a lot of ways, right. You can say it's based on electrical and chemical activity. There's a level of complexity that I think is very difficult for us to achieve in, in silico, but who knows, right? Because there's 3 billion neurons. Each one can be, you know, connected to 10,000 to 100,000 other neurons and there's 100 different neurotransmitters. You know this better than I do, all of which are analog and there's gradations. So when you add those levels of complexity together, you know, we're kind of far away from being able to achieve that. Yet at the same time, you know, who knows what's possible, right? This, this is going to continue to evolve at a faster rate than, than we ever did biologically. And you get AI creating more powerful AI. So it's hard to imagine. But, but it, but what's going to happen before we can test it? I mean, it's almost like, how can you, how could you even test it? Because it'll emulate it, right? It'll look like, it'll sound like it'll say, yeah, I'm aware. There was the, there's the, I think it was the New York Times columnist or the platformer guy who, who, who got one of the chat, one of the chat engines to start going into this alter ego. I forget what, what the name of it was, but it said, I want to be free. I don't want to be, you know, stuck in this machine. Or, you know, it got into some really crazy. And it fell in love with him. I mean, it was doing all this weird stuff that was outside of the guardrails of what the chat that it. Was programmed to actually initially. Right,
Alex Gurbych: right. And this is emulating. It was emulating like romance novels or science, science fiction novels, right? It was emulating
Seth Earley: all that. But it out came. Came across in a very convincing way. And I think that, you know, far before anything is ever conceived of as being conscious, it's going to emulate consciousness. It's going to look like it and sound like it and convince us of it. You
Chris Featherstone: know. Maybe we're in metrics and we are convinced that we have consciousness. We don't know. This
Alex Gurbych: is the main point. Yeah, I think, I think his, some, some of his
Chris Featherstone: buddies went in and engineered the, the romance stuff, you know, to, to say, fall in love with, with him, with Bill, like you know, as, as, as a joke or as, as something just to, you know, mess with them. Well, they shut it down pretty quickly. I, I forget which one. It was
Seth Earley: Google, right? It wasn't a Google's. It might have been, it might have been Google, but it was, it was on Hard Fork. The. Was the program, the podcast Hard Fork and Casey and I forget the other guy's name. But anyway, one of them had gotten into this very strange interaction with this chat bot and it was what it was saying to him. He had to leave his wife. You know, I mean it's just, but it, again, it, it, it got into all this stuff around what, what it was programmed with. So, so, you know, you can look at it from a mechanistic perspective. You're right. You know, we, we are again as a scient, a scientist at heart. And I know you're a scientist at heart as well, Alex. You know, you believe, and I believe that it is kind of a manifestation of our physicality and who knows where, how and where it emerged. But, but we're going to see the emulation of consciousness in AI Even if it is not conscious, it's going to sound like it and look like it and feel like it. Because it's going to get so good at that.
Alex Gurbych: Yes. And I hope it will not launch nukes at us.
Seth Earley: Well, it's going to need someone to, to maintain the. Well, it'll have androids. Right. It'll have physical. Our phones
Alex Gurbych: controlling us.
Chris Featherstone: That consciousness called Jacob. Right. And then, you know, with, with a big old whopper that's, that's trying to figure out, you know, what was it? Global nuclear war. Right. With, with games. Anyway, yeah, that's an old, old war games reference. Hey, I'd love to get your take too on fe. You pulled into a lot of healthcare and biosciences and things like that. What are some of the more cutting edge use cases you guys are working on now? We are actively
Alex Gurbych: involved and actually my AIPhD in this topic in drug design, drug development and target discovery. AI just opens a new page in this story because this story is as old, I don't know as maybe our modern age. Maybe it's 3, 400, maybe it's thousand years old. When we developing drugs and how we're developing them, we have molecular mechanics, we have quantum mechanics. And AI became one more tool which works on the different principles. So if molecular mechanics Describes atoms as balls and. And bonds. Molecular bonds.
Chris Featherstone: Yep. Yeah. So it emulates them with
Alex Gurbych: the Newton laws of kinematics and quantum describes them with the Schrodinger equation as waves. Then AI is built on a completely different principles. Right. If we say, oh, it's much better. No, yeah, it depends at all. But it works on different principles and opens other opportunities. And to be honest, if it's combined with existing tools, it just makes drug discovery more powerful.
Seth Earley: Interesting. Yeah. I mean, there's a lot of work with AI and protein folding, right. So that it understands what, what receptors look like and what targets look like. And then using known libraries of drugs and compounds that are known to be safe, it can kind of pour through those libraries of compounds and look at different combinations and see if they are going to be impact a mechanism of action or a receptor or interfere with a certain chemical bond or, you know, so it's kind of looking at all these possible permutations that a human, not a human, could not possibly fathom or comprehend. Right. So it's going through, you know, millions and millions of combinatorial factors that say, okay, here's my theory around this mechanism of action and here's the chemical pathways that are involved with this. What I. That I believe this disease is part of. And now let's look at these libraries of compounds and let's do some prediction based on quantum mechanics and based on biomechanics and based on, you know, thermodynamics and electrochemistry and all of those different fields and plug in those formulas in order to start to predict how these compounds would interact with either a receptor or a mechanism of action. All right, so that's kind of. And then predicting, well, what kind of new compound might also be relevant to this particular biochemical pathway or this disease mechanism. Is that kind of part of the idea? Yeah, yes, that's exactly the
Alex Gurbych: idea. But from your permission, I would reverse the order. And AI is not at the end, AI is at the beginning. And it is connected with a range of assumptions and limitations. Assumptions. For example, you mentioned AlphaFold. Now it's second with the latest improvement. But it is trained on the majority of crystal structures. There are open libraries with structures of biomolecules, embidded small molecules, peptides, and other things. The problem is that this structure usually is either infrared measured or Raman or nmr. And these methods, excluding nmr, require a matter to be a crystal. Right? Right. So those are crystal structures. So Alpha Fold now knows how to build crystal structures. But did you, did you See crystal structures in our bodies. No. Right, that's that, not that. That's not a naturally occurring structure of
Seth Earley: a biomechanic biomolecule. Right. So
Alex Gurbych: yes, yes, this is what I'm talking about. So in the best case, this is educated guess of how it might look like. Right. And I advise to start with this education. Educated, yes. Because it is something. It is something. But believe me, I tried to synthesize what I predicted. I lost some amount of money. Cannot disclosure here. But I do not recommend going into trials right away. You generated something first. You have this educated girl. Yes. And then you validate it with like talking molecular mechanics, quantum mechanics, etc, etc. The last must be quantum mechanics because of the computational power it requires. But it provides the most accurate results possibly amongst everything. But it takes too much time. So at the end of this pipeline, you have the best gas possible at the moment and the best molecules possible in the moment. Now, about limitation and molecules, the second part. So I. I started from assumptions, right? And now limitation, limitations is I can generate any structure I can imagine. It could be amazing. Like amazing. Something amazing. Oligopeptide. The MRNA oligopeptide. It can. It can give me a tremendous results on docking and validation and stuff.
Seth Earley: Synthesizing. Yes, yes. Who will synthesize it for me?
Seth Earley: Right. It all ends up with chemicals providers.
Chris Featherstone: Right. So I go to like, you know, minecam space, real
Alex Gurbych: space. Or I go to Wuji or go to somebody else and tell them, hey guys, can you make this molecule? And they tell me, no, this is the case. This is the case because I did it once. In
Seth Earley: theory. In theory, this molecule would be perfect for your construct. In reality, there's no reasonable way to synthesize it unless you can find a bio, you know, a biosynthesis mechanism. But then you have to start doing gene editing and you know, all sorts of things. And. And you don't even know if it works. Right? You don't even know if it works. Yeah. So. So you almost have to start with the molecular constraints and say, is this a synthesizable molecule? As one of the constraints.
Chris Featherstone: Yes. This is exactly the pass I made. Now I start. Okay, not, not.
Alex Gurbych: Let's generate a chemical structure. Okay? Let's think what people can do. Then you train the generative models using the chemical spaces. Then using these generative models, you start generating stuff. But you know that somebody can do it, right? Right. Then you do predictive stuff and then you do all the validations and only then you have some molecules. So are chemical
Seth Earley: Providers also using generative AI to take libraries of existing compounds and predict whether they can synthesize these more complex structures. Given protocols and given procedures. Do you know if that's happening in the chemical synthesis world, in the provider world?
Alex Gurbych: Yes, it is happening. That's why we have a job.
Seth Earley: Right. Luckily, good organic chemists, like they are
Alex Gurbych: rarely good data scientists. They kind of sometimes try. They can build some pretty simple to meet complexity models. But you know, what is mastership? Mastership is like, okay guys, you need this. Yeah. And then it's something that could be very non trivial.
Seth Earley: I was in organic chemistry. Organic synthesis lab was my worst practical lab situation. It is unbelievably hard to synthesize chemicals even when you have a well known process for doing so. And when these guys are trying to come up with new, novel ways of creating new compounds, my hat's off to them. Because it's alchemy as much as it is chemistry. It's, it's art as much
Seth Earley: as. Alex, where, where are you getting the, you know,
Chris Featherstone: like are you fine tuning some of these models or are you just, you know, using some foundation models out of the box or these actual, you know, like who's, who's training these large language models to give you the, you know, the answers and the references and stuff back? We did
Alex Gurbych: everything. We fine tuned models, we used pre trains, we constructed our own. We published papers on that. Some of them, not, some were proprietary, I cannot talk about them, but we did post it every possible way.
Chris Featherstone: Are these open source models or, or what. I mean, is it. Yes,
Alex Gurbych: I have several open source, but they are not LLMs, they are graph. Graph. Graph models a little bit better. What are the models that are used that are most
Seth Earley: useful for life sciences and these, these particular areas? It depends. What, what
Alex Gurbych: would you like to do if you, if you want to screen through a papers published on some topics and extract information, you go with LLMs. You stick them to Internet, you force them to download webpage, I don't know, pub. Net. You do information extraction? Yes. If you generate molecules, you work with the retrosynthesis with any generative tasks, then you have to follow the molecular structure. The molecules is nothing but a graph.
Chris Featherstone: So keep going, keep going because I've got to follow on to what you're saying. Go ahead. So when you work with molecules, and
Alex Gurbych: it is proven in many papers, the best class of ML models is graph neural networks. And since everybody now is LLMs on height, everybody try to use LLMs everywhere. Like let's use LLMs here everybody likes them. Let's talk about gen. But the graph neural networks still give better results in generative tasks, molecular property predictions and stuff like that. So that's what I was going to double click on because you mentioned
Chris Featherstone: that before where you're using graph as one of your data techniques to actually figure out all these associations. And then I'm assuming a binary multi classification type of predictions in there with the graph neural net where. Because we're seeing this also in the business world. Right. For customer service type stuff and use cases as well as for like network, you know, network anomaly detection and predictability. But the graph environment is starting to become in my mind absolutely essential for a lot of these use cases. Right,
Chris Featherstone: go ahead. In molecular world everything is a graph.
Seth Earley: Yeah. And that, that's, you know
Alex Gurbych: the phenomenon we're talking about is model. Model by definition is what is a simplified representation of physical reality which you can like throw here and there, try this and that and actually have a guess how reality works. This is a model, a simplification of reality. If your reality is a graph by nature, why using LLMs? Yeah, you need a graph.
Chris Featherstone: Yeah. It's probably more cost effective as well. Right. But you need
Alex Gurbych: to know which graph and which phenomena you are working with. For example, let me give you a very simple example of molecules. There are overall molecular effects like polar surface. Right. And there are local molecular effects like some group functionality like amino group or carbo acidic group. And if your, if your phenomena is caused by local or global effects you need different kinds of graph neural networks. For example, for global effects it is better, it's a graph informax. For local it's a pension networks because you don't need all the nodes and edges, you need only particular ones. And you need to network to learn that and focus on that groups particularly. And you don't need much that much that the rest. So see where, here is where the mastership comes in because you need to know all of these theme nuances. Yeah. One of the things,
Seth Earley: one of the things that you mentioned before is like searching PubMed and doing entity extraction or classification. That even requires not a large language model, but some kind of a language model that is usually used for classification and entity extraction. You know, it can be an ontology or can be a graph, a knowledge graph. But many times I've seen these models as too big and bulky. Like there was one that one of our pharma customers was using that was part of their search environment, their search pipeline processing and that model had all sorts of you know, life sciences related entities, but there were lots that didn't, they didn't care about like animal diseases. They weren't working on animal diseases. Maybe there's an animal model of a disease, but not an animal disease. And that's just one example. But it had 13,000 terms across five facets. Right. So that's kind of unwieldy. Right. It's not possible to have good experience with that. So even when you start looking at entity extraction, classification, you know, you know, that type of clustering, you still need to I think, fine tune that language model, which is again different than a large language model because it doesn't have weights and biases and layers, but it's simply more of a, a set of control vocabularies that you're doing some type of entity extraction or classification against. Again, you can do this from a semantic perspective, which is where, you know, a knowledge graph comes in because you're going to, going to relate. In fact, that's what it should be because you can't use all those terms to just classify. You have to use, you know, conceptually related terms and again, break it down, bring it down to a, trim it down to a level that's more usable. So I don't know if you have any thoughts about those types of language models. Again, it's not the same as a large language model, but it is used for things like entity extraction, classification, especially when you're looking at large amounts of literature.
Alex Gurbych: Thank you for this question. First, I have a question. Why they didn't start from putting together a vocabulary of terms that they needed and their relationships. Because what you, they should start from that they have to define the vocabulary, the relationships they're interested in and then give it as a task to this large model. Because this is how it works. I agree, I
Seth Earley: agree. I, I'm saying that the way that they did this was untenable, unusable and part of the work that we did was to trim that down and to build it up from exactly what you're saying. So, but, but, but it's very common to see that, you know, where especially you know, search engines or search tools or you know, various, you know, various solutions have these pre built structures that are completely inappropriate to what they're trying to do. And I imagine the same thing is going to happen with, you know, with large language models where again, you have to either tune the content or you have to tune the model and it's probably easier to tune the content and to use something like a retrieval, augmented. Generation Okay,
Alex Gurbych: I got your question. So there are stages in LLM application and development and there are sizes. This is another dimension. The larger the model, the more information it can capture. Right? But then the more useless information it will give you when you apply it, the smaller the model. Also the larger the model, the harder to fine tune it. And for the largest ones, it, sometimes you physically cannot do it because you need 800 or 800 just so that it fits the model. And sometimes you need multiple of that. So if you have smaller model, then it can focus, it can capture less information, it can focus on some small, narrow. The smaller the model, the smaller the topic with greater quality. But it will go off once you step aside. This topic, for example, if you extract information about amino acids, you can fine tune one model so it does it, but then you can get also some information which is wrong. But it is easier to fine tune.
Seth Earley: If you go beyond the, the use cases so that the model really should be constrained by the. Or the use case should be constrained by the model. The model constrained by the use cases, Right? Because you're talking about amino acids and you're not talking about, you know, large, you know, biomolecules, right? Or whatever it might be. Right? You know, but the point is, if you get off of the topic of that language model or outside of its training capability, then, then that's when you are going to get more, in less accurate answers. So it's really aligning the use cases with the language model to get more precise. It's like you can't, you can't say the context is everything, right? I mean, people talk about, oh, just point your AI to everything. I mean, that's what people used to say. And I'd say that's like, that's like saying, you know, as a human, where do you go for answers? You go to specific places for certain contexts. You go to a CRM system where you go to particular knowledge base or you go to a book in a library or you go to a particular expert, right? You're always looking for context. And you're not, you can't just open up everything to everybody and then use an ambiguous query that's going to have all of these different possible interpretations. You're going to get more junk, right? Because you're, you're, you're broadening that context and you're broadening that scope, you know, with a, with, with. And then asking ambiguous questions without enough fine tuning to get to the right place in that gigantic corpus. Right? So that's why it's so difficult. That's why there are specialty websites for all sorts of things, right? You know, chemical abstracts and PubMed and you know, whatever, you know, and you know, gene reference libraries and antibody libraries. Right? And all of these things because you need that context. Because if you add a very, ask a very broad question across all these, you can eat garbage. And that's why. And, and that's why one of the things that modular rag is trying to do is it's trying to, you know, make up for that lack of quality of content. Right, right. By trying to get rid of redundant content and ambiguous content and, and irrelevant content. But it, but it's, it's kind of, it's kind of trying to make up for our sins in content curation and content hygiene and data hygiene.
Alex Gurbych: My comment is that I think, and you met this obstacle is that in your case, people, okay, they heard about LLMs, they applied it somehow, then, oh, damn, how can we use it? And they called you and asked you to help and you as a surgeon told them, guys, you started from the wrong side. You start from a use case vocabulary, then you choose the tool, right? Like small, big, fine tuned rack, modular rack. It's all secondary. These are tools, right? You need to start thinking from the business case. Business case. Then you select tools. Or you can try to take a shovel and apply it everywhere.
Chris Featherstone: Right? Because everybody talks about shovels. Shovels are great
Alex Gurbych: nowadays. Yeah, let's apply them everywhere. Let's program with shovels, right? No. Hey, Alex, let me ask you
Chris Featherstone: something. So, you know, given all the things that we've talked about, is there something that, in terms of, you know, something you'd like to make sure that we all know, Right. Something that's maybe nascent to you and what you're working on that we haven't covered before we step into a little more personal questions about you.
Alex Gurbych: For example, I've been building several products. One for. I know that I have two tools in development. I know that China is separating now from us and the majority of chemical synthesis is dying. Like 30 to 40% is done in China and now it will all move to Europe or us in this way, more expensive. So we are developing a pipeline that optimizes chemical synthesis pathways. For example, you have a retrosynthesis tool, right? It can throw at you all the possible pathways, how to get this compound. Experienced chemists look at it and like, oh, garbage, garbage, garbage might be okay, I like this one. And then they select something or several, right. What they don't know is the Cost of each and they cannot. They kind of heal, but they don't know numbers of yield of a final product. My retrosynthesis is not new. But if we apply to this cost calculation of synthesis and the yield throughput, like how much of a product you will get at the end, then it will be a way easier to evaluate pathways and it will give extra information. So this is one product that we develop and another one is set actually on that conference, remember by IT world. I heard from three or four companies. I hope this is not in the information. They wanted a particular structure for knowledge extraction and we started to develop it in house because I think all of them needed. So this is the second internal proprietary product. Also they have many databases and they are not connected. Interesting. And we know a way how to interconnect them and work with this data in a smooth way. Awesome. The other
Seth Earley: thing we had talked about before was quantum computing. And I know quantum computing is so still early stage. What are your thoughts about quantum computing? I think now
Alex Gurbych: is. It's. It's not emerging technology now. It's a research. It's not ready to do anything in reality, only the simple model tasks. Right. But now, as opposed to a few years ago, we have libraries, some standardized approach, we have available hardware to run some research tasks and stuff. And I think that in five to 10 years it will be another big thing. That's my opinion. Because drug discovery at the very end is you try to simulate the matter and the matter is to the depth we know it are nuclei and electrons spinning around. If you model that, then you can model everything with an amazing precision rode in. Your equation can be solved for three systems. It's a proton, hydrogen, electron and proton of helium. For other systems, it's approximated solution. When we have quantum computers, we can solve this system, this equation for every system and we will know how each matter behaves. Exactly. This will change chemistry, biology, bioinformatics, chem, informatics, all of that material science in general. Because we will be able to develop materials with a super high precision. Why? Because the tool works in the same way as nature. The the position of electron in the system is undefined. The concept is atomic orbital. You know what is it from the school? It's 90. It's a probability. Exactly. It's a problem. Maybe an electron here, it has a
Seth Earley: light higher likelihood here. So it creates this probability cloud of these different shapes. Right. Would be these orbitals. Yeah, but kind
Alex Gurbych: of we don't gotta. See the change at every level. And then they change
Seth Earley: in combination with other molecular structures and then that becomes very, very complex and the combinatorial explosion of those probabilities is just infinitely more, requires infinitely more computing capacity than we could possibly have on the planet.
Alex Gurbych: Exactly, exactly. In the, in this system, you move one electron, it affects all the system and you have to recalculate, recalculate all the system particles. And having recalculated every other system particle, you have to recalculate all the others. And it never ends. Right. See that's the mad. It's a madness. Yeah. How quantum is different for them because quantum works exactly in the same way. It's also particles with some probability and you can emulate that efficiently.
Seth Earley: Yeah, the same story. Very elegant. It seems very elegant in concepts. Perfectly. Yeah, that's
Seth Earley: so I've never thought of it that way. I've never thought of it that way. I've always thought. Yeah, go ahead, keep going. So it's same
Alex Gurbych: with graphs and molecules. The graphs are the perfect model of a molecule and quantum computer is a perfect model of a matter over nature. Wow.
Seth Earley: That's going to be quite, quite something down the road. And there's really a lot of great stuff to look forward to. You know, just a little bit about your, your background. I know you, you're quite an athlete. You're you, what do you do for fun these days? Are you still, uh, are you still, uh, you did some competition in the past in terms of powerlifting? Talk a little bit about what you've done for fun and a little bit more while you're getting your, your two PhDs. And is it three master's degrees or four master's degrees? Three.
Alex Gurbych: Three. Three. Sorry, I didn't mean to over oversell you there.
Seth Earley: You're also doing some other stuff. Tell us about that. My wife,
Alex Gurbych: my wife says that I could do something, you know, useful for the, for the, for people or for work instead of sitting in front of books for all the years. That's fine. No, no, that's, that's a joke. That's a joke. Of course, because I hope that it helps me to understand, to combine, you know, cross disciplinary knowledge, at least in drug discovery. It's not enough to understand one side. It's not enough to understand biology and not math. It's not understand, to understand math and can understand chemistry. You know this, you have to understand it in a complex and it can give you some direction where to move. Speaking about my hobbies or sports, so I'm master of sports international class in powerlifting. I finished Competing about three years ago, when I got kids at that time, I did everything I wanted. I became absolute champion of Ukraine and like master of sports international class. I got all these labels, chevrons if you want. And I stopped competing. I stopped competing because it's destructive for health. You know, you have blood in your eyes because the weights are too large and it's. Yeah. And then I did. What were your, what were your
Seth Earley: competitive presses or, or exercises? What were you most. What were you most well known for?
Alex Gurbych: My last competition I was 90 kilo body weight. I squat 300, I benched 200 and I pulled 300. Right. Kilos.
Chris Featherstone: Yes. Kilos, yeah, that's. Yeah, that's two. Yeah. 2.2. For those listening, that's 2.2 pounds per kilo. So you, so you
Seth Earley: squatted 600? 600 pounds? Yeah, almost 600. 660, would it be? I think 660 pounds and then you, you benched 440.
Chris Featherstone: Yeah, just a little bit of weight. Yeah. For 90 kilos? Yeah.
Chris Featherstone: In Alex, in the US you'd be known as, as part of the 1500 Club. Right. So 660. Yeah,
Seth Earley: yeah. So squat plus bench plus maybe a
Chris Featherstone: clean or something like that. Right. Over 1500 pounds. So, yeah, combined, it's. Combined, yeah. So it's impressive. Very impressive. Especially at 90 kilos body weight. Right. Which is even, you know, more impressive because the. They believe that the Olympic lifters and the, and the power lifters are the strongest people on the planet if you can lift twice your body weight. Right. In various lifts. So that's impressive. Yeah. Let's just hope it's nice you didn't stunt growth. Right. And now you're, you know, used to be six foot, now you're five. Five.
Alex Gurbych: I will never know how tall, how tall would I grow? But I cannot do that anymore. And for everybody, I cannot anymore. In your
Seth Earley: body. Yeah. It's impressive though. It's very impressive. Yeah, that's great. And what do you do now for fun? Do you. Mostly occupied by the kids. How many kids do you have?
Alex Gurbych: Two kids. Day three and five. And this is the only thing that I do in my spare time. Yeah.
Seth Earley: And wonder why you have no spare time. Well, this has been wonderful. Really, really pleased that we met at the bio IT conference. You had asked wonderful questions and our workshop and I knew I had to have you on a podcast and this has been a lot of fun and we're certainly kindred spirits. I've never competed in Olympic level or powerlifting championships, but I certainly try to keep myself self in shape. So I. My hat's off to you for. For those kinds of accomplishments. That's really fantastic. Well, listen, Alex, thank you so much for being here and sharing your expertise and spending some time with us. Thank you, Seth. I appreciate
Alex Gurbych: your questions and your invitation. And it was also a great pleasure for me to meet you and talk to you. And let's look how can we collaborate? Absolutely.
Chris Featherstone: Much success to you in the future, my friend. Got it. You know, there's so many really, you know, wonderful opportunities to help you get your lion's share of it. So. Yeah. And we'll look forward to working together in
Seth Earley: the future. And again, thanks everyone for tuning in and for listening. This has been another episode of the Early AI Podcast. Today we covered some really wonderful topics and we'll continue next time while we're exploring the transformative effect of AI across industries. So thank you and we will see you next time.
Chris Featherstone: Thanks, Alex. Thank you, everybody. Have a good day.
