Ship Real AI Agents Now: AgentKit + ElevenLabs, Deloitte’s GPT Faceplant, and the 17× Dot-Com Bubble Claim
Hi, everyone. Welcome to the monkey patching podcast where we go bananas about all things, aging kits, hallucinations, and more. My name is Marillo. I'm joined by my friend Bart. Hey, Bart.
Murilo:Hey, Marillo. How are you doing? I'm doing great. A lot of cool things were released. I think OpenAI had their dev day, I think they called it.
Bart:Something like that. Yeah. I'm not sure the exact name. So we
Murilo:have some some interesting stuff to to talk about, and we have also some tidbits, some I don't know how to call it, like chitchat, bubble.
Bart:Let's see. Do always?
Murilo:Let's see. Don't we always? But let's get started, shall we? Do you wanna start, should I start? Kick it off.
Bart:Eleven Labs' agent workflows let you build branching voice flows on a clean drag and drop map with human handoffs baked in. You could swap models or voices with sub agent notes and route successful or failure with dispatch tools. It's no code guardrails so complex. Jeez. Eleven Labs, has released agent workflows that lets you build a branching voice flows on a clean drag and drop map with human handoffs baked in.
Bart:You could swap models or voices with sub agent notes and route successful or failure with dispatch tools. It has no code guardrails built in, so complex call trees don't fall apart when users talk over each other. Cool.
Murilo:So Eleven Labs is the company that did a lot of voice cloning and all these things. Right? Like the voice agents, let's say.
Bart:It's very well known for, let's say, to voice. Right? And, like, train your own voices, voice cloning, etcetera.
Murilo:Indeed. So they also released now the agent workflows. So and we're gonna see this have been fun later later in the episode, but looks like something like this. So basically, you can have
Bart:A bit like a drag and drop canvas. Yeah. Right? Like what like like if you are familiar with Zapier or with Nathan, n a 10. Like, you built your workflow, built in the logic through a visual interface.
Murilo:Exactly. So you can drag stuff in, you can actually select the different voices for each thing, you can say if the intent is this, then go to this other agent that may have the same voice may not, you can call tools and say if this tool succeeds then do this, if the code if the tool does not succeed then do that. This is I think I guess from like phone numbers actually, because one of the things you can also do is to transfer to a person as well. So like to also transfer to a phone number or you can just hang up the call. So I think it's cool.
Bart:I mean, yeah. It's it's cool. And it's it's I think it's something that's eleven Labs was already used a lot for for, let's say, quote, unquote, automated voice calls or, automated customer support via voice. But you had to integrate very you had to build your own logic around this and integrate it via their API or or the tool set that they had available at the time. Now I think what they introduce is basically their own out of the out of the box call center as a service.
Murilo:Yeah. Indeed. Indeed. Which should make a lot of stuff much, much easier, right, to build these, POCs and all that.
Bart:Exactly. Exactly. And it also, I think it will very much upset the startup ecosystem around those virtual call centers because I think probably a lot of tools have already wrapped 11 labs just to do this. Right? And now they you can do it with 11 labs just out of the box.
Bart:For sure.
Murilo:And I think they're also not just startups, but also phone providers, right, that integrate the software with phone numbers. Yeah. I know that some of them were also working on things like this to implement AI and all that. But now I feel like from a big name like eleven Labs with very good the quality of the AI is good, right, for the voice and all these things to to do that. Then I think the develop I guess now I'm I'm saying this, the main developer task here is integrating Elevenlabs with the phone systems that they have, right?
Murilo:Which I'm sure
Bart:that it will. They'll keep working on this as well to make it more and more smooth. But this makes like the the technology that their core technology, this synthetic voice generation, like it brings it much closer to actually getting a direct value out of that. Right? Yeah.
Bart:Because this is something that you can like tomorrow implement.
Murilo:For sure. Yeah. I think financially, like, as a product, Eleven Labs now has makes way more there's way more value. Right? Like, it's different for to say, like, I have a company that builds voice cloning and to say I have a company that creates these agents that can support businesses for to enter to manage customer relations.
Murilo:Right?
Bart:Exactly. Yeah.
Murilo:Very nice. And, actually, also, is something, like you said, I I've seen a few use cases on this. It's not as popular, and I think it's not as popular because it's not as easy to get going with these things. So I'm also curious how they do it. Right?
Murilo:For like, if you need to version stuff or how do you, you know, the AI layer with the text layer. What if you have multi language? Right? We live in Belgium. In Belgium, there are three official languages as well.
Murilo:So how or accents as well. So I think there are there's still a lot of questions to be answered, but I think it's a step in the right direction.
Bart:We should test it out.
Murilo:We should test it out. Maybe we can do a
Bart:For our episode for everybody calling in every time of each episode.
Murilo:Yeah. Exactly. Otherwise, it's too much, you know, it's unmanageable. Maybe you can do, like, a whole episode that is just, like, calling them, you know, and just having a conversation and just see how it all goes. Alright.
Murilo:OpenAI's agent kit bundles a visual agent builder, embeddable chat kit, and new eval so teams can design, ship, and measure agents faster. Think drag and drop notes, previews, versioning, plus a connector registry for governance. Less glue code, more oversight, useful when real users and real data hit your bot.
Bart:Very related, right, to what we were just discussing around 11 Labs agent workflows? Yes. This came out of
Murilo:OpenAI on their dev day.
Bart:Yep. And this is more or less like a bundle. Agent agent kit is a bundle. What I tested out is agent builder is part of it. It's like this visual canvas that you're showing now.
Bart:There's also chat kit, which is like your embeddable chat UI, and there are specific evals that you can you can now use. I checked out agent builder yesterday. You did it? It's very impressive, actually.
Murilo:Really?
Bart:And to me, what really stands out so so this allows you to really, in this with this drag and drop canvas to really build the logic around your LMs. And I think the type of workflows that you would build in that, this is what you would have built in the last two years in Nathan or Zapier.
Murilo:Mhmm.
Bart:And it will be a huge, huge competitor for these type of platforms, I think.
Murilo:Do you think there is if you have have you used an eight n or Zapier? I know you've used.
Bart:There's also I'm using Nathan as well. Yeah.
Murilo:And how do you compare them?
Bart:Well, I think the the difficult thing with these platforms is always because I've also tried Make, for example, make.com. It's also a similar platform. It always stands or falls with how easy does it integrate all the tools that you want to use. Like you're calling out to this service. I want to store something there.
Bart:And what OpenAI now did with their agent builder is they have an MCP block and you can basically do connect with an MCP and do use whatever tool that's available. So just connect to the Google Drive MCP and like store some data there. Right? Like to me, it's also like the the first use for MCP where I see, like, this is a huge change because before the MCP was there, like, integrating thousands of tools would have taken them, I mean, five years or something. Yeah.
Bart:And now you can just out of the box use everything that has an MCP server exposed. True. Like that combination of like being very close to that model, really being able to control like the performance throughout that workflow. That OpenAI can do. Having these MCP integrations like it makes the functionality very strong and they also have things built in like like for example.
Bart:Again, something where there are a lot of startups around like guardrails. Like there's a just a block for guardrails where you can say like trigger when there is PAI data. I want to moderate this. I want to check for jailbreaks. I want to do do hallucination verifications versus like this set of documents that I have.
Bart:Like, a lot of it is built in.
Murilo:Yeah. Indeed. It's a really speeds up the a lot of the things. I'm also wondering with these developments, I feel like this space is getting more and more mature. Do you think this is the future of building agent workflows?
Bart:That's actually something that I thought about myself as well. Right? I don't think it will solve all cases. I think that's maybe the logical answer there. Like, because you are asking, like, will we go from coding to these visual platforms?
Bart:Right? Is it what you're hinting towards? Yeah. But I do think that's the problem or the problem that the what we're basically doing today for a lot of this logic is just writing code that basically defines a workflow and throughout that workflow does API calls to whatever service that is providing your model. And sometimes there are very complex systems that are that are difficult to maintain from in a in a visual interface, but a lot of them are simple.
Bart:Simple workflows that I would argue are easier to maintain in a visual interface and maybe also don't need coding skills. And actually how, OpenAI now did this with with her agent builder is that you could, can actually have The full expression you see it also in the image that you're sharing at the top right, you can just copy paste the Python codes to the to do the exact same thing.
Murilo:Very interesting. Can you do the
Bart:opposite Start off like building everything here in this interface and then switch to a more code based environment because you have the right arguments to do that.
Murilo:And maybe I guess even versioning and also sharing, I guess that's also that is interesting. That is interesting. Have you looked into that? I'm curious how the code looks like. Does it look kinda like a JSON?
Murilo:Kinda like a
Bart:It's it's Python code.
Murilo:It is Python code. Okay. Cool. This looks like a really good trade off. I'm also I I remember reading or listening to someone saying that in the the company, the policies, every time someone has an idea, don't pitch the idea, build something, and show me.
Murilo:And I think With OpenAI, you mean? I don't know if it was OpenAI, but I remember something like if someone anyone has any idea, marketing. Okay. Okay. Yeah.
Murilo:Yeah. Business development. You know, they say, I would like to see this. With more of these tools, they don't have to just say, this is how I envision. They can really just build something.
Murilo:Just don't pitch me the idea. Pitch me the the actual The prototype. Exactly. Like Yeah. Really early stage, but still.
Murilo:Right?
Bart:Yeah. Puts a lot of lot of power in people that are not familiar with coding.
Murilo:For sure. For sure. And I think it's actually a good thing. I know some people don't like it because they say that's yeah. It's maybe create, like, funky or brittle applications, but I think it's actually I I personally think it's it's a good thing, right, like, to empower people.
Murilo:I don't think like like you said, there will always be a space for for developers to rebuild the safeguards and, like, you're deploy, have really production grade systems.
Bart:I think there's more stuff to it. But Yeah. There is this there is this trade of effort. I think from the moment that you start building a visual interface and, like, you have tons of difference of blocks that contain custom code so that you put, like like, then it becomes hard to understand what went wrong if something goes wrong. Right?
Bart:Yeah. But if you don't have these custom code blocks and stuff, like, it's easy to find where where something went wrong.
Murilo:Yeah. Indeed. But I think it's cool. Very curious, actually. I would like to try.
Murilo:But I think maybe we can maybe we could have, an episode on these Canvas, drag and drop tools, you know, and then just kinda do a tour of them. That could be interesting.
Bart:Right. What else? David Heinemeier Hansen, one of the guys behind Basecamp, he argues for a simple rule. Do the work that energizes you before email and meetings. He sometimes skips check ins to quote and think her treating maker time as nonnegotiable.
Bart:It's a nudge to protect focus so the rest of the days runs on momentum. So it's a short article.
Murilo:Did you get a chance to to read this as well, Bart?
Bart:Yeah. Yeah. Scroll through it.
Murilo:Yeah. Yes. So I read it. I I had some thoughts on this. I also wanted to hear your thoughts on this.
Murilo:So he has a very high position. Right? Like, also CTO for the of 37 signals. He also did a lot. He's very successful as a programmer as well, created Ruby on Rails and
Bart:all
Murilo:that. Basically, what he says is prioritize also he also has some, like, hot takes. Right? I think he he was the same guy that said that he's they're buying servers again, that they're moving away from cloud because they did some calculations, and that's just it's actually cheaper in the, I think, like, five years horizon or something.
Bart:But that's not really related to the end to this article.
Murilo:Not related to this article, but just to say that he has some interesting thoughts. Right? And some of the stuff I agree, some of the stuff I don't. The point that he makes here is to with pay yourself first, is do the things that give you satisfaction even though that's maybe not the thing that is knocking on your door. So for example, he says he's responsible for employees, customers, readers, other managers.
Murilo:There's projects. There's emails that he needs to answer. But sometimes he said he just he just does what he wants kinda. He programs because he wants to. He experiments.
Murilo:He does research even though that may be not the most pressing thing. The reason I was also interested in this is because in my current role, I also would like to stay technical. But if you just follow the priority list of the things that people are asking for you, the technical work is probably gonna be one of the last ones, but more realistically, something that never gets picked up. Right? No one is asking you to code, but it's something that you want to do.
Murilo:It's something that you you you know that is important for you to do to be able to to give advice on the other things. Right? To stay up to date on all these things. Like, in the previous article, talked about the the OpenAI agent kit. I think for you to say what is good, what is not good, you also need to try these things.
Murilo:So it is important for sure, but it's not as urgent. There's not as much people you know, those are gonna knock on doors and hey. Have you tried this yet already or not? Right? Mhmm.
Murilo:And I I struggle a bit with this. I think, like, finding the balance. I think that's also a bit the what gives you energy. Right? Maybe there are things that are more you have a meeting with a potential customer next week.
Murilo:So there's still time to prepare, but should you start preparing now, or should you just kinda, like, investigate something that is more fun that gives you more energy? And he kinda he touches a bit on this. Right? Like, do the things that pay yourself. Pay yourself first.
Murilo:Think there's also needs to be a bit of a confidence in a way. I don't know if confidence is the right word, but I also think you need to have some some assurance on yourself. Right? Like, okay. I'll get everything else done, or this is the more important, be able to defend this to other people.
Murilo:And maybe that's also what I struggle with personally. But I also wanted to hear from you because I also from the outside in, at least, it always looked like you had a good balance of these things. And I'm sure you've in your previous role, you also had a lot of different things that were competing for your attention, but you said, okay. No. But I wanna do this or no.
Murilo:I do think this is important or everything else would be fine. I'm just
Bart:gonna focus on this. It's a big question. How to respond to the article? Okay. Article, I don't agree with at all.
Bart:Like, at all. Okay. Like but it's also maybe how you read it. Like, I read this as let's do the fun stuff first and then do the stuff I need to do because the fun stuff I get energy from.
Murilo:Mhmm.
Bart:And I don't really believe that that is how life works if you have ambitious goals. Mhmm. Like, if you have ambitious goals, you want to reach something. It is gonna be a shit ton of hard work that is not fun, but that you need to do together. And if you're and if you view towards the world is now I'm just gonna do fun stuff, and then let's see if I have some time left, and then I'm gonna work on my goals.
Bart:I don't think you will get there. I think that I don't think that's real life. I think this is all this is to me, this is an example. Someone that has been in a privileged privileged position for a long time and doesn't remember anymore how how what a shit ton of work it took for him to get there.
Murilo:Okay. Interesting. And I also when you as you were saying this as well, my thought was also, like, doing the fun stuff is never like, you never finishes. Right? It's never like, I did this okay.
Murilo:Now I I I scratched that itch. Now I'm gonna do the actual important work. I feel like a lot of these things, like, I don't know. If you tell me you have I have one month to prepare a presentation, and that's the only thing I'm gonna focus on, I can spend all month on this. If you tell me I have one week, I'll also do this.
Murilo:And same thing goes for fun stuff. So I think also I don't think if that's how you read it, I also don't agree because I think even if you said do the fun stuff first, it's like it's not a it's not a checklist that stops. Right?
Bart:No. And and but I do agree with, like like, you're gonna have a priority list. Right? And some of those high priority things need to do to be fun stuff. Otherwise, you're not gonna be able to do that for a very long time.
Murilo:For sure.
Bart:So you also need like you need to find a way to correctly moderate those priorities and say yes to some things and no to other things. Yeah. But I think it should start from the priorities and not how much do I like to do this.
Murilo:No, for sure. I also don't again, what you said is, like, do this first and do this second. I also don't think that's the way you should. I think for me, the way what I my the reflection for myself was more like, you need to have a balance. Right?
Murilo:And it's not always easy to like, sometimes you need to be conscious about balancing one way rather than the other. Right? So, like, normally, I need to protect time to do the technical work, and I don't need to protect time to do the the sending proposals or the client meetings or all these things. Right?
Bart:Well, I think when you're in a role where you have responsibilities to work, it comes down to managing a team, basically. I think it's very easy to lose sight of actual priorities, but just to get your work to let your work shift towards, like, oh, there's not a fire. Let let's put out that fire.
Murilo:Yeah. For me also, it's like
Bart:And those are not always priorities, but they feel as priorities.
Murilo:To me, but it even doesn't even need to be a fire. For me, it can be just like, okay. There's a proposal, whatever for next month. Maybe I should just get started now because I don't wanna be in the grind later on. Right?
Murilo:But then there's always gonna be something, and the time is always gonna take. The time to do the task will always be the whole amount of time that you allocate for it. You know? So I also don't think that's the the right way to go. And like you said, there's also things that you wanna do because it gives you energy, you also need to be mindful of that.
Murilo:There is also a bit of very the side note or the it's the same spirit, but in a very different context. I was watching a documentary about a tennis player, and then he wanted to take some time off, and then everyone was saying like, no. You shouldn't do it, but he says, but but I need this. Right? I need this to to stay sane as a person, to be able to enjoy playing tennis, to be able to do these things.
Murilo:And I think the way he said it, like, I need it. I also thought it was I think you need it for you for you to be able to do things long term, right, to not burn out, to not do this, not do that, you also need to be mindful of these things. And I think
Bart:Yeah.
Murilo:It should be normal for us to say, hey. I wanna do this because I need to recharge my battery. I need to have that balance that that you know? And it also brings value, of course, not to say that just do something purely for fun, but I do think there's a lot of stuff. If you chose a role, there are a lot of stuff that probably gives you energy in the role.
Murilo:Right? And also be mindful of that and keeping that in the forefront.
Bart:Yeah. They're like there needs to be stuff in those priority things that are fun to do.
Murilo:For sure. And I think if there's nothing, then maybe you are in the wrong role. Right? So which which is also fine, but reflection for yourself. So okay.
Murilo:Cool. Did you ever struggle with this at all, Bart, throughout your career when you were managing a company?
Bart:Of course. I think everybody does. Definitely. And I think in my role as a founder, like, it also you you have a role that with the growth of the company, every every year changes. So I think sometimes that makes sense very hard to understand what a good priority list is.
Bart:So definitely, yeah, think it's natural that you that you struggle with it. Yeah.
Murilo:Okay. Cool. And up next, have Swiss researchers explore quote unquote wetware computers built from lab grown brain organoids as a niche complement to silicon. At final spark, organoids mature in four months, connect to electrodes, and fire signals for up to four months. If they learn simple tasks, living processors could deliver ultra efficient task specific systems.
Murilo:Interesting.
Bart:What is this? Take more, Mark. Yeah. I I, I came across this article. I don't know actually where.
Bart:I think it passed on Reddit somewhere about, these, scientists trying to basically build sort of CPUs like processing units from, organic neurons, I want to say, organic cells, least. Not sure what type of cells exactly. They come they start from, by harvesting stem cells, and then they grow out these micro brains from them. It trickled me very much because I didn't know that this is something that we were already doing. It's very much sci fi futuristic aspect.
Murilo:Very sci fi.
Bart:And also the name they give it, what were? Like, this could be something that came from a sci fi novel. I looked into it a little bit. I think it's a bit thought provoking. Well, the performance is not really there.
Bart:Right? This is still very, very early stages. You they get these, cells to, what they call it, organoids, to basically, respond to stimuli. These organoids, they, live around three to four months, which is already quite a long time. Right?
Bart:Yeah. But they can already do some stuff like interact with a pong game. Yeah, it's very interesting. And I was thinking to myself, like, how do you train something like that? Right?
Bart:And apparently they have like this sort of a, well, there are multiple ways to do this. I think the article that we're looking at here, they do it through like, if the sort of inputs To this. Cluster of neurons to this organoid, and if the if the output is correct. Then they get a structured stimuli, and if the output is not correct, they get a very noisy stimuli, and apparently. It wires itself through time after enough things to.
Bart:Basically. Respond more positively to these to these structured stimuli. That's how they get it to function, but it's still, of course, still very early days. There are also other ways, like they're they're actually to, to with certain, chemical products, to use that as a positive feedback response and stuff like that. But, yeah, I'm I'm I'm very curious to see what what this will look like in, I don't know, thirty years, twenty years.
Murilo:Yeah. Right.
Bart:I feel like You will have your wetware laptop, and then during the day, you need to stop because you need to, like, open the valve, and then you pour in your, your protein milkshake Yeah. Exactly. To feed it. Right? You know?
Bart:It's like there's no power, but
Murilo:it's just like yeah. It's but even, like, things like how skin cells are turned into mini brains, like, that in itself is already, like, what the fuck? And then to convert it to, like, a, like, like, a computer interface thing. You know? Like, it's like, even the first part, like, human cells are turned to stem cells.
Murilo:I didn't even know that was possible, to be honest. Because I also heard that, like, stem cells are super valuable because, like, if you have damaged cells, like, they're the only types of cells that can that can transform into any tissue, basically, any organ. But then I was like, we do this? I didn't even
Bart:Apparently, do this.
Murilo:It's crazy. It's cool. Indeed. It's the it's from where are they from, actually? Swiss Swiss researchers.
Murilo:Okay.
Bart:So in thirty years, you will not just have your AI assistant, but as you also have an actual You just brain.
Murilo:Yeah. It will be just AI assistance. Yeah. Just intelligence.
Bart:Yeah. Yeah.
Murilo:It won't be like the neural networks will be very it's gonna be in a new new terminology for all these things. Jesus. Alright. Maybe more on the AI, especially the artificial part. Right?
Murilo:We have ah, it's your turn, actually.
Bart:Deloitte Australia admits that, an AI tool fabricated citations and quotes in a four 140 thousand government review. A revised report discloses use of Azure OpenAI GPT four o and removes a fake judicial quote on page 58. Expect stricter disclosure rules and client asking what exactly was machine made. Interesting. So this is, Deloitte was, tasked by the Australian, government to, basically do a welfare system review, for Australia.
Murilo:What's this what kind of system review you said?
Bart:Welfare system. Welfare system. The Australian welfare system, basically. Yeah. Well, probably with very specific research, tasks.
Bart:They were, paid for 440,000 Australian dollars for it. Is it Australian dollars?
Murilo:Australian dollars. I think yes.
Bart:But they used AI like, I guess, everybody that is writing text, unfortunately.
Murilo:Yeah.
Bart:And the problem there is that it's basically hallucinated a lot of facts and citations to those facts.
Murilo:That's a biggie. That's a big no no one.
Bart:That's a biggie. Yeah. And and and to me, the difficult thing here is the answer can be don't use AI. Right? But is that a realistic one?
Bart:Because should we not assume from the moment that you get a report that AI was used?
Murilo:Well, I think you assume that, but my mom wouldn't assume that. Right? Government officials maybe won't assume that. But I
Bart:I sorry. Go ahead.
Murilo:I think I think it's maybe aside from that, what I would assume if if I get a report, I would assume that there is definitely Gen AI in this. Mhmm. But I would also assume that you reviewed it, that the key points I mean, maybe, like, an intro, maybe some stuff is fluffier than others. That's fine. But, like, the facts, these are things that you can Facts need to correct.
Murilo:Exactly. Right? It's like but it's in this way, like, okay. And I hate to anthropomize AI, but it's like, if I delegate a task to someone, but I'm the owner of it, and I'm presenting at a client or something or to a government, I cannot just say, ah, sorry. That's not my fault because this guy actually wrote this.
Murilo:Right? You need to be you need to be able to back it up and to take ownership and everything.
Bart:I agree. I fully agree. Because otherwise, like, if you don't if the facts are not correct, you can also, for this example, just bypass Deloitte and just put these research questions in a deep research task of OpenAI.
Murilo:Exactly. There is a I mean, even for like, I remembered, poof, and I need to look remember this, but there was a university as well that some some students were asking for a refund because the teacher was using AI. But the teacher was using AI to generate content, but he was still like, he still prompted AI to say, about this, this, and this. This should be the curriculum. He will still review the content.
Murilo:And I feel like, maybe that's fine. Like like, there was also a podcaster, I think, from a tech meme ride home, I think, that he was he also did a there was a tool, I think, about because he he was also a historian, I wanna say, or something like that.
Bart:Yeah. He's true. He also has a history podcast. Yeah.
Murilo:But Yeah. And then he he used one AI tool that would also do research and also output some stuff. And then he argues a bit like, if it's still valuable, my contributions are still valuable because I still needed to say this is accurate. This is not accurate. Let's focus on this.
Murilo:So, like, it's not like he didn't need to have the knowledge and just use the AI and you just were replaced. His knowledge was also very key in delivering a good output.
Bart:Mhmm.
Murilo:Right? But that doesn't mean that like, AI generated doesn't mean that there's no human expertise in it. Right? Yeah. So I think there's also a distinction there, and I think I mean, there was a big boo boo.
Murilo:I still like because he said the the issue, the hallucination was the fake quotes. Right?
Bart:Yeah. Yeah. Exactly.
Murilo:Fake quotes from from a judge or something. So, like
Bart:I guess, a a a fake citation.
Murilo:Fake citation. Fake citation. Yeah.
Bart:You're CTO of a big consultancy firm?
Murilo:Mhmm. Yes.
Bart:Do you have a, like, a guideline for these type of things? Like, if you make a report for a client or something?
Murilo:I know what I use, and I know what we we tell the other people as well. Again, what what I said is, like, we're still responsible. I think that's something that I expect everyone to know. There's also the whole security part, right, like, which I'm not gonna get too much into. Right?
Murilo:It's like, are the data being used to train your models and all these things? I expect people to I do expect people to use AI, to be very clear. Like, I do think it's it can help. Like, I think you were faster reviewing and iterating on things than writing things out. So I do think it's it's good, but I think, again, needs to be responsible in the the security part.
Murilo:And I think everything needs to be reviewed and vetted. Right? And everything okay. What it's saying here, is this actually what you want to say? The focus of this document is is it focusing on the key parts that you think are important?
Murilo:Right? If it's like a, I don't know, presentation or a document or state of my work, maybe 80% of this should be on the actual tasks and estimations. Right? Is this actually what we want? And I think and also even cutting out AI can be very eager to generate text.
Murilo:Right?
Bart:Yeah. That's true.
Murilo:To say, okay. There's a lot of fluff here. We we have a one page, but, actually, you could say this in one paragraph. Just let's rephrase that. And you can use AI for that as well, but you need to be able to to read it then and say, okay.
Murilo:I would be happy if I had written this and we're sending this. If there are any questions, we can back this up.
Bart:Yeah. Yeah. Exactly.
Murilo:Think the only not even fear, but, like, one question I have is sometimes AI makes things so quick, and I think we, as humans, we need more time to really absorb the thing. So sometimes I think we need to slow down a bit. It's like efficiency more efficiency is not always better because if you generate a document that you need to present and then you're quizzed on this, you also need to understand this as well. Right? So that's why I also like if I was writing a a an article, I used AI, but I also noticed that sometimes I need to type things out, or I was usually also using speech to text.
Murilo:But sometimes for me to type this out is also a way for me to start thinking about it and getting into or, like, what what do you wanna you know, like, even though I'm typing these words, I'm thinking about what's the next part, what is the next
Bart:framework around your
Murilo:what you're working on. Right. So it's a bit like yeah. And maybe related to coding, I think I have some of that as well. Right?
Murilo:Like, as I'm doing something, I'm thinking a bit of how the other things work in the background. And with AI, I feel like I can move faster, but then things are not thought through. What do you think, Bart? How would you tackle this if you were in my my situation? Throwing the question I
Bart:think these clear guidelines. Maybe also, So what I noticed that so I've noticed this same problem what they have here exactly myself as well. So I'm working with a friend on a on a let's call it a a citizen data science paper for Arcs archive on it's a bit technical, but on on the running economics. And in a article, you typically have, a part where there's, like, a short literature review. And I tried to maybe I can let it open the eye, do this literature review.
Bart:And it looks very believable. But when you actually check-in detail, there are citations to things that simply don't exist. But there are also like these these things, and I could also like see a value in that, like. Typically, when you have a piece of text and you ask the AI to validate if all of these citations are correct and exist, which has become easier because, like, web web usage tools and so they have become way better over the last six months. These typically get picked out.
Bart:So I could also, like, aside from that, you have very clear guidelines. I could see that you have, you can maybe build it in in the agent workflow thing from OpenAI, but, like, that you have this agent where you drop in this document and that the agent basically verifies every link and piece of text to see, like, does it actually exist if there is a fact, like, the the citation to the fact, can we follow this link? Can you can you can you retrieve it? And do they they basically have, like, you yourself as a person are responsible, but you can use that as a tool to do an extra validation.
Murilo:Yeah. Yeah, for sure. I think, yeah, I think you can always add more tools to increase the confidence.
Bart:Yeah. That's maybe the right wording. Increase confidence. Yeah.
Murilo:Yeah. I think and I think you should. I think also depends on what kinda like, there are big mistakes and there are small smaller mistakes, let's say, as well. But I I would still expect people to to I would still expect someone to take ownership still
Bart:even if you have degree around it. I fully agree
Murilo:with that.
Bart:So And responsibility of the person that is that is responsible for delivering it.
Murilo:Exactly. And I think the reason why I'm saying this now is because if someone is hearing this and they're like, oh, yeah. No. But AI checked the references, and it's fine. But that actually wasn't fine.
Murilo:If someone came to me and said this, I would still say, yeah. But it's still your job to
Bart:Yeah.
Murilo:Fully agree. Yeah. Okay. Alright. What's next?
Murilo:Have macro strategy partnership claims the AI boom is a bubble 17times.com and 4timessubprime using Wixels framework for access credits. They point to storing training costs, lowering adoption, and shaky returns for large models. Good check or alarmism, what breaks first if cash tightens. Another another warning that AI is a bubble. I feel like Another
Bart:warning that AI is a bubble. Yeah.
Murilo:Yeah. Another warning. I think it's
Bart:an interesting article to read. Maybe we shouldn't go too much into this, but what they try to do is, like, basically try to have a metric of how much of GDP is incorrectly spent towards this bubble. And they say, like, one of this incorrect spending of GDP, when does it happen is when you have a very long period where you have actually quote unquote too low interest rates. So you can borrow borrow money very, very cheaply, which we've seen over the last years. And they say that's like what is happening now, this incorrect spending of GDP is 17 times larger than the .com bubble, and it's four times larger than the the subprime crisis that we had
Murilo:in the, I want to say, 02/2008. I think so. Yeah.
Bart:And, like, aside from this article, I think we we're hearing a lot. We also had Jeff Bezos recently speaking on this. And I think everybody's a bit nervous on what will this do. Right?
Murilo:Jeff I think Jeff Bezos say, by the way.
Bart:He's he's basically saying that it's a bubble, but, like, that is So he agrees
Murilo:with that.
Bart:Exactly. Yeah. But that is still a very valuable technology. Like, there was a .com bubble, but the Internet is has increased everybody's arguably, everybody's wealth through a rising economy. And the challenge now, I think that's what also what you see in the bubble.
Bart:If you're not in a bubble, investors invest in good ideas. But if you're in a bubble like you need, you're very eager to Put your money in this in this trend because this is the next big thing. Before you know it's a bubble. And. The problem then becomes that your risk appetite becomes higher.
Bart:You don't really see like are these good ideas or not anymore because it's also like the technology is very out there. It's still very young. Often people that are investing in this that are not as close to the actual what can it actually do. But it is yet another thing that is branded as an AI tool. So maybe we should invest in it.
Bart:And you you see also like you also what we've heard. We've discussed a lot of high valuations typically on on very big companies. Right? But you also see very small startups raising millions of dollars without having even a product market fit. Right?
Bart:That is
Murilo:Yeah. So, yeah, it will
Bart:be interesting to see what this gives.
Murilo:Yeah. What you're saying is, like, it's a mixture between hype and FOMO that creates this bubble. Right? Like, people
Bart:Yeah.
Murilo:I feel like people I mean and you hear this a lot. Like, if you're not investing in AI, you're falling behind. I've said this, like, not in investing, but, like, in using, right, AI tooling. Well, yeah. I mean, sometimes you see even businesses.
Murilo:Right? I imagine that some conversations behind closed doors is like, we need to do something with GenAI. I don't care what it is. We just need to do something. Sometimes the market signals are good as well, but it's a bit founded.
Murilo:Right? And I think what you're saying as well, I think the .com analogy is a really good one because it's also something that we've we've also said here that there's so much promise about Gen AI, like the next AGI and whatever. Right? And it falls flat or it for short fall falls short. But at the same time, if you really take a step back and really look at for what it is, it's still very revolutionary.
Murilo:Right? It's still like, you're not gonna go back to doing things the way they were before, but it's just like you promise so much that it's unattainable. It's it's like it's cannot get there. Right?
Bart:Yeah. To me and there probably is a bubble. The question is whether it will pop
Murilo:or not.
Bart:But the thing here is, like, it will still be a technology that changes a lot of things in society for the good or bad. Right? Like how people work, especially the the type of, of office worker Yeah. Or coder like it. Like, this will be changing going forward despite it being a bill.
Bart:We also see now what and what is also concerning is that you have these these large players like OpenAI and NVIDIA basically investing in each other, That there is this discussion that OpenAI has, setting up a lot of contracts that will cost them a lot of money going forward to build, for example, data centers, but that they do not have the resources to honor that commitment in the coming years, so they're forced to raise even more. It's it's there are a lot of these these these signs both on the very big companies as well as the small start up that raise money without having serious customers that make people feel nervous.
Murilo:Yeah. I'm also I also heard somewhere that OpenAI actually doesn't make a lot of money because they have a lot of, like, free tier users and all these things. And right now, they're just trying to increase the market share to then catch up with the profits later. That's what I heard.
Bart:Well, they're they're they're not cash flow positive. They make a lot of money, but they they burn more money. Yeah. They burn money.
Murilo:Right? Which I think it's also I don't know. It I wouldn't be comfortable if I was running a company like that, right, to say, like because a lot of stuff happens there and, like, with the Yep.
Bart:And and also there, like, so they need outside capital basically. And over the last years that have always been investors in the most in most recent time, they're also looking more and more towards debt. So lenders also all small signals that maybe the capital has run out because it's becoming slightly risky. So now we need to look at banks and then so it's all small signals that together make people nervous. I think the what you saw after the after the .com bubble, like, the actual Internet companies, like, they still did very well.
Bart:Like like like, for example, Amazon, Jeff Bezos. Like, Amazon still did very well. The companies that just adopt adopted Internet by having a website, they they still did very well. Yeah. But, like, just another idea that people threw some money at, they went bankrupt because there was no actual value in there.
Bart:And that's that's I think what we will
Murilo:And I think what the article is saying, it's the amount of money that people are throwing in there. It's the it's a different scale.
Bart:Yeah. It's a different scale.
Murilo:Like, it's
Bart:it's a lot. And there you could argue, maybe maybe it is more transformative or it's more transformative or it's the the the same amount of transformative than the Internet bubble or the the the coming of the Internet, but at a much shorter time. Yeah. Because it's going super rapidly. Right?
Murilo:Yeah. It feels like it feels extremely quick. Yeah. Like, everything.
Bart:Yeah. So, yeah, interesting to see how this go how this continues. Exactly. What else? We have, a study that tests eight table formats for large language models and finds that markdown key value blocks lead at 60.77% accuracy.
Bart:CSVs land at 44.3%, but uses far fewer tokens with thousand questions asked over thousand synthetic records. Cheap to parse isn't always correct. Format choice changes outcomes. So this is a a study that tries to understand What's table formats are easiest to understand by your LLM friends? Right, Smirno?
Murilo:Yes, exactly. I thought it was a we don't have to spend too much time. I think it's also kind of straightforward, but I thought it was interesting because sometimes like if you have a little to do app or, like, you have a little agent or something, like, sometimes you want to you have structured data. Right? And what I've always done is to just copy paste it into sometimes ask from an CSV or sometimes I have the markdown tables.
Murilo:Right? But they actually spend some time to see what is the best format for these models to understand. And the first one was markdown k v, which I wasn't a 100% sure what it meant by mark because they also have markdown table later. Markdown k v, and I'll share for people following the video. Markdown KV is basically like a markdown task the text.
Murilo:So I have here. And then they just say record one as, like, a subtitle, and then they have, a block with key and a number key and number, and then they say record two and
Bart:then they're like. Does that render to a table? No, right?
Murilo:That does. It's just like text files almost like it's really just. Okay. Okay. Yeah.
Murilo:So that's just it's just this. There is a markdown table, which this renders to
Bart:a table. What I call it markdown KV, actually never heard of it, but it's something like basically like JSON lines. So like in every line, don't know here, have a row over multiple lines, but like you have a book of text per row basically.
Murilo:Exactly. Exactly. It was really like you have a pound. Right? And then there's the title, and then like double pound, that's the subtitle, and then you have like a block with literally like ID colon one name colon Charlie a zero age colon, and then pound pound record two, which again, I've never seen that.
Murilo:But apparently, this is the best format, which is not maybe it's not as surprising in the sense that markdown probably the models are optimized for markdown, and this is the most markdown report kind of thing. Markdown table is the one that I was thinking when I think of markdown documents, which is basically like something that kind of looks like a table. Right? And a lot of the times the markdown parsers, they actually read a nice table with this. And I guess what they're saying here, so there is some they do something they explain how the the analysis and the confidence intervals and all these things, but actually markdown kV is the best.
Murilo:So if you ask questions about your data and your data is in that format, Markdown KV is apparently the best for you. And then everything else, so by like a 4% margin, 5% margin, and then everything else kind of starts at 56, and then it drops down for XML, ini, YAML, HTML, JSON, markdown, natural language, JSONL, and CSV. I was also surprised that CSV is so low. I have done some things with CSV, like just dumping the CSV on CHESTPT and say, what is the aggregation? Stuff that you could do with SQL and Ductib on a CSV, but you're just asking natural language.
Murilo:Right? But actually it was pretty it was pretty bad. So maybe the highlights here just to bring it home, let's say. Format seems important. So even though it was the same data, the model had and the model they use was a one nano, I think.
Murilo:So it wasn't a super powerful model, but still For
Bart:all nano mini, I think, even.
Murilo:Mini made lot. Yeah. Did. See. Here's a a four one four one no.
Murilo:Model four one nano. Oh, yeah. So the format's just having a bit of a think of what's the format would make a difference. And where is it? CSV performed poorly.
Murilo:Markdown KV, again, of which I have no idea. JSONL was also pretty bad. So just having a bit of I guess, for me, the takeaway was having a bit of a think. They're not all the same. Maybe think a bit of this, think of that.
Murilo:Maybe depending on your application, if you need a a high accuracy because even Markdown KV, which was the best, it was like 60%. Maybe you need to have something else like an actual tool that can translate natural language to SQL queries for all these things.
Bart:I'm I'm also a little bit skeptical on these things because tool use has become so good these days. Like, a decent LLM will probably just drop this in a Python code and execute that and then do a filter on that. Right? Like, in a data table, some kind of format.
Murilo:Yeah, indeed. After seeing this and the numbers, like 60% is not very high. Yeah. So I actually agree with you. But at first, if I hadn't seen this, I would have thought that dumping all the data there would probably give you very good results actually, but that's not the case.
Murilo:So, yeah. Okay.
Bart:Our last because we're a little bit pressed for time. Right?
Murilo:We'll do our last. We have Snowflake introduces a managed model context protocol server so agents can query governed data without teams running their own infra. It supports OAuth and role based access and can expose Cortex Analyst and Cortex Search and as tools, NetEffect simpler standardized data access for enterprise agents. So what is this about, Bart?
Bart:Yeah. They released this on the October 1 a few days ago. And what they now basically have is that they have their own managed model context protocol server that gives access to a lot of different tools so you can basically fetch data from your warehouse. You can also, through those tools, Cortex, which is their AI analyst slash and basically exposes your Snowflake warehouse to whatever MCP client that you use. Like you could use it directly in, let's say in, Claude's or you can actually now in, OpenAI agent builder, fetch data from your warehouse, put stuff in your warehouse, etcetera.
Bart:Yeah, it's good to see that. I think this is yet another signal that MCP is at least for the foreseeable future here to say.
Murilo:Yeah, I think so. I think so. And I think it's really like I would. I think it's a logical step as well. I do think we'll have more of these MCP as a service kind of thing.
Murilo:I think it makes sense. Like, it's another way to interact with your with your data. Right? It's Snowflake managed, so I guess the security and all these things are are there as well. And I think it opens up a lot of possibilities.
Murilo:Right? Yep. Interesting. I wonder how the performance goes though. Like, if you Yeah.
Murilo:No. True. Because I'm thinking, like, even for, like, model deployment for feature stores. Right? If you need to create inference real time.
Murilo:Sometimes it takes time just to query the data and to return. Right?
Bart:Yeah. That's like, that inference time that you need to to do tool usage, like, it can can also feel like it's a bit grating. Right? Like, it's too it's too slow to to respond on that. And especially when when it's not immediately the right answer.
Murilo:Exactly. Exactly.
Bart:It can take way longer than just having your, like, your trusted dashboard that you just click refresh on.
Murilo:Indeed, know that Snowflake, they also had a bit of a feature store story as well, which I guess would be more like a cache layer on top of the data. So kind of like a Redis thing, you know, so you have really fast queries. But I don't know too much about it. But I also think that if you solve the inference time problem and you have these MCP servers, I think I think it's a very good story. Yeah.
Murilo:I feel like, again, like you said, agent kits, now people, business people can create their drag and drop agents and connect with data from Snowflake, and it's like, poof. You have a lot of stuff there. So, of course, there are more questions, like, on the access controls and all these things, but it feels like a step in the right direction. And I expect more providers to also start rolling these things out. Cool.
Murilo:I think those are all the topics for today. We have more news to share, but I think maybe we'll
Bart:You make nice like a big thing, but it's not a big thing. I actually switched from, for one of, my adventures Okay. Adventures Yes. From Gmail to Outlook. You you switched from Gmail to Outlook.
Murilo:So now your I
Bart:have a I a
Murilo:venture is Outlook based.
Bart:I have some insights that I did not think that I would have.
Murilo:K. Anything you wanna spoil?
Bart:I feel I've been always been too opinionated on this. But let's let's discuss this next time because you need to go to the
Murilo:I need to, treat myself. Let's just say that.
Bart:Okay. Oh, wow. Sounds more okay. Yeah. Yeah.
Bart:But maybe
Murilo:you got me a bit curious, but then, like, you think before you were too opinionated about email providers, and now you feel like that those opinions were too strong.
Bart:I think so. Email providers slash tools slash solution providers. Maybe my Google versus Microsoft, stands on this. Has softened. Tooling for startups.
Bart:Softened, but there I've like pros and cons a bit based on the context.
Murilo:Okay. And I
Bart:used to be much more strong opinionated versus one direction. And maybe why did you switch? We're gonna keep it
Murilo:for next time. We'll keep it for next time. Cool. Bart, thanks a lot. Thanks everyone for listening.
Murilo:Thank you. I'll see everyone next week and potentially with a new studio. Wow. Woah. No.
Murilo:Maybe. See you next week. Ciao.
Bart:Bye bye.
Creators and Guests


