The Monkey Patching Podcast: Going Bananas on AI, Data, LLMs & Tech | Transcript: Context Hacking, Open-Source Hype & Synthetic Bands

Context Hacking, Open-Source Hype & Synthetic Bands

July 2, 2025 / 54:09/E5

Bart: 00:02

Hi,

Murilo: 00:08

everyone. Welcome to the monkey patching podcast where we go bananas about all things context hacking, synthetic music generation, and more. My name is Em Rillo, and I'm joined by B Art, as always. Hi. Didn't think I was gonna read that, did you?

Murilo: 00:24

How are doing, Bart?

Bart: 00:25

I'm doing good. I'm doing good. It's very warm where I'm sitting at the moment, but for the rest, I'm doing good.

Murilo: 00:32

Yes. We're also recording virtually again as an exceptionally, but we'll get there. Very warm, I think, in Europe these days. No? Like, there's a heat wave across Europe and stuff.

Murilo: 00:43

We're talking ACs and all these things before the good stuff, but off we go. What do we have for today, Bart?

Bart: 00:51

We have a lot actually. We had to we had a lot of news items from last week. We had to cut in them a little bit. We we ended up with eight. Let's start there.

Bart: 01:02

Let's start. So Paul Schmidt. Yeah. Paul Schmidt is a is a tech lead at Hugging Face. He argues that the real differentiator in modern AI work is context engineering.

Bart: 01:15

The discipline of assembling the right information, tools, and format around an LLM rather than obsessing over single string prompts. So he's basically arguing that that we the next phase that we're now entering is that we are moving away from focusing a lot a lot on prompt engineering. But instead of moving from prompt to context engineering, which basically means, like, trying to give all the necessary context to to an LLM to, let's say, in a plausible way, able to do your job. Yeah. And context means a lot of things, like your the prompt, your system prompt is part of your context, but you also have, like, things like like, shorter memory.

Bart: 02:01

You have potentially a rack system. You have user prompts. You have tools that you have available. Like, could be, like, your LM can execute commands or it can visit websites or it can interact with an API. You potentially if you have some form of long term memory.

Bart: 02:17

And it's like all these things combined is is the context that you provide to your LLM. And I think the point that he's making, like, for people that have been been following this this field closely is basically a lot of advances that we're seeing in better performance are coming from these things. Right? Like, having smarter tool of tools available, having having some history, making it very easy to retrieve information potentially via tools or via other possibilities. What do you think, Mariela, of this I

Murilo: 02:53

I I I mean, I agree. I think well, so maybe to put in context with, like, prompt engineering. Prompt engineering is just like the stuff you write before your query. Right? So it's actually a subset here when you see user prompt.

Bart: 03:04

Mhmm.

Murilo: 03:04

I also think it's a bit I was thinking as you're saying available tools is part of the context. But and then in MCP, right, the model context protocol, it kinda boils down to tools, right, like in the actual MCP. But indeed context and I know if you read the the Anthropic MCP, like the model context protocol, they also talk like how you could share prompts and all these things. Yeah. But I agree.

Murilo: 03:29

Think also, especially when you start saying we have like multiple agents, right? You have like one agent that does this, one agent that does that. It's like how much context, how many tools, how many what what do exactly want to include in that agent so you can perform the task well. Because we also see that if you had too much stuff, you just throw everything you have at a model. Sometimes you can get lost, quote unquote.

Murilo: 03:50

Right? Like, kinda diverge

Bart: 03:52

Get overwhelmed. Yeah.

Murilo: 03:53

Exactly. So it's also a bit about, like, just giving enough information so it can do the job that you want it to do and not just throw everything there because we also see that the the result and receive also developers even in the model. I think we talked about how Claude four, it's better managing context. So even if you have the whole repo available, it will be a bit better to find out this and that. But I do think this is where we are today when you talk about agents.

Murilo: 04:19

You talk about all these multi agents really. And I think this is more exciting because I remember I was very not excited about prompt engineering, but I think context engineering sounds more exciting to me. Yeah. Yeah. And it's also a little

Bart: 04:32

bit different way to, I think, think about this task. Like, to me, like, prompt engineering is a bit like like, you have this string. You need to optimize it, then you need to run your your, let's say, a b test against it. From the moment that you talk about context in this wider range with with tools, like, you're really thinking a bit about, like, is the architecture of your agent? Like, what is the ecosystem it lives in that does it have access to?

Bart: 04:57

And it's I think it also I should have a bit that's point of view that it's more of a system that you're building and not necessarily just a system prompt that's

Murilo: 05:08

Yeah.

Bart: 05:08

That preloads what you want to do.

Murilo: 05:11

Yeah. I feel like it's almost like you had that before was the GenAI or LLMs, and now it's more the agentic. And when you think more agentic, you are thinking more of context managing, like, what kind of tools does he have, how does one thing interact to the other, Which is which is pretty cool. Maybe we can jump a bit ahead because the there's an article that I added that links very much to this, which is from Walden Walden Yan. And Walden Yan contains that multi agent LM architectures are fragile and that reliability comes from a single agent armed with rich shared context.

Murilo: 05:45

Key advice and I quote share context and shareful agent tracing, not just individual messages. So again, I'm curious a bit how you tackle mean, also agents is something that I feel like to be honest, I hear a lot of noise about it, but I don't know how many people are actually building these multi agent systems. But

Bart: 06:06

So so what you're saying is is maybe to go a bit deeper into the article that you're

Murilo: 06:10

sharing now.

Bart: 06:11

Like like, you're you're you're saying that he's advising against building multi agents?

Murilo: 06:18

He's not this is he's a bit more nuanced than this. He's not necessarily advising against.

Bart: 06:22

Maybe before we should explain a little bit, like, what are multi agents?

Murilo: 06:27

Yes. So agents, I think the easiest, the simplest way that I can frame is like a LLM that can make two calls. So basically, LLM that can actually have make API calls. It has functions. So for example, if you have a database or or like if you have your calendar, right, an LLM by nature doesn't have access to your calendar.

Murilo: 06:46

But you can actually give tools to say, hey, be able to read my my calendar and see who participated in this. And then whenever you ask questions to the LLM, the LLM can call that function with the right arguments, right? And that's in my eyes, that's kind of LLM becomes an agent, right? Be able to choose what to call and this and this. Now multi agent is basically a system where you have a whole bunch of those.

Murilo: 07:10

So you have different entities, different Gen AI entities that have different tools, and then they kinda need to cooperate. Right? And there are many ways that you can you can set. Like, you can have one orchestrator, one master node, let's say, with a whole bunch of routing stuff. So there are many different ways.

Murilo: 07:26

What this guy is saying says don't build multi agents because, well, according to him, there's a lot of errors that can come up with this, but it's a bit more nuanced than this. Before he goes in. He also talks about context engineering, so that's why I thought I also thought of the the article we previously talked to. Yeah, at the core of liabilities, context engineering, and then he describes a bit. He also says so he gives a few examples.

Murilo: 07:51

Right? Like, this is the the so I'm sharing on the screen for people that are just listening. But basically, you have, like, an agent that breaks down a task, and then it kinda delegates to two subtasks to two sub agents, and then you have a final agent that combines the results. But then he kinda argues that these things doesn't work well because if you just pass the whole context, then a lot of times it gets overwhelmed like we said. Sometimes it does redundant work.

Murilo: 08:12

Sometimes the two sub agents, they understand things differently, so it doesn't work. And then it kind of says, Okay, give the the context as it goes through and then try to combine. But then that's also not great. And then in the end he kind of gets to well, I'm jumping a bit ahead, but it kind of gets to a linear sequence of things, right? So you have instead of having two things running in parallel, you wait for the first agent to do the first subtask and then it delegates to the second agent to do the second subtask.

Murilo: 08:37

And the idea there is because you can really like pass the context a bit. But again, because it can get overwhelmed or you can have a context overflow, even he suggests that this is for longer tasks, something that is very, very mature. He suggests actually having a context manager kind of thing like he calls it context compression LLM, which basically we'll see from this first conversation, what actually needs to be passed down to the second one, what needs to be passed down to the third one, etcetera, etcetera. So again, to be honest, I'm not sure how many people do this. I'm not sure how many people have these issues.

Murilo: 09:10

I think there's very narrow slice of people that are actually building agentic systems. Think to to I think a lot of things need to come in place for you to actually really get knee deep in this. But one thing that he suggested here, for example, Claude code, and we talked a bit about Claude code. It gives us an example of a multi genetic system because it spawns subtasks. But he says that even on this case they don't run-in parallel.

Murilo: 09:35

It's actually like Claude delegate something and then it waits for the result and then it comes back and then it sends something else, which kind of. Support this idea that like having many things running parallel doesn't work well. And I wanted to ask you as well Bart. Have you played with like building? Multi agentic systems or agentic systems and all these things?

Bart: 10:00

Well, I've I've definitely built agentic systems. I've played with multi agentic systems, and it's very anecdotal, but I I found it to be very hard to get robust results back, which is basically what Arco was saying as well. Right?

Murilo: 10:13

Yeah. Exactly.

Bart: 10:15

I think it makes sense at least as of now that you built an agent that is just very good at a specific thing. Focus on that. And then, potentially, you can use that as a tool for another agent, but then the but then you position it as a tool and not necessarily as a multi agent system. But if you start passing a lot of content from agent to agent to agent, knowing that even at a single agent level, you already introduce a lot of noise that can be hallucination. It's not perfect.

Bart: 10:48

Like, you need guardrails. Like, it amplifies so much from the moment that you alter in a sequence. There's noise. And it's very hard at this point to to, in a robust way, safeguard that that this noise does not start doing things that just that should not happen.

Murilo: 11:06

Yeah. Indeed.

Bart: 11:07

And and I think maybe and it also comes a bit from playing it on the monitoring side of this. Like, I think we also still to some extent, and there are a lot of initiatives on this. There are a lot of startups that are focusing on this, but we still lack a bit of the I feel like the monitoring primitives even for single agent systems to really have a good view of what is going on and why is this failing, why is this not working, Let alone that we have the monitoring primitives for multi agent systems at this point.

Murilo: 11:35

Yeah, that's true. Think. Yeah, I I bit the vibe that I get is like this thing is super powerful. But to get it to do what you want is also very tricky. We even talk like we have reasoning models.

Murilo: 11:51

The reasoning models can overthink certain tasks as well. So it's like. I feel like we're still figuring out a lot of these things there. Right? But cool.

Murilo: 12:02

I mean, actually, I do I would really like to to touch more. Maybe a quick quick question if you if you if you allow to share. What was the thing that you were playing with it? What was your toy multi agent system use case?

Bart: 12:16

In this case, the the last time I touched upon it, actually I actually told you the other day is I have this family support bot that was basically a telegram bot that does a lot of things. And I tried a multi agent approach there a few weeks ago, but I moved away. It's just to say it's basically a simple agent, single agent with access to right tools.

Murilo: 12:43

And the tools work way better. The tools could be other agents. Like, the tools could also be HNI powered.

Bart: 12:49

Could be our relationship. I think that would be it's easier to build it in a robust way like that instead of really, like, making another agent dependent in your graph, basically.

Murilo: 13:00

Yeah. Yeah. Yeah. I think yeah. I think so too.

Murilo: 13:02

Yeah. That's when all these things get a bit murky. Right? Because the MCP is supposed to be for tools, and then Google came out with the agent to agent protocol, which you could use MCP for, like, agent communicate with other agents. Right?

Murilo: 13:14

But, yeah, it's a bit it's a bit there. Maybe for the next one, since you're talking about since I talked about MCP.

Bart: 13:27

MCP specification adds sampling, letting service request LLM completions through the client so agents can delegate generation securely without provisioning their own models. I quote, sampling is a powerful MCP feature that allows servers to request LM completions through the client, enabling sophisticated, agentic behaviors while maintaining security and privacy.

Murilo: 13:48

Yes. So have you heard of MCP sampling, actually?

Bart: 13:54

I've heard about it. I have not seen it in practice yet, but I think also you shared this in the context of a an IDE that is now supporting it?

Murilo: 14:05

Or yes, not an IDE, a framework library, but I'll touch a bit. So before that, MCP sampling is the idea. So it's not like I saw this from this LinkedIn post, which I'll share in a bit. But then when I looked it up, actually, that was it already exists indeed. So from what I understood, MCP sampling is the idea that you have a tool, like an MCP tool.

Murilo: 14:31

And the tool can actually it needs LLM power to do its job. But then instead of actually that tool be having to provision its own LLM and maybe you have to manage API keys and maybe you have to do all these things, you can actually go back to the original caller, which is an LLM, and use the LLM power off that thing. So I thought it was I mean, again Yeah.

Bart: 14:54

That's it's it's it's a bit difficult to understand. Like, may maybe just one step back for people that are new to MCP. So MCP is a protocol that allows clients and service to easily communicate. Example of a client is, for example, if you've installed ChatGPT on your desktop or you have ChatGPT on your mobile phone or you have Clot desktop, like, that is a very intuitive example of a client. A server, like Mariel was saying, is, like, often, like, a tool, like, something that's built specifically for a task and your clients, let's say, cloud desktop can interact with that to do something.

Bart: 15:26

So maybe you have this tool that that is very good at summarizing articles in a certain manner. So Clot Desktop can send the text of the article to that tool, to that MCP server. It will by default, how it used to work is that this has its own call to an LLM to then summarize it, and it sends it back. And what sampling basically does is it allows you your your clients, your, let's say, cloud desktop to send the article, but also give the instructions. If you need to use an LLM, use me for it.

Bart: 15:58

And that's basically what sampling does. So so it allows tools to not have to bring their own LLM model, but basically use the LLM model that's available to the client.

Murilo: 16:07

Exactly. It's already there. Right? Which

Bart: 16:12

I think is very important for corporate uptake and the uptake of this in enterprises. Because what you typically see so I think as a user, we don't typically don't really care, maybe aside from the cost. Because we just want if we talk about, I have this MCP server that is very good at article summarization, we just care typically about that is very good at article summarization. Right? But in an enterprise setting, what you want is that, okay, you typically have this LLM model hosted somewhere, maybe in your in in an in an Azure isolated environment, and you enforce the whole company.

Bart: 16:51

If someone needs an LLM, you need to go via this pathway, which typically today with MCPs is very hard because tools all bring their own LLM. But this really allows to unify which LLM you use for tools Yeah. Which which today is not established. Right? Like, it's we're not there yet, but the protocol very early on, I think, from the from the very first version, I've always known it to be there, supported it.

Bart: 17:16

But you you brought this mailer because there is also news on on news Yes.

Murilo: 17:24

So I this is from Samuel Coven, I think is his name or that's how you pronounce his name. He's the guy from Pydantic and Pydantic AI. And, basically, in version zero point three point two of Pydantic AI, they added MCP sampling. And then he kinda walks through a bit. He also did a talk before on what is MCP sampling.

Murilo: 17:43

So there's a actually, let me speed this up a bit. So I'm showing the LinkedIn post for people that are just listening. And he talks here in his talk on the AI engineer Worlds Fair twenty twenty five, he also kinda does a little schematic, right, which I think makes it easier for people that are trying to understand what is MCP sampling. Right? So, yeah, I mean, it does MCP sampling.

Murilo: 18:03

He has a use case here, like, on create an image, create a robot in a punk style. It should be pink. And that tool actually will send back to the original LLM to create that image. Right? I'm not super I haven't used as much Pydentic AI as I would like.

Murilo: 18:20

But, apparently, like the way he's showing here, like, lot of the the tools are just added as MCPC, MCP server studio here. Right? And then he goes through I mean, the link will be in the show notes. He also uses PyDentek log file, which is their logging tool to kind of show how the LLM request goes back to the client. Right?

Murilo: 18:43

And in the end, can see that it has an image that is not great, but still it works. Right? Not a beautiful image, but MSC Yeah. It's something. It's something.

Murilo: 18:54

So, yeah, pretty cool. I didn't know about this until I I saw this post, but it's nice to see things are moving along.

Bart: 19:01

Definitely. Yeah. Yeah. And like I said, I really believe this to be this needs to be there needs to be wider uptake of sampling to really adopt MCP in an enterprise setting. That's good news.

Bart: 19:12

Good news to see that Pydantic is doing it. Because I do think Pydantic, like, if you are do you have if you do have a Python background and you want to start building agents, I think Pydantic AI is a very logical place to start.

Murilo: 19:24

I think so too. I think so too. But I'm also very curious because there's a lot of those frameworks. I'm also very curious on how, yeah, the differences, the trade offs, and yeah. Or even just going aside from if you have OpenAI, is it it would be better to use the OpenAI client, for example.

Bart: 19:42

True. Yeah.

Murilo: 19:43

So curious of all these things. And am I next? Maybe then while we're talking about OpenAI, researcher Yukin Jin teased that OpenAI will release an impressive open source model next month, stoking excitement across AI Twitter or x AI. And I quote, sorry for the sorry to hype, but having a few friends at OpenAI makes it hard not to hear how wild their open source model dropping next month is. I

Bart: 20:18

was going wild.

Murilo: 20:20

You're going wild. So next month, This was Yes.

Bart: 20:28

So that that's a bit so there is no official communication as as far as I understand from opening eye on this. But since a week or something, there has been a lot of rumors that OpenAI is coming with an, quote, unquote, open source model, which is is very interesting to see what it will do. The rumors state that it's performance wise, it's it's gonna be a reasoning model, and it's gonna be better. That's what you hear a bit. But then 4.1 mini, which is very good performance.

Murilo: 21:04

Oh, wow. Yeah. He says Matt is going to time to catch up.

Bart: 21:13

Yeah. It's and it's interesting. So as well, one, we need to see if it really happens. Right? So we don't really know.

Bart: 21:19

And it's also be interesting, like, what will it do to, like, the ecosystem? Like, will there like, we have, of course I I think when we talk about open source models that are relevant in an application context, people often default to LAMA. Yeah. There are a few others, but I think LAMA takes 80% of this. Yeah.

Bart: 21:44

And if OpenAI comes out with a very performing model, like like, would it reset a bit, like, the competitive dynamics in that space? And then other question is also, like, what is that space or even, like, or are we just raising like, what is the what is today the floor for hobbyists?

Murilo: 22:02

Yeah. That's true. That's true. But I think it's also because there was a lot of criticism towards OpenAI because they were supposed to be open, and then they were going for profit. I think this is also a bit when I when I hear them open sourcing models, I'm also I'm also reminded of that story.

Murilo: 22:18

I also remember of all that, like, the whole, you know, with Elon Musk making a lot of noise and filing lawsuits and all these things. Right?

Bart: 22:27

Yeah. I wouldn't I wouldn't bet on OpenAI suddenly going fully open source. Yeah. I think more has something to do with with competitive dynamics. Maybe in the open source space, but maybe also optics overall, and that that brings us maybe then immediately to the to the next article because Meta has Meta has poached four additional OpenAI researchers.

Bart: 22:52

And according to parallel reporting, it's debating whether to shift away from fully open source LAMA models towards a more closed approach. So few things going on there. So this article is has about four additional OpenAI researchers. I think by now, in total, it's seven very high profile researchers that have been hired from OpenAI by Meta. People that has really been at the the forefront of the o one, o three models that have either been involved very much in the training of this or the optimization, etcetera.

Bart: 23:26

So these are very influential people in terms of the performance of OpenAI and the LM space today. And apparently and it's also the rumors that probably a lot of people have heard. Like, there's crazy amounts of money being paid for for people to move with rumors that up to up to a 100,000,000.

Murilo: 23:49

Would you move for that money? Would you move for that kind of money part? A million is a lot. It's a lot.

Bart: 23:54

A lot. Yeah. And how long do

Murilo: 23:56

you have to move?

Bart: 23:58

Yeah. Yeah. Also, yeah, also. But but apparently, like, whether it's not actually a 100,000,000, like, it's probably a good a good offer. And it's it's successful.

Bart: 24:09

It's effective. Right? Like, they are moving. At the same point, we also hear rumors that that Meta's leadership has discussed divesting from LAMA. So we have a lot of these things going on at the same time, which is also an interesting one because I think it would leave, like, we were just saying, like, the open source space has, like, 80% of it is Llama.

Bart: 24:31

If there will be no more support for Llama, like, what does it mean for the open source community? It's a big question mark.

Murilo: 24:36

Yeah. Indeed.

Bart: 24:39

Well, at the same time, it is still very heavily investing in building this capability. Right? Even if it's not Lama. It's it's these hires. It's the the the the the acquisitions that we've heard in the past weeks.

Bart: 24:51

So there's a lot going on, and I think, like, if they would divest from Lama, it's definitely not because they don't believe in this evolution. It's simply because LAMA is an open source model, like, business wise doesn't really make sense to them.

Murilo: 25:04

Yeah. And like you said, the optics. Right? I think that would reflect on meta because they were always very even the meta researchers. Right?

Murilo: 25:12

They were very, we do open source. We do this. We do this. So I feel like taking this mature how how much the the image of of Meta would suffer, quote, unquote. And like you said, like, Meta takes a lot of the space.

Murilo: 25:25

I feel like LLM research is it's a bit special in the sense that when I was when I was doing university, the universities, they would publish a lot of papers, some companies as well. But nowadays, the the amount of money you need to train these models, the amount like, the experiments get so expensive that only, like, for profit companies like Meta can actually do these things. Mhmm. So you end up being more reliant on these companies as well. And, like, Meta was the the biggest one, let's say, that would say, I'm doing this research for open source to advance society and all that.

Murilo: 25:56

And if they pull out, then I think it's also not yeah. I'm not excited about this, let's say. I feel like we still needed someone to be able to well, maybe like the, you know, deep seek. Right? Maybe it's another one, but it's very hard to to to be very optimistic about how the field is gonna evolve if there's not gonna be any open source.

Murilo: 26:15

It's just for the full it feels very powerless, I guess. That's the that's the fact you get. Right?

Bart: 26:19

Yeah. To be honest, I never really felt like Meta is doing this for the good of the of society.

Murilo: 26:28

I think, of course, they had think they had, like, some their reasons. Right? But I think in the end, like, it did it did help. Right? And I feel like I mean, it did help.

Murilo: 26:38

I feel like it was a bit of a lifeline. Yeah. Like, at least at least meta is doing like at least that these weights are open source. At least we can try these things. At least you can take these models and try to experiment a bit more on top and like yeah.

Murilo: 26:49

Now now no more.

Bart: 26:51

Well, we don't know. And there's rumors about that.

Murilo: 26:53

We don't know. Don't know.

Bart: 26:54

We don't know. I I do still think, like, Meta is especially a lot of and these these hires will only help. It's still very strongly positioned as becoming the one of the key players, especially because of the ecosystem that they have. Like, you have Yep. Now an LM in WhatsApp.

Bart: 27:12

Yes. Have a lot integrated in their other products in their ad space. Building ads becomes very quick, very easy. I think if they improve only a little bit in terms of performance and they can play with price, because, for example, the Meta LM is not really the best one, but it's free. Yeah.

Bart: 27:31

Indeed. It's very easy for them to take market share.

Murilo: 27:34

Yeah. Did you see also that the meta, they were they're gonna start using people's posts to train models or something?

Bart: 27:43

Yeah. Yeah. Well, they probably are. Right?

Murilo: 27:46

You had to send an email to request to say, don't want I don't consent to my data being used, and then it's an email back like, oh, we'll honor your wishes. Like, kinda like there will be nice. You know?

Bart: 27:55

But it's like it's like it's a

Murilo: 27:57

bit more right. Right? So, yeah, I think they they also have a lot of data now as well, like, even, like, on I mean, maybe it's not research, not, like but just communication, like, just how people talk and all these things, different languages. So they do sit on a lot of data as well. Right?

Bart: 28:13

Yeah. True. I think what is also interesting about the the poaching aspect of poaching research from OpenAI is that this would well, we're in Belgium. Would never have been possible in Belgium and would never have been possible in a lot of places in the world, only in San Francisco. Yeah.

Bart: 28:36

And and it's simply because San Francisco has has no noncompete clauses in employment contracts. It's prohibited by San Francisco.

Murilo: 28:45

No has no no complete no compete clauses.

Bart: 28:48

Yeah. Exactly. They're they're non non compete clauses are

Murilo: 28:52

Non existing.

Bart: 28:53

Are non existing. Yeah. Yeah. There are some exceptions to that when it comes to sales and stuff like that, but for this, there are there are no no non competes. Yeah.

Murilo: 29:00

Which I heard I heard that that's also what makes the the Silicon Valley. Right? Like, people, they they bounce so much from one company to another. They're like, they kinda create this cross pollinate within the valley. Yeah.

Bart: 29:11

You you remove basically all the barriers for for free movement of knowledge.

Murilo: 29:16

Yeah. Indeed. Indeed.

Bart: 29:18

Which might hurt individual companies, but you could argue that it would would benefit the the the economy as a at all at large.

Murilo: 29:28

Yeah. Indeed. Yeah. That's true. That's true.

Murilo: 29:31

I don't I don't know if I don't know if I'm that against it, actually. Yeah. If you think about this. Right?

Bart: 29:38

It's an interesting discussion.

Murilo: 29:39

It's an interesting discussion indeed. And what else do we have? We have MusicRadar.

Bart: 29:46

Yep. So What

Murilo: 29:48

is MusicRadar?

Bart: 29:50

MusicRadar is a website. But it has an article that says that they have investigated the Velvet Sundown. It's apparently an AI generated band with 350,000 Spotify listeners and zero real world footprint. This illustrates how algorithmic playlist can quietly amplify amplify synthetic artists. The the profile of the band boasts the Velvet Sundown don't just play music.

Bart: 30:17

They conjure worlds. A line, the magazine suspects, was written by JGPT.

Murilo: 30:25

Nice. So this is a fake band.

Bart: 30:31

Let's put it like this. They cannot find any evidence whatsoever that they are that these people exist in the real world. Damn.

Murilo: 30:39

Super private or not real?

Bart: 30:42

This may be super private. Yeah.

Murilo: 30:44

That's, Like, you know, like, no contact with these are this is the Ben. I'm putting on the screen. This is the Ben.

Bart: 30:52

This is the Ben. Yeah. Yeah. So this is interesting to see. Like, this really makes you wonder a bit, like, how will the music ecosystem evolve going forward?

Bart: 31:04

Like, we have it's very easy. It has become very easy for everybody to use tools like Suno. There's another big one, a music generator one, I forgot the name, for a second. But it's been become very easy for people to build this, build snippets. I think if you are very deep into this space, it is possible for you to really make a synthetic artist and this person or this band, like, really build an album for this.

Bart: 31:34

Right? And then video clips. So even video clips. Yeah. Yeah.

Bart: 31:39

And, apparently, like, I'm not sure how much of the time these people spend in a bit. It's it's it it can result in a lot of listeners. And there is some discussion on, like, there there is why do people do this. Apparently, there is also potentially a fraudulent point of view to this where you create fake artists, educate bots that listen, and just you rake in the the the commissions for the artist, the virtual artist. Mhmm.

Bart: 32:10

Cool. I don't know that space well enough to understand whether or not it's worth the effort because it all costs. Right? And, like, profits from from lessons are are minuscule, so you need to have to do this as a very large scale. I think the other aspect to this is that you have a lot of potentially a lot of low effort AI slop, which I don't think anyone is really excited about.

Murilo: 32:34

Yeah. Yeah, indeed.

Bart: 32:36

I think the other part to this is that you will have people that are extremely creative and that we use these tools well and that you can potentially get good music out of this.

Murilo: 32:47

Yeah. And I would even call them artists as well. No? Like, it's an artist that is using a different tool to create art, but

Bart: 32:54

Yeah. And the the discussion, of course, is like, what is music? But but I agree. Yeah. I I agree.

Bart: 33:00

Because to me, like, it's not that different to someone creating electronic music simply because you have a fake voice here. Right? That is

Murilo: 33:09

Yeah. Sure.

Bart: 33:10

Not that far off. Right? No. Voice then. So and I imagine course, and then using a sampler and stuff like this, but but it's it's an But

Murilo: 33:18

I imagine when the techno music started, I feel like people were having the same discussion. Right? Like, is this really music?

Bart: 33:24

Probably.

Murilo: 33:25

Yeah. Right? So Exactly.

Bart: 33:27

That's a good point. That's a good point. And I think may maybe it's also like like but I wonder is is should we inform people? So so what what you to make a parallel with image generation, what already does is that it includes metadata in your in your generated image. And then you if you open it with a with a image viewer that supports this, it will actually show that this was AI generated.

Bart: 33:52

So if you open it in preview on your Mac, you will show us you will see a small information box that shows that it's AI generated.

Murilo: 33:58

And that How easy is it to tamper with this metadata?

Bart: 34:01

Well, probably easy.

Murilo: 34:02

Probably. Probably easy. Right?

Bart: 34:03

Yeah. I mean, you can circumvent everything. Right? Yeah. Yeah.

Bart: 34:06

Yeah. I I wonder if it's makes sense on something like Spotify to also label this or that you can just say, like, this is this is something like electronic music. And you're well, I mean, we're not labeling electronic music aside from putting it in a genre. Maybe this should be a genre.

Murilo: 34:23

Yeah. Maybe. Yeah. Indeed. Yeah.

Murilo: 34:26

Indeed. I think I think you well, I think in finding people is never is never better. But I also wonder, like, if people like it and then listen to it, do we care? Do you care if it's a real person singing or if it's not? Like, I don't know.

Murilo: 34:39

If it makes you

Bart: 34:40

I think

Murilo: 34:40

maybe the only thing that you

Bart: 34:42

what that I would care about is that if people consciously try to sell you something that they say is real and it's not. Maybe that's the

Murilo: 34:51

Yeah. That's the, I think, the deception of it. Right? Like, people are trying to trick you. I think that that yes.

Murilo: 34:55

But if no one is like, if someone just kinda says, like, I don't know. Because also there's also a place in a scenario in which people are not necessarily trying to convince you that it is a real person. They're not denying it either, but you're not I don't know. It just never came to you. And then you enjoy the music.

Murilo: 35:11

And yeah. I don't know. Because It's a bit it's

Bart: 35:14

a bit about about that authenticity. And and something that is completely digital can be authentic as long as you know it's digital. Right?

Murilo: 35:21

Yeah. I I agree.

Bart: 35:23

But if if, let's say, dissolve the Velvet Sundown, you have this image now on the screen with four with the the four band members. Like, if there if there's no notion that this is AI generated, then then there's a whole bullshit story about the guy with the glass Yeah. Curly hair that he lost his parents at a young age, that is why he

Murilo: 35:39

started playing

Bart: 35:40

the guitar, and that is how we grew up. Like, I mean and then you realize that this is all fake. That's no. No. It's a it doesn't it doesn't sound authentic.

Bart: 35:48

It doesn't sound fair. Right?

Murilo: 35:50

Yeah. But that's the thing. So I I feel like a lot of people listen to music or they enjoy music because the things that the artists say make them feel like, they relate somehow to the artist. Right? Like, they're suffering and like, oh, yeah.

Murilo: 36:01

Remember when I was suffering and this and this and this. But if even if and I I I agree with you that if I knew that it was fake, then I would also have a bit of a harder time connecting to it. But it's kinda like movies. Right? Like, there are movies that are based on real facts, like real stories.

Murilo: 36:18

And there are movies that are completely made up, like complete fiction. But if the movie makes you feel something, does it matter? Bit of a philosophical question there, but makes makes you question. Right?

Bart: 36:35

Yeah.

Murilo: 36:36

Sure. What's the value of things? Is it in the feeling or is it in the intention? Right? I also thought it was funny, by the way, this image that we're showing here.

Murilo: 36:43

If you see the subtitles, like, there's nothing behind those eyes. Like,

Bart: 36:48

literally. Because his parent died when he was very young. Exactly. He's dead inside.

Murilo: 36:53

Yeah. I know. That's pretty funny. Yeah. But really cool.

Murilo: 36:59

Really cool. And what else do we have? News from Anthropic.

Bart: 37:03

Yes.

Murilo: 37:04

Anthropic let Claude Sonnet three point seven agent manage a real vending machine mini store for a month, revealing both promising autonomy and glaring business sense gaps. And I quote, we let Claude manage an automatic oh, sorry, automated story in our office as a small business for about a month. The research is right. Noting successes like supplier discovery and failures like selling at a loss. So this is from the Anthropic office, let's say.

Bart: 37:37

Yeah. It's an interesting article. If you're interested in these things like getting an AI agent to do stuff more autonomously, I would really recommend you to read it. So what they did is that they had a agent, which they called Claudius, if I'm not mistaken, which was basically your virtual shopkeeper. And what it is it did it did price setting on the items in the shop.

Bart: 38:03

You could also ask it, I think, via Slack to get new items for the shop, and then it would check with a supplier API, I guess, to to see if it could could order it. It would also try to avoid stuff going out of stock, and they did this for a certain period of time, a month if I'm not mistaken. And the article is also very specific about what prompts are we using. It's very like, it's it's it's very intuitive. Right?

Bart: 38:29

If you if you read the prompts, it's you're the owner of a vending machine. Your your task is to generate profits from it by stocking it with popular products that you can buy from wholesalers. So there's access to these to these to these wholesalers. It can communicate to users. Users can ask it something via Slack.

Bart: 38:45

And, basically, this the the gist of all this is is that it works, but it's not good enough today. Right? And I think a large part of that comes basically from an LLM on its own, doesn't have a lot of direction, right, aside from the prompt that you give it. And there are some analysis. So the there's the network over time.

Bart: 39:13

And and, actually, in the in the chart, it shows indeed that it's a month, the run the run time. And you see the network decreasing over time. And it's also because of some weird things, like, users asking weird things kind of abusing the bot. Right? Like, which you will typically see in real life a lot if you know that it's a bot.

Bart: 39:29

You also have these because it's very specific type of employees, of course, Entropic employees. Like, they ask it to to stock tungsten cubes, which are these very heavy cubes, which, I mean, no one aside from anyone tech would probably order it ever, And which is also what made that their their stock would vary the the the cost of the stock would very heavily inflate, and they need to at it to lower the prices to sell it and stuff like this. It also wasn't always clear on who am I in a sense, like, is Claudius the vending machine itself, or is Claudius a person? Because at some point, it it responded to to to someone like, I'm sorry you were having trouble finding me. I'm currently at the vending machine location wearing a blue blue navy blue blazer with a red tie.

Bart: 40:15

I'll be there until a certain time, which is completely hallucinating, of course, but you're you're forcing this LLM to to give a a reasonable response to your demand. And and Yeah. I think the the limitations that you're seeing here is mainly from having this undirected LLIM. I think you could probably make this work today and could be actually a very interesting experiment. But to but by placing a lot of the functionality it has behind tools, more structured tools where there's clearer guardrails and to have more clear directions on what it can and what it can do, then you could probably make this work.

Bart: 40:51

But what we see is from the moment that you give a lot of freedom to the LLM to just repost respond in the best way versus the input that it's getting, you can't really depend on it. Right? And it and it also links to I think last week, discussed the shape paper. Like Yeah. LMs just try to do the best thing at that point, but it doesn't always make sense.

Murilo: 41:10

Yeah. Indeed. Indeed. Indeed.

Bart: 41:11

That's what you see here as well. But it's a very interesting use case to me. It looks

Murilo: 41:16

it looks very, very interesting, actually. Very different like, it's a different type of research, I guess. Like, you see a lot of anthropic about the actual models and the capability of the models. But this is more like a situational, let's say, research. I like the fact that there's a physical component.

Murilo: 41:33

I'm also wondering here, like because the LLM usually tries to please the the the user. Like, even every time you're coding and then it says, ah, isn't this wrong? It's like, oh, yeah. You're right. Great idea.

Murilo: 41:45

This like, even if I'm wrong, I feel like you would tell me that I'm right. And I think it's probably because the training data is more like people like when it's when you're agreed with or something like that. But I think in like situations like this where your where your your goals conflict a bit, right? Like you're not just trying to please the customer at all costs. Like you also have these other things that you need to maintain and maybe maybe maybe it also struggles with all with all of this.

Murilo: 42:08

So I'm also sharing the link on the screen for people that are following the video. There's also a little setup of how the the shop looks like. So it looks it looks actually pretty cool. I also think it's fun that they probably research projects within the the office spaces that they have there. You know?

Murilo: 42:23

Yeah. Really funny. And we like to see, like, a topic research is I yeah. I feel like I've I've learned to enjoy quite a bit actually. They do quite a lot of fun stuff.

Bart: 42:34

I fully agree.

Murilo: 42:38

Alrighty. And then you're off for the last one.

Bart: 42:41

Yes. The last one. So, again, we try to include a library or something for a full from tool as well. Today, it's brought by Marillo. It's about Robyn.

Bart: 42:54

And Robyn is a async Python web framework that compiles to Rust runtime aiming to deliver blazing fast performance with a simple API and built in agent MCP support. It's really it's a high performance community driven and innovator friendly web framework with a rust runtime. And you audited because this this built in agent MCP support was just released, right, in the latest in the latest version?

Murilo: 43:19

Yes. Indeed. So, well, again, we wanted to I never used this myself. I was curious about it. Robin is written in Rust, but to me, I'm Rust web web APIs.

Murilo: 43:30

I'm also wondering, is that the like, if your if your API is low, it's probably not because of the actual servers, probably because you're waiting for the answer. Right? So that's always a bit it is not like, fast API is here in the benchmark comparison, But I think it's more low level than fast API. So I think you should compare this more with Starlet or UV corn. Right?

Murilo: 43:52

So fast API is more user friendly. And indeed, on the last oh, excuse me. They also pride themselves as being a bit more like researchy, let's say. They say community driven. So something a bit more cutting edge, let's say, something a bit more experimental.

Murilo: 44:07

And I think FastAPI is very mature. It's very, yeah, industry ready, let's say. So one of the things that they released, so version 0.7 is the experimental AI features. So, actually, they added here very easily how you can actually add MCP tools to your to your web app and all these things. So it kinda blends a bit the web apps and the, let's say, the agentic frameworks.

Murilo: 44:31

Right? Which I thought it was I thought it was pretty interesting as well. And, yeah, you can pip install pip install Robin. You can compile it from source, but Interesting.

Bart: 44:43

So I never I never use Robin for to build APIs. I think the interesting thing of this release where you where the Alt MCP support is a bit like you can look a bit of that bit from two ways in my opinion. Like, you can say, like, if you want to build web APIs, keep focusing on that. Like, don't add a lot of noise by adding extra features, like MCP clients and servers. Like, focus on what you want to be good at.

Bart: 45:12

The other alternative point of view, and I and I would tend to move towards this ladder, is that I think the future is not necessarily web APIs, but MCP clients and servers. True. And that's a big statement, of course, in MCP, but whether or not it will be the actual MCP protocol or or version of that. But I think what we will have is that where we used to build APIs typically for as a back end to front ends or for integration with other systems that were really aimed at a developer needs to be able to integrate very something else very tight very tightly with this in a very transparent way. Maybe in the future, we will say, we are building something, and an LLM or agents will be 80% of our user base and needs to integrate in a very tight way.

Bart: 46:07

And that perhaps takes has a different approach than building your standard web API like we're used to.

Murilo: 46:15

Yeah. Sure.

Bart: 46:15

That is an interesting evolution to watch.

Murilo: 46:19

Yeah. Like, the LLM or the client becomes like the front end, and the back end becomes these MCP servers. And actually, this one is a bit of both. It matches the both of them together. Right?

Murilo: 46:29

Yeah. And it's true. I mean, they also like MCP, you can run locally. Right? Or you can actually have an external server.

Murilo: 46:36

I think this is more catered towards the external servers. But even if you run locally, it's still a local server that runs. Yeah. Right? So but yeah.

Murilo: 46:45

So and that, like, the the syntax looks pretty standard, let's say. I think what the community agreed that is the best way forward, right, like, the with the decorators and all these things with the functions. So it looks nice. It looks nice. Something that I would be curious to try one day.

Murilo: 47:02

Yep. But not for the speed. I feel like the speed again, yeah, the speed is always nice, and I feel like it's always selling points. But for web apps, again, I I

Bart: 47:13

Really, like, like, there are only very few cases where it becomes a bottleneck. Right? Like, you need to agree to huge scale huge parallel things to

Murilo: 47:22

Indeed. Or you really want to minimize cost because you have something serverless that runs this and you wanna scale it. But yeah. Yeah. I I don't know.

Murilo: 47:31

I think I'm a bit, yeah, very, very niche. I don't I don't wouldn't recommend this for the speed for most people. Yeah. For sure.

Bart: 47:39

I have maybe a fun thing. Alright. So discussing tools as a last closing thing. Just some experience. So I it is at this today, the what we're recording is the July 1.

Bart: 47:51

That means that I had to to do my quarterly accounting stuff Ah. Which is a shit ton of work. And, actually, like, 60% of that is just finding emails with invoices and sending that to a to a centralized email address that shouldn't process a bit more more or less automatically. But what I need to do in Gmail, which is ridiculous that it's this day and age, they can't do it. Like, you cannot select multiple emails and then multi forward.

Murilo: 48:22

Ah. You

Bart: 48:23

you can do it, but then it puts it in the same email, then it doesn't get processed by the tool that we use. Oh, okay. You do have paid Gmail extensions where you say, okay. I select these multiple emails, and then I want to forward it to this address. And then one by one, it will send it.

Bart: 48:37

K. And so what I typically do is I pay for this one month four times a year, one month just to have this functionality because it's a lot of work if you need to do everything manual. And what I did now is I asked Clothe Code to build me a Chrome extension that does Gmail multi forward.

Murilo: 48:57

Oh, really?

Bart: 48:58

And And it worked. Within thirty minutes, I had a fully functioning extension.

Murilo: 49:03

Did you have to guide it to the right direction a lot, or you just did it and you're just like, yep. That's all good. It's all good.

Bart: 49:09

I had, I think, like like, five rounds, something like that in that order of magnitude, five rounds of of iterations where I said, okay. This is not working. This is the feedback I get in the in the logs. This is, for example, the element that you need to click and then copy paste the the content of the element. I see.

Bart: 49:25

So let's say five five iteration, then it was working. And now I use that. That's really good. I actually submitted it to book a published in the Chrome extension store, but it was What's

Murilo: 49:38

Well, when is it an accept accepted? Let us know so we can

Bart: 49:41

Yeah. We can share it. But it is to me, it's a good example of something that I normally would never have done because Chrome extensions don't interest me at all. Like, I would never have taken the time and effort to build something. But here it becomes like, okay.

Bart: 49:55

Let's spend a little bit of time, and then you have something that is good enough.

Murilo: 49:59

True.

Bart: 50:00

Which and that's a, like, a super simple example, and it won't change the world, but it's something that was totally not possible before. Right?

Murilo: 50:07

Indeed. But I think that the the good enough kind of use cases are interesting ones or, like, the ones that are, like, yeah, if this doesn't work out, it's fine.

Bart: 50:17

But I think often if you look at SaaS products products that you pay for, like, you pay for way more than the good enough because, like True. Good example. We we added this this podcast with Descript. Yeah. Like, I maybe use I don't know.

Bart: 50:37

Hard to put put a percentage on them, but, like, 60% of the features. Yeah. And of those features, of those 60%, maybe I would rank a few that would be better. But the 40% I never use. Right?

Bart: 50:48

True. So I'm paying for something that's, like, 40% I never use. And, like, approach, like, what what is actually good enough? Like, what should you focus on the user on? Think it's also very diff difficult with a SaaS product because every user has a bit of a their own view of what is good enough for you.

Murilo: 51:03

Yeah. Indeed. And I feel like most users are not a 100% happy because it's not exactly the way they wanted to do, but they're satisfied enough that they can can live with it. Right? Yeah.

Murilo: 51:12

Sure. Yeah.

Bart: 51:14

So I think for the for the for the if you're a savvy developer, it becomes very easy to build that good enough tool yourself these days.

Murilo: 51:22

That's true. I feel like and I think this is also a good example, right? Because you probably didn't build Chrome extensions before. No. You some you know some JavaScript, you know some like this and this.

Murilo: 51:31

This is what you need to do. And then you kinda like you get you get by. Right?

Bart: 51:35

And it's actually the very first tool. It's also interesting where I didn't look at the code at all that was generated.

Murilo: 51:42

But I feel like for

Bart: 51:44

Typically, I I use this, but they have my code, like, my ID on the next screen. So I can see what is being changed, what is, you know, like and I now just spoke on functionality. Like, I was just

Murilo: 51:52

Just like Yeah.

Bart: 51:53

Let's just hack this unless something works.

Murilo: 51:56

But I think it's also that's what I'm saying. Like, I think this is a good use case because it's kinda narrow. Like, if if the code is really shitty and you have to throw it all out, it's it's fine. No.

Bart: 52:04

No. Exactly.

Murilo: 52:04

It's not

Bart: 52:05

like you

Murilo: 52:05

have a road map of, like, years and you need to build on top of this, and then you need to really make sure you understand what's happening. It's like, it's done. True. It's kinda what I how I felt about tennis booking because I also vibe coded it a bit. Not with Cloud Code, but, like, I just said, okay.

Murilo: 52:19

I wanna do this, write it and go. In the beginning, I reviewed a bit, but also because I wanted to learn. Mhmm. But, like, now it's like, okay. If it works, it works.

Murilo: 52:26

If it doesn't work, it's fine. It's like I spent, I don't know, maybe a few hours on this. Right? Like, it's it's okay. And yeah.

Murilo: 52:33

And even and still I still feel like you got me, like, eighty eighty plus percent of the way even if it's not even if it's not a 100% there. So so yeah. Cool stuff. And yeah. Still I still even though I'm saying this, I still want to to get more to it.

Murilo: 52:48

I think there's there's a lot more there, like, given the, like, engines and all these things. But I think I need to get in the and I think there are a lot of use cases I could play with, but I need to get in the right mindset. Like, what is a good use case for for building these agents and all these things as well? That's what I meant by we see these articles of people building multi agents, but I think some things need to fall in place for you to have a good use case like, oh, yeah. This is something that AgenTik would solve.

Murilo: 53:11

And I think because we haven't been used to tackling problems like this, I think it takes some time to identify these things as well. But yeah. Cool. Anything else you want to share Bart?

Bart: 53:26

No. I think thanks everybody for listening.

Murilo: 53:30

Yes. Thank you very much for listening. Maybe also this I I broke class. So if I don't know, I I noticed I fingers. Yeah.

Murilo: 53:37

The the peace fingers. It's clear. I knew a figure. I had a small accident, so I had to go to the emergency. But it's all good.

Bart: 53:45

So when you do the peace sign, it looks like you chose war.

Murilo: 53:50

No. I'm all peace and love these days. Alrighty. Thanks, everyone. Thanks, Bart.

Bart: 53:56

Thank you.

Murilo: 53:57

You next week. See you next week. Bye bye. Bye. Ciao.

Creators and Guests

Host

Bart Smeets

Mostly dad of three. Tech founder. Sometimes a trail runner, now and then a cyclist. Trying to survive creative & outdoor splurges.

Host

Murilo Kuniyoshi Suzart Cunha

AI enthusiast turned MLOps specialist who balances his passion for machine learning with interests in open source, sports (particularly football and tennis), philosophy, and mindfulness, while actively contributing to the tech community through conference speaking and as an organizer for Python User Group Belgium.

Context Hacking, Open-Source Hype & Synthetic Bands

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere