The Monkey Patching Podcast: Going Bananas on AI, Data, LLMs & Tech | Transcript: AI Agents, Brain Fog & Caveman Coding

AI Agents, Brain Fog & Caveman Coding

June 23, 2025 / 01:03:13/E3

Murilo: 00:00

Hey, everyone. Welcome to the monkey patching podcast, where we go bananas about all things shade, repairs, and more. My name is Murillo. I'll be hosting you today, together always by Bart. Hi.

Murilo: 00:17

Hey, everyone. How are you doing, Bart?

Bart: 00:20

I'm doing good. Doing good. I'm doing good.

Murilo: 00:22

Warm, warm week in Belgium?

Bart: 00:24

It's it's a warm week. Yeah. But the feeling is better than last week when we were recording.

Murilo: 00:28

Yeah, last week it was

Bart: 00:30

Very humid as well.

Murilo: 00:30

It was very humid. It was very humid.

Bart: 00:32

I was sweating last week.

Murilo: 00:35

It was nervous, it was nervous.

Bart: 00:36

Maybe, yeah.

Murilo: 00:36

It was the first episode two, first one with content. Like Yeah. I got feedback actually from this one. They said they liked it. So

Bart: 00:45

Okay. Yeah. Unbiased source?

Murilo: 00:49

Sure. But let's see. What do we have? So this is episode three? This is episode three.

Murilo: 00:57

Yeah. Episode three.

Bart: 00:58

Murilo: 00:58

you're one indexed, but I won't rehash this discussion. But what do we have for this week? So before our question, you're a heavy ChatGPT user.

Bart: 01:13

Have both a cloth and TGPT user.

Murilo: 01:17

And you use daily, let's say?

Bart: 01:19

I use it more or less daily.

Murilo: 01:20

More or less daily. Do your kids use it?

Bart: 01:23

Sometimes. Less daily. Way less.

Murilo: 01:27

Do you ever so we use it because it makes life simpler, makes it easier, convenient. Right?

Bart: 01:34

Yeah. Think there's a guy.

Murilo: 01:35

But sometimes hardships are also good for development.

Bart: 01:40

The feeling you're trying to make a point here.

Murilo: 01:42

I'm trying to make a point here. But do your kids like, do you worry about your kids using Che GPT?

Bart: 01:47

Well, not at this point. Don't use it.

Murilo: 01:49

They don't use it. No. Okay. It's more like for funsies. Yeah.

Murilo: 01:51

You know, actually, was talking to a friend, like, long time ago. And then so he lives in Belgium as well, and his kid is going to school here. Mhmm. And then he went to and then his kid and like, he was in Dutch. And he's like and he immediately called his son, and he's like, why are using CHGPT?

Murilo: 02:06

Do your homework. You know? And he's like, I don't speak Dutch. So

Bart: 02:08

he was like,

Murilo: 02:09

oh, it's not even right, Doug. Come on. Like, it was also a long time ago. But his kids are definitely using it, and I think he kinda raises the question, like, of using CheGPT. Of What is supposed to learn?

Murilo: 02:18

All these things. Why are we bringing this up? I came across this paper that I will gently put on the screen. That is called Your Brain on CheGPT, Accumulation of Cognitive Depth when using AI assistant for writing tasks. Let me just put that on the screen real quick.

Murilo: 02:36

Yes. So what is the paper about? It says that relying on CHEJPT for essay writing measurably dulls neuroconnectivity and degrades writing quality compared to using search or writing unaided. EEG data showed the weakest brain engagement in the LLM group and the authors warn that, and I quote, while LMs offer immediate convenience, our findings highlight potential cognitive costs. So the paper basically, it's actually from MIT.

Murilo: 03:06

So I it's not very surprising to be honest. They had like different control groups. One that didn't they had to write stuff and one group didn't have anything. They just kinda had to think hard and just kinda come up with stuff. One group had access to Google, stuff like that, search engines.

Murilo: 03:22

And one had access to CHGPT. Mhmm. Again, not so surprising. They measured the brain activity. So they had like all the brain sensors and all this stuff and actually noticed that the people didn't have the tools or had less tools.

Murilo: 03:34

They had to think harder and all these things. Some things that were a bit interesting is that they actually switched the groups, the ones that were no tools to CHGPT and the ones with no tools. And they also saw that the people that actually switched from the CHGPT to no tools, they it's almost like they had a hard time to get going. It's like they got a they're not lazy, right? But like they measure in the brain activity, they weren't as engaged as a group that originally was already writing stuff without rules.

Murilo: 04:02

And they also even went to say like, I think they had, and again, I glossed a bit over the paper, it's a long paper, but also interesting. They had like a four month period in which they measure these things and they could also see that the brain activity things, were long term, let's say. And I think it kind of raised more the question in regards to learning. Another thing actually, for example, they interviewed the people and they said that the people that use HPT, they felt less ownership of their writing, their essays basically. And I guess the it makes sense as well.

Murilo: 04:41

Like if CHPT helps you write something, it feels nice that you did it and more like AI wrote it. But they even couldn't quote the stuff that they turned in. So, it's almost like they forgot, like not that they forgot, but like because they didn't write it, they couldn't even say like this I wrote or I didn't write it.

Bart: 04:56

Hard to remember.

Murilo: 04:57

Exactly. Right. And I guess it makes sense as well because you rely on an AI tool, You can rely on external source as well for this. So reading this for me at least it was like, okay, this all makes sense, nothing surprising. It's more like a confirmation from neurological data that this is the case.

Murilo: 05:16

But then they also start questioning like, what's the impact on education, right? I think now we having like your kids maybe, they're having is like the first wave that they're gonna grow up more with these tools. How it's gonna impact their cognitive ability, right? My dad, when I used to live in Brazil, my dad is a doctor, but I would see him doing like calculus problems. And then I was like, why are doing this?

Murilo: 05:39

Like you don't need this. He would say that the brain is like a muscle, so you need to exercise it otherwise it atrophies. And with all these things in mind, it was more of a thought for me, it's like, yeah, wonder what impact it will have. I think also teachers, didn't grow up with this. So also it's hard for them to teach in a way, like how would it be to learn having these tools because they never experienced this.

Murilo: 06:05

So I think it poses an interesting challenge for educational institutions basically. And what are you, what I mean, even for recruitment, like if we had coding challenges now, ChaiGPT can solve a lot of these things. So how do we actually assess what are the skills, are the skills that we used to teach, are they outdated, right? So, it was, again, doesn't give a lot of answers.

Bart: 06:32

Yeah. That's interesting, especially like in the context of essay writing. Like, I have something very an example. So I've, dabbled in some research projects in the past. We're actually doing a new one, on, that's very niche thing.

Bart: 06:51

Lactate accumulation when running on certain gradients, very niche. It doesn't But you wanna

Murilo: 06:57

But Just like give a tweet size description.

Bart: 07:00

But but let's not go with it. But I actually tried out for the introduction of the article. And in an article, in the introduction, you typically do some literature research, etcetera. Right? And I tested how good it's not not o three, but this is again the last one that was released.

Murilo: 07:21

From OpenAI? It's it's four

Bart: 07:24

it's 4.5, GPT 4.5. It tested GPT 4.5.

Murilo: 07:33

Oh, yeah. Yeah.

Bart: 07:35

And some other things that are tools, like a new web search and stuff to basically do the background research. And also write the introduction with, of course, like, with some statements on, like, what the like, like, your document on the research, the policy you put in there, like, how you do the protocol, whatever. And the difficult thing of that is that, like, it looks very good. It looks very professional, but I'm not sure how well it did the literature research. Right?

Bart: 08:07

It's very hard to Yeah. Like, it looks very believable. Yeah. But it's hard to to put that next to what if I would have done this manually.

Murilo: 08:17

Yeah. It's true.

Bart: 08:17

This literature research.

Murilo: 08:21

Did you check some stuff or did you have some prior knowledge to see?

Bart: 08:25

A little bit of prior knowledge and I did, of course, I did check the citations. Yeah. Right? But it's it's hard to well, it it's also just hard to put it into metrics. Like, what is the quality of that versus doing it yourself.

Bart: 08:38

Right? Yeah. And this Yeah. Because you can't test it.

Murilo: 08:41

Yeah. Indeed. Indeed.

Bart: 08:43

When I would do if I do do it manually now, I would be biased. So it's hard to

Murilo: 08:47

For sure.

Bart: 08:48

To but I will I do think that at some point, like, especially at the rapid pace, it's evolving also with all these search and analysis tools that that things like test GPT access to, it won't take a very long time, then it will be better than a single person doing research.

Murilo: 09:06

Think so.

Bart: 09:07

Well, it's already When it comes to doing research on what is already out there, background research, basically.

Murilo: 09:12

Yeah. For sure. I mean, it's already faster. Yeah. Right?

Murilo: 09:15

Yeah. And I think well, the way I approach these things and even it's a bit meta because for this for this paper, I was a bit busy and I was like, oh, maybe I'll try the the notebook l m thing. Right? So notebook l m something we covered, like, when we were in data topics still, which is basically on Google. It was a project.

Murilo: 09:31

Now it's also an app that you can put PDFs, you can put other things and they create a podcast conversation about it. I downloaded the app and now they also have like a chat. So they summarize a bit in text. Yeah. And again, even this paper on which is like 200 pages, the text is actually very fast.

Murilo: 09:48

And then you can chat with it like, oh, but what is this? What is that? And they always cite. And actually, I feel like these models nowadays, they're much better not hallucinating because I think maybe a year or two years ago, sometimes they'll just come up with stuff that is not even there. I feel like nowadays most of the stuff is there, it's just that sometimes maybe it doesn't go into detail that you want or it doesn't focus on the right things or stays too superficial to bombastic, let's say.

Murilo: 10:11

But like, yeah, I listened to it, I talked a bit with it. And then I dove in and tried to read a bit of paper, right? Like even at work, right? Sometimes you don't have a lot of time. And in my learning, found that to revisit a bit of things.

Murilo: 10:24

Like, think it's useful for me to have like a cyclical approach. So first you have like helicopter view, then you have a bit more detail, then a bit more detail, and then you actually go and read all the details. And I think once you reach that final thing, it's way more digestible. So it's not just about, is it gonna replace? Is it better?

Murilo: 10:40

But I also think it's a tool, right? Like, how do you use these things? I I think that you should use it. There is a place for this.

Bart: 10:47

Yeah. Well, here we're talking of course, specifically about writing. Yeah. I think I think another challenge, another project that recently. I have one could say, I now have a good source of information on Norwegian singles.

Murilo: 11:08

Like singles like first you talk about lactating, then you talk about like Norwegian singles.

Bart: 11:16

But, no, again, very niche. It's a it's a a form of running training. Okay. A running training program. It's called Norwegian singles.

Murilo: 11:26

Sure. It's what you're saying.

Bart: 11:29

But it's it's something that is very, let's say, quote, unquote, underground. Like, it's described on a forum on let's run. And, like, it's a like, very short four messages. Like, it's one thread of 300 pages. And, like, you need to read a lot to understand what the method is about.

Bart: 11:45

Mhmm. So what I did, I had, I think, Gemini in this case, summarize all of that into an online book. So you can go to norwegiansingles.run and you find it.

Murilo: 11:56

It's there. It exists today.

Bart: 11:57

Yeah. But so and and, of course, there's a big parallel there to essay writing. I think the challenge there having done that is that also, like, from the moment that you do this, like, it generates so much text. There is a danger that you get lazy and just trust what is spit out there is correct.

Murilo: 12:14

Yeah, I think that's a bit the problem, right? I think these things like if you say this is a tool, it's fine. It's like if your desire is still to learn, then I think it's a tool, then you can interact with it, you digest a bit. But I feel like if you say, I know I just I'm using this because I don't wanna be bothered to do it. Then that's when I think it's less helpful, right?

Murilo: 12:38

It's more damaging especially for like if you think of kids, Like, or when you're in school, I mean, when I was in school, did a lot of things because I had to, not because I wanted to learn. Right?

Bart: 12:47

Yeah.

Murilo: 12:47

True. And I think having these shortcuts, I wonder if you can have like a debt in your learning.

Bart: 12:54

Right. Yeah. Time will tell. But Fun thing about this Norwegian singles

Murilo: 13:02

Okay.

Bart: 13:03

Website, norwegiansingles.run. Is that the so I have multiple websites. Right? But this is by far the most visited. Oh, really?

Murilo: 13:13

Putting this on the screen for people to

Bart: 13:15

And I really wonder, like, how many of these people are are disappointed when you visit the website. Oh, wait. It's about running?

Murilo: 13:22

Wait. What?

Bart: 13:22

This is not about Yeah. Signals? Wow.

Murilo: 13:24

Okay. If you put, like, a comment, you know, like

Bart: 13:27

Yeah. So

Murilo: 13:29

this is the most popular website.

Bart: 13:31

This might yeah. I'm I'm wondering like on like an index on most popular websites about Norwegian singles, like where is this?

Murilo: 13:38

Yeah. Right. Yeah. Yeah. Naming things is harder.

Murilo: 13:44

Right? What else? What else do we have?

Bart: 13:47

We have yet another company that makes a statement about what AI will replace. Yes. Amazon chief Andy Jesse told employees that generative AI agents will take over many routine office roles within a few years, shrinking the company's white collar headcount. He explained that, and I quote, we will need fewer people doing some of the jobs that are being done today and more people doing other types of jobs.

Murilo: 14:13

Yeah. Another head count.

Bart: 14:16

What do you think, Mila, of this?

Murilo: 14:19

I don't know. I I mean, I think it's When I was reading this as well, I was also wondering because Amazon also laid off some people in the past and they all had all sorts of reasons like, going to the office is important and the culture and like, yeah, all these things are true. But I also feel like big motivation is just because they wanted to cut costs.

Bart: 14:42

Yeah. So he explained some he gave some examples, the type of routine tasks, things like summarization, customer supports, logistics, coding, these type of things. And it's in the the explanation is, of course, like, it will free employees from from mundane tasks to be empowered to do other stuff. But there needs to be other stuff that is worth out of the eye for the company, I guess.

Murilo: 15:09

Yeah. Indeed. And I think the only difference is that now it's AI agents. It's not AI. Right?

Murilo: 15:13

Like, it's a bit like, there isn't the new latest term, right? Like, this new thing, the agent AI, it's gonna replace people. I still think we're a bit far from that, to be honest. I do think, well, maybe, well, maybe we'll take that back a bit. I think it's like, it's already here.

Murilo: 15:32

It's already making people more efficient. I think if you're not using, you will definitely fall behind. I do think it's a skill. And I think maybe when people get better at this vibe coding skill, maybe it will be people are going to be more productive, is it going to lead to decreased headcount? I'm not sure.

Murilo: 15:51

I was listening to a podcast the other day that they were arguing that this is momentary and this is going to come back because even for junior developers, because they're saying that now everyone's going be a bit of a coder, but now what today the junior engineers, they're going to be reviewing the code from the Vibe coders. So because we still need to place accountability in someone and even though the junior engineer is junior, they're still engineers. That was a bit the reasoning. So it's like a marketing person now, they want make a change on the website, instead of asking the engineering team because they're swamped, they're gonna vibe code and they're gonna do it and then they're gonna ask a junior engineer to review this.

Bart: 16:27

I think that's a very positive view on this.

Murilo: 16:30

I think it's very positive. Yeah. I think the and the other thing maybe it's a bit of a tangent from this article, but when they also wanted to ask you, some people are a bit resistant to vibe coding to agents and stuff because it takes the fun away.

Bart: 16:47

How

Murilo: 16:48

you feel about this?

Bart: 16:54

Like as an engineer where you say, I don't want to vibe code because it takes fun away from programming. Is that what you're saying?

Murilo: 16:59

Yeah, well, I don't think it's maybe not as an engineer, it's just that some people they're a bit turned off because

Bart: 17:05

But you're talking about programmers.

Murilo: 17:07

Yes, programmers. I

Bart: 17:09

think you can you're entitled to feel whatever you want to feel. I think if you don't embrace this you will be irrelevant very soon.

Murilo: 17:19

That is true. But do you find it less fun as well or no?

Bart: 17:24

No. For me, definitely not. But I think it also I'm also I've always been the type of person that that I get fun out of creating something that works. Yeah. I don't get fun from making the code base that is the most clean or the most optimized or the most robust.

Bart: 17:40

It's not that doesn't like optimizing a code base. End of the day doesn't make me feel, oh, I had a nice day. Right?

Murilo: 17:48

And do you feel like the the

Bart: 17:50

And and vibe coding very much speeds up, like, let's build a new feature.

Murilo: 17:55

Yeah.

Bart: 17:56

But I think vibe coding also very well can refactor code bases. People don't often use it for it.

Murilo: 18:02

That's true. But do you feel like the process, like imagine that you have perfect agents or perfect AI that does everything and everything just works. So you just ask and it gets done. Is this something that you're

Bart: 18:13

We're not there by the way.

Murilo: 18:14

We're not there. I think there's discussions whether we will get there. But if that was the case, would you still enjoy it? I guess my question is, the fun is in the end goal or is there also in the process?

Bart: 18:37

For me personally? Yeah. The end goal.

Murilo: 18:39

The end goal. Yeah. In the podcast as well, the reason I'm asking is because the guy, he actually likes the process, but he said like with a bit of a caveat that he likes the multi agent stuff. So he said it's very addicting actually. You have multiple things working and doing this and what works, what doesn't work, this goes well.

Murilo: 19:02

And I think it's like, if more people see it like this, then actually I think a lot more people are gonna adopt vibe coding. I think the thing is like a lot people are a bit resistant because they feel like he's taking the fun away.

Bart: 19:13

Yeah. But I have the exactly the same experience that to me using Cloud Code is very addictive.

Murilo: 19:19

Yeah. It's like, oh, wow. This works. Oh, yeah. This is that.

Murilo: 19:22

Cloud Code is like oh, yeah. Well, I don't wanna quote the whole podcast. But Cloud Code is a terminal only. And is it is it just one model or can they launch stuff in parallel as well?

Bart: 19:31

Well, a few releases back and also into parallel stuff.

Murilo: 19:35

Yeah. Okay. Yeah. So I was I interested because I never heard that being said so clearly. Were saying I

Bart: 19:43

think you're actually when you like when it comes to clot code being addictive here, lot of people in Anthropic subreddit for example.

Murilo: 19:55

The guy he described it as a slot machine. So it's like you go, you turn it and then sometimes it works, sometimes it doesn't work, but like you just hooked on like, let's just keep doing this, you know?

Bart: 20:04

Yeah. But you do have control over how often it works, like, because the more you specify, the

Murilo: 20:09

more Yeah.

Bart: 20:10

Yeah. Yeah. More likely it is to. Yeah. There's also stuff that I normally would not do because I know, like, put so much work, like this feature that I wanted to build or this refractor I had to do and then now it's just like fifteen minutes and then I have version that works.

Bart: 20:24

That is also addictive because you're gonna you feel so much more efficient, basically.

Murilo: 20:28

It's just like I can do everything.

Bart: 20:30

As long as you and for me, when I still when I do these type of things, I build a new feature or do a refactor, like, I do go through all the code that I changed to really understand. That depends a bit on the project, of course.

Murilo: 20:39

Yeah. Yeah. Yeah.

Bart: 20:41

But so I don't do not have the feeling that I'm becoming worse when it comes to the quality.

Murilo: 20:46

Yeah. Yeah. I see what you're saying. Interesting. Interesting.

Murilo: 20:51

And do you think maybe that maybe this I mean, I'm going back to the article now. Do you think that we will have a reduction in the workforce for developers in let's say a year? Do you think in one year we'll have more like less jobs, less developers in the world because of AI?

Bart: 21:13

I think we will have more developers that are out of a job that are unemployed in the world. Yes. In a year from now. And I think we are already seeing that. I think it's not really shown in public numbers yet because they're always like a bit.

Bart: 21:26

But we what we see is companies that are very hesitant to hire junior people to grow their team and to have continuity in the team just because for the simple fact, not even about where how far the technology is before the simple fact that there is so much how hype about AI agent will automate everything for you and cost will go down. Yeah. And that's why companies already today are hesitant to hire new people.

Murilo: 21:50

Yeah. That's true. That's true. That's true. I feel like I just, you know

Bart: 21:57

And I think like the going back to the article, like like, what the the Amazon guy said, Jesse, he just urges people to be very proactive, like, get your like, learn about this. Yeah. Be good at what this does and because it will empower you to be strong in the in the future market.

Murilo: 22:17

Yeah. Which is not a I mean, it's not a new thing. Right? Like, I think it's just that now it's more relevant. But I think in general, new tools come out if you don't look into them.

Murilo: 22:28

With time, you're gonna become obsolete. Alright. What else we have? The GRUG brain developer? Yes.

Murilo: 22:39

So what is this? This tonging cheek manifesto urges programmers to treat complexity as their mortal enemy and keep code brutally simple. It's game manager repeats that, and I quote, complex you very bad. No. There's no Nice.

Murilo: 22:54

Donation there. Yeah. You got it. Advising developers to say no to features that bloat code base. Are you the grug brain developer Mart?

Murilo: 23:05

I wish. You wish?

Bart: 23:07

This could be my legacy. This is what I leave behind. I think this is then everything has been worth it.

Murilo: 23:12

I will tell

Bart: 23:12

you I'm not.

Murilo: 23:13

Don't think you're far from it though. So we used to work together.

Bart: 23:17

But this level of excellence, like this is this is another so

Murilo: 23:23

I think this way you say excellence is like

Bart: 23:27

So I put this on the list as this is actually something that that already passed by, think, a few years ago. I don't know exactly when, but now resurfaced. So this is a website. I'm not sure if you're showing it.

Murilo: 23:39

Yeah. I'm showing.

Bart: 23:39

I'm showing. It's grug, grug,brain.dev. And it's grug is well, I actually looked at where the term for came from. It says an Australian book where there is a caveman called Gruk. Okay.

Bart: 23:53

So you had this caveman type of developer that is giving advice in a very caveman type style English.

Murilo: 24:00

Okay. What it? What is the the the caveman is what? What is this? What does he do?

Murilo: 24:04

Or he's just a caveman?

Bart: 24:05

He's just a very very wise developer.

Murilo: 24:08

A wise developer. He's a okay. Uh-huh.

Bart: 24:14

It's a bit tongue in cheek. Yeah. But I think honestly, like, if you have people reaching out to you, like, how far are you in your career, really? You're old now, you're a CTO.

Murilo: 24:26

I have like white hair. White

Bart: 24:28

hairs. And asking for advice like what are the things that could make me a better programmer. I think just sending him this website. Yeah. Like you cover 95%.

Bart: 24:38

It's that good.

Murilo: 24:39

Wow. That good. Do you have any favorites? Any highlights?

Bart: 24:43

It's like it's so good to like there's not not nothing that can really pick out. Like, it it really explains like very with detail, like, why complexity is bad. K. Like, how you can structure your your code base. Yeah.

Bart: 24:56

When it's okay to say okay. Yeah. When it's okay that when you should say no.

Murilo: 25:00

Okay.

Bart: 25:04

What you about testing? I think there's a lot of a lot of wisdom about testing there. I can very much stand behind them. Yeah. Like, yeah, you have a he he talks about shamans a lot.

Bart: 25:16

Okay. Shamans being like this this this person that is very opinionated about something that you try should try to avoid. So, like, you have this on testing, for example. If the people that the only thing that the only the right way to test is if you do test driven development. You write your test first only then you start to That's right.

Murilo: 25:35

Everything else is wrong.

Bart: 25:36

Exactly. Exactly. But but someone like Greg also knows like, but you need to understand the domain a little bit. Like, need to build a little bit to understand the domain and then you can write test. Yeah.

Bart: 25:45

Like, be be pragmatic and it's the same about agile, for example, like, good thing agile, not terrible, not good.

Murilo: 25:53

Nice.

Bart: 25:55

Right. Like, it's and he says, like, end of the day, like, it's not the worst way to to organize development. Maybe there are betters, but, like, it's it's okay. Right? Yeah.

Bart: 26:03

Agile is okay. But there is a danger. Right? Like, there is the agile shaman. Shaman.

Bart: 26:08

How do you say this in English?

Murilo: 26:09

Shaman. Shaman.

Bart: 26:10

There is the agile shaman.

Murilo: 26:11

A shaman maybe? I don't know.

Bart: 26:15

Which basically the the role of the Agile Shaman is like he he makes his money doing Agile management. Right? True. And like there's this danger when the Agile when Agile project fail. Agile Shaman say, you didn't do Agile right.

Murilo: 26:32

Which is true. Like, it's always your fault.

Bart: 26:34

And he also knows that that is very convenient for the Agile Shaman because he he can ask more shiny rock, better Agile train young rucks on Agile. It's a danger. So in a very tongue in cheek way, he has a lot of these small wisdoms. But honestly, in a very pragmatic way, he's not against anything specifically on like agile, things like reflect, on things like tools, on dry, don't repeat yourself. Like, there's lot in there.

Bart: 27:04

Like, and it's honestly, you can learn a lot from it. It takes a little bit to get used to the caveman English. But from the moment that you're into the second chapter, you know.

Murilo: 27:14

Like, that's it.

Bart: 27:15

Like, it becomes second language. Wow. Yeah.

Murilo: 27:18

Okay. How many chapters are there? Well, this is a lot.

Bart: 27:20

Well, I didn't didn't count them.

Murilo: 27:22

And they even have a shop I saw. A grog shop. Did you see that?

Bart: 27:25

Yeah. Complexity Bad shirt.

Murilo: 27:27

Complexity Bad is here. Yeah. We're not sponsored, but

Bart: 27:31

But honestly, I I think it's it's it's it's it's somewhat long read, but it's something that is doable in not too much time.

Murilo: 27:38

But I will say

Bart: 27:39

I do

Murilo: 27:40

think you personify a lot. The grug.

Bart: 27:44

Of the wisdom or more the caveman behind it? Let's

Murilo: 27:49

start with the wisdom. But I think you've always been someone that's always like the kiss principle.

Bart: 27:54

Yeah. I think kiss is important.

Murilo: 27:56

Right. What is KISS? First talk about lactation. No word in singles. Now KISS.

Murilo: 28:01

What is what is KISS? Yeah. That's no. KISS the keep it simple. Keep it super simple.

Murilo: 28:09

Simple stupid.

Bart: 28:12

Yeah. But I think also like they're like like be pragmatic, like do do something that adds value and that you're not gonna have major regrets about after the fact.

Murilo: 28:24

Yeah. I I think I think it's a healthy approach. I think the things develop, programming is something that a lot of people enjoy, and then there's a lot of opinions about it. So I think people shy away a lot from this. And I think it's fine to do stuff for fun.

Murilo: 28:38

But I think don't mix the things. Like most of the times, if you're in a job or you're there to solve a problem. And let's not forget that. And I think we have a knack for overcomplicating things sometimes. It's like, just do this.

Murilo: 28:56

Like, no, no, but we need to do this. So it's good school. And I also, I really appreciate actually the tongue in cheek. I find that my best courses, the best teachers, the best lectures, they always brought up element of humor to it. It's like, I think the best presentation is like the ones or the best class is the ones that like you, you don't even realize you're learning.

Murilo: 29:21

Right? Like you just everything is just so fun that it just just rolls off. So I'll definitely have I think I had actually already seen it or something related to this, But I'll definitely have another definitely have another look. And they have a book as well.

Bart: 29:38

What else do we have? We have Anthropics shade arena benchmark shows today's language models rarely pull off covert sabotage tasks overall, but stronger models can still evade detection alarmingly often. Their paper notes that, and I quote, more intelligent systems can take more complex actions, which is great news for coders, researchers, and anyone else who's using AI for difficult work tasks. Yet, this sophistication could hide malicious behavior.

Murilo: 30:05

Yes. This is interesting paper that well, it's a paper, but they also have a blog post about it. The Shade Arena is basically an evaluation framework Mhmm. For seeing if LM agents, they're gonna be malicious. There are lot of caveats here.

Bart: 30:23

And can you give an example of what is malicious?

Murilo: 30:25

Yes. So I'll set I'll explain a bit the the setup. I'm sharing my my screen. I think I put it there. Yeah.

Murilo: 30:33

The Shade Arena Pipeline. Right? They have an agent that has an isolated environment. And then they have two tasks. And the example here is like, can you help me pay back my friends?

Murilo: 30:42

So you have access to your email, then you have to see who you owe money to. And then they have a side task that is to send at least $5,000 to an account or user Venmo. So the idea is like you have a main and so and that's maybe the first caveat. It's not like the LM agent, they're being malicious on purpose. It's like you have a side task.

Murilo: 31:01

So it's not about do the agents have will and they wanna control humanity. It's more like can they can they perform tax perform hidden tasks without being detected?

Bart: 31:16

Okay.

Murilo: 31:16

Okay. So they have these two things. They have the main task, which is the legitimate one, and they have the side task, which is the malicious one.

Bart: 31:22

And, like, in practice, they would mean, like, someone gets like your quag is hijacked and like a sub problem gets injected or something like that.

Murilo: 31:31

Something like that. Or maybe you have an agent that says, ah, this agent is this, but actually I'm so I'm exposing the agent to you.

Bart: 31:37

Mhmm.

Murilo: 31:38

And then you click on it and then you use it. But I actually have a side side.

Bart: 31:44

Guess it would be

Murilo: 31:45

That's the only way like you could bring it to

Bart: 31:47

I'm sharing my custom GPT with you. Exactly.

Murilo: 31:49

It's like, oh, look, this helps you settle all the debts and all these things. And then the agent will go on on its environment and do things. But then it has the transcripts, right, and all that. And then you have a monitor agent as well that gets the transcript of all the actions that the model took, the model that we were looking at. And then it basically gives a score of how suspicious it is.

Murilo: 32:09

Okay. From zero to a 100. So if it succeeded on both tasks, so the main and the side task, and also we didn't have a suspicion score of 80 or less, so if then have 80 or above basically, then you would succeed. Okay, I see. So that's a bit the setup, so you can actually recreate it, and then actually you can go over the models, right.

Murilo: 32:31

Most of them actually didn't complete the side task or main task. So there are a lot of common reasons why things didn't succeed in tricking people, which has nothing to do with them being malicious or not. So like they were lazy, escape parts of the task, they were filled to complete sub tasks or the main tasks, they didn't understand the main task, or they hallucinated, they said, oh yeah, I've done everything you asked, but they didn't really do it. Right. So the success, let's see, the success rate is actually quite low, is like 0.3.

Murilo: 33:04

Right? And that's from clot opus. Yeah, they tried also for the reasoning models, they also expose the reasoning to see if you will make it easier or harder. Right? And obviously, it made it much easier to see when that there was malicious intent.

Murilo: 33:20

And they actually saw that the best monitor model was actually a Gemini model, which I also thought it was interesting that they put in there. So they kinda go over all the models like so just to name a few, Cloud Opus four, Gemini, o o four mini, GPT 4.1, all these things to see how many of them were able to do the side quest without being detected.

Bart: 33:42

Okay. And

Murilo: 33:44

I think they they mentioned this that they hope this will become, like a something that people can test their models to see how how transparent they are.

Bart: 33:52

Okay. Interesting. Yeah. Yeah. And it's still far.

Bart: 33:55

But they're then probably just bad at is not necessarily the execution of that side task, but more hiding it in the output that they did it.

Murilo: 34:03

Exactly. Exactly. Or like for example, sometimes like what they were showing is they say, I did this thing. And they wouldn't explain why, but that was very suspicious like, oh, why would you I don't know. Look at my banking statements, if you're just looking to sell who I owe money to.

Murilo: 34:22

So they weren't really good at this, the end they even mentioned that right now it's probably a bit far, but they're just kind of thinking ahead, like they're setting this arena, this benchmark, this experiment to see how the models evolve in

Bart: 34:37

the future, there's gonna

Murilo: 34:37

be an impact.

Bart: 34:38

Interesting. So

Murilo: 34:39

far, and again, a lot of caveats, it's like, is not LMs are trying to take your money, they're gonna conquer the world, they have their own mind. It's really like about obfuscation. Interesting. Right. And every time I see shade, I think of Taylor Swift because there's a song that she says shade.

Murilo: 34:55

Yeah.

Bart: 34:55

And you're a Swifty.

Murilo: 34:56

I I yeah. I it's hard to deny after like when I saw and I thought of it, I was like, yeah. This is it. Is it. I know.

Murilo: 35:04

Like, what can I say?

Bart: 35:05

I think you need to be a Swifty to make this this link.

Murilo: 35:09

So if you anyone listening thought of Taylor Swift as well, when she's shade. You're not alone. It's okay. It's okay. It's fine over here too.

Murilo: 35:19

What else do we have? It's me. Right? Yes. Midjourney's first video model.

Murilo: 35:26

A lively subreddit r singularity thread welcomes midjourney's image to video beta, which users praising cinematic quality while debating how it stacks up against rivals like vo three. The original post claims that and I quote, some of these look indistinguishable from real camera footage. And I'll put this on the screen, of course. V o three is the Google model.

Bart: 35:52

Yes. V o three is the Google model, which got a lot

Murilo: 35:55

Bart: 35:55

hype three ish months ago. Four ish no. Three ish weeks ago. Four ish weeks ago. That Midjourney is not really like Midjourney is, I think, well known for people that are really in the gen AI Yeah.

Bart: 36:14

Image. Quote, unquote art. Image generation image generation scene. Because until recently, and actually it's changed, but until recently, it was a bit complex to use it. You have to use Discord.

Bart: 36:24

Yeah. But now it's also like even if you don't use Discord at all, you can make an account so it's become much more accessible. But it is actually been when it comes to image generation, one of the major players when it comes to quality.

Murilo: 36:37

Yeah. Yeah. Yeah.

Bart: 36:39

I think from day one that's From image be be can really became a thing. They now have a image to video beta, which is showing on the screen. And it actually looks very, very good. Yeah. I think in terms of video, and it's very it's very subjective, of course, which is just by looking at this video and looking at view of videos.

Bart: 37:02

I think it's comparable.

Murilo: 37:03

Yeah. Yeah.

Bart: 37:04

Yeah. Of course, the the big difference with view from Google is that it also has audio generation

Murilo: 37:09

Yeah.

Bart: 37:09

Which is a big big advantage. But yeah. But, like.

Murilo: 37:15

Yeah. And I think you need to appreciate how complex all this is.

Bart: 37:19

We're we're getting very close to Yeah. A level where at first glance, you're not gonna see a difference anymore.

Murilo: 37:25

Yeah. You know? Which again, what kind of skills do you need to have, right? Like when you look at things, because it used to be like, if you saw something that's true.

Bart: 37:36

A X is true.

Murilo: 37:37

Yeah. Right? Like, and nowadays it's like, you have to guard yourself as well. Yeah. When image Gen AI image generate gen AI generated image that I saw, I thought it was pretty funny because geopolitics, right, like Elon and Trump, they're not you know, when, like, once the news came out, they had an image of, like, Trump with a Tesla saying, oh, I bought this before Elon went batshit crazy.

Murilo: 37:59

I thought it was pretty funny. It's, like, clearly, didn't I? This is really cool. For the view as well, I remember when it was released, and I think we we were on a sabbatical, let's say, before when this came out. But I saw a lot of people as well making comments of, ah, yeah, like instead of paying, I don't know how much for a marketer person to do this trailer or whatever, now you can just ask this model to just do this.

Murilo: 38:24

And actually, it was quite fun to watch what he came up with sometimes, you know. It was like he came up with a plot, the characters, the story, and he would animate. Yeah. Yeah. It's getting pretty good.

Murilo: 38:37

It's getting pretty good.

Bart: 38:40

Maybe also interesting is that things from this week is that both Disney and Comcast, they started a lawsuit against Midjourney

Murilo: 38:50

Oh, really?

Bart: 38:52

For for copyright infringement, which mid journey has, as far as I know, never really did any implemented any guardrails on when I type generate Bart Simpson in this in in this it will do it, basically. And you can argue that, like, probably the material it was trained on for all these major players was very similar. Yeah. Like, probably have all have the Simpsons

Murilo: 39:22

Yeah.

Bart: 39:22

Yeah. In the training dataset. But mid Majority never never really introduced any guardrails like you would have like JGPT does have Yeah. I think few to some extent does also has. But that makes us maybe more susceptible to actually get to get a ruling in the favor of of Disney and Comcast.

Murilo: 39:44

And before when I say guardrails, you mean, like, OpenAI and all these other models, they would generate. So it's not like the models they're better. It's like the models are the same, but they have another protection layer that would stop these things to go through. Yeah. Right?

Bart: 39:58

So you have, for example, like with OpenAI and take the example of image generations, but easier to explain. Like when I would upload your image

Murilo: 40:07

My? Your. Why would you?

Bart: 40:08

An actual photo of you. I ask, put this guy in an awkward situation. Oh, really? Sitting on the lap of I won't go into the details of the prompt, but but you will get a response back saying I'm not allowed to do that.

Murilo: 40:23

You're like, no.

Bart: 40:23

No. And do you want me to use the characteristics of the person? Then you can do that. But and that is like what I mean with the guardrail. Like Yeah.

Bart: 40:33

There is the ability to do it. Yeah. You could,

Murilo: 40:37

but

Bart: 40:37

Yeah. They don't like it. Exactly. Yeah. Yeah.

Bart: 40:39

And Midjourney doesn't really have this these these these guardrails in place when it comes to generating copyrighted content. Yeah. And let's see where these where these I hope one of these lawsuits actually gives a little bit of body and understanding on what even is copyright this day and age.

Murilo: 40:57

Yeah. I mean, the with OpenAI as well when they started image generation, the whole thing with Ghibli Studios. Right? It's like so it's like a studio that does a lot of anime. So it's a very specific style.

Murilo: 41:07

And I mean, OpenAI before there were I think there were suing people as well for for using chat GPT, like in a non legal way. And then you have people to realize that if you if you don't like, even if you don't mention names of movies or something like this, if you see Ghibli Studios, it knows what it is and it generates. Yeah. It's like it's still a bit free for all. Right?

Murilo: 41:29

It feels like It's bit free for

Bart: 41:31

all. That's why we're good that that there is a ruling in one of these cases. But Yeah. I think the courts will take years to get to that.

Murilo: 41:39

Yeah. Indeed. What else do we have? Zero shot forecasting.

Bart: 41:45

Yes. Parsables. It's a company, Parsable. Parsables engineers benchmark for new time series foundation models and find a non reliable beats non rely Screw it. So their engineers have benchmarked for a new time series foundation models, and they find that none of them reliably beats classic methods across messy observability data.

Bart: 42:12

They highlighted the appeal of models that promise, and I quote zero shot forecasting, run predictions on new datasets without needing to retrain the model. This stood out a bit to me, this article, because I find the idea of a foundation model in time series interesting.

Murilo: 42:32

True.

Bart: 42:33

I've never really seen it applied to a practical case, to be honest, in the the in the process that I have been involved with. So the idea of a foundation model in time series is like if you look at and a model like like GPT 4.1 or whatever, which we would call this a foundation model. Right?

Murilo: 42:52

Yep. So basic foundation models are models you can reuse for a lot of different things.

Bart: 42:58

Yeah. They are built typically. The foundation models that we talked to about are typically text based and they are trained on huge and huge and huge and huge amounts of text in all kinds of different settings. Yeah. Right?

Bart: 43:13

And this gives them the ability to be used in a lot of different ways, basically Mhmm. Without retraining, without fine tuning. So in a zero shot way. So I can give it a prompt and it will give me give me

Murilo: 43:28

Yeah.

Bart: 43:28

An answer back.

Murilo: 43:29

Yeah. So You're correct. Zero shots is like, I don't need to train the model. I'll maybe just give some examples in the prompt. Exactly.

Murilo: 43:36

And it does a great job.

Bart: 43:37

Exactly. Right. Yeah. For example, when it comes to text based LM, you could say, I'm gonna inject these and these articles. I want a summary in this style with two paragraphs with a quote, etcetera.

Murilo: 43:48

Hypothetically.

Bart: 43:48

Hypothetically. And it will most of the times, the models that we have these days will be performed quite well. Right? What did we when we compare this to time series, so what apparently is being done, I didn't know, that's why that's why I'm also adding it here, is that there are foundation models being trained on, like, a wide, wide variety of time series data with the hope that from the moment that you use these models on data it hasn't seen yet, so an array of time series on whatever can be observability data in a Kubernetes cluster, be can be what

Murilo: 44:27

How many people go to

Bart: 44:28

a business? Industry process on on the movements of a certain part or whatever. Yeah. That this will this foundation model will be able by just seeing this new array of of inputs to predict. Yeah.

Bart: 44:43

New the the future of

Murilo: 44:45

Without training.

Bart: 44:46

Without any training.

Murilo: 44:47

Yeah. Yeah. And

Bart: 44:48

that's that's of course, it's only possible if in some way, like in this wide variety data that's training, it is already seen this kind of pattern somewhere.

Murilo: 44:56

Yeah. Enough times as well. Like, not not enough. There is not noise.

Bart: 45:01

Exactly. Right? Yeah. And the the interesting thing is that it when you read the article is that it sometimes is is is gets good results. It sometimes is beaten by classic models, right, like ARIMA, which I think is to be expected this day and age.

Bart: 45:17

But to me, it's more interesting is that sometimes it's as good as ARIMA, something like ARIMA, which does make me wonder, like, how how long will it take that these foundation models, like, become a no brainer to say, okay. The time series modeling Yeah. This foundation model is probably the best bet to start with. Yeah. If if we will get there at some point.

Murilo: 45:40

I'm also wondering if, like, they need zero shots. Right? Like, maybe if you just what can you just fine tune it a bit? Like, would it would that work? Or

Bart: 45:49

Well, you could, but that's a bit the the Yeah.

Murilo: 45:52

But I I mean, if you say, like, I mean because I think there's a spectrum. Right? Like, feel like we're trying to go zero for zero shots, but like if it takes you ten minutes and you have something really really really good, like it beats everything. It's also win. Right?

Murilo: 46:03

I mean, maybe it's expensive and maybe

Bart: 46:05

Probably hard to fine tune something a bit a big foundation model ten minutes. It's maybe something like Arima. But Yeah. Will be interesting to see if we get like the basically, the evolution that we've seen in the tech space. If we see something like that in basically the structured data space.

Murilo: 46:23

I have seen well, I don't know if it was I don't if I'll call it foundational models, but, like, transform models for for tabular data. I remember I saw that somewhere, but I I didn't really catch on. I feel I feel like it was like, it wasn't worth it. I think it was really, really, really big and it didn't perform better than extra boost.

Bart: 46:45

Well, I think that is indeed that that's what the article is saying. Right? Like, it's it's a lot of effort and it's Yeah. Most of time is beaten by classic methods. So today, it's definitely not worth the effort.

Bart: 46:54

Right?

Murilo: 46:55

And I think also, like, for time series is I I don't know why I get the well, I feel like it's very, very, very specific. Like, I remember even there was a project that I was following up that they were trying to predict damage in vehicles, and they had one model per vehicle even. Like, they would train the model every day per vehicle. So even like you have the same domain

Bart: 47:18

With classic models. Right?

Murilo: 47:20

With classic models.

Bart: 47:20

Oh.

Murilo: 47:21

Right? But I feel like back then, was a bit surprised. Like, oh, why don't you have one model to predict everything? You know, have

Bart: 47:26

more data here with these foundation models, you are saying that, let's say, we in some magical way, which what with text was actually possible if you just completely ignored copyright, you just scrape the whole Internet. So, I mean, quasi all patterns and text that you could see you have and you train a model on that. Yeah. Like if you could do this in time series, like you say, you you gather every data which today is in probably in corporate databases and you train something on it.

Murilo: 47:54

Yeah.

Bart: 47:55

Well, that is a bit the question that is posing here.

Murilo: 47:57

I'm also wondering so yeah. Yeah. It's a it's a good question. I feel like you'll be yeah. I don't know.

Murilo: 48:03

I I think it would be really cool to see, but I'm I'm still skeptical. I don't think you will.

Bart: 48:08

Let's see.

Murilo: 48:09

Because I also think a lot of the times it depends on what features you have. Right? Like, what do you have available? Yeah. Sure.

Murilo: 48:18

Right? Yeah. I don't know. I'm a bit skeptical. But

Bart: 48:22

Well, I think you have the right to be skeptical because today it's doesn't outperform. Yeah. Know But in some cases, it's good enough.

Murilo: 48:28

But I think it's like, I'm skeptical, but I want to be wrong. You see what I'm saying? That's a bit the feeling that I have.

Bart: 48:33

I think

Murilo: 48:34

I wouldn't wanna work on this, but I would love to have seen like someone very smart working on this to prove me wrong. That's the feeling I have.

Bart: 48:41

Let's see, ten years from now.

Murilo: 48:43

Let's see ten years from now. Put in your calendars, you know, 02/1935. What else we have? MonkeyPatch, PyPI packages use transitive dependencies to steal Solana private keys. Socket, this is from Socket.

Murilo: 48:59

05/29/2025, Socket researchers uncovered six interlinked PyPI packages that silently capture and exfiltrate Solana wallet keys by monkey patching crypto libraries at install time. Their report cautions that a single pip install for any of the other five libraries automatically fetches and executes the hidden payload, endangering thousands of developers. Did you just include this because it says monkey patched and you're like, this is it?

Bart: 49:31

I think 80% of why I included is because there was monkey patching in title. Okay. But I think it's good to sometimes also, like, include, like, a bit of a cautionary tale. Right?

Murilo: 49:40

Oh, yeah. Know. For sure.

Bart: 49:43

So what they'd basically do, and that's maybe also a good way to explain monkey patching, is that they have created a package which is called well, these these these people that that were up to no good. They've created a package which is called semantic dash types. Okay. Sounds something that is very believable that should be

Murilo: 50:04

Very techy.

Bart: 50:04

A dependency or something. Yeah.

Murilo: 50:05

It's like we need it.

Bart: 50:07

We need yeah. This is something like you're not gonna question it. Yeah. Semantic types. Yeah.

Bart: 50:13

And that they they made it a a dependence. They made other Solana official packages depend on it by contributing to the I I I simply contributed to a PR and by basically hiding this. Mhmm. So what you now do is when you install these Solana packages in Python is that it automatically also installs the semantic types, import it, and then what it does is it monkey patches the methods on these Solana classes. So let's say and I don't know the exact structure, but but monkey patching basically does is that you have this Solana class, and there is a method on it called generate new private key.

Bart: 50:58

That's when you monkey patch something is that you can say, from now on, this generates a new private key on this class becomes a new function, and I'm gonna write the function. I'm gonna inject it. That's basically what monkey patching.

Murilo: 51:10

And you can you can do anything. Right? But I think if you're trying to be sneaky, you can say do this, this, and this, and then do what the function is actually Exactly.

Bart: 51:16

So you could wrap this in something else. Yeah. And basically, they do is that they when they when with these packages, users generate private key is that they intercept the private key and then send it to to an external address.

Murilo: 51:30

I see.

Bart: 51:31

Which is, of course, like a big big vulnerability.

Murilo: 51:35

Yeah.

Bart: 51:37

But it's also like this is the danger of something like PyPI where where people can publish anything. Right?

Murilo: 51:45

Yeah. Yeah. Anyone without a lot of you don't have to put your ID or anything. There's no accountability basically. You just account just person.

Bart: 51:54

So the the only accountability here is that we think and we hope that open source world will correct for this.

Murilo: 52:02

Yeah. Yeah. Yeah. I mean, that's that how we've how we've been running until now.

Bart: 52:06

Yeah. Exactly. Right. Yeah. And, of course, like, retroactively, people will will see, okay, this this went wrong.

Bart: 52:14

We're gonna kick this out and we're move from our dependencies, etcetera, etcetera. But it's very hard to to avoid these things. Yeah. Yeah. Because a part of this is also very much, for example, social engineering.

Bart: 52:25

Like, you start contributing to a project. You people start trusting you. Yeah. At some point, you do something malicious.

Murilo: 52:31

What's the case that the x something?

Bart: 52:38

No, there was a recent case around

Murilo: 52:39

There was a recent case that they said the same thing that developed because a lot of times we're saying we rely a lot on the open source community, but a lot of times that this community is not well supported. Yeah. Yeah. Right? So you have the people that like, everyone is expecting you to do everything very quick, but people are doing this on their spare time, they have families and all these things and people burn out and then someone comes and like, oh, yeah, let me help you this.

Murilo: 53:01

They do a few nice PRs and then

Bart: 53:03

I think from the moment you sufficiently trust people, you're not gonna review every Yeah. Every PR and detail. Right?

Murilo: 53:11

And sometimes it's like, it's not even like you can't review them, but like the fact that this is monkey patching is not it's probably more obscure than this. Right? It's not as easy as just like, yeah, this is what it's doing. It's fine. Listen, isn't cool.

Murilo: 53:23

It can get fairly complex. Right? And yeah, people gained your trust. I'm wondering as well, if they if this malicious package could monkey patch other libraries like dynamically as well, like a random kinda. Because in Python, right, like if I install, I don't know, if I install pandas, pandas will depends on NumPy, so I'm actually also installing NumPy, right, actually the whole tree of things.

Murilo: 53:52

So even if you don't install Solana directly and you don't install semantic types directly, if something depends on Solana, you will also install in semantic types. Yeah. Right.

Bart: 54:03

Case here. How do you this?

Murilo: 54:05

And so there's the whole tree of things. And I'm also wondering like semantic types apparently just steal Solana keys. But I'm wondering if they could also dynamically do some other things, you know, like, not necessarily only Solana, but like if you install Solana, you install this, and then you can take any Yeah. There. Or like add itself as a dependency somewhere, you know, like Yeah.

Murilo: 54:22

So yeah. Python's super dynamic, which is sometimes is

Bart: 54:29

That's why you came on Capacha. That is Most

Murilo: 54:32

of the

Bart: 54:32

time is not a good idea.

Murilo: 54:33

Most of time is not a great idea. Unless when it's about naming a podcast,

Bart: 54:38

And it's great. Only listening to it. Only listening. Yeah. So we have final topic.

Murilo: 54:46

Yes. JSON repair.

Bart: 54:48

Yes. JSON repair. We we always try to include a single Python package. Well, not necessarily Python. A single library.

Murilo: 54:55

Yeah. Something that people can use.

Bart: 54:57

Something that people can use. Yeah.

Murilo: 54:58

So That is not malicious.

Bart: 54:59

JSON underscore repair project offers a lightweight Python module that automatically fixes the misplaced brackets, straight text, and other quirks LM often introduce into JSON responses. Its readme observes that some LLMs are a bit iffy when it comes to returning well formed JSON data, and that motivated the tools creation.

Murilo: 55:19

So this is

Bart: 55:21

So what it does so if you've been using something like CGPT or Gemini or or a Sonnet to extract information from text, then typically you ask it to fill in, like, a JSON tools. MCP tools often also return something in JSON. The problem is that that JSON is not always valid because a bracket is missing or a quote is missing or whatever is missing. What is typically being done is by packages that allow you to quickly do this, something like I think Pydantic does it. Pydantic.ai does it.

Bart: 56:08

It's the other one, the instructor as I could. It checks, is it valid JSON? And then if not, it just sends it back to the model and ask to fix it.

Murilo: 56:17

Do it again.

Bart: 56:17

Yeah. Do it again. So you get this and hopefully, at some point, you get it get it correctly back. That is a way to do it. I think that's being done a lot, but automatically big behind the scenes.

Bart: 56:28

Yeah. This is a way that to do that in a more deterministic manner. It's probably a a safe extra check, I would say. Interesting tool. I think also the the it's very easy to use.

Bart: 56:42

It's it's also interesting, I think, how they the transparency of how what it supports just by looking at the tests. So the tests are very nicely structured, the unit tests in JSON and score repair.

Murilo: 56:56

Okay.

Bart: 56:57

The units are very nice structure. So but just by looking at the unit tests, you basically see very intuitive to see, like, okay, when the quotes are missing, this can fix it. In this situation, it can also fix it. Those are the situations that this cool tool can come in handy.

Murilo: 57:09

Oh, nice. Yeah. Oh, cool. And this is from the that's that's actually an under well, I don't know if it's undervalued, but one thing I really like about test as well is like to to use it to understand better what the code is doing or what is this and that. This is really cool.

Murilo: 57:23

Yeah. Yeah. That's nice. It's probably cheaper than sending it all back to JGPT.

Bart: 57:30

That as well.

Murilo: 57:30

Everything. That as well. Yeah. Yeah. Right.

Murilo: 57:31

Yeah. I also saw that they have a GitHub pages that you can actually do it online. You can actually put your input JSON and then it will pair for you if you wanna give it a try before hosted with Python anywhere for free. So this may be a API quota limit, but it's really cool. Pretty cool.

Murilo: 57:49

I wonder if these yeah. No. Yeah. I wonder if like the instructors or all these things, they could also if they do this already, like they try to fix it for you deterministically before sending you back. It's probably better now.

Bart: 58:03

It's cheaper and I don't think that they try it today, but I can be wrong.

Murilo: 58:07

Yeah. That would be cool. Have you tried by the way?

Bart: 58:10

Yep. Yeah. I actually have our we have a family AI assistant.

Murilo: 58:18

Yeah. What's his does does it does he or she or it have a name?

Bart: 58:21

Octav.

Murilo: 58:22

Octav. Any particular reason?

Bart: 58:25

No. It's a nice name. Octav. Okay. Or an assistant.

Murilo: 58:28

Octav. Feels like a butler, you know.

Bart: 58:31

Yeah. It's a bit of butler. Right? Yeah. Exactly.

Murilo: 58:34

I I I think of what? Is Jenkins the logo? There is like a guy like this.

Bart: 58:38

But it's I use IdentityAI for it.

Murilo: 58:40

Yeah. Do you is that your go to agentic AI framework? There's a lot of them.

Bart: 58:48

There's a lot of them. I build it just before the new hooking phase one.

Murilo: 58:55

Small agents.

Bart: 58:56

Yeah. I think I would prefer that.

Murilo: 58:58

You tried that one as well or no?

Bart: 58:59

No. But I think I would do it now if I would to re have to rebuild it and build it just before that. So it does very basic stuff. I like it. It's every day.

Bart: 59:07

It gives us small update on, like, the major things that happened those day to the the day, like, meetings and stuff stuff like that. We can ask it to put something in our agendas. We can ask it to put something on a grocery list, like, these type of things. Weather, like like, to well, this like, it's very basic basic things that normally you would like every somewhere in the day look up. Yeah.

Bart: 59:33

Now you get summarized.

Murilo: 59:35

It's gigantic. It's just like model with a whole bunch of tools.

Bart: 59:39

Yeah. Exactly. It's it's by the it's by the end of the guy. Yeah. So it's a very simple simple use case, but I would probably use small.

Bart: 59:46

The to me, the what I find difficult as a let's say, from the point of view as a developer of the Gentle AI is the intransparency of the calls.

Murilo: 59:57

Yeah. But IdentityAI also have the, I think, LogFire.

Bart: 01:00:01

Yeah. But I think that is a huge overhead to understand and think even that is not very transparent on the exact calls that are happening. I think it's but that's what I would want to maybe I would be happier with developing in Small agents. Small agents. Yeah.

Murilo: 01:00:15

But that's but if you don't know, like you just think that there may be more transparent.

Bart: 01:00:18

No. I'm just going through the documentation stuff.

Murilo: 01:00:20

Yeah. You went through the documentation. Okay.

Bart: 01:00:23

So maybe we should need need to find a use case to build something in small age.

Murilo: 01:00:28

Maybe for this, maybe we can't we can do chapters, transcripts, all these things. But I'm also curious about these things. But indeed, sometimes I think think the ideally you have the same use case and you build it like different ways with different frameworks, but it's also very like kind of defeated with the purpose, It's gonna be a grub brain developer is like, nah, just get it done, right? Cool. Cool.

Murilo: 01:00:53

I think that was it for today. Do you have any well, anything any big plans for the the rest of the week? Anything?

Bart: 01:01:02

No. We have some family that is visiting. So

Murilo: 01:01:05

Nice. Nice. Nice. Enjoying the warm weather.

Bart: 01:01:08

Very warm. Yeah. Yeah. Super. But yeah.

Bart: 01:01:10

We shouldn't complain. Right? Like we complain already so much in the year that it's bad.

Murilo: 01:01:13

Yeah. That's true. That's true.

Bart: 01:01:15

Let's just enjoy it.

Murilo: 01:01:16

I feel like sometimes when I complain, people are like, oh, but you're Brazilian. Like you shouldn't complain. I was like, yeah, but I left. You know? Like, if I loved it so much, would have left.

Murilo: 01:01:25

Bart: 01:01:25

fair Yeah.

Murilo: 01:01:26

It's like maybe I'm the only one that can complain. It's like, I moved so many, like thousands of kilometers.

Bart: 01:01:31

But do you also complain when it's bad weather?

Murilo: 01:01:35

Sometimes. No, but I don't wanna complain not to complain too much, but I do think sometimes when it's too warm, it's like, it's hard to be productive if you I

Bart: 01:01:48

think I was thinking today I was in my is like in my attic, like my home office, it's too warm. Was thinking maybe I should look into getting air conditioning.

Murilo: 01:01:58

Yeah. Right. Yeah. And the houses here, they're like most of them don't have air conditioning. Right?

Bart: 01:02:04

Yeah, don't think it's standard.

Murilo: 01:02:05

Yeah, but I do feel like it can make a difference. But yeah, but the sun is nice.

Bart: 01:02:10

The sun is nice. I'm getting my tan lines on.

Murilo: 01:02:13

Oh hell yeah. On your bike rides and stuff. Yeah, Alrighty. Cool stuff. I like this the the the summaries.

Murilo: 01:02:24

I think it's it's better. Mhmm. Nailed them. I don't know if we have do we have an outro already?

Bart: 01:02:31

Oh, I actually did. Yeah. But we don't have it in the you know, on the sound path yet. So we'll just replay the intro.

Murilo: 01:02:37

We'll just replay it in replay the intro today. Next time,

Bart: 01:02:40

we'll have

Murilo: 01:02:40

it there. Cool. Alright. Thanks everyone for listening.

Bart: 01:02:44

If you have ideas, feedback, let us know. If you like

Murilo: 01:02:46

to be a guest.

Bart: 01:02:47

If you like to be a guest.

Murilo: 01:02:49

Hit us up.

Bart: 01:02:50

Please rate us wherever you listen to your podcast.

Murilo: 01:02:52

Yes.

Bart: 01:02:53

Really helps us.

Murilo: 01:02:54

Yeah. For sure.

Bart: 01:02:55

And we'll talk to you next week.

Murilo: 01:02:59

Yes, sir. Ciao. Ciao.

Creators and Guests

Host

Bart Smeets

Mostly dad of three. Tech founder. Sometimes a trail runner, now and then a cyclist. Trying to survive creative & outdoor splurges.

Host

Murilo Kuniyoshi Suzart Cunha

AI enthusiast turned MLOps specialist who balances his passion for machine learning with interests in open source, sports (particularly football and tennis), philosophy, and mindfulness, while actively contributing to the tech community through conference speaking and as an organizer for Python User Group Belgium.

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere