The Agent Era: Standards, Self-Improving Codex, and Disney’s Sora Bet
Murilo (00:07)
Hi everyone, welcome to the Monkey Patching Podcast, where we go bananas about all things Disney, Linux and more. My name is Murilo and I'm joined by my friend Bart. Hey Bart, how are you doing?
Bart (00:17)
Hey Murilo.
I'm doing great, I'm doing great. How about you?
Murilo (00:20)
Maybe just before.
I'm doing good. I updated my iPhone. So I got the liquid, what was it called? Liquid, liquid what? Liquid, liquid something. I don't know, let's check. I think it may be liquid.
Bart (00:26)
Yeah, liquid something, yeah. Liquid team, I don't know.
Yeah, I did it a while ago. Do you like it? Liquid glass.
Murilo (00:35)
Liquid glass, liquid glass. That's
what it is. So I got liquid glass. don't know. I don't know if I don't like it because I don't like the UI or I just don't like it because I'm old because there are a few things like the camera. changed a bit or even the Safari. Right. And now I need to learn a new way to do things. And maybe I'm just old and grumpy. I don't want to learn a new way to take pictures. You know, it's like I don't know. What did you have you you tried it? Have you updated your iPhone?
Bart (00:47)
Hmm.
I did, I want to say a Montego or something, but yeah, I quickly got used to it. I'm not really, like I'm quite neutral on this, but I did yesterday update my Mac Mini that I'm on now to Mac OS Taro, which is a bit like, I don't know, the partner release to Liquid Glass, right? There's some liquid stuff as well.
Murilo (01:05)
You're indifferent.
Mmm, yeah.
Bart (01:22)
And actually I'm using now a feature that is in there that I kinda like. you can, OS X can like add effects to your web camera. And it now has, so like it can fade out the background or it can replace the background. But it now also has edge light. And that's what I'm using now. And edge light basically on your monitor, like it creates this border with very bright white pixels basically.
give your face a bit more lighting in dark situations. And it's actually quite dark where I'm sitting, but you don't really see it on the recording, right? That's how it looks so beautiful.
Murilo (01:46)
Well...
No, you don't. That's why you look so beautiful.
That's cool.
That's cool. That's cool. All right. Maybe I just need to get used to then. Maybe that's it. Ah, that that I don't know. I don't know if I'll ever get used to it. was like, well, I think of the liquid glass, but I show you. All right.
Bart (02:01)
to me being so beautiful.
Ha ha ha ha
Yeah, thing is
we don't have much influence on it, on liquid glass. Like we can sit on the lawn shouting at the clouds, but don't think they're gonna change it for us. This is it now.
Murilo (02:18)
Yeah, it's true. They just say like, this is it now. And honestly, like
after a while when you get used to it, if they switch it back, you're to be like, are we switching again?
Bart (02:26)
Yeah, but that's true. yeah. The thing is with these changes, like your muscle memory needs to adapt. And that creates a bit of friction.
Murilo (02:32)
Yeah, that's the thing. Like the beginning takes some... Yeah.
But we'll survive. We'll survive. So we have... We chose eight topics and then we have some... Not really tidbits, right? We have some extras that... Let's see if we can cover, but we're not...
Bart (02:45)
Yeah, if we
have time we'll discuss them and otherwise forward them to the newsletter.
Murilo (02:49)
Exactly. newsletter.monkeypatching.io So you want to kick us off Bart?
Bart (02:53)
OpenAI, Anthropic and Block are helping launch the Linux Foundation's Agentic AI Foundation to stop agent software from splintering into closed incompatible stacks. Their donation of protocols like Agents.md MCP and Goose sets shared guardrails so tomorrow's task running bots can interoperate safely instead of living in walled gardens.
Murilo (03:13)
So it's a new.
Bart (03:14)
Big news,
I guess, right? Or at least big, well-known protocols that we're talking about here.
Murilo (03:20)
Also Linux Foundation, think it's a big, it's like has a history of setting this neutral, quote unquote, joint, right? Like being a bit the Switzerland of the tech community, right? Like the neutral zone there. So the Linux AI Foundation, think it's a new one, right? mean, agentic AI Foundation. I don't know how new it is, but it needs to be at least. So Anthropic is donating MCP.
Bart (03:43)
I think
the Agentic AI Foundation is new within the Linux Foundation.
Murilo (03:47)
Yeah, yeah, yeah. Yeah,
that's what I understood as well. But I don't know if it's like, is it three months? Is it a year? I don't know. But I think agentic is getting more more popular. I think it also is a signal that we don't have standards. mean, we have patterns, but I don't know if we have standards. Right. I feel like there's a lot of people doing these things, but
I mean, now I think we're starting to get to a point of these are the standards. If you want to learn about agent-ti, these are the things you need to learn, right? And I think MCP is an easy one to say, right? I think it's very popular, but like Agents.md and all these other things.
Bart (04:17)
Yeah, and then...
So, Anthropic is basically donating the MCP protocol. OpenAI is bringing Agents.md and Block is contributing Goose. Goose, honestly, I didn't, I wasn't aware of.
Murilo (04:23)
projecting.
Yes.
Me neither. Well, maybe let's break it down for people that don't never heard of these things, right? So MCP is from Anthropic, which they had already kind of positioned as a neutral thing. They wanted it to be a neutral thing as far as how it looked like, right? They had their own website. They didn't mention Anthropic that much, but they created a protocol for enriching the model's context, which most of the time is adding tools to the model, right? So basically you want to talk to your database.
You don't wanna recreate the, reinvent the wheel. And then you just have an MCP server. So basically server that runs next to your agent and the agent can actually make requests like, okay, you have this tool because of your contract, right? I know that you have this tool. Run this query on my database for me, right? And give me back the results. So the agent can do that. And then it has access to your actual schema and all these things. Agents.md, what is Agents.md Bart?
Bart (05:18)
Agents.md is used in codec CLI. So OpenAI coding assistant, coding agent. And within agents and agents MD, you basically define and the CLI has some utility functions to define that. You define what agents are available to work on your code base. So like very simple example, you can maybe have a front end developer, a backend developer and a database expert. Like these become the personas from which like this agent tries to work on your code base.
Murilo (05:45)
and how is it different from Cloud.md?
Bart (05:46)
Well, you could, I think, guess, directly extend Claude.md to do this, but Clawdendee is much more broader. It also has definitions of what is the solution that you're building, things that needs to remember. For example, always run this command before committing. It's a much broader set of instructions.
Murilo (06:04)
So it's more like it's and also even in the name, Like Claude seems very specific to Anthropic, right? So and then like you said, there's Goose, which also I didn't know. Block. I didn't know Block either, to be honest, but Block is the company behind. Where did I see it? It's like some...
Bart (06:09)
Yeah.
Yeah, they do a lot. I think they started a bit in the crypto scene. Actually, Jack Dorsey is behind it, the CEO. And I think they are after paying clear pay payment systems. They're probably the biggest companies, but within the group, they also acquired Tidal, the music app.
Murilo (06:33)
Yeah, they have...
well.
Bart (06:39)
Dave Bitkey, which I don't know. It's a self-custody wallet. And also another app, Proto.
Murilo (06:45)
Yeah, and they have also the company behind Square and Cash App. So it's a they're describing here as a fintech company. So in a goose, I again, it wasn't familiar. It's also like a framework, open source framework for a gigantic coding, I guess. So, yeah, again, wasn't.
Bart (06:49)
Yeah.
Murilo (07:04)
as familiar with them. So I think of all the names, this is the most surprising one, right? Which I think is good for them. I think even in the article, it's a dual way, think, well, maybe more, but they make their name a bit to say, hey, we also do agentic coding. I think also by making this open source, people can contribute to it. But I think in any case, it's nice to see these companies coming together. I'm also wondering if they're competing standards, right? So I think the things that we saw,
there's not a lot of competition maybe, like there's no, for now, but would they like, if I think for example, the clearest competition is on the agent commerce from OpenAI and AA2P or something from Google, that was basically how to establish the foundation. And this is something I don't know actually, like can they embrace competing standards as well or?
Bart (07:30)
I don't think they are competing stand-lots now, because Goose actually uses MCP a lot. ⁓
Murilo (07:55)
Is it supposed to be like this is the standard and everyone else should follow this?
Bart (08:00)
You mean what is the strategy behind doing this?
Murilo (08:02)
Maybe not the strategy, but for example, would it make sense for the Linux Foundation to adopt two competing standards?
Bart (08:08)
question.
Murilo (08:08)
Because then you're
not providing, because I think the way I see this is kind of like, it's a statement of saying this is the way that everyone should do it, right? Now nothing is walled, everything should be pluggable. But the moment you adopt two competing standards, how do you go from there, right? Not the case now, not the case now.
Bart (08:21)
But that's not the case now today, right? Like these are complementary standards.
I, well, I doubt that we'll see an announcement tomorrow that they, they also adopted a, a competitor. I do believe like this, like it will probably evolve from here. but are not, will have actually this definition of agents MD and MCP and goose in the future, or that it will be fused into a single framework or something. that's to be seen.
I think probably long-term for them all, it's a good thing, right? Because it's not, they have all the incentives to make sure that the whole ecosystem can interact with their offering, like without friction. And if they all have their own interface, like you need to start basically saying, yeah, please also adjust your solution to my interface because we also want to be able to serve your, like it's, it's.
This basically takes away friction.
Murilo (09:07)
Yeah, no, I agree. But then for example, can we, well, we make sense for cloud codes to adopt agents MD then.
Bart (09:13)
Who knows? I don't think today. But I think that's to be seen how Agents.md evolves towards the future, like within this in a more holistic total framework. I don't think it makes sense for them to do that today.
Murilo (09:25)
But I do think is I am happy to see these things as well. I feel like I do get the feeling that there's a lot of people talking about agent-techi and all these things and there's different protocols that come up and it's like a bit hard to get your head around. And sometimes they don't plug in as nicely and sometimes there is a bit of an overlap, but not really. And I think this will bring more clarity as well. And I think people that are interested in this, I think this is also a good place to start learning.
Instead of trying to learn everything of everything and just picking stuff on the internet, maybe to look at here, what are the protocols? What are the building blocks? What are the shared layers between things? Or at least what we want to become the shared layers of everything. So yeah.
Bart (10:01)
And I'm kind of hopeful that we'll also see some work on security, which typically is left up today to the person that implements. Not that you need to enforce a certain way to make this secure, but at least providing defaults will be nice. So yeah, let's see how this evolves, right? Like I think that it's a nice story and it's also probably like...
They have incentives to do this so that everybody becomes interoperable. But it's also a very nice story, right? Like we're here for the ecosystem, we're doing this for the ecosystem. So you see, we do this for the greater good and we're distributing the value. you should not, you probably, there's not really a reason to regulate us, right? Like this is a very nice story to tell both to the community as well as to regulators.
Murilo (10:40)
Yeah, I just saying.
Yeah, that's true. That's true. And I think of all of them, I think for the one that I'm really surprised about this, I guess is Anthropic. I do get a feel that Anthropic was...
more trying to say, I'm helping advance the research, look at all these things we did and look MCPs, a protocol that we aim to be like, I think it fits nicely, but I think it's also nice to get signals from other players like OpenAI and Block as well. That they also just want to come together at some point right now. I think we're getting a point now that we know that there's, we need to have an agreement to have an agreement.
Bart (11:10)
Hmm.
Murilo (11:18)
otherwise we don't go forward. So nice. Let's see how it goes from there, but for me it's a strong signal, indeed.
Up next, have DeepMind Sima 2 now plugs Gemini Brains into a gameplay agent that plans, chats and self-improves inside unseen 3D worlds. The leap from simple command following to goal reasoning edges it towards robotic scale autonomy and offers a live sandbox for testing agent safety. So.
SIMA 2, I don't remember what it stands for. So there was a SIMA 1. This is a, I think that was a paper. So this is from Google's DeepMind research. And basically they use these foundation models, I think Gemini 3 in this case, to basically have agents playing kind of like games. They call it 3D, virtual 3D worlds, but basically they're like video games. Maybe I'll, I don't know if we can play this.
But basically like, I think it goes from pixels to pixels and keyboards to the model. So it's actually like just like how quote unquote a human would learn these things. And then you actually have like multi-modal things. Like you can say, draw stuff on the screen or you can draw stuff there to say, go build the thing that I'm drawing here. So it's like, I think the example is a spaceship. You can say, can, yeah. So that's what it's showing in the video now.
You can also go ahead and ask questions like, where am I? What are you doing here? Go to the red house or go to the house, here, yeah, go to the house, color like a ripe tomato. And then he knows how to go to the red house as well. And the thing is like, it's the same model that was trained on different games. And they can learn from one game to the other, right? On putting things together.
Bart (12:55)
But it's still
like prompted by user. The user says, please do this for me.
Murilo (12:59)
Yes, but I think the level of instruction is very broad, right? So I think there was one example that I saw is like, go find water. And then he knows how to look around and to jump and to do these things and to touch and this and this and this. So again, like, they turn on different games, right? But just to kind of show that what you learn in one game can actually transfer to another and what they're hinting later. And I think this is the part of it that I'm talking over. Then you could also transfer this to real life, right? So.
Bart (13:06)
Okay, yeah.
Murilo (13:23)
you can have the agent playing a lot of these 3D simulation, 3D simulated worlds, but then at some point you can actually transpose this into actual robotics. But I think that the idea is more like, if you can transfer from one game to another, you can also transfer this to real life. Which I thought it was very appealing, right? So I see here, like you asked, like, where are you? And it says like, what are you doing here? And it can give you like a bit of a, I'm doing here because I'm looking for X, Y or Z, right? So.
Bart (13:26)
Hmm.
Nice.
Murilo (13:47)
Yeah, maybe it's very nice, but maybe to bring back to reality is still not something you're gonna see tomorrow, right? So for example, success rate from SEMA 1 to SEMA 2, it actually doubled the success rate, but it's still below human, right? So I think right now, no, it's still very impressive, right? But it's not like something, you're not gonna see robots, yeah.
Bart (14:03)
But that's still impressive, Like what you're showing here
on the screen, the task completion success rate of the human is like 78 % something. And SEMA2 is at 65%. So like it outcompetes a dumb human. Is that how you can say this?
Murilo (14:16)
Something like that, yeah.
65%, exactly. So it's not better humans yet, but it's...
Yeah,
I think so. If it was me doing these things, would do better than me. And the other one, I think this is for tasks completion success rates and then for previously unseen environments. again, this success rate here is actually below 20%, below 15 % actually. for, yeah, that's low. But it's still like from, yeah.
Bart (14:42)
or wow that's low.
So
does this mean like unseen gaming environments? these environments were not present in the training data?
Murilo (14:54)
But I think the setup they have is a bit like reinforcement learning thing. So I feel like they even show and I know I'll jump back a bit.
Bart (14:57)
Hmm.
So also
not seen in post training basically.
Murilo (15:03)
Yeah, exactly. And here, for example, they show a bit like after, and I think this is from the reinforcement learning, if you say extinguish the campfire, you can actually see how one of them, was enabled to, the model was enabled to and the other one he was. So just kind of showing the improvement there, right? So I guess basically for the graph here, the way I understand this is like, if this was a robot, it's not ready yet to go to a place where it never been before.
Bart (15:14)
Yeah, yeah. Okay.
Murilo (15:27)
You cannot just have a robot and say, okay, I'll do this and do that. And then just take the robot in your car and then just drop it somewhere else. It's like, now you need to do this. It's too low.
Bart (15:34)
So the
post-training labelers need to be in my house in order to get to a good robot. Otherwise the robot will throw the coffee pot through the window.
Murilo (15:43)
Yeah. Yeah.
For you. Yeah. Exactly. Exactly. I think in the video game it's easier, right? Because you can codify a bit what you want to do and like give rewards based on that. I think in real life you still need someone saying like, no, this is bad. Yes, this is good. But if you go from Sima 1 to Sima 2, it's still like a big improvement. Like from I don't know what mine dojo and this the two environments. It goes from like, I don't know, 0.25 to
Bart (16:01)
That's true.
Murilo (16:07)
I don't know, 14%. And the other one is like almost zero to 13%. So, not like it's still very impressive, right? But I think there's still ways to go for us to really fully benefit from this.
Bart (16:18)
Yeah, these gaming environments,
are perfect to train this and to get to something that might be safe in real life.
Murilo (16:25)
Exactly. I
think it's showing promise, right? It's showing a lot of potential here, but I think we still have room for getting there, but it's cool in any case. And I also thought that this felt a bit like an evolution of Genie 3, right? Which was just walking around 3D generated worlds. So if you really think about it, it's actually been improving very fast. So yeah. What?
Bart (16:45)
Hmm.
Murilo (16:49)
is next part.
Bart (16:50)
OpenAI has unveiled GPT 5.2, touting sharper reasoning, long context, memory and tool calling tuned for professional work. International benchmarks show gains across 44 occupations and tough math and code tests. Evidence for OpenAI is that agents will soon handle day-long projects. End to end.
Well, I guess this is the reaction to the code red at OpenAI. There were a lot of rumors that GPT 5.2 was gonna be released. In the end, it was one day later than rumors, but it's still there. You're showing it on the screen,
It's said to be the best model for professional work. So think about, I want to do a research task or want to build an Excel or these types of works like in your professional day to day and the best model for a long running agent tasks.
Murilo (17:39)
Yes, so it looked like I glanced a bit at the benchmarks as well. It looks like it actually improved quite a bit. Like the numbers here, like I don't know what a lot of those are, right? Like knowledge, work tasks, GDP, VAL, it went from 38.8 to 70%. Yeah, yeah, yeah, but like they went up a lot, right? Like from 38, like 40 to 70. It's a big increase, right?
Bart (17:55)
But the numbers go up.
Yeah, I'm a little bit, this is what we see every time a new frontier model gets released, right? Like a lot of benchmarks showing an upward trend. I think we should look at them, but also a bit see them for what they are, right? Like these are clearly a set of benchmarks that every new frontier model is measured against. So they're also,
taking this into account and post training, course. Like I don't think this necessarily translates to one-to-one to real life performance, but it gives it gives a direction on the the on the trend, right?
Murilo (18:25)
Yeah, yeah.
Yeah, think so. think it's also a bit, I think for us, we're discussing this here. We look at a of benchmarks because they're the most objective thing to look at, right? But I think I agree with you. You need to really try it out and see how much it works. They also mentioned here that it's good at front end of software engineering. Actually, like this graph, I also thought it was interesting. Kind of shows the accuracy versus output tokens.
Bart (18:44)
Exactly.
Murilo (18:57)
and they're showing the GPT 5.2 thinking is actually better than the 5.1 Codex.
Bart (19:03)
but uses way more output tokens.
Murilo (19:05)
We'll use it way more. But I feel like even, like across the board, right? Like it always, it's always ahead. Even with less output tokens, right?
Bart (19:11)
Yeah, I
see what you mean. For the same output tokens.
Murilo (19:16)
So
again, it looks interesting, right? But yeah, indeed.
Bart (19:19)
It's definitely a step up again now. And I must say, like,
I have the feeling when it comes to professional work, but also to long running agent tasks, maybe a little bias there, like this is what I've been using ChetchupyD for the last year. And I very much had the feeling that they're also the best at this. This kind of confirms it. ⁓ Well, professional work. And for long running tasks, how I mainly use it is the deep research. ⁓
Murilo (19:35)
long running tasks.
Mm.
I
see. Yeah, I think I mean, I kind of see what you're saying. But at the same time, to what's what's professional work? I mean, they do give some examples like spreadsheets and all these things, right? But I think it's also it's like, it gives a blurry picture to me, right? Like what exactly you're good at that.
Bart (19:44)
tool.
But it is
a very blurry picture, I fully agree. But to me, it's like as a support tool for market research, as a support tool to create content, as a support tool to create analysis. Like all these kind of things that are, well, I want to say maybe, for me, they're not coding and they're nothing visual, right?
Murilo (20:19)
Yeah, I what you're saying. But I guess for me is like the thing that bothers me a bit quote unquote is that to say professional work, it is a blurry picture. I kind of get it. But at the same time, I'm wondering if they just call it professional knowledge work because that's what's going to increase the enterprise adoption or something or like, you know, like I feel like if they were saying it's the best at Excel, but you know what saying? Like I feel like if they say like it's the best at Excel,
Bart (20:21)
everything that remains.
Yeah, I think that's the best mystic view of this.
Murilo (20:43)
or Excel tasks or whatever, it's more concrete, right? It's less blurry and I feel like I understand better what you're saying. But I'm sure that there's a category of professional knowledge work that I can think of that that's not what they thought of. But to frame it like this is kind of saying like, if you work, if you work, you should use Chagypt. Right? I mean, maybe it is a bit pessimistic view and I do think Chagypt in all transparency.
Bart (20:55)
Sure, yeah.
Mm-hmm.
But I honestly, like I fully agree, it's a very vague definition, but like, if you think a broad definition of professional work and whatever it is, whether it be spreadsheets or presentations or research analysis or writing texts or like, there is not really another player today that says we're the best at that broad definition because like at Tropic you go for coding, right? Gemini, I mean, the only thing that Gemini is was.
Murilo (21:22)
Yeah.
Bart (21:26)
really state-of-the-art is what's coding and is now at visual images. So I agree that it's a vague category, but I also kind of do believe that for now they're leading it.
Murilo (21:35)
No, I do. And that's what I was gonna say. do for no transparency, I do think ChatGPT is the best for like, what I kind of call like day to day things for me, like at work. Yeah, I did it. That's Vega. I agree. So I do think like they have, they are ahead in a few places, but I feel like I'm trying to just put my joy like black and white. Like what is it that I'm trying to say here? So.
Bart (21:46)
It's even more vague.
Yeah, yeah, that's true. And maybe that's
also interesting because this announcement of 5.2, it's very clearly not an announcement like we're the best frontier model across everything, which we've seen with a lot of models over the last year, right? Like something new comes out and it's more or less the best on everything. But here they're clearly saying like there are two areas where we're out competing others and for the rest we're...
Murilo (22:09)
True.
Yeah.
Bart (22:21)
quite good versus others, but there only two clear domains where we're out-competing.
Murilo (22:25)
Yeah, which I guess I think we also talked about this maybe last week or so that specialization, right, of the different providers. it is a step in the direction, right. So, yeah, it's interesting indeed. And I feel like, yeah, maybe even for me, I would like to have a bit of a almost like a roadmap, you know, like what should I use which model for what. ⁓
Bart (22:32)
Mmm, yeah.
Yeah, agreed.
Yeah, the difficult
thing to that is that it changes every quarter, right?
Murilo (22:51)
Yeah, that's true. That's true. But I think, for example, it changes. But for example, Anthropic for coding, it's been like that for a while. Right. So I'm wondering if it's if like maybe by the end of next year, it's like we're recording as of December of 2025, right? If December 2026, if we were to revisit this discussion, it was going to be like, OK, this is clearly this. This is clear that this is clear this. And there's these categories that it's not defined yet. Right.
Bart (22:57)
That's true, that's true.
Murilo (23:15)
To be seen, to be seen in case I think it's interesting. Have you tried it maybe, 5.2?
Bart (23:19)
Well, yeah, because ChatGPT by default switches to 5.2 if you have a ChatGPT account. So yeah, for the things I do try it with, haven't really, I didn't feel a day in the day, like from one day to another, a clear improvement or anything. But I did already have this with 5.1.
They made a big step in terms of less hallucination, which I clearly noticed. Apparently with 5.2, they made another step here. So that's a very good thing.
Murilo (23:45)
that's good.
Yeah, it's true.
One thing I was also as you're saying this, because I actually used it not that long ago to prepare for the show notes. And I realized that I was using 5.2, but I didn't even notice because right now they don't you don't like you have to for you to select 5.1, you have to go to model and then you have to go legacy models and then you have to choose the 5.1, right? Because now it just says stinking and then that's it. So, yeah, I guess.
Bart (23:58)
Yeah, exactly.
Yeah, yeah, yeah.
Murilo (24:12)
If I didn't notice, it's not a bad sign. I'm not sure if it's good, but it's definitely not bad. Exactly. It's like, it's the same or better. that's good. All right. Next we have CNN and CNBC have inked deals with prediction market Kalshi. I'm not sure if that's how you pronounce it. Inviting viewers to wager on tariffs, elections and more in real time. The partnerships push gambling from sports desk.
Bart (24:16)
at the very least neutral.
Yeah.
Murilo (24:37)
desks into hard news, raising fresh questions about journalistic incentives as odd tickers share the screen with headlines.
Bart (24:45)
Yeah, it's interesting to see news networks partner with companies like Kalshi, right? I don't think it's necessarily surprising. think general journalism has been gone long for has gone down the road for a long time.
Are you familiar with Kalshi?
Murilo (25:00)
I'm not familiar with Kalshi. Are you familiar with Kalshi? Is that how you made your big bucks?
Bart (25:03)
No, No, I'm not a user, but I'm familiar with it. So, Calce is a bit... You have another big player here, which is Polymarket, which people might know. They are sort of prediction markets and you can...
Murilo (25:05)
Hahaha
Bart (25:14)
more or less create a prediction around everything. So if you go there, can see, for example, the Fed did a rate cut last week, I want to say, and there was already a long running prediction on what is it going to be. And there will be for every quarter, there is a new bet and you can basically see evolutions and trends going there. But this can be go, this is around anything, any topic that you can think of, you can have a prediction there. Which
Like, if you would explain this, will Murillo injure himself during futsal? What are the odds and how can I put money on this? And what is my payout if it happens or if it doesn't happen? If you would explain this to any person, they would say this is gambling website. Right? ⁓ But apparently they found a way around it.
Murilo (25:51)
Yeah. Yeah, I mean it is.
Bart (25:58)
basically saying that they are working in the sort of like you have futures and stock options on the on the stock market so they're basically there is there is a contract about future events and what will happen if this event occurs that is is difficult way of saying it's not gambling
Murilo (26:18)
Okay.
Bart (26:21)
And because of this, this is more close to what we see in financial trading than it is to actual gambling. And this is why you should allow us to operate. And the Kelsi and Polymarkets, even though it was a bit difficult to them, get them street legal, they are street legal in the US. They are not in Europe. But what you see is that, so why are news basically
Murilo (26:37)
Poof.
Bart (26:45)
news networks were focusing on this is that you have a lot of people basically putting bets on this like will this person become the senator for this state as of that election for example and it kind of starts working like polls right and polls are very expensive to run if you do them manually ⁓
Murilo (27:05)
Yeah.
Bart (27:05)
But because there so much volume behind these things, they do become informative.
And I I understands why, like in this case, CNN and CNBC, they have these signed deals with Kalshi because like they, Kalshi provides a lot of interesting information about like, what is the consensus of the population, of a large part of the population on a certain topic.
Murilo (27:29)
Yeah, I think the thing that gets a bit muddy for me is like...
how much of that the like journalistic integrity, right? How much of those signals are you actually taking in? And like.
You don't want also to create a bit of a bubble where you confirm. You should try to be unbiased to large extent. And I think these things, they tilt the scales a bit. So it's a bit hard. I mean, I'm not saying that it is gonna happen. I don't know. I'm not in that world, right? But that's a concern that raises to my mind when I see these things.
Bart (27:49)
Yeah.
I think from the moment that you start presenting these results as news, because that's probably what everybody's thinking that will happen, right? The challenge here is that if you look at culture polymarket, like within the region that this information is relevant, let's say who will become the next senator for that straight, like you probably have a very biased audience because like you're...
I don't know, your 70 year old grandpa's not gonna use this, right? They're not gonna be on calci, I guess. ⁓ And the other way around, like also like the people placing quote unquote bats, they're from across the world. So you also have this influence and it's hard to filter out this influence. know it would be interesting to see like how detailed information.
Murilo (28:23)
True. True.
Bart (28:39)
if there will be a difference in information that CNN or CNBC will get from the demographics of people placing bets.
Murilo (28:46)
Yeah. No, I see what you're saying. Maybe it's a very doomy scenario, I remember watching House of Cards many years ago and there was one thing that was like the guy from, I don't remember which position he had. He was leaking information to a journalist and she was asking, but is this real? And he said, someone is going to be appointed something. And he's like, is this real? And he says, well, if you report on this, it will become real.
And they kind of show how because the public thinks that's gonna happen, people start to actually go towards that. that's a bit what I'm... Like if you have a lot of people saying like, this guy is a good candidate for this position, and you see a lot of people starting to say this, then it also can influence people to think that it's actually good.
Bart (29:11)
Well, exactly. Yeah.
That is actually good, but also maybe if it's so like the odds are so much stacked against you, like, like, let's say I want to vote for a certain guy to become the next president, but probably like we see, we see on Kalshi polymarket, like 90 % of the other guy's going to win. So maybe I'm not even going to vote anymore. It's not worth it. Right. And like Kalshi polymarket are basically
Murilo (29:42)
Exactly. Exactly.
Bart (29:46)
like providing a quote unquote legal market to influence that, especially when this gets picked up by news, because like you can literally just put more money into to change their prediction.
Murilo (29:52)
Exactly.
Exactly. that's the thing. I mean, so that would happen even if there was no tie between news networks and Kalshi. But when you have that, you're almost like you're doubling down the weight of that.
Bart (30:08)
Yeah,
the news networks makes that this becomes a thing that is seen by the public at large and not just the people placing bets. Yeah, that's a very good point. I fully agree with what you're saying. Slippery slope. But I think the premise is already wrong. Like this should be regulated as a gambling website.
Murilo (30:13)
Exactly. Especially the people in the news networks as well.
So I think that's the, yeah.
Sure, I fully agree. I fully agree. Yeah, but I fully agree. But and also how does this work actually? if they I guess that I imagine that was a legal procedure like a court ruling or something or I don't know.
Bart (30:27)
think that is where it should start.
No,
think probably a lot of golfing or maybe like a golden statue somewhere. I think these kind of things happened.
Murilo (30:45)
I can't see.
Okay, okay. But then it could still be that it still changes, I guess. Like, it's not like... Yeah, okay. I hope so too.
Bart (30:53)
I hope it changes. Because if this gets allowed,
like then becomes a very slippery slope. Like also like to add to our gambling websites posing as non gambling like, let's see.
Murilo (30:58)
Yeah, but yeah.
Agreed. Agreed. What is next?
Bart (31:06)
Disney will invest $1 billion in OpenAI and license over 200 Marvel, Pixar and Star Wars characters for Sora, the text-to-video engine. Beginning in 2026, fans and Disney Plus itself could spin up Mickey-to-Vader shorts on demand, testing Hollywood's balance between IP control and an exploding creator economy.
Murilo (31:26)
Yes. So for me, I came across this and I thought it was the first thing that stood out to me is that Disney is investing one billion in OpenAI, where I would have expected the opposite. I would have expected OpenAI to pay Disney to use the characters. So I wasn't first I wasn't sure what to make of this. Like why? I mean, it's a lot of money as well.
Bart (31:45)
It's a lot of money and we don't have the details of the contract.
Murilo (31:48)
Exactly.
Bart (31:49)
So there is an equity investment from Disney into OpenAI. Which, I mean, Disney is a old classical company. Maybe good for shareholders also to see for the market to see that Disney becomes more tech savvy by having a stake in OpenAI. Can imagine that there is some incentive there. But I can also imagine that from the moment that Disney characters get used on
Sora that there is something on monetization in the contract. But we don't know right at this point.
Murilo (32:12)
Not true, I sure didn't think of that.
The other thing that they mentioned here is like Disney says alongside the agreement, it will quote unquote, become a major customer open AI and use APIs to build new products to those experiences including for Disney Plus. So maybe there's also something there that like by investing, they can actually bring this back into Disney Plus products. Maybe, I'm sure.
Bart (32:35)
Yeah, and then what I will probably give them is like early access to new GenAI video stuff.
So yeah, I think another thing that plays here is that it's a highly contested market when it comes to IP. What we've seen in the last year is that all the big providers are just taking everything and running with it. No one cares about IP anymore. I think this is also a way from Disney like...
Murilo (32:50)
Yeah.
Bart (32:54)
Either you don't do a deal or you do a deal and then you get some money out of it. That's what they're doing. But maybe it's also like it sets up the market to become regulated on IP. Because what Disney did, I think one or two days after announcing this is that they started a legal case against Google. Because Google is also with NanoBanana and stuff, you can generate Disney characters.
Murilo (33:15)
Is this the... see. They also sent the letter to CharacterAI. I don't know if that's the same. Yeah, for sure. Yeah, so... Yeah, but...
Bart (33:20)
and also my trillion here. Let's see, now an article. But I think the big news there is Google.
And this
collaboration between Disney, which is a very big player, and OpenAI, another very big player, it also might become exemplary in the market. This is how IP in this world should be handled.
Murilo (33:40)
Yeah, I think again, for me, the story, think it's still I think they still missed the story a bit because again, to say OpenAI shouldn't take characters for free. They should expect money from the producers. It's a bit like. We are still, you Like, we don't know the details, you know, but I think that.
Bart (33:54)
Well, we don't know the details of the contract, right? Like the one billion
equity investments is just an investment, right? Like I think you need to, turn, like they're allowed to, I don't necessarily like say that we need to say like in turn of the, like in favor because they're doing equity investment, now they are allowed to use these characters. To me, these are multiple things that are running in parallel, right?
Murilo (34:14)
Yeah, yeah, yeah. true, true.
I think maybe that's the maybe that's what the article for me sounded very strange because that's how they made it sound like. It's like, ⁓ Disney is investing. So now you can use my character. It's like that sounds like the worst deal ever. You know, it's like I spent all this time doing these things and now you're using for free illegally. So I'm going to pay you to use illegally. It's like that's that's a bit. No, no, I agree. I agree. I think it's more the article. I feel like the way it was.
Bart (34:21)
Yeah, that's true. That is a fair point,
But I don't think they're doing an equity investment.
Murilo (34:40)
it was written or the way I read it, maybe more likely sounded a bit weird to me.
Bart (34:45)
But you can also
think about it like from the point of the shareholder, like if Disney becomes a shareholder in OpenAI, like I want my investments, in this case OpenAI, to flourish towards the future. I can help them by having my IP, which is super unique in the market, allowing them to use it. And that will mean that their share value will only go up versus the other players when it comes to OpenAI. So that's also a way to look at it, right?
Murilo (35:03)
Yeah, that's true. That's true. Yeah,
that's true. I also think that, if Disney characters pop up in Sora, like as marketing, right, to go back to Disney, I also think that if they're bringing the Sora into Disney produced videos in the future, there is more there, right, that could be said. But I think I landed into these conclusions after reading the article and not based on information that the article gave. So that's why it also stood out to me.
Bart (35:20)
Hmm.
Murilo (35:29)
Do you think that there will be more studios moving like this? Because actually I...
Bart (35:35)
That's a good question. I don't know. think the incentive that they have today is that the market is unregulated, or at least everybody's saying it's unregulated. So you better sign contracts with these players to make sure that you at least get something out of it.
Murilo (35:48)
No.
true.
Bart (35:49)
There is an incentive there to go for these deals.
Murilo (35:51)
Yeah, I was also actually looking here because I thought that Yes, because I also saw that Disney had made a similar investment in the past but I need to double check that
Yeah, I think it says here that Disney and this was February of 2024, they'd also invested in Epic Games. So they're also investing there to like other media outlets. I mean, it's not that comparable, right? But Disney is also making investments on different channels that could also use their characters, right? So it's not the first time they do this, I guess. Maybe the first time they stood out to me because it is opening night.
Bart (36:14)
Yeah, yeah, yeah.
Yeah, maybe
to look at it like that, like Epic Games, like is a channel to distribute their characters, but Sora is as well, guess, right?
Murilo (36:27)
Exactly. that's also when I was doing my research here, see, Disney, happy games, create expensive open games, entertainment universe connected to Fortnite. So maybe it's also something that did last year. So this is February of 2024. Maybe it's something that paid off really well. And then they see the opportunity now with Sora and they have also other plans there. Right. It's not just like I'm investing and we can use it, but there are other things in the roadmap for them. And this is maybe even part of a big strategy for them. So to be seen.
Bart (36:52)
What else do we have?
Murilo (36:53)
Yeah.
What else do we have? We have...
President Trump's new executive order tells federal agencies to fight state level AI rules and even withhold grants aiming for a single national framework. Tech lobbies cheer the preemption while privacy advocates advocate for a battle over whether AI oversight should be centralized or diversified across 50 experiments. So basically Trump saying there are too many rules, it's too much all over the place and we should...
centralize things a bit and we should not. think the main, at least how I read it, the main thing is about not slowing down AI. I even heard, I even read somewhere and I couldn't find it this morning, but they even makes a statement towards China. Like if we keep slowing down, we're just basically giving China the leeway to take the lead.
Bart (37:39)
We've seen anything, maybe most notably California coming up with its own state regulation around AI. Some other examples and what Trump is now saying is that we're gonna fight the state level AI regulation and basically come up with a centralized one, like for the whole United States. If states don't...
don't listen, they're gonna withhold grants, et cetera, et it's a little, and I think the whole thing about...
fighting state level or let's say lower region, smaller region level legislation and coming up with some, but a mature framework that everybody can as easy as it's easy for everybody to implement. It's easy for companies within the same country to understand what is expected to them of them. It's easy to understand for other countries how to interact with states because there's only one regulation that applies. Like these are all good arguments.
Murilo (38:25)
Mm.
Bart (38:28)
I think the whole problem here is that no one really trusts that this will go happen well at federal state level. I think the only regulation that will come is that if someone goes for dinner and promises to spend, I don't know, 50 million on the new wing of the White House, then maybe they get their part into the legislation. Like no one trusted this legislation will be a mature and fair one based on
experts. think that is the big worry here. I think the general plan is a good one, but the execution will be as corrupt as everything else that happened over the last year.
Murilo (38:55)
Yeah.
Yeah, I agree. think that the story quote unquote, think is the arguments are there, right? But I also feel like it's a bit early to say this is how we should regulate AI. It's also moving very fast. And I think it's to say like, okay, now we're gonna set the standards and this is how everyone should do it. It's like, don't, you know.
Bart (39:13)
Yeah.
Murilo (39:19)
I would, I don't know, I'm not in the legislative field, but I think it's still early to say this is how we should regulate things. This is how we should afford like for a large group of people. Right. And I think
And I think really the reason why they're doing this is really for the AI race with China. I think also when I saw this as well, think we have recorded our episode with our... How can I describe it?
Bart (39:45)
data shared to the EU. Flanders data shared to the EU on everything digital, among which AI.
Murilo (39:50)
Yeah.
And one of the things we did talk about was the AI Act, right? Which is a different stance from what in the US were. So that's also when I think when I read this, also had that a bit in mind, right? And maybe also that influenced a bit that when I read this, also saw this a bit as Trump saying like, okay, I'm gonna deregulate a lot because we want to move very fast. We don't want to be behind China, right? Which maybe is true. Maybe it's not again. There are other, like you said, good arguments for doing these things.
But it also gives me bit of a...
I don't know, like you move fast, but like, it more dangerous? Like, do we know what the consequences are? This is still a new tech, this is still evolving quickly. Is it the right time to do all these things? I don't know what you think.
Bart (40:27)
All the bit of a broader definition, right? Like to me, AI regulation is very, very, very broad, right? Like I don't, I'm very much pro regulating AI applications. I don't think it makes sense to regulate AI models.
Murilo (40:40)
Hmm, yeah.
Bart (40:41)
To me, an AI
model is just something that helps to come to a decision or an action or whatever, or to generate something. It's the application that you should regulate. Like if this is an application in healthcare, you should regulate it. If this is an application that undermines people's privacy, or if there is an application that causes discrimination, that should be regulated, but not the underlying model.
Murilo (41:01)
Which is,
is it right to paraphrase it a bit like you're regulating, well, if you had like knowledge and you had application like in science, right? Something was discovered. Like you're not regulating that, but replication, you're regulating the applications of that. So like maybe medication, maybe devices, maybe this, like that you regulate, but the knowledge itself, you shouldn't.
Bart (41:14)
Yeah, exactly.
Maybe that's good thinking about the phrasing, but yeah, I think so.
Murilo (41:24)
Okay. So yeah, let's see, because I think that there is a new, so what I understood here is that there is a, I don't know if it's a law, yeah, state laws. So, but I think he also wanted to, the things that are already passed, they also wanted to encourage the states to remove it, right? So they're also gonna go over a list of all the states and what are the regulation that the states already have implemented and.
you would try to, I don't think you can block them, but like to remove the incentives so they are encouraged to do so.
Let's see. think everything else with Trump, like with Trump, I think gives me bit of anxiety as well. That's also like, I don't know. It's a bit...
Bart (41:58)
Yeah, mean, no
one is expecting that this is going to be like a expert led fair processor.
Murilo (42:04)
Yeah, but there's always a lot of, don't know if it's, I don't know. It's like the feeling I get, and maybe I'm not the best person to comment on these things, but it's like, there's not gonna be a dialogue that everyone's gonna sit down and discuss. And it's like, okay, this is one point, this is the other. But it's gonna be like, no, you have to do it like this and that's just the way it is. if you don't, then I don't know, you're gonna have to suffer the consequences, you know? And is that the best way to...
You know, like it doesn't, the feeling I get is that it doesn't welcome dialogue, right? And I think these things is not gonna be carried out in a way that's gonna be, let's try to get a panel of experts and let's really try to have a conversation, understand the pros and cons because everything is pros and cons and then come out to this. That's not the vibe, that's not the vibe I get or the feeling that I get from these things.
Bart (42:31)
Well.
No, and then if you, well, there are some pros to that as well, right? Like I think you have at the EU level with the AI Act, there's a lot of dialogue, right? And so much dialogue that there's now a lot of dialogue to postpone the implementation of the high risk category stuff. But to get there, there's a lot of dialogue, right? Like a lot of time might need it. ⁓ And that means that even though
Murilo (43:06)
Yeah.
Bart (43:09)
we probably get to a framework that hasn't been hopefully well thought through and more or less future proof. That's what we hope. I think the process is at least clearer there that there will be expert led and that there are good arguments. But it will take a long time in a space where everything moves very, very quickly. And then the advantage of something that is like in the US, the only advantage that you have there is that it moves very quickly and often for the wrong reasons, but it does move very quickly.
Murilo (43:34)
Yeah.
Yeah, that's true. That's true. That's true. It's a good counterpoint. Yeah. Maybe that's the key difference, right? Like this moves. This does move very, very fast. yeah.
What else do we have? What's next?
Bart (43:46)
An analysis tied to SK Hynix says consumer DRAM, dynamic random access memory, shortages could persist until 2028 as chipmakers prioritize AI servers and data centers with inventories at historic lows and NAND NAND flash facing similar pressures. PC and laptop buyers may see tight supplies and higher prices longer than hoped even if memory capacity expands.
This news about memory shortages, I've seen it coming up, I want to say since roughly a month. And what it comes down to is that these producers of which SK Hynix and I think Samsung are the biggest together, they are basically switching to opting for basically server memory versus consumer memory. Where...
They think that server memory, is roughly that the output is now in 2025 around 38%. They expect it to move to 53 % in 2030. So this is a significant change. And it will also have a big impact on the cost of consumer memory. And then more specifically on DDR4 and 5.
Murilo (44:51)
you come across? How did you felt this? Were you building a home lab or something?
Bart (44:55)
No, did. Well, no, but I actually did it a year ago with my kids. But I'm happy I did it before the shortages. But there's already some discussions, there are some announcements from Dell as well that consumer laptops will go up, prices for laptops will basically go up. Like it will have a big, big impact on on
Murilo (45:03)
Yeah.
Bart (45:16)
and the whole ecosystem on the consumer electronic ecosystem.
Murilo (45:18)
And is there can we like, is there a reason why there's so much there is a shortage now? it geopolitical? Is it the AI farms?
Bart (45:26)
No, it's just a huge
demand from AI data centers and a very high willingness to pay from them.
Murilo (45:32)
Yeah, it's another race, right? see every other week we see big numbers.
Bart (45:36)
It's another race and
it's, I think something that not everybody saw coming is that whatever advancements in AI are happening, like we feel there's another regions as well, right? Like the prices of our electronics will go up because AI is evolving.
Murilo (45:49)
Yeah,
yeah, yeah. I think also we're talking a lot about chips and chips and making chips and this. And I feel like it's the first time I hear. And even if you look at the the the website that reported it, like it's not really on the headlines from the big news outlets. People are really focused on the chips production and Nvidia and China is blocked and this and this. But
Even if you remove one bottleneck, can have another one, right? So I think something to also think about, And maybe if it was a memory, maybe it would be something else. Right? think...
Bart (46:13)
Yep.
Well, is also
that's maybe a bit in parallel, but maybe a bit less priority because it's still very much a rumor phase. There are some rumors that Samsung will stop its SATA SSD production in the coming year. ⁓
Murilo (46:30)
Mmm.
Bart (46:31)
because it's not that profitable for them anymore, but there still a lot of consumers that put that into new builds. So also on the, let's say the more long-term storage side, there we might see cost rises, rising.
Murilo (46:45)
Let's see. I wonder what's also going to be the impact for the big data centers as well. OpenAI is trying to build. Is it going to really skyrocket? Did they account for all these things? There's a lot of moving pieces as well. So it was already one trillion or whatever. Is it going to be 1.5 trillion now? the... lot of things happening. And last...
Bart (46:59)
True, fair point.
Murilo (47:09)
Thinking Machines has opened Tinker to all, pairing its trillion parameter KimiK2 model with OpenAI compatible APIs and new image inputs via QWEN3VL. Dropping the waitlist positions, the platform as a nimble alternative for research seeking long chain reasoning and multi-model fine tuning outside OpenAI's ecosystem. So, Thinking Machines, maybe about the company first part. What can you tell us about Thinking Machines?
Bart (47:33)
Thinking machines is run by.
Mira Morati, that's the correct way to pronounce it.
Murilo (47:38)
I think so.
Bart (47:39)
I'm gonna banana this, but I'll give you a second.
What's the name again? How? What did say? Mira Murati. Okay, Thinking Machines is run by Mira Murati, which I think is the ex-CTO of OpenAI. I'm not completely mistaken here. ⁓ Yeah. It's, I want to say it's...
Murilo (47:43)
Mira Murati, Chief Technology Officer of Mira Murati. ⁓
I think so, yeah.
Bart (48:03)
Was announced roughly a year ago, Thinking Machines Lab. They did a huge, huge seed funding round. So that's the initial funding that the company gets. They got 2 billion in seed funding, was, not necessarily 2 billion was the most over, but it was at a valuation of 12 billion after closing, which is the highest seed funding round ever.
You have this huge amount of funding. You have this very big person and also very talented people joining the thinking machines. So there was a lot of mystery also behind this. And I want to say that two months ago they announced a beta or an alpha of Tinker, which is an API for researchers and for developers to basically do aspects of training and fine tuning of models.
more specifically open weight models. That's also what the general available versions now allows you to do. It allows you to find you, example, Kimi's K2 model, but also a bunch of others. And it takes care of all the infrastructure for you. And basically they have their CLI to interact with this process. There's already a bit of community around it. There's also Tinker UI built by the community to have a bit of a visual interface on top of this.
So yeah, interesting to see if we will see as big results of people using this. I'm wondering a little bit also as well, like it seems, even though it's ambitious to build something that hopefully becomes the de facto fine-tuning training tool set for model builders.
I think a lot of people expected more. If you do this type of seed funding round, everybody starts thinking they will probably come up with a frontier model themselves.
Murilo (49:37)
I see. I see.
Bart (49:38)
And I think that
maybe they have something in the pipeline that is bigger than this, but if Tinker has to become their product that will basically greenlight initial investment, mean, that would mean that every big lab adopts Tinker as the de facto fine tuning layer, right? And that seems unlikely because it's also very proprietary in these big labs.
So yeah, it's interesting to see that these kind of tools get more accessible for the public at large. But I'm also waiting to see what Thinking Machines will do next.
Murilo (49:59)
Yeah, ⁓
Yeah, so thinking machines, only have this, this is the only product I guess they have, like it's just Tinker. That we know of, okay. But the whole thing of this is based on fine tuning open models basically.
Bart (50:13)
that we know of.
Basically,
Murilo (50:20)
Basically, and it's not just the, mean, is it open source or no? It's not open source. Yeah. So basically it's like an API. You call something and then it will take care of all the infrastructure. will take care of everything for you. And you get a quote unquote personalized fine tuned model. Okay.
Bart (50:24)
I don't think so. I don't think so, no.
And what they take care
of is all the infrastructure behind that because it's not easy to do, This should make it very accessible for everybody.
Murilo (50:41)
Yeah, yeah, yeah.
Yeah, yeah, yeah, yeah. Because that's also the, even if you had the code, it's also a scale problem, right? So, and I think so far, the only other one that we saw, it was like Hugging Face, right? But Hugging Face was really open source. You really had to take care of everything. But I remember we saw there was like some cookbooks on how to train your own LLMs and all these things, right? I do think, I mean.
Bart (50:51)
Exactly.
Hmm.
Murilo (51:07)
I do think maybe it's not a sexiest thing, but I do think it's a space that not as many people are competing.
Bart (51:13)
No, I agree. There's not a lot of competition.
Murilo (51:15)
Right? So I as a company, and I think is it unrealistic to say, but that by in one year, two years, people will be wanting to fine tune their own models. Maybe not, right? Like exactly.
Bart (51:23)
Well, especially companies, right? Like that have
their own, like own data set, let's say, sort of contracts, these type of things that I can very much see a use case for this that I fully agree with. ⁓ It's hard for me to see how they can basically validate their evaluation based on this.
Murilo (51:33)
Yeah, also thinking of other languages.
Yeah, indeed. I do think it's, well, I do think it's, there's a need in, there will be a need in the market, right? So if it's just this, it, does it justify it? I don't know. But I think there's a lot of, there's a lot of like, almost like cracks in the AI space, right? That you can fill in probably with fine tuning, know, these big models. Like I'm thinking of languages, I'm thinking of something very specific to your organization.
Right. I think this could feel so I think it's interesting. I understand why people are disappointed as well, because there's a lot of hype and sexiness on the AI space today. But I do think if you're a business person, you think it through, it's. I think it looks like they know what they're doing, right? I I wouldn't I wouldn't take that out yet. Right. Like I think. So interesting, interesting.
Bart (52:22)
Hello, of course.
Murilo (52:25)
Are you going to try? Have you ever looked into this? Have you ever played with it a bit? Have you ever seen anyone from the Reddit or anyone in the community that tried this and has any opinions? Okay. Interesting. Okay. And I guess we'll wait and see. I'm curious what's going to come out of this. Let's wait and see. Maybe we have some extras. So that's all the articles that we had for today. We have maybe some...
Bart (52:28)
No, not really.
No, also not, actually.
Let's wait and see.
Murilo (52:46)
some small extras, maybe you can just go very quickly over them and then people can actually click the links from the newsletter if they're interested. Do wanna kick us off, Bart? Just really quickly.
Bart (52:56)
I think there is a interesting discussion on what is open source and what is not. Where there is a bit of a discussion between Ruby on Rails creator, DHH and WordPress developer, Matt Mullenweg. I think it's an interesting read. I'll add a link to the show notes. There is also a new visual editor for the cursor browser.
A bit of a drag and drop interface where you can basically drag and drop in React components. I'm not sure if I would quickly use it because typically for me it works very well just via prompting but maybe this is faster than prompting. ⁓
Murilo (53:31)
think this is
also a bit supposed to component property so you can also select an element and say change this to bold you but then you have like you reduce a bit the context
Bart (53:39)
and then
it's maybe more efficient than just prompting because you already click on it, like you already have this very specific selector.
Murilo (53:43)
Yeah.
exactly or like
centered this and then like if you select the element it's easier to know it's almost like encoding how you can select the piece of code and you say change this to that you can do this with the elements I think maybe for you is also not that interesting because I think you're proficient the front end part but I think for people that are less maybe like me you know maybe it can be a time
Bart (54:08)
Yeah, it's
very intuitive interface, of course, if you have a drag and drop.
Murilo (54:11)
Yeah, at least the demos that they're showing, right? But so
it's like bit of a UI drag and drop kind of like doing things, but you can also look at the code. can also prompt and it's on react. So could be, could be interesting. Yeah.
Bart (54:23)
And we have OpenAI that is apparently now also adopting skills like Anthropic is doing with Claude. Where ChatGPT has a hidden skills folder with very clear instructions on how to do specific tasks. For example, how to create a PDF, how to create a spreadsheet, these type of things. So it's...
Murilo (54:23)
What else?
Yeah, if that's the case, I'm
also wondering if skills is going to go into Linux Foundation, is it going to become a standard as well?
Bart (54:47)
Let's
see, let's see. But it's clear that it added a lot of value to Cloud became very good in certain tasks and they must have noticed and are now using more or less the same approach for ⁓ ChatGPT.
Murilo (55:00)
Yeah,
And last but not least.
Bart (55:04)
OpenAI is apparently using GPT-5 Codex to improve Codex itself. Which is not really a surprise, right? It would be weird if it would not be the case. Yeah. ⁓
Murilo (55:11)
Yeah.
like you should use my coding agent but I'm not using
myself it's like okay
Bart (55:18)
Exactly.
And the only surprise to me was that not 100 % of people were using it. I think the number was still very high, like over 90%. But Codex for people who don't know is just basically a CLI. It's their alternative to cloth code. It's or Gemini CLI or Mistral's Vibe. ⁓ And apparently the developers behind Codex are...
Murilo (55:33)
or Gemini CLI as well, right?
Yeah.
Bart (55:43)
using Codex to build Codex, which I guess makes sense, right?
Murilo (55:46)
Yeah, I think if you're a user of your own product, I think that's the best way to see what's doing well, what's not going well. I also think Cloud Code announced that they were doing this for a while ago. I think even Gemini. yeah. Still very, I think maybe for a lot of people, including myself, like there's a lot of these tools. It's not very clear to me if it matters so much which one you choose, which one should you choose? Is it one better for one thing, the other one better for the other? What are the pros and cons? Or is it just like, just...
Bart (55:49)
I think so as well. ⁓
Murilo (56:11)
pick the best coding model, is normally Claude and just go with it. And there's also something, some open source ones, though, that you can change the model underneath.
Bart (56:18)
Yeah, open source versions as well, where can just, let's say, drop an open router key and you just can switch to whatever model you want to use. But I think if you want to get started now, you can't really go wrong with either Gemini, Clot, or Codex, all three are very good. And I think from the moment that you've been doing this for a few months, you start to build a bit of a preference for certain things.
Murilo (56:31)
Yeah.
Does it make sense to use more than one in parallel or no?
Bart (56:42)
I think it gives you bit of insights on how they differ.
Murilo (56:45)
But then like, do you have a strategy quote unquote, if you use more than one or is it just.
Bart (56:50)
Well, for me personally it's just good to see how they differ, necessarily. I would not actively use them in parallel. Me personally.
Murilo (56:56)
Okay.
Yeah. I think the only time when I would use them in parallel is like if Claude says, you maxed out on your usage and they're like, okay, I'll use like another one now. But then it's like the only thing that annoys me a bit is like then you have Claude MD file and then you have maybe another one from like agents.md and then you have this. But I think that's the main use case for me to try different one. And maybe that's not better because maybe you're trying different things and you can compare them.
Bart (57:03)
Yeah, yeah, yeah.
Murilo (57:21)
only for me at least that's that's the only time when I've touched more than one. All right but thanks for the quick rundown Bart for if people are interested they can where can they find these articles?
Bart (57:30)
We'll add them all to the newsletter and the links will also be in the show notes below the podcast or below the YouTube video.
Murilo (57:38)
Yes, yes, yes. All right, and I think that's it for today. Is there anything else you would like to say before we get going, Bart?
Bart (57:43)
We'll see you all next time.
Murilo (57:44)
Yes. See you next time. Thanks everyone for checking us out. Ciao.
Bart (57:48)
Ciao!
Creators and Guests
