Tiny Brains, Big Gains: No-GIL Python, $300k OSS, and Europe’s Chip Power Play

monkey patching podcast where we go bananas about all things tiny models coaching and more

my name is Murillo I'm joined by my friend Bart hey Bart hey Murillo hey nice new place

new place place so for people going places in the world yeah that's not that's not true we are going

places we started in Murillo's closet yes we came out of the closet now we're kind of in an office

kind of in an office yeah yeah indeed so you may hear like uh I feel like we still need to do some

I feel like it's a bit echo you know yeah we probably need what do we need we actually have a

we have some fur which maybe need more fur maybe more fur cool but yeah now we're gonna it's a nice

place I'm excited about this change definitely it's gonna be good what do we have for this week

Bart we have a lot right we do have a lot we have a lot to get to I'll kick it off we have a tiny 7

million parameter network called the tiny recursion model claims big reasoning gains by looping on its

own answers it reports 45 on ARK AGI 1 and 8 on ARK AGI 2 challenging bigger is better reflex with

recursive updates if small models can reason cheaply startups get new room to build clever

agents without big tech budgets what is this about so this is a paper from I think Samsung I want to say

people from Samsung so we can take a look here yeah and basically they they propose a new architecture

so it's a they call it a tiny network and it's recurrent so in the sense that the output of the

previous iteration goes back in and it's also reasoning model right so I'm putting a lot of

things together here so recurrent means that the output comes back in that's not how LLMs work and

reasoning model is like when you have the the LLMs they have like a scratch pad kind of right and then

they put some things on paper but then they just use it as intermediate steps right so it's a bit

bold putting both things combined and they actually shown that like training a new model from scratch

with not that much data can have very good performance on some benchmarks so I think some

benchmarks specify here like the ARK AGI and I'll talk a bit more later but they perform on par or even

better I think I went on the the results later down the table and I think they perform even better than DeepSeq or

Cloud 3.7 you see so they're actually model is um was one of the best ones except except for uh Grok

actually but considering that it was very very small models very little data is actually quite

impressive of course for other benchmarks I would imagine they would do way worse and there was some

information there on these are these benchmarks are geometric puzzles yeah indeed so that's what I also

wanted to touch on but maybe just real quick as well they also mentioned hierarchical reasoning so

this is a paper that was inspired by that okay which has also has like uh it's a similar kind of

setup but it's a bit more complicated so this is actually a simplified version the hierarchical reasoning

model there was like two networks and it's almost like you had two different persons and one is on the board

doing the the more daily like the more uh hands-on tasks and then the other network would be just

kind of saying you're in the right direction keep going like this so it was also like the

different different way of looking at it and then there was like a feedback loop that was faster like a loop that was faster on the

on one of those and then the the higher uh hierarchy I guess like the overlooking model also had a slower one and

then the RKGI so that's what I also wanted to to to bring here in the discussion do you know what the RKGI is?

RKGI yeah the benchmark right yes you know what this is who looked into it so I looked as well so the RKGI so this is RK3 RKGI is basically RK stands for

what where is it abstract and reasoning corpus and RKGI is artificial general intelligence so it's actually from created from the same guy from the created Keras I want to say

and he was basically arguing this little video here we'll put also on the show notes

he basically argues that a lot of these actual benchmarks it doesn't really demonstrate intelligence

right a lot of times if you see a lot of data if you like a monkey can copy things you know like it's not what is really intelligence

so this benchmark was designed to to tackle more of the intelligence problem and what they do is that they have so you can

for people following the video I'm also going over the website and there's like a play humans and build AI

and basically the idea is that they have inputs and outputs but it doesn't tell you what the rules of the game are

and then the model needs to figure out by itself what does it need to do to actually accomplish the task

so I played a bit with this so this is an example right so you can play as a user and then you have the commands here

so just uh let me share this right screen so you have like a little a little interface here you have like the the arrows so that what you can do

and then you kind of have to like play around and then see what is actually the goal right so in the beginning you don't know

and the idea is that you need to that you need to let the model needs to basically figure out what it is

and that's how he defines that that's intelligence right interesting and uh apparently these for these types of things

well that's what I'm assuming from the the paper right the the tiny network the tiny reasoning

what's the t t t tiny networks I forgot how they called it it's like the tiny recurrent network dot t r m

they're actually good at the tiny recursion model yeah tiny recursive model yes exactly they uh

they're actually good at these things which again doesn't mean that it replaces a chat gpt right

because maybe actually a lot of the stuff that we asked the gpt is not an actual intelligence

according to the definition of the other cube but it is a different approach and I'm also wondering if

this actually I actually saw on the paper somewhere that they said that it doesn't scale as much so

actually having something smaller would actually give you better performance

so I thought it was also I also wanted to bring it here because

I don't hear as much of the the new architectures the novel things

and I also thought it tackled a different kind of problem I also learned about the rkji so

what do you think

very interesting the rkji stuff makes me like the example that you just showed makes me think about this uh

this game uh that has been running on twitch for a while uh pokemon basically

and I think it's a I think I think it's the game boy version of pokemon that's playing

not 100 sure and they have like uh all kinds of models trying to get to the end

and actually like it's super inefficient there are hundreds of different things

and the first time that it actually successfully finished the game was with

the release of chess gpt 5 really that's very recent but the type of game is also like

you don't like there's probably a manual somewhere out there right

to figure out the fitness function basically how to play this game and how to how to improve on it

but they didn't give them the the like the instructions to the models they just kind of say

this way these are the inputs that you can do and you can monitor the output

yeah exactly and I'll give you rewards based on the output but I won't tell you exactly

what led you to that reward okay

it's interesting to see these new uh architectures it's actually the first time that I'm

I'm reading on this um yeah so that that was a the beginning when I was trying the the game right

they said that the the the model was like easy for humans hard for ai and then I was supposed like

this is fucking hard I was I really like no kidding I was really like reflecting like am I

intelligent or am I like a well-trained monkey you know just repeat stuff like I had a bit of an

existential crisis but I figured it out eventually but I do wonder how they because it's apparently good

these type of quote-unquote puzzles yeah in games without instructions I do wonder how they train it

on this yeah it's true whether there is no risk of over training on this like are they also testing this

on yes I'm wondering how they how they do it's probably in the paper right probably the paper yeah

need to need to fine comb through it but because they also different version of the benchmark so maybe

they also trained on one before or not or I don't know if it's just actually it's a good question

because also training in the unless you train a different version of the bank benchmark but it's

also kind of cheating in a way I don't know yeah and I would I would also maybe because they're saying

that these type of models score way better on these benchmarks I would maybe also argue that

alums are typically not optimized for this type of contract yeah yeah indeed indeed and that's what

I was also there was also I was thinking like uh I understand what the relevance of this benchmark is

but at the same time it's like you're being very narrow about your definition of intelligence

at the other side like there's there's uh these are seven million parameter models which are super

efficient yeah you compare it to what is out there so that's that's definitely indeed indeed for going

further in this field for sure even the the other model that they mentioned like the hierarchical

something something hierarchical reasoning model the hrm is it was also like 25 million parameters

so we're also they show they work well but this is even a step yeah smaller right so yeah again I

think it's uh that was pretty interesting again not sure if they have a huge impact on our day-to-day

yet but uh who knows probably some I probably would have said the same thing about transformer

architecture when it just came out right and now we're here true all right

else python 3.14 is out python 3.14 is free threading no gill finally lets plain threads run cpu

work in parallel not just io lucas a friend of ours it's for model inference and async apps and shows how

to try it out fast throughput isn't everything if latency matters this could shift python's cost

and architecture math more than any other auto scaling button um python 3.14 is out did you know that

i did know that it was out but i haven't really looked into what exactly was released with it yes

so one of the things that was released well with an asterisk and i saw i hadn't seen this for myself i

just saw like people on my network sharing about this that actually the free threading the no gill

uh is actually out with an asterisk it's not the default yet you need to kind of set some flags and i don't know if you need to rebuild it but like maybe you don't need to build it but like you need to specify that i want the free threading version of the gill

of the gill what is the gill maybe for people that this is the first time that they hear

we're a bit on spot here i'm trying to say to get a eli 5 answer on this so the gill is the stands for the global interpreter lock

and it basically means that in python even though it sometimes feels different but it can only do

one thing at a time ever

whenever something gets done basically locks the cpu just to that

just to that job and then releases it and then another job

exactly

even if today or let's say before 3.14

you use something like threading

there was already a threading library

and it felt like threading but in essence

it was not really threading

yeah the way i i understand is like there were different threads but like you weren't doing stuff in parallel

it would just switch between threads all the time

so it was a bit useless in a way

and yeah i know python is popular with ai machine learning all these things but it's also because they

enter they have an interface with c and then when you're on c land then you don't have this problem anymore

that's why like numpy and all these things um got really popular right like the with python so but now 3.14 is out the

this is a article from lucas valatka

ex-colleague and a friend of ours

and he kind of he makes some some interesting points on why we should care about

about this development he also points out that now python is probably one of the fastest interpreted languages right

big statement is it true

i have a good question i haven't checked

but

on his actual arguments right so for example he says

maybe people are pretty skeptical right so you can use multi

i can use multi-processing which basically

python because he couldn't run stuff in parallel

you could run different python processes for each one of those

like that was a way to to bypass the the gill

right um and then he kind of says yes but there are of course

like costs from this right can because you have different processes that are separate

you cannot have really good communication between them

right the other says about the the c and rust extensions um

but then again you have to write c rust right think io so

even though if you have something that is io bound meaning that is like you have to read

read things in and write stuff out then

python can be faster for these things because basically you can tell the the single thread to do

other things while it waits for response and then it just comes back and check right but a lot of the

stuff is actually not io bound right and then he goes on and on for like a lot of different

articles on like okay you can scale horizontally yes but you're still very efficient on the single

on a single instance and again machine learning inference i think a lot of this stuff also seems to

see in rust extensions and existing io apps maybe one thing i also wanted to to to add on this

python just because it does have the the free threading right it doesn't really mean that

the code that you have today would just run faster no true so i do think there's a lot of code that

would need to be reworked to be able to leverage these things i'm also wondering how many how much

people are going to go for like how much people are going to really invest the time to to work on

the free threading for python you know because most of the stuff that there was a real need

people have found workarounds for this today right and also i feel like if you're going for speed

not sure if python would be the first choice either right i think it depends a bit on the on the use

case you're building of course i think from when you're building from scratch in the future and

threading odds value be it be it efficiency you will probably use use this native approach right

yeah instead of going for multi-processing i'm still curious to see i'll still be curious to see how

how much work is it to actually implement this right because i feel like then you seem to

manage different processes and like it's not easy right it's not like you just turn a switch and it

just works no no you need to develop specifically for it right yeah but i feel like that's also goes

a bit against why python is so popular that's what i'm thinking you know python is really simple you

don't have to worry about much stuff that's why a lot of people just do it and if you do need to

worry about all these things then you have the c extensions the rust extensions and all these things

yeah i think it really depends on this case i think uh back in the day a long long time ago

threading was already very popular and like even though it wasn't like need actual threading but like

for uh for when you were building a let's say a desktop application and you had to uh you had to manage

uh updates to the window paints and stuff like that like in the end you were using threading there

like it's been used since day one i think what they're trying to do is simplification i think

simplification is always a good thing if you don't need to think about anymore that reason even is a

gill yeah there is a global interpreter lock like threading is just actual threading i think that is a

good thing but it goes a bit i think why did it take so long is because probably for simplification

reasons at the very early days there was a gill and that made it very hard and not very simple to

move away because a lot of the things that were implemented like assumed that there was a global

interpreter lock so it was not simple to move away but in the end we get to a simplification of of

understanding the api that you're interacting with as a developer so i think that is a good thing

i think so too but i also think that because well as i understand

it's not like you need to learn like it's not a different programming language right like you're not

losing a lot of the stuff everything you have today will still work and now you have this yeah so for

sure i'm all i'm all for it but yeah again curious to see the first serious applications that we use

this um see how the experience is but yeah let's see and the other things also so three four point

fourteen i think that was the first the main thing or the biggest change but there are other things

that were changed as well did you have a look at that no no i didn't so one of the things that it

meant so yeah okay i'll just very skim through the table of contents here on this other link

the python repo now is a bit better so there's different colors and stuff and also if you mistype

a word and you try to run the program the interpreter will say ah did you mean this did you mean that

yeah right so kind of like in rust they also had this now you have t strings have you heard of t

strings template strings exactly uh i've heard of them i've never used them so now they're released so now

now you can use it but i think the idea here is to because people use a lot of f strings right so

yeah yeah and now i think especially with i think because a lot of lms to be honest right i think a

lot of the you're prompting a lot of stuff for lms so basically have this different type of string

here which is a t string you can like it doesn't evaluate to a string directly evaluates to a template

right and then the developer can actually choose what you want to do with this what you want to

interpolate this you can also do some uh like checks right like to to make sure there's no

injection or some some sorts right so now it's part of so in uh on the import lib right the standard library there

i think that's that's those are the things that really caught my attention to be honest but

and i do i actually have heard as well that it's also the fastest python again that's good huh even

without the the the the now it makes me actually wonder what look is the statement like how uh how

father interpret the language is like how would this uh how does this compare to julia for example or

sure sure i'm not sure not sure but yeah with the food or to um what's the name a bit of the the

superset of python mojo mojo yeah but mojo is compiled i think is mojo i think so okay my bad

i think so yeah i think so mojo is compiled okay yeah understanding mojo a python-like compiled language

but yeah curious how these things go indeed what's next a javascript dev made over 300 000

dollars from light gallery by dual licensing a free gpl agpl for open use paid commercial for closed

projects and because gpl compels any bun embedding his code to open source their whole site

most companies buy a license that is the core of the business model he pairs that with contributor

agreements a clean major version switch and exclusive features to nudge upgrades smart sustainability or

tollboot on open source what do you think mariella

i don't know i don't know actually so to to make sure i understand correctly he has two licenses what

he did uh he makes claims that he made a lot of money i'll leave that in the middle he has this uh

this uh javascript project called light gallery this is by the way good good advertisement for that

it's a bit of a it's it's basically like a an image and video lightbox gallery do you know like if you

click an image on a website and you get like a gallery of images and then you can scroll through

them yeah so that's light gallery what he does he has a he has a very much open source license

gpl or agpl and um he also has a paid commercial license i never really thought and the paid

commercial license is basically to make money off this which makes sense for him right like he's putting

a lot of time and effort in this we really have this sleek thing he was trying to find a way how can i

build a business model around this and then the gpl license is like it's open source but so a lot of

people use default to the mit license right okay the gpl or agpl license they they are a bit more

restrictive right they they force you as a user of that to if you make a certain change of you use it

to be very transparent on what you use or what you do to do with it so and he says actually this difference

between gpl v3 and agpl v3 so he says the gpl v3 is best for libraries and frameworks

because it's triggered by distribution so if someone includes your gpl javascript

on their public website they must must in theory open source their entire website which commercial

companies are never going to do and agpl v3 not really relevant for his project but he says the best for

for sas projects for the service where a code is used on a server but never distributed so the agpl is

basically trigger when a user interacts with that software over network and if that is the case if a

user interacts with your agpl project over that network like the company again is forced open source

they're all which you're not going to do yeah it's very easy because it's open source projects it's very

easy to for a web developer to use this library it's like oh look there's school looks very nice it looks

very sleek we need this we can't use it but license only cost like i don't i don't know honestly don't

know what this pricing is but like it's maybe it's five euro five euros a month or something right

like it's it's a no-brainer to use a commercial license yeah so i thought it was interesting an

interesting point of view so basically you have like uh both are more you have a paid license which

whatever it is and then you also have a very strict open source license and then like because the mit

is like it's a bit like you can use but you don't need to open source anything it's like right so you

need to acknowledge i think you just need to acknowledge yeah so then you have like a very

strict open source one and another good option and it's okay let's just go for the the small fee one

because we want to respect the license because we have another alternative

to the open source one and we we're not going to open source a whole website that's that's a bit the

idea okay and he also mentions a little bit like how he handles contributions

where you basically have like apparently there are hidden bots to do this so when you have

users contributing to your project that has these type of licenses where there's also commercial

part that you you can do like you can have like a contributor license agreements so contributors

basically let you use their code but they still own it i'm going to contribute to your project so

you can use everything but i still own it and you also have a copyright assignment agreement that's

basically one from the moment i commit i sign it off to you it's yours okay as a repository owner

and apparently you have fitted bots that take care of this for you that's handy that's handy yeah and

i would definitely imagine that like if i'm using light gallery and have this fine a small tweak that

i want to do and it's a fit but i'm not gonna care too much like i just want to get it fixed right

yeah yeah i think they're still there's still even though like this will probably scare off the

true open source aficionados yeah like there will still people be that are engaged enough and i've

actually had but it's in a little bit of a different i've had similar examples but slightly executed

differently so i used a while ago there is this library forgot the exact name uh to basically

template word documents from

javascript or typescript okay and you can easily generate a word document but then to it works and

it's the best function because you have a number of alternatives but it's the best one so you go for

that one and i say okay you can use that for free but to also let's say i think for example inserting

images in the word document you need to plug in then you have a commercial license so i had this

approach where it's very much open source but yeah for some parts you need to pay but i hadn't really

seen this explicitly like the same the same thing but it's just based on how to use it whether or not

you're under the commercial order yeah no but i think the plug because the plugins thing also feels a

bit like the freemium model yeah true yeah it's a good comparison i think yeah but it's true i've never

seen anything like this um i think it's an interesting venue to explore as a

library author that is very engaged and making sure that you have something and

it's a creative way to build a business model around exactly and i think

ways to support open source i think it's it's always good right like finding new ways that people

can can make a living out of these things and still like as i think it's it's a difficult discussion

because this is good for open source but it's a bit of a slippery slope right

why like this is still open source because it's under gpl right yeah that's what you're saying but it's

very clearly just chosen for that so that's

like as a company you can't really use it unless you go for the commercial license

it's just to nudge people that yeah i know what you're saying but the advantage of course like

it is still an actual open source license that you can use it under even though it's very it's a

restrictive one yeah restrictive if you unless you have an open source project as well right but i feel

like it's really like it is a very because that well if i understand what you're saying like a lot of

of times the open source it also like open source still moves the world and i feel like

by having this very it's almost like water and oil right like these two like the things that are

open source with they're going to use this license and they're going to everything's going to stay

open source and everything that's commercial there's no it won't feedback right like you won't

you're just you're just consuming it but then like i'm also wondering like how

would this also scale like for for if everyone had this this setup would open source be as big as

it is they would be probably not yeah that's probably not so yeah yeah it's interesting interesting

to think about but i do think that if this is a recipe for success i think it would be good for

more people that are trying to also make a living out of open source to to rely on these things right

so it's always good alternatives are always good i feel true

all right and up next we have the netherlands takes effective control of chinese-owned chipmaker

and xperia citing governance failings and risks to dutch and european economic security using rarely

invoked goods availability act ministers can reverse harmful decisions wing techs shanghai listed shares

promptly fall 10 it's a new line in europe's tech sovereignty sand does data custody now extend to

boardroom control what do you think bart um i think it's an interesting uh news article that popped up uh i want

to say yesterday yes october 13th yes uh that's actually today uh popped up today but um

this is a very aggressive move by the dutch government so a little bit of background here

wing what's the name

wing tech is the parent company the shanghai listed chinese company

nexperia is a subsidiary of wing tech and that is in that i think it's that quarter resides in netherlands

was bought by wing tech i want to say in 2016 around that so not that long ago it comes originally from

uh philips sold by philips and they uh are big in the semiconductor industry so they make things like

transistors and diodes and stuff like that so not super fancy stuff but like things that are used throughout

every type of electronics that you can imagine basically from household appliances to cars whatever

they're quite big i think they employ around a thousand people um and in the netherlands or

i think in the netherlands yeah i'm not sure on that but they're big in the netherlands and

what now happened is that basically they the netherlands by the use of this this act they have a name

for the act uh good availability act yes i'm a bit at loss all the details on the act but uh

they took away control of nexperia from wing tech so that's very big right like yeah

and take control means like so they basically uh took over the boardroom so they say that the current

board of directors is no longer capable of making decisions

what the next step is i'm not sure yeah to be honest and i was looking into the details on why

this decision was made and today the details or the public details and that are still very fake

like it's it has something to do that that the governance of nexperia

was something was not not an order and that had to do with a

security on technology and ownership on technology within the netherlands and the european region

so that's very fake right yeah yeah like cyber security stuff that's the only thing they're saying

cyber security or something like with the the knowledge on these products that they're making

that something was happening around that but it's still very fake what is happening well why why this is

being done i added this because it's like to me it's a very very i think it's a bit unprecedented in the

european region i haven't heard about this at least it's um it's a very it looks like a very big

step up in tech that looks to be linked at least to tech sovereignty yeah like i'm afraid you're gonna

take our tech away from us even though it's clearly owned by a chinese company so it feels very

weird like yeah it's um it's a weird situation yeah i feel like yeah i'm curious what's the next

it's gonna be uncovered right because there's probably gonna be more and this is very there

should be more there should be more i don't know if it's also my my bias right but like uh chinese

government don't and then there's always a bit like it's not government owner no sorry not chinese

government on but like um china and maybe also the the way that the title reads right dutch government

takes control of china owned chip maker yeah it feels a bit like uh i don't know but i do hope

that we get more details because this is to me this is not a good thing that is happening like if you're

conducting business in a company and you're trying to abide by everyone's laws and you're doing so

and already since well since a long time like i said like they acquired in 2016 and then suddenly

for whatever reason that is today still very vague just saying like you're doing something wrong with

and it's not clear which lawyer actually breaching like which is it is it compliance related like it's

you need to have some as a when you're running account company you need to have some legal certainty

right like the legal system that we're in now and now we're applying it like we need to be certain

that this is we're in this for the long run yeah it shouldn't change too much but do you think that

like they they are not also aware either what uh why they hope they are i hope they are but i hope

that that's what i'm saying and i think it's uh it's big news but it really requires the context to

form an opinion on this yeah no that's true that's true but i do think like at the same time if details

do come out and you see like what the fuck like what were they doing like then then it can be a good

thing as well right that the government still has the the the not like the power kind of but also

like that they're they're not passive right that there is someone that is actually making sure that

the true so it really depends on what it depends on yeah because that's what i'm like because you

said it's not good but at first i was like but if because also this is the first time i think we

heard this happening so i would imagine there was something very serious right so in my head i was

thinking like oh it's good that these things are being forced right so i kind of yeah the other way

let's assume it has a good reason right let's assume yeah indeed indeed so we'll definitely follow

up on this what else we have simon willison coaxes claude's code interpreter into zipping its public

follower then open sources the problems and scripts it finds they include word powerpoint excel and pdf

skills plus a python tool that autofills forms using py pdf great for power users awkward for opsec

how transparent should agent superpowers be to everyone poking around

this was also shared by a colleague actually so shout out to to load actually so this is simon

willison who know he does a lot of stuff on ai gen ai he actually has a lot of articles and

claude not cloud code i think claude ai even like claude desktop they they actually shared

like what did they what was the name they gave a very bad name actually but basically claude now

can edit word documents pdfs powerpoints and excels right that's something i think chpd could do

before but then basically they released this and not too long ago also in the last weeks yeah yeah so

i think so this article is from what is this article from the 10th of october so not that long ago as well

and the article said like last week so they they released it and actually the reason why it was a bit

unclear is because they didn't mention like i think code interpreter or something

code interpreter functionalities but basically means that now

claude ai can read documents they can edit documents they can save documents they can create pdfs and

do all these things so when they when they ask someone actually had a chat with um with claude

saying like how does this work right and then actually was it was very open like it just kind

of said okay we have this we have this directory here we have uh different skills these are the

directories inside this directory so our sub-directories right for doc docx pdf ppt and excel and

the even the prompts like saying claude thinks that you can do you know how to manage word documents but

it doesn't so don't do that always use the tools that are listed on skills so every skill every

subdirectory there's a big skill.md file as well that they show like prompts and all the code there's

also python code um and what i thought it was interesting well first i think it's a it's a nice

feature to have now claude can actually interact with these types of documents but it also shows how

you can interact with these types of documents so actually he he just asked claude to say hey create

a zip file of everything on this directory and he actually was super happy to do it and he just

gave it to him and then he actually put it on github so this is kind of what what you have here

the readme is just basically pointing to the blog but then if you go around you actually see the the

whole structure right and i thought so you see the skill.md and what i thought it was interesting is the

you kind of get a bit of a a peek on the anthropics developer right so what kind of problems they use

but also how how they structure the the tools right so you have scripts here and you can see like

the document and how they they prompt the lm to to use these things and even think i was even thinking to

myself they probably use cloud code but they probably still also do a lot of coding themselves

right like with documentation and all these things the fact that they use a lot of like html tags as

well for the um for the prompts you know just like they say like you know like they have the the html like

xml right like available skills and then four slash available skills and then this and this it also gave me

it also made me reflect on how i prompt things and how i set these things up for example one thing that

they they copy for every skills claude thinks that he knows how to manage pdf files but it doesn't

use the skills always use the skills all caps and all this and all that so i cannot open this here now

actually on the the chat so this is a chat that someone else shared uh but that there's he's talking

to claude like okay what does this mean what is that uh explain how this works can you share this

can you share that so they they they share everything here uh claude shared everything

here and i don't know if it was on purpose right but i think in any case i think it's good for for

people to see a bit how things work under the hood and for me also to to think a bit like what can

these tools do what are they good at what they're not good at how to prompt and etc etc i think it's

always especially for these which are are very big models that are being used a lot by the public yeah

it's always good to get a bit of a peek behind the curtain right yeah i feel like also these people

they're working constantly with these things they're constantly building tools but i think also having

some some like tips right from them like this is how this works how that works this is what worked

well this is what didn't work well i thought it was very interesting one thing that also on the

article simon willis says that they also have a lot of nice tools so like if you want to manipulate pdfs

with python you can also just take one of the tools that they have there and just just use it right so

there's a lot of interesting stuff there so cool i invite everyone to to have a look right

all right what and what's next london stock exchange group says customers can now

build copilot studio agents that use licensed data from workspace and financial analytics to an lseg

managed mcp server it is fewer glue scripts more govern access agents live inside microsoft 365 but

talk to lseg via model context protocol so i thought this uh for um the simple fact that we discussed

mcp service last week yeah with the snowflake snowflake we're using the managed mcp for server this week

it's the london stock exchange group we have a managed mcp server now through which you can uh

basically get information on financial markets that's how i understand it um via the model context

protocol is communicated and it is very neatly apparently in uh copilot 365 but i assume that you can also use it

with other uh with other uh mcp uh clients the mcp server like you have like it says here microsoft

365 but i guess you can have any client that you use the mcp server that's the idea i think in theory

yes and in practice i don't know they probably integrated very heavily with uh with microsoft

copilot for authentication stuff like that i would assume that's because they're their release have

focuses heavily on microsoft yeah that's what i was also like a bit curious because mcp should be

client agnostic but they are mentioning a lot of microsoft stuff but maybe it is for the authentication in

yeah cool so you know what kind of what kind of information you can pull from it or no no i don't

know it's financial market information but i've really gone in depth but i think it's yet another

sign like we see a lot of these things like official mcp servers coming out will we have a mcp server for

the monkey patching podcast what would the tools be that like you find find find articles find the

quotes or finding for or maybe just asking questions about like what's the give me a summary of the

last episode i think there's a good mcp server of just like no one really thought about like they

they just thought about can we do it they build it and no they didn't really realize how should we do

it it was like there's no mic but it's there but it's there manage and all that's secure manage and

all like it's just like the message ends up in your slack and you type back exactly just like

like automation zafir's thing murillo in the loop but no but i think i do mention last time that we see

like it's probably going to be something that will continue happening right like i think it makes

sense that like these providers they have their own yeah and because you also have these like we also

discussed last time like with uh open ai's agent kit yeah the canvas builder that they have now like

they have full support for using tools from ncp service so you know no longer only see it in

clients like cloud like gpt like these same things or or copilot but also in workflow automation tools

like yeah the there's really an uptake of the under of this technology now for sure for sure and i

think as soon as you have like i think about that like i said agent kit as soon as you have like

these low code tools or no code tools the market for like the need for these mcp service increases a lot

yeah definitely right and i think it's good as well i think the protocol will also mature and

other things as well so it's really really cool what else we have um a women's health editor

trades six weeks with runa stravas new coaching app and likes the tail of paces behind this entirely

interface amid tiktok critiques of aggressive mileage runner replies we don't use ai to generate

training plans seeing expert designs them while ai adjusts the progress fair deal for experienced

runners may be risky for beginners so how much coaching can an app do before your

knees file a complaint you're a runner no bart

i would normally say yes but maybe i could say now i used to be a runner because we've been

engine for a shitty long time but let's not go into that but you're using runner

that would be a good excuse yeah it would be easy if you can just blame like it like an app

yeah it's this fucking app yeah it's like yeah it would be easy yeah now why i put this here i think

it's interesting like it goes into bits like there's some discussion in that community that they think

that ai is being used to generate training plans and because they typically

these training bags go a bit too high in typical training load like a bit too intensive and that's

apparently also what you get when you use something like chpd directly so it's very apparently very

very but the issue is not that it's using ai is using ai without any

layer on top of it they're just using the plain ai because i guess you could also have no i think

their issue is that they're using ai without enough actual knowledge on yeah exactly on the training

a theory behind it yeah yeah because that's that that's also what i was fishing for because

you could also have you could also use ai but you can have like a curated sources yeah that you should

instruct the ai to always follow through but that's that's that's not what apparently they're using

well well not they're a bit vague on this i think that's a bit the the feedback from the community

is a bit that they're questioning that like how much is actually um a app is this runner which

was recently acquired acquired by strava not too long ago let's say four ish months ago okay so

was it there was a startup or something and then it was a startup um i think a uk startup

which grew very very quickly i also think they did a shit ton of marketing like a you couldn't

look at anything sports related without getting an ad oh really wow i think they grew also a lot through that

but we we see a lot of these apps coming up right like we also have something in belgium

which which i would say is a competitor to runa it's called renara but you have a lot

of these candidates right every other training tool is now building an ai coach yeah but i'm

also wondering if it's because it's quote-unquote it's easy because i feel like if run is really

just using chgpt then yeah like you can have a hundred of these competitors right

just marketing around and maybe the ui or something right that's true so yeah maybe

one thing before we continue this is on the woman's health mag is there anything woman related to this

because i think it's just a woman that's tried runner okay it's the only thing reason why it wasn't the

the women's health uh okay um what do you think of this uh coaching

i think it makes a lot of sense when you do it in a very curated manner

i think what is the the challenge if you just if i just asked just give me a play in training

plan because in three weeks i need to do that and help me prepare for that in three weeks like it's

not sure if that's the best support you can get then you could probably better go to

or whether stream or three weeks or three months doesn't really matter but like without specific

instructions it's probably better to go to coach to avoid through like a coach i mean like a person a

person there's some actual experience to avoid things like overtraining and injuries but i think from the

moment that you have this a theoretical knowledge on training physiology and you make sure you have a

very curious prompt or or whatever solution to to inject that i think then it starts to become

better i think when it makes it even better is when you have objective feedback from users or from

from users or from wearables from the user ah yeah i see what you're saying uh because you had a

little toy project around this no yeah it's already a while ago for uh to prepare for a race that i had

in like uh in in two months or something yeah and i think i remember if i remember correctly correctly

if i'm wrong so you had the ordering so which gives you like a sleep score and readiness score

yeah and you also prompted to say i have this i want to have like i think you said like i want

to have this much rest and this and this and this one's long dates short day and then you could give

feedback after every after every exercise that would basically add to the same conversation and then

you could also add like little reminders like okay i i will do the long run today and i forgot my energy

hell whatever then like yeah right um but then you also prompted to say this is what i want this is

kind of the same thing you didn't just say i want i have a race in three months get me ready no it was

according to a certain uh let's say a certain framework yeah and you knew already this that's

why i already put it in yeah so and but i think that's like i i use for example an aura ring but you

can also use let's say your your uh your apple watch or a whoops wrap but i think it's very important

to do this because i think as a person you're you're not very objective towards yourself

like if you're preparing for a race and you feel like i'm not so i'm a bit tired but like it's only

three days left to go i mean i'm just gonna do the hard worker today and and well for me it's the

opposite it's like i'm a bit tired i'm just gonna stay yeah like to each their own right i just left 12

hours i feel like it's like something like an o-ring or a whoop or an apple watch will probably give

much more objective information i think for sure you need this combination because otherwise an ai can

distinguish which personal coach would probably do like a good one yeah but i think i agree because i

think also you have more touch points like the ai has no the only interaction the ai has for you

is really just the type of text that you write and i think if you have a user there's tone of voice

there's body language there's there's a lot of more stuff and i feel like if you you just a person

has more context than just the ai has and i think by adding these other things you're also giving more

context for the model right uh yeah i agree yeah and of course a lot of these things are often baked

in when you use an app right like when you use runner the layer after training it probably asks you

like how it went and stuff like that it's still your input it's not as objective as yeah that's your

recovery matrix yeah and uh maybe but i do believe that because i'm saying like it's probably better to

go to a coach i'm saying in this situation it's probably better to go to experience coach for sure i think

your below average coach it's probably better to use ai right yeah yeah so i'm not i'm not a runner

right i did i did two sports the more tennis and football for example but i'm not a runner and i feel

like it's very different disciplines right but maybe to be the devil's advocate here like is it that

different like so you could also argue that the training data from the llm right some of it was also

diet was also running routines like you know like and i would imagine that it's not like every three

months there's a new running regimen it's like oh yeah this is the state of the art and everything

actually everything before was bad true so in terms of like knowledge is that just the training data of

the lm not enough compared to like a let's say an average coach would the average coach be able to

give you better information no with the same i think an average coach

better understands how you're feeling so you would just be able to ask more of the questions

and like to to get the right content so it's not about like if you have the same context for both

they do probably a similar job but the coach you're talking about an average coach now every

coach yeah but the even the average coach she would be able to better poke for the right context

like how you're feeling okay you're feeling a bit tired but like is it tired because you didn't sleep

well or is it because you have do you have pain but how is the pain and maybe an lm wouldn't do that

as much that's the and i think also especially when face-to-face you have a lot of non-verbal

communication yeah maybe you have a limp or maybe you're just a bit like you look maybe it's a bit

too much but dragging yourself and you're walking in a wheelchair fine you want a wheelchair you know

like hello um but i get what you're saying i get what you're saying also like maybe you can also

perform some assessments like okay does it hurt like try to like push here resist my push like

does this hurt does that hurt etc etc or also there are some things i also think it's very hard to

explain to ai right like i it it my knee hurts when i turn it like this you know like how do you like to

you know you have someone that's like there you're just showing and it's like yeah that's what you're

pointing like right here right here when i turn from here to here and i'm looking like towards the

sun it kind of like yeah um so i think that's also useful i think it's also very very useful

do you use any uh coaching uh running app or anything no i have an actual uh human coach did you ever

try the actual uh aside from your little toy project

no only the toy projects

do you think there is a future for this do you think like maybe

no there's definitely i think there's already like it's already there's a personal base yeah but

that's a bit of the the i think the that feels a bit like the challenge there is that the human

coach is way more expensive i mean you easily uh in the given range here like probably around

75 to you could go probably up to 200 euros a month yeah for a coach that follows you up yeah yeah

personally versus paying uh what is it maybe i don't know 15 20 euros yeah for something like

it yeah sure it's a big difference right like i think a lot of people that are not super serious

when it comes to sports or just starting out will very quickly up for let's try this out and i'm also

wondering like the people that are not really into running or people that are starting who do you

think they're the ones that actually should be the more careful you know what i'm saying like maybe

like you don't know like you're an expert runner or you were but uh like you have a lot of experience

so maybe for you to use these apps it will be not as dangerous as someone that is just starting out

but the person is starting out doesn't want to invest yeah right so it's a bit of a how do you

find the right market for the right things um and maybe one last thing before i move on what about diet would you

because when you're thinking of this i think of sports preparation and i think of like dieting and

all these things and to me i feel like there's a lot of parallels right like dieting as well i don't

think diet changes every month but i also feel like there's a bit of non-verbal also like trying

these things and and having someone talking to you and asking the right questions do you think they're

different like if the question is is llms or ai a good use for coaching running we discussed but like for

diets or giving advice for a dietitian things about the same things different i think the objective

measurements are maybe a bit easier like the result like you have the weight yeah but i feel like you

can also lose weight and healthy right i agree i agree but like i think that it's the same challenge

because whether you're you're coaching a performance or you're coaching diets you still need

something that coaches you and i think if we're actually talking about coaching i hope i believe

that you myself still better at coaching than in the ice today yeah yeah i think today for sure and i think

any i can support but then you need to like do part of that coaching on yourself basically yeah

but i think and i think we talked some weeks ago about the hallucination thing of open ai

i think the biggest biggest biggest the easiest reason that i can give to anyone why a human will always be

better or not always but our place better today is that llms they don't ask as much follow-up questions

they don't know when to ask questions and when to ask and i think for a human they always they will

have them more like what do you mean by this what do you mean by that you're saying this but maybe

what you mean is that like do you have pain or do you have soreness right like all these things

and uh i think lms will definitely fall short on these things all right moving on we have post hog

lays out six easy to make ai coding mistakes think big code code based blindness in context and quote

unquote led the agent to everything optimism their code base spends 8,984 files and 1,623,533 lines

and one engineer jokes and i quote cloud code riding rust is a while loop that accelerates climate change

so lock in guardrails cursor rules spec files linkedin and expect more code more reviews and yes

more bugs if you don't curious about this one bart you share this uh yeah this is from the post hoc team yes

they are uh they uh basically made uh an overview on um how to avoid common mistakes when

when using tools like cloud code or cursor i'm a big fan of postdoc i think it's cool it's cool that

they are publishing stuff like this so postdoc for people that don't know it is like this uh

quote-unquote a bit of an uh all-in platform if you're building a product or a web app because you can

get if you use postdoc it's very easy to plug in and then you get a lot of analytics on how your product

is being used uh like how often is this button clicked or a specific user segments that use specific

functionality more and more often um you can do easily do a b testing you can have feature flags so

let's say for this user segment we show this feature for that user segment not or so it's and it's super

super intuitive super easy the website is really cool the website they're a bit of uh tongue-in-cheek

yeah uh a bit very very much developer focus yeah makes me think a bit about like model duck

a little bit same type of uh tongue-in-cheek humor

recently yeah um i think i want to say a week ago released this blog post and they say

we'll quickly go over the remarks that they make you should not treat your big code base like a small

code base which i guess i guess good feeling makes sense but i think it's if you work on a small code

base like a few files but you start up cloud code or codex or whatever and you just ask it to make changes

and it typically works and you are i think typically it's it's a danger to just switch to the big

project to the same yeah i would i haven't done that yet but i i can imagine because i think also yeah

context and size and like not duplicating things that already exist yada yada right yeah and i think they're

like the then comes to the second thing like the you need to from the moment that you end up in

these big costs you need to provide right context like specific rules specific guardrails it depends a

bit on what type of uh that's all you're using how you set this up right like you use cloud on d or

yeah or cursor rules stuff like this third thing is trying to use the i have something you know is not

good at so they they say here that it's for example uh not good at rust um or specific specific

languages or specific niche dialects is there for example the hawk query language like don't use it for

stuff it typically hasn't seen too much um makes sense but then also like how do you know what what

it's good and what's not good just try it out i think it's a bit uh experimentation yeah it's also

something that you probably should reevaluate as time passes yeah yeah for sure

being content with your existing workflow i think we uh all very quickly fall into that gap like this

works but maybe we should try this other thing and then actually have like a degradation of the

performance right yeah um i have a

i have i have that problem for sure like oh maybe i should try this i'll just try that i think for

editing the podcast as well something that i definitely like i have like a a setup and it's

like ah but maybe i can do this i can use this feature i can do this and it's like no just just

just just just do it yeah exactly um another thing you mentioned here is not using ai it's not a good

idea okay um even if you dislike it personally you should still realize like your competitors are using

ai i see so the so not using i is the mistake is that you should use yeah yeah exactly okay um

and your users are almost certainly using it maybe the exception here or there true

which is a fair point but then like is the argument here saying like

because i makes you more productive and your competitors are user and your users are using or like

why is this i think your competitors are using it your users are using it like you need to know what

is going on in the world and to i see yes but it's very specific to post hog right because they are

developer focused like they don't have a product for marketing when it comes to users maybe but even

there like like if you have like this i don't know marketing product that makes uh but that's whatever

generates uh social posts for you like competitors are also using it right yeah your users are probably

expecting that they can type something and like something gets templated up by an ai

right like it's they think you can make this he's more than a lot of different industries so i think

what they're saying is like you should always be touching ai because you need to know what it can do

because people are going to expect you to be able to things that you can do so just being like

by doing it you're always going to be up to date it's a good it's a good point i hadn't thought of

that and another bad idea is not is uh is letting ai do everything for you it's the other the other

opposite right zero extreme yeah yeah which which i guess makes sense i think that it's uh

sometimes a bit the challenge that uh because you use it a lot you also quickly try to use it in

different situations even though you know it's not always the best but just to be aware of that

so i think it's an interesting one we'll post the link of the article and show notes anything that

surprises you on this not necessarily um i think you're good feeling wise i think i think the one

about not using ai is not a good idea i think that is becoming the reality i think you still have

people with a very strong opinion on not wanting anything to do with ai that i have the feeling that

it makes yourself irrelevant but i think even like i like to we had all mit study that 95 of stuff fails

i also saw again the other i think it was mit study as well that like using chat gpt will make your brain

whatever yeah but i think like the reality is that today ai is getting ingrained in whatever we do

yeah whether for good or for bad i don't know like you can argue about it look we had the example of

this running a running app right yeah like it used to be like it used to be just programmatic like and

now every running apple probably have an option to have an ai generated plan and i think often you

you don't even see anymore the ai gets used behind the that is simply the reality right it's getting

embedded in everything for better or for worse and i think probably when we look back in 10 years a

lot of these products have become a lot better than they are now for sure they will just keep getting

better i mean the people are using as well yeah and also the the features ai enables yeah yeah

through that definitely and i think this is a healthy stance right like don't do everything

don't try to do everything but do something watch yourself yeah the last thing is you said like

challenging yourself as well i also i heard it somewhere and i also think it's true broader than just

programming right or whatever that i think it's healthy to not be married to your beliefs right like

always be like maybe i'm wrong maybe let's try this maybe let's try that i think it's just healthy for

you as a person you know to have that attitude towards life you know like yeah i agree like i think if

you ever hear yourself saying i don't agree and i don't care what you say there's nothing you can

say to convince me that's not healthy right there should always be something that i can say to you

know yeah i fully agree yeah and it works the other way around as well and i think we shouldn't

every now and then have a look at like we're discussing a lot like where is this actually helping

about true like yeah and i think that's the other the other extreme quote-unquote right like uh

maybe i is not the way to go and i'm happy to to consider that possibility as well right so let's see well let's see

that is it for the articles we have today yeah um we have a little bit of news as well we have a

sign-up link we have a sign-up link or tell me more about the sign you can go to newsletter dot

monkeypatching.io and you can basically leave your email address what are you gonna do with this email

you're gonna send well that's a big question you're gonna sell this to people to yeah we're

gonna sell their data i'm gonna get rich off it yes um first thing we'll buy is a yacht yeah okay

we're gonna be recording on the yacht from now on yeah yeah uh no but in all seriousness so we

have a you can sign up newsletter dot monkeypatching.io there is no newsletter yet

but we are definitely playing with a bit with how can we make something that is um

quickly digestible trusting something that you can um like do a look a five-minute read or

maybe not even five minutes like once a week where that gives you a bit of an overview what happened

this week a bit like the the the quote-unquote paper version of what we're doing here i think

that might be interesting for some people also use the channel to explain a little bit more on

what we are gonna do with some ideas to repackage we are playing with some ideas around events

we're also playing with some ideas to bring startups young startups in contact with interested investors

like there are a lot of ideas that we're working on but that have not been fully made concrete but if

you want to stay up to date do sign up on a newsletter dot monkey patching dot io also we're gonna have

to think a bit like how to make this valuable right there's a lot of ai generated content and i think

we're both very much like okay if that's how it's gonna be then let's not do it right but i also do

think that having like a different format on this even if it's the same content sometimes like i hear

something but then i'm like okay i need to sit down and read this because just hearing it like it's harder

like sometimes you'd also like to really dive deep or to you have some you need some visuals you need

this so we'll definitely have a thing we don't know if it's going to be all the articles or just

going to be the highlights or just going to be this just going to be that but uh yeah feel free to

to subscribe if you like to stay up to date if you also have any any thoughts suggestions as well

feel free to to reach out there are also contact details of uh our general one mine and merilos on

monkeypatching.io also even for the the regular quote-unquote podcast right like if you have any

article that you've seen throughout the week that you love that you hate it that you would like to to

hear us discussing about it feel free to to let us know and uh we'll be super happy yeah we also did

a short write-up today i published it on my blog bartz.space i also cross post on linkedin on a bit

of uh like we're six months in a bit on the stats that we have so far a bit on these plans that we have

coming up um so if you're interested uh check it out indeed i think this is also a good um

like we're saying this now but i think it's also it's another one of those good like to digest it for

like we sat down you wrote it like you sat down you wrote about it so if you want to to read a bit

more in detail and not just the high level discussion i think it's also something something

something interesting indeed and again have any thoughts feedback let us know we'll be super happy

here and um always very free to leave a five star review just five nothing below five unless you're

unless you unless your scale goes to 10 then no but yeah leave a if you leave a review star we'll be

super super happy as well thanks everybody for listening thanks everyone we'll see you uh next week

yes thank you bart maybe we'll have more decor decoration decor maybe around we can maybe buy uh from all the

the data that we sell like a van gogh maybe issue like maybe be a monkey like yeah you know figure it out

thanks everyone ciao

Creators and Guests

Bart Smeets
Host
Bart Smeets
Mostly dad of three. Tech founder. Sometimes a trail runner, now and then a cyclist. Trying to survive creative & outdoor splurges.
Murilo Kuniyoshi Suzart Cunha
Host
Murilo Kuniyoshi Suzart Cunha
AI enthusiast turned MLOps specialist who balances his passion for machine learning with interests in open source, sports (particularly football and tennis), philosophy, and mindfulness, while actively contributing to the tech community through conference speaking and as an organizer for Python User Group Belgium.
Tiny Brains, Big Gains: No-GIL Python, $300k OSS, and Europe’s Chip Power Play
Broadcast by