Tiny Brains, Big Gains: No-GIL Python, $300k OSS, and Europe’s Chip Power Play
monkey patching podcast where we go bananas about all things tiny models coaching and more
my name is Murillo I'm joined by my friend Bart hey Bart hey Murillo hey nice new place
new place place so for people going places in the world yeah that's not that's not true we are going
places we started in Murillo's closet yes we came out of the closet now we're kind of in an office
kind of in an office yeah yeah indeed so you may hear like uh I feel like we still need to do some
I feel like it's a bit echo you know yeah we probably need what do we need we actually have a
we have some fur which maybe need more fur maybe more fur cool but yeah now we're gonna it's a nice
place I'm excited about this change definitely it's gonna be good what do we have for this week
Bart we have a lot right we do have a lot we have a lot to get to I'll kick it off we have a tiny 7
million parameter network called the tiny recursion model claims big reasoning gains by looping on its
own answers it reports 45 on ARK AGI 1 and 8 on ARK AGI 2 challenging bigger is better reflex with
recursive updates if small models can reason cheaply startups get new room to build clever
agents without big tech budgets what is this about so this is a paper from I think Samsung I want to say
people from Samsung so we can take a look here yeah and basically they they propose a new architecture
so it's a they call it a tiny network and it's recurrent so in the sense that the output of the
previous iteration goes back in and it's also reasoning model right so I'm putting a lot of
things together here so recurrent means that the output comes back in that's not how LLMs work and
reasoning model is like when you have the the LLMs they have like a scratch pad kind of right and then
they put some things on paper but then they just use it as intermediate steps right so it's a bit
bold putting both things combined and they actually shown that like training a new model from scratch
with not that much data can have very good performance on some benchmarks so I think some
benchmarks specify here like the ARK AGI and I'll talk a bit more later but they perform on par or even
better I think I went on the the results later down the table and I think they perform even better than DeepSeq or
Cloud 3.7 you see so they're actually model is um was one of the best ones except except for uh Grok
actually but considering that it was very very small models very little data is actually quite
impressive of course for other benchmarks I would imagine they would do way worse and there was some
information there on these are these benchmarks are geometric puzzles yeah indeed so that's what I also
wanted to touch on but maybe just real quick as well they also mentioned hierarchical reasoning so
this is a paper that was inspired by that okay which has also has like uh it's a similar kind of
setup but it's a bit more complicated so this is actually a simplified version the hierarchical reasoning
model there was like two networks and it's almost like you had two different persons and one is on the board
doing the the more daily like the more uh hands-on tasks and then the other network would be just
kind of saying you're in the right direction keep going like this so it was also like the
different different way of looking at it and then there was like a feedback loop that was faster like a loop that was faster on the
on one of those and then the the higher uh hierarchy I guess like the overlooking model also had a slower one and
then the RKGI so that's what I also wanted to to to bring here in the discussion do you know what the RKGI is?
RKGI yeah the benchmark right yes you know what this is who looked into it so I looked as well so the RKGI so this is RK3 RKGI is basically RK stands for
what where is it abstract and reasoning corpus and RKGI is artificial general intelligence so it's actually from created from the same guy from the created Keras I want to say
and he was basically arguing this little video here we'll put also on the show notes
he basically argues that a lot of these actual benchmarks it doesn't really demonstrate intelligence
right a lot of times if you see a lot of data if you like a monkey can copy things you know like it's not what is really intelligence
so this benchmark was designed to to tackle more of the intelligence problem and what they do is that they have so you can
for people following the video I'm also going over the website and there's like a play humans and build AI
and basically the idea is that they have inputs and outputs but it doesn't tell you what the rules of the game are
and then the model needs to figure out by itself what does it need to do to actually accomplish the task
so I played a bit with this so this is an example right so you can play as a user and then you have the commands here
so just uh let me share this right screen so you have like a little a little interface here you have like the the arrows so that what you can do
and then you kind of have to like play around and then see what is actually the goal right so in the beginning you don't know
and the idea is that you need to that you need to let the model needs to basically figure out what it is
and that's how he defines that that's intelligence right interesting and uh apparently these for these types of things
well that's what I'm assuming from the the paper right the the tiny network the tiny reasoning
what's the t t t tiny networks I forgot how they called it it's like the tiny recurrent network dot t r m
they're actually good at the tiny recursion model yeah tiny recursive model yes exactly they uh
they're actually good at these things which again doesn't mean that it replaces a chat gpt right
because maybe actually a lot of the stuff that we asked the gpt is not an actual intelligence
according to the definition of the other cube but it is a different approach and I'm also wondering if
this actually I actually saw on the paper somewhere that they said that it doesn't scale as much so
actually having something smaller would actually give you better performance
so I thought it was also I also wanted to bring it here because
I don't hear as much of the the new architectures the novel things
and I also thought it tackled a different kind of problem I also learned about the rkji so
what do you think
very interesting the rkji stuff makes me like the example that you just showed makes me think about this uh
this game uh that has been running on twitch for a while uh pokemon basically
and I think it's a I think I think it's the game boy version of pokemon that's playing
not 100 sure and they have like uh all kinds of models trying to get to the end
and actually like it's super inefficient there are hundreds of different things
and the first time that it actually successfully finished the game was with
the release of chess gpt 5 really that's very recent but the type of game is also like
you don't like there's probably a manual somewhere out there right
to figure out the fitness function basically how to play this game and how to how to improve on it
but they didn't give them the the like the instructions to the models they just kind of say
this way these are the inputs that you can do and you can monitor the output
yeah exactly and I'll give you rewards based on the output but I won't tell you exactly
what led you to that reward okay
it's interesting to see these new uh architectures it's actually the first time that I'm
I'm reading on this um yeah so that that was a the beginning when I was trying the the game right
they said that the the the model was like easy for humans hard for ai and then I was supposed like
this is fucking hard I was I really like no kidding I was really like reflecting like am I
intelligent or am I like a well-trained monkey you know just repeat stuff like I had a bit of an
existential crisis but I figured it out eventually but I do wonder how they because it's apparently good
these type of quote-unquote puzzles yeah in games without instructions I do wonder how they train it
on this yeah it's true whether there is no risk of over training on this like are they also testing this
on yes I'm wondering how they how they do it's probably in the paper right probably the paper yeah
need to need to fine comb through it but because they also different version of the benchmark so maybe
they also trained on one before or not or I don't know if it's just actually it's a good question
because also training in the unless you train a different version of the bank benchmark but it's
also kind of cheating in a way I don't know yeah and I would I would also maybe because they're saying
that these type of models score way better on these benchmarks I would maybe also argue that
alums are typically not optimized for this type of contract yeah yeah indeed indeed and that's what
I was also there was also I was thinking like uh I understand what the relevance of this benchmark is
but at the same time it's like you're being very narrow about your definition of intelligence
at the other side like there's there's uh these are seven million parameter models which are super
efficient yeah you compare it to what is out there so that's that's definitely indeed indeed for going
further in this field for sure even the the other model that they mentioned like the hierarchical
something something hierarchical reasoning model the hrm is it was also like 25 million parameters
so we're also they show they work well but this is even a step yeah smaller right so yeah again I
think it's uh that was pretty interesting again not sure if they have a huge impact on our day-to-day
yet but uh who knows probably some I probably would have said the same thing about transformer
architecture when it just came out right and now we're here true all right
else python 3.14 is out python 3.14 is free threading no gill finally lets plain threads run cpu
work in parallel not just io lucas a friend of ours it's for model inference and async apps and shows how
to try it out fast throughput isn't everything if latency matters this could shift python's cost
and architecture math more than any other auto scaling button um python 3.14 is out did you know that
i did know that it was out but i haven't really looked into what exactly was released with it yes
so one of the things that was released well with an asterisk and i saw i hadn't seen this for myself i
just saw like people on my network sharing about this that actually the free threading the no gill
uh is actually out with an asterisk it's not the default yet you need to kind of set some flags and i don't know if you need to rebuild it but like maybe you don't need to build it but like you need to specify that i want the free threading version of the gill
of the gill what is the gill maybe for people that this is the first time that they hear
we're a bit on spot here i'm trying to say to get a eli 5 answer on this so the gill is the stands for the global interpreter lock
and it basically means that in python even though it sometimes feels different but it can only do
one thing at a time ever
whenever something gets done basically locks the cpu just to that
just to that job and then releases it and then another job
exactly
even if today or let's say before 3.14
you use something like threading
there was already a threading library
and it felt like threading but in essence
it was not really threading
yeah the way i i understand is like there were different threads but like you weren't doing stuff in parallel
it would just switch between threads all the time
so it was a bit useless in a way
and yeah i know python is popular with ai machine learning all these things but it's also because they
enter they have an interface with c and then when you're on c land then you don't have this problem anymore
that's why like numpy and all these things um got really popular right like the with python so but now 3.14 is out the
this is a article from lucas valatka
ex-colleague and a friend of ours
and he kind of he makes some some interesting points on why we should care about
about this development he also points out that now python is probably one of the fastest interpreted languages right
big statement is it true
i have a good question i haven't checked
but
on his actual arguments right so for example he says
maybe people are pretty skeptical right so you can use multi
i can use multi-processing which basically
python because he couldn't run stuff in parallel
you could run different python processes for each one of those
like that was a way to to bypass the the gill
right um and then he kind of says yes but there are of course
like costs from this right can because you have different processes that are separate
you cannot have really good communication between them
right the other says about the the c and rust extensions um
but then again you have to write c rust right think io so
even though if you have something that is io bound meaning that is like you have to read
read things in and write stuff out then
python can be faster for these things because basically you can tell the the single thread to do
other things while it waits for response and then it just comes back and check right but a lot of the
stuff is actually not io bound right and then he goes on and on for like a lot of different
articles on like okay you can scale horizontally yes but you're still very efficient on the single
on a single instance and again machine learning inference i think a lot of this stuff also seems to
see in rust extensions and existing io apps maybe one thing i also wanted to to to add on this
python just because it does have the the free threading right it doesn't really mean that
the code that you have today would just run faster no true so i do think there's a lot of code that
would need to be reworked to be able to leverage these things i'm also wondering how many how much
people are going to go for like how much people are going to really invest the time to to work on
the free threading for python you know because most of the stuff that there was a real need
people have found workarounds for this today right and also i feel like if you're going for speed
not sure if python would be the first choice either right i think it depends a bit on the on the use
case you're building of course i think from when you're building from scratch in the future and
threading odds value be it be it efficiency you will probably use use this native approach right
yeah instead of going for multi-processing i'm still curious to see i'll still be curious to see how
how much work is it to actually implement this right because i feel like then you seem to
manage different processes and like it's not easy right it's not like you just turn a switch and it
just works no no you need to develop specifically for it right yeah but i feel like that's also goes
a bit against why python is so popular that's what i'm thinking you know python is really simple you
don't have to worry about much stuff that's why a lot of people just do it and if you do need to
worry about all these things then you have the c extensions the rust extensions and all these things
yeah i think it really depends on this case i think uh back in the day a long long time ago
threading was already very popular and like even though it wasn't like need actual threading but like
for uh for when you were building a let's say a desktop application and you had to uh you had to manage
uh updates to the window paints and stuff like that like in the end you were using threading there
like it's been used since day one i think what they're trying to do is simplification i think
simplification is always a good thing if you don't need to think about anymore that reason even is a
gill yeah there is a global interpreter lock like threading is just actual threading i think that is a
good thing but it goes a bit i think why did it take so long is because probably for simplification
reasons at the very early days there was a gill and that made it very hard and not very simple to
move away because a lot of the things that were implemented like assumed that there was a global
interpreter lock so it was not simple to move away but in the end we get to a simplification of of
understanding the api that you're interacting with as a developer so i think that is a good thing
i think so too but i also think that because well as i understand
it's not like you need to learn like it's not a different programming language right like you're not
losing a lot of the stuff everything you have today will still work and now you have this yeah so for
sure i'm all i'm all for it but yeah again curious to see the first serious applications that we use
this um see how the experience is but yeah let's see and the other things also so three four point
fourteen i think that was the first the main thing or the biggest change but there are other things
that were changed as well did you have a look at that no no i didn't so one of the things that it
meant so yeah okay i'll just very skim through the table of contents here on this other link
the python repo now is a bit better so there's different colors and stuff and also if you mistype
a word and you try to run the program the interpreter will say ah did you mean this did you mean that
yeah right so kind of like in rust they also had this now you have t strings have you heard of t
strings template strings exactly uh i've heard of them i've never used them so now they're released so now
now you can use it but i think the idea here is to because people use a lot of f strings right so
yeah yeah and now i think especially with i think because a lot of lms to be honest right i think a
lot of the you're prompting a lot of stuff for lms so basically have this different type of string
here which is a t string you can like it doesn't evaluate to a string directly evaluates to a template
right and then the developer can actually choose what you want to do with this what you want to
interpolate this you can also do some uh like checks right like to to make sure there's no
injection or some some sorts right so now it's part of so in uh on the import lib right the standard library there
i think that's that's those are the things that really caught my attention to be honest but
and i do i actually have heard as well that it's also the fastest python again that's good huh even
without the the the the now it makes me actually wonder what look is the statement like how uh how
father interpret the language is like how would this uh how does this compare to julia for example or
sure sure i'm not sure not sure but yeah with the food or to um what's the name a bit of the the
superset of python mojo mojo yeah but mojo is compiled i think is mojo i think so okay my bad
i think so yeah i think so mojo is compiled okay yeah understanding mojo a python-like compiled language
but yeah curious how these things go indeed what's next a javascript dev made over 300 000
dollars from light gallery by dual licensing a free gpl agpl for open use paid commercial for closed
projects and because gpl compels any bun embedding his code to open source their whole site
most companies buy a license that is the core of the business model he pairs that with contributor
agreements a clean major version switch and exclusive features to nudge upgrades smart sustainability or
tollboot on open source what do you think mariella
i don't know i don't know actually so to to make sure i understand correctly he has two licenses what
he did uh he makes claims that he made a lot of money i'll leave that in the middle he has this uh
this uh javascript project called light gallery this is by the way good good advertisement for that
it's a bit of a it's it's basically like a an image and video lightbox gallery do you know like if you
click an image on a website and you get like a gallery of images and then you can scroll through
them yeah so that's light gallery what he does he has a he has a very much open source license
gpl or agpl and um he also has a paid commercial license i never really thought and the paid
commercial license is basically to make money off this which makes sense for him right like he's putting
a lot of time and effort in this we really have this sleek thing he was trying to find a way how can i
build a business model around this and then the gpl license is like it's open source but so a lot of
people use default to the mit license right okay the gpl or agpl license they they are a bit more
restrictive right they they force you as a user of that to if you make a certain change of you use it
to be very transparent on what you use or what you do to do with it so and he says actually this difference
between gpl v3 and agpl v3 so he says the gpl v3 is best for libraries and frameworks
because it's triggered by distribution so if someone includes your gpl javascript
on their public website they must must in theory open source their entire website which commercial
companies are never going to do and agpl v3 not really relevant for his project but he says the best for
for sas projects for the service where a code is used on a server but never distributed so the agpl is
basically trigger when a user interacts with that software over network and if that is the case if a
user interacts with your agpl project over that network like the company again is forced open source
they're all which you're not going to do yeah it's very easy because it's open source projects it's very
easy to for a web developer to use this library it's like oh look there's school looks very nice it looks
very sleek we need this we can't use it but license only cost like i don't i don't know honestly don't
know what this pricing is but like it's maybe it's five euro five euros a month or something right
like it's it's a no-brainer to use a commercial license yeah so i thought it was interesting an
interesting point of view so basically you have like uh both are more you have a paid license which
whatever it is and then you also have a very strict open source license and then like because the mit
is like it's a bit like you can use but you don't need to open source anything it's like right so you
need to acknowledge i think you just need to acknowledge yeah so then you have like a very
strict open source one and another good option and it's okay let's just go for the the small fee one
because we want to respect the license because we have another alternative
to the open source one and we we're not going to open source a whole website that's that's a bit the
idea okay and he also mentions a little bit like how he handles contributions
where you basically have like apparently there are hidden bots to do this so when you have
users contributing to your project that has these type of licenses where there's also commercial
part that you you can do like you can have like a contributor license agreements so contributors
basically let you use their code but they still own it i'm going to contribute to your project so
you can use everything but i still own it and you also have a copyright assignment agreement that's
basically one from the moment i commit i sign it off to you it's yours okay as a repository owner
and apparently you have fitted bots that take care of this for you that's handy that's handy yeah and
i would definitely imagine that like if i'm using light gallery and have this fine a small tweak that
i want to do and it's a fit but i'm not gonna care too much like i just want to get it fixed right
yeah yeah i think they're still there's still even though like this will probably scare off the
true open source aficionados yeah like there will still people be that are engaged enough and i've
actually had but it's in a little bit of a different i've had similar examples but slightly executed
differently so i used a while ago there is this library forgot the exact name uh to basically
template word documents from
javascript or typescript okay and you can easily generate a word document but then to it works and
it's the best function because you have a number of alternatives but it's the best one so you go for
that one and i say okay you can use that for free but to also let's say i think for example inserting
images in the word document you need to plug in then you have a commercial license so i had this
approach where it's very much open source but yeah for some parts you need to pay but i hadn't really
seen this explicitly like the same the same thing but it's just based on how to use it whether or not
you're under the commercial order yeah no but i think the plug because the plugins thing also feels a
bit like the freemium model yeah true yeah it's a good comparison i think yeah but it's true i've never
seen anything like this um i think it's an interesting venue to explore as a
library author that is very engaged and making sure that you have something and
it's a creative way to build a business model around exactly and i think
ways to support open source i think it's it's always good right like finding new ways that people
can can make a living out of these things and still like as i think it's it's a difficult discussion
because this is good for open source but it's a bit of a slippery slope right
why like this is still open source because it's under gpl right yeah that's what you're saying but it's
very clearly just chosen for that so that's
like as a company you can't really use it unless you go for the commercial license
it's just to nudge people that yeah i know what you're saying but the advantage of course like
it is still an actual open source license that you can use it under even though it's very it's a
restrictive one yeah restrictive if you unless you have an open source project as well right but i feel
like it's really like it is a very because that well if i understand what you're saying like a lot of
of times the open source it also like open source still moves the world and i feel like
by having this very it's almost like water and oil right like these two like the things that are
open source with they're going to use this license and they're going to everything's going to stay
open source and everything that's commercial there's no it won't feedback right like you won't
you're just you're just consuming it but then like i'm also wondering like how
would this also scale like for for if everyone had this this setup would open source be as big as
it is they would be probably not yeah that's probably not so yeah yeah it's interesting interesting
to think about but i do think that if this is a recipe for success i think it would be good for
more people that are trying to also make a living out of open source to to rely on these things right
so it's always good alternatives are always good i feel true
all right and up next we have the netherlands takes effective control of chinese-owned chipmaker
and xperia citing governance failings and risks to dutch and european economic security using rarely
invoked goods availability act ministers can reverse harmful decisions wing techs shanghai listed shares
promptly fall 10 it's a new line in europe's tech sovereignty sand does data custody now extend to
boardroom control what do you think bart um i think it's an interesting uh news article that popped up uh i want
to say yesterday yes october 13th yes uh that's actually today uh popped up today but um
this is a very aggressive move by the dutch government so a little bit of background here
wing what's the name
wing tech is the parent company the shanghai listed chinese company
nexperia is a subsidiary of wing tech and that is in that i think it's that quarter resides in netherlands
was bought by wing tech i want to say in 2016 around that so not that long ago it comes originally from
uh philips sold by philips and they uh are big in the semiconductor industry so they make things like
transistors and diodes and stuff like that so not super fancy stuff but like things that are used throughout
every type of electronics that you can imagine basically from household appliances to cars whatever
they're quite big i think they employ around a thousand people um and in the netherlands or
i think in the netherlands yeah i'm not sure on that but they're big in the netherlands and
what now happened is that basically they the netherlands by the use of this this act they have a name
for the act uh good availability act yes i'm a bit at loss all the details on the act but uh
they took away control of nexperia from wing tech so that's very big right like yeah
and take control means like so they basically uh took over the boardroom so they say that the current
board of directors is no longer capable of making decisions
what the next step is i'm not sure yeah to be honest and i was looking into the details on why
this decision was made and today the details or the public details and that are still very fake
like it's it has something to do that that the governance of nexperia
was something was not not an order and that had to do with a
security on technology and ownership on technology within the netherlands and the european region
so that's very fake right yeah yeah like cyber security stuff that's the only thing they're saying
cyber security or something like with the the knowledge on these products that they're making
that something was happening around that but it's still very fake what is happening well why why this is
being done i added this because it's like to me it's a very very i think it's a bit unprecedented in the
european region i haven't heard about this at least it's um it's a very it looks like a very big
step up in tech that looks to be linked at least to tech sovereignty yeah like i'm afraid you're gonna
take our tech away from us even though it's clearly owned by a chinese company so it feels very
weird like yeah it's um it's a weird situation yeah i feel like yeah i'm curious what's the next
it's gonna be uncovered right because there's probably gonna be more and this is very there
should be more there should be more i don't know if it's also my my bias right but like uh chinese
government don't and then there's always a bit like it's not government owner no sorry not chinese
government on but like um china and maybe also the the way that the title reads right dutch government
takes control of china owned chip maker yeah it feels a bit like uh i don't know but i do hope
that we get more details because this is to me this is not a good thing that is happening like if you're
conducting business in a company and you're trying to abide by everyone's laws and you're doing so
and already since well since a long time like i said like they acquired in 2016 and then suddenly
for whatever reason that is today still very vague just saying like you're doing something wrong with
and it's not clear which lawyer actually breaching like which is it is it compliance related like it's
you need to have some as a when you're running account company you need to have some legal certainty
right like the legal system that we're in now and now we're applying it like we need to be certain
that this is we're in this for the long run yeah it shouldn't change too much but do you think that
like they they are not also aware either what uh why they hope they are i hope they are but i hope
that that's what i'm saying and i think it's uh it's big news but it really requires the context to
form an opinion on this yeah no that's true that's true but i do think like at the same time if details
do come out and you see like what the fuck like what were they doing like then then it can be a good
thing as well right that the government still has the the the not like the power kind of but also
like that they're they're not passive right that there is someone that is actually making sure that
the true so it really depends on what it depends on yeah because that's what i'm like because you
said it's not good but at first i was like but if because also this is the first time i think we
heard this happening so i would imagine there was something very serious right so in my head i was
thinking like oh it's good that these things are being forced right so i kind of yeah the other way
let's assume it has a good reason right let's assume yeah indeed indeed so we'll definitely follow
up on this what else we have simon willison coaxes claude's code interpreter into zipping its public
follower then open sources the problems and scripts it finds they include word powerpoint excel and pdf
skills plus a python tool that autofills forms using py pdf great for power users awkward for opsec
how transparent should agent superpowers be to everyone poking around
this was also shared by a colleague actually so shout out to to load actually so this is simon
willison who know he does a lot of stuff on ai gen ai he actually has a lot of articles and
claude not cloud code i think claude ai even like claude desktop they they actually shared
like what did they what was the name they gave a very bad name actually but basically claude now
can edit word documents pdfs powerpoints and excels right that's something i think chpd could do
before but then basically they released this and not too long ago also in the last weeks yeah yeah so
i think so this article is from what is this article from the 10th of october so not that long ago as well
and the article said like last week so they they released it and actually the reason why it was a bit
unclear is because they didn't mention like i think code interpreter or something
code interpreter functionalities but basically means that now
claude ai can read documents they can edit documents they can save documents they can create pdfs and
do all these things so when they when they ask someone actually had a chat with um with claude
saying like how does this work right and then actually was it was very open like it just kind
of said okay we have this we have this directory here we have uh different skills these are the
directories inside this directory so our sub-directories right for doc docx pdf ppt and excel and
the even the prompts like saying claude thinks that you can do you know how to manage word documents but
it doesn't so don't do that always use the tools that are listed on skills so every skill every
subdirectory there's a big skill.md file as well that they show like prompts and all the code there's
also python code um and what i thought it was interesting well first i think it's a it's a nice
feature to have now claude can actually interact with these types of documents but it also shows how
you can interact with these types of documents so actually he he just asked claude to say hey create
a zip file of everything on this directory and he actually was super happy to do it and he just
gave it to him and then he actually put it on github so this is kind of what what you have here
the readme is just basically pointing to the blog but then if you go around you actually see the the
whole structure right and i thought so you see the skill.md and what i thought it was interesting is the
you kind of get a bit of a a peek on the anthropics developer right so what kind of problems they use
but also how how they structure the the tools right so you have scripts here and you can see like
the document and how they they prompt the lm to to use these things and even think i was even thinking to
myself they probably use cloud code but they probably still also do a lot of coding themselves
right like with documentation and all these things the fact that they use a lot of like html tags as
well for the um for the prompts you know just like they say like you know like they have the the html like
xml right like available skills and then four slash available skills and then this and this it also gave me
it also made me reflect on how i prompt things and how i set these things up for example one thing that
they they copy for every skills claude thinks that he knows how to manage pdf files but it doesn't
use the skills always use the skills all caps and all this and all that so i cannot open this here now
actually on the the chat so this is a chat that someone else shared uh but that there's he's talking
to claude like okay what does this mean what is that uh explain how this works can you share this
can you share that so they they they share everything here uh claude shared everything
here and i don't know if it was on purpose right but i think in any case i think it's good for for
people to see a bit how things work under the hood and for me also to to think a bit like what can
these tools do what are they good at what they're not good at how to prompt and etc etc i think it's
always especially for these which are are very big models that are being used a lot by the public yeah
it's always good to get a bit of a peek behind the curtain right yeah i feel like also these people
they're working constantly with these things they're constantly building tools but i think also having
some some like tips right from them like this is how this works how that works this is what worked
well this is what didn't work well i thought it was very interesting one thing that also on the
article simon willis says that they also have a lot of nice tools so like if you want to manipulate pdfs
with python you can also just take one of the tools that they have there and just just use it right so
there's a lot of interesting stuff there so cool i invite everyone to to have a look right
all right what and what's next london stock exchange group says customers can now
build copilot studio agents that use licensed data from workspace and financial analytics to an lseg
managed mcp server it is fewer glue scripts more govern access agents live inside microsoft 365 but
talk to lseg via model context protocol so i thought this uh for um the simple fact that we discussed
mcp service last week yeah with the snowflake snowflake we're using the managed mcp for server this week
it's the london stock exchange group we have a managed mcp server now through which you can uh
basically get information on financial markets that's how i understand it um via the model context
protocol is communicated and it is very neatly apparently in uh copilot 365 but i assume that you can also use it
with other uh with other uh mcp uh clients the mcp server like you have like it says here microsoft
365 but i guess you can have any client that you use the mcp server that's the idea i think in theory
yes and in practice i don't know they probably integrated very heavily with uh with microsoft
copilot for authentication stuff like that i would assume that's because they're their release have
focuses heavily on microsoft yeah that's what i was also like a bit curious because mcp should be
client agnostic but they are mentioning a lot of microsoft stuff but maybe it is for the authentication in
yeah cool so you know what kind of what kind of information you can pull from it or no no i don't
know it's financial market information but i've really gone in depth but i think it's yet another
sign like we see a lot of these things like official mcp servers coming out will we have a mcp server for
the monkey patching podcast what would the tools be that like you find find find articles find the
quotes or finding for or maybe just asking questions about like what's the give me a summary of the
last episode i think there's a good mcp server of just like no one really thought about like they
they just thought about can we do it they build it and no they didn't really realize how should we do
it it was like there's no mic but it's there but it's there manage and all that's secure manage and
all like it's just like the message ends up in your slack and you type back exactly just like
like automation zafir's thing murillo in the loop but no but i think i do mention last time that we see
like it's probably going to be something that will continue happening right like i think it makes
sense that like these providers they have their own yeah and because you also have these like we also
discussed last time like with uh open ai's agent kit yeah the canvas builder that they have now like
they have full support for using tools from ncp service so you know no longer only see it in
clients like cloud like gpt like these same things or or copilot but also in workflow automation tools
like yeah the there's really an uptake of the under of this technology now for sure for sure and i
think as soon as you have like i think about that like i said agent kit as soon as you have like
these low code tools or no code tools the market for like the need for these mcp service increases a lot
yeah definitely right and i think it's good as well i think the protocol will also mature and
other things as well so it's really really cool what else we have um a women's health editor
trades six weeks with runa stravas new coaching app and likes the tail of paces behind this entirely
interface amid tiktok critiques of aggressive mileage runner replies we don't use ai to generate
training plans seeing expert designs them while ai adjusts the progress fair deal for experienced
runners may be risky for beginners so how much coaching can an app do before your
knees file a complaint you're a runner no bart
i would normally say yes but maybe i could say now i used to be a runner because we've been
engine for a shitty long time but let's not go into that but you're using runner
that would be a good excuse yeah it would be easy if you can just blame like it like an app
yeah it's this fucking app yeah it's like yeah it would be easy yeah now why i put this here i think
it's interesting like it goes into bits like there's some discussion in that community that they think
that ai is being used to generate training plans and because they typically
these training bags go a bit too high in typical training load like a bit too intensive and that's
apparently also what you get when you use something like chpd directly so it's very apparently very
very but the issue is not that it's using ai is using ai without any
layer on top of it they're just using the plain ai because i guess you could also have no i think
their issue is that they're using ai without enough actual knowledge on yeah exactly on the training
a theory behind it yeah yeah because that's that that's also what i was fishing for because
you could also have you could also use ai but you can have like a curated sources yeah that you should
instruct the ai to always follow through but that's that's that's not what apparently they're using
well well not they're a bit vague on this i think that's a bit the the feedback from the community
is a bit that they're questioning that like how much is actually um a app is this runner which
was recently acquired acquired by strava not too long ago let's say four ish months ago okay so
was it there was a startup or something and then it was a startup um i think a uk startup
which grew very very quickly i also think they did a shit ton of marketing like a you couldn't
look at anything sports related without getting an ad oh really wow i think they grew also a lot through that
but we we see a lot of these apps coming up right like we also have something in belgium
which which i would say is a competitor to runa it's called renara but you have a lot
of these candidates right every other training tool is now building an ai coach yeah but i'm
also wondering if it's because it's quote-unquote it's easy because i feel like if run is really
just using chgpt then yeah like you can have a hundred of these competitors right
just marketing around and maybe the ui or something right that's true so yeah maybe
one thing before we continue this is on the woman's health mag is there anything woman related to this
because i think it's just a woman that's tried runner okay it's the only thing reason why it wasn't the
the women's health uh okay um what do you think of this uh coaching
i think it makes a lot of sense when you do it in a very curated manner
i think what is the the challenge if you just if i just asked just give me a play in training
plan because in three weeks i need to do that and help me prepare for that in three weeks like it's
not sure if that's the best support you can get then you could probably better go to
or whether stream or three weeks or three months doesn't really matter but like without specific
instructions it's probably better to go to coach to avoid through like a coach i mean like a person a
person there's some actual experience to avoid things like overtraining and injuries but i think from the
moment that you have this a theoretical knowledge on training physiology and you make sure you have a
very curious prompt or or whatever solution to to inject that i think then it starts to become
better i think when it makes it even better is when you have objective feedback from users or from
from users or from wearables from the user ah yeah i see what you're saying uh because you had a
little toy project around this no yeah it's already a while ago for uh to prepare for a race that i had
in like uh in in two months or something yeah and i think i remember if i remember correctly correctly
if i'm wrong so you had the ordering so which gives you like a sleep score and readiness score
yeah and you also prompted to say i have this i want to have like i think you said like i want
to have this much rest and this and this and this one's long dates short day and then you could give
feedback after every after every exercise that would basically add to the same conversation and then
you could also add like little reminders like okay i i will do the long run today and i forgot my energy
hell whatever then like yeah right um but then you also prompted to say this is what i want this is
kind of the same thing you didn't just say i want i have a race in three months get me ready no it was
according to a certain uh let's say a certain framework yeah and you knew already this that's
why i already put it in yeah so and but i think that's like i i use for example an aura ring but you
can also use let's say your your uh your apple watch or a whoops wrap but i think it's very important
to do this because i think as a person you're you're not very objective towards yourself
like if you're preparing for a race and you feel like i'm not so i'm a bit tired but like it's only
three days left to go i mean i'm just gonna do the hard worker today and and well for me it's the
opposite it's like i'm a bit tired i'm just gonna stay yeah like to each their own right i just left 12
hours i feel like it's like something like an o-ring or a whoop or an apple watch will probably give
much more objective information i think for sure you need this combination because otherwise an ai can
distinguish which personal coach would probably do like a good one yeah but i think i agree because i
think also you have more touch points like the ai has no the only interaction the ai has for you
is really just the type of text that you write and i think if you have a user there's tone of voice
there's body language there's there's a lot of more stuff and i feel like if you you just a person
has more context than just the ai has and i think by adding these other things you're also giving more
context for the model right uh yeah i agree yeah and of course a lot of these things are often baked
in when you use an app right like when you use runner the layer after training it probably asks you
like how it went and stuff like that it's still your input it's not as objective as yeah that's your
recovery matrix yeah and uh maybe but i do believe that because i'm saying like it's probably better to
go to a coach i'm saying in this situation it's probably better to go to experience coach for sure i think
your below average coach it's probably better to use ai right yeah yeah so i'm not i'm not a runner
right i did i did two sports the more tennis and football for example but i'm not a runner and i feel
like it's very different disciplines right but maybe to be the devil's advocate here like is it that
different like so you could also argue that the training data from the llm right some of it was also
diet was also running routines like you know like and i would imagine that it's not like every three
months there's a new running regimen it's like oh yeah this is the state of the art and everything
actually everything before was bad true so in terms of like knowledge is that just the training data of
the lm not enough compared to like a let's say an average coach would the average coach be able to
give you better information no with the same i think an average coach
better understands how you're feeling so you would just be able to ask more of the questions
and like to to get the right content so it's not about like if you have the same context for both
they do probably a similar job but the coach you're talking about an average coach now every
coach yeah but the even the average coach she would be able to better poke for the right context
like how you're feeling okay you're feeling a bit tired but like is it tired because you didn't sleep
well or is it because you have do you have pain but how is the pain and maybe an lm wouldn't do that
as much that's the and i think also especially when face-to-face you have a lot of non-verbal
communication yeah maybe you have a limp or maybe you're just a bit like you look maybe it's a bit
too much but dragging yourself and you're walking in a wheelchair fine you want a wheelchair you know
like hello um but i get what you're saying i get what you're saying also like maybe you can also
perform some assessments like okay does it hurt like try to like push here resist my push like
does this hurt does that hurt etc etc or also there are some things i also think it's very hard to
explain to ai right like i it it my knee hurts when i turn it like this you know like how do you like to
you know you have someone that's like there you're just showing and it's like yeah that's what you're
pointing like right here right here when i turn from here to here and i'm looking like towards the
sun it kind of like yeah um so i think that's also useful i think it's also very very useful
do you use any uh coaching uh running app or anything no i have an actual uh human coach did you ever
try the actual uh aside from your little toy project
no only the toy projects
do you think there is a future for this do you think like maybe
no there's definitely i think there's already like it's already there's a personal base yeah but
that's a bit of the the i think the that feels a bit like the challenge there is that the human
coach is way more expensive i mean you easily uh in the given range here like probably around
75 to you could go probably up to 200 euros a month yeah for a coach that follows you up yeah yeah
personally versus paying uh what is it maybe i don't know 15 20 euros yeah for something like
it yeah sure it's a big difference right like i think a lot of people that are not super serious
when it comes to sports or just starting out will very quickly up for let's try this out and i'm also
wondering like the people that are not really into running or people that are starting who do you
think they're the ones that actually should be the more careful you know what i'm saying like maybe
like you don't know like you're an expert runner or you were but uh like you have a lot of experience
so maybe for you to use these apps it will be not as dangerous as someone that is just starting out
but the person is starting out doesn't want to invest yeah right so it's a bit of a how do you
find the right market for the right things um and maybe one last thing before i move on what about diet would you
because when you're thinking of this i think of sports preparation and i think of like dieting and
all these things and to me i feel like there's a lot of parallels right like dieting as well i don't
think diet changes every month but i also feel like there's a bit of non-verbal also like trying
these things and and having someone talking to you and asking the right questions do you think they're
different like if the question is is llms or ai a good use for coaching running we discussed but like for
diets or giving advice for a dietitian things about the same things different i think the objective
measurements are maybe a bit easier like the result like you have the weight yeah but i feel like you
can also lose weight and healthy right i agree i agree but like i think that it's the same challenge
because whether you're you're coaching a performance or you're coaching diets you still need
something that coaches you and i think if we're actually talking about coaching i hope i believe
that you myself still better at coaching than in the ice today yeah yeah i think today for sure and i think
any i can support but then you need to like do part of that coaching on yourself basically yeah
but i think and i think we talked some weeks ago about the hallucination thing of open ai
i think the biggest biggest biggest the easiest reason that i can give to anyone why a human will always be
better or not always but our place better today is that llms they don't ask as much follow-up questions
they don't know when to ask questions and when to ask and i think for a human they always they will
have them more like what do you mean by this what do you mean by that you're saying this but maybe
what you mean is that like do you have pain or do you have soreness right like all these things
and uh i think lms will definitely fall short on these things all right moving on we have post hog
lays out six easy to make ai coding mistakes think big code code based blindness in context and quote
unquote led the agent to everything optimism their code base spends 8,984 files and 1,623,533 lines
and one engineer jokes and i quote cloud code riding rust is a while loop that accelerates climate change
so lock in guardrails cursor rules spec files linkedin and expect more code more reviews and yes
more bugs if you don't curious about this one bart you share this uh yeah this is from the post hoc team yes
they are uh they uh basically made uh an overview on um how to avoid common mistakes when
when using tools like cloud code or cursor i'm a big fan of postdoc i think it's cool it's cool that
they are publishing stuff like this so postdoc for people that don't know it is like this uh
quote-unquote a bit of an uh all-in platform if you're building a product or a web app because you can
get if you use postdoc it's very easy to plug in and then you get a lot of analytics on how your product
is being used uh like how often is this button clicked or a specific user segments that use specific
functionality more and more often um you can do easily do a b testing you can have feature flags so
let's say for this user segment we show this feature for that user segment not or so it's and it's super
super intuitive super easy the website is really cool the website they're a bit of uh tongue-in-cheek
yeah uh a bit very very much developer focus yeah makes me think a bit about like model duck
a little bit same type of uh tongue-in-cheek humor
recently yeah um i think i want to say a week ago released this blog post and they say
we'll quickly go over the remarks that they make you should not treat your big code base like a small
code base which i guess i guess good feeling makes sense but i think it's if you work on a small code
base like a few files but you start up cloud code or codex or whatever and you just ask it to make changes
and it typically works and you are i think typically it's it's a danger to just switch to the big
project to the same yeah i would i haven't done that yet but i i can imagine because i think also yeah
context and size and like not duplicating things that already exist yada yada right yeah and i think they're
like the then comes to the second thing like the you need to from the moment that you end up in
these big costs you need to provide right context like specific rules specific guardrails it depends a
bit on what type of uh that's all you're using how you set this up right like you use cloud on d or
yeah or cursor rules stuff like this third thing is trying to use the i have something you know is not
good at so they they say here that it's for example uh not good at rust um or specific specific
languages or specific niche dialects is there for example the hawk query language like don't use it for
stuff it typically hasn't seen too much um makes sense but then also like how do you know what what
it's good and what's not good just try it out i think it's a bit uh experimentation yeah it's also
something that you probably should reevaluate as time passes yeah yeah for sure
being content with your existing workflow i think we uh all very quickly fall into that gap like this
works but maybe we should try this other thing and then actually have like a degradation of the
performance right yeah um i have a
i have i have that problem for sure like oh maybe i should try this i'll just try that i think for
editing the podcast as well something that i definitely like i have like a a setup and it's
like ah but maybe i can do this i can use this feature i can do this and it's like no just just
just just just do it yeah exactly um another thing you mentioned here is not using ai it's not a good
idea okay um even if you dislike it personally you should still realize like your competitors are using
ai i see so the so not using i is the mistake is that you should use yeah yeah exactly okay um
and your users are almost certainly using it maybe the exception here or there true
which is a fair point but then like is the argument here saying like
because i makes you more productive and your competitors are user and your users are using or like
why is this i think your competitors are using it your users are using it like you need to know what
is going on in the world and to i see yes but it's very specific to post hog right because they are
developer focused like they don't have a product for marketing when it comes to users maybe but even
there like like if you have like this i don't know marketing product that makes uh but that's whatever
generates uh social posts for you like competitors are also using it right yeah your users are probably
expecting that they can type something and like something gets templated up by an ai
right like it's they think you can make this he's more than a lot of different industries so i think
what they're saying is like you should always be touching ai because you need to know what it can do
because people are going to expect you to be able to things that you can do so just being like
by doing it you're always going to be up to date it's a good it's a good point i hadn't thought of
that and another bad idea is not is uh is letting ai do everything for you it's the other the other
opposite right zero extreme yeah yeah which which i guess makes sense i think that it's uh
sometimes a bit the challenge that uh because you use it a lot you also quickly try to use it in
different situations even though you know it's not always the best but just to be aware of that
so i think it's an interesting one we'll post the link of the article and show notes anything that
surprises you on this not necessarily um i think you're good feeling wise i think i think the one
about not using ai is not a good idea i think that is becoming the reality i think you still have
people with a very strong opinion on not wanting anything to do with ai that i have the feeling that
it makes yourself irrelevant but i think even like i like to we had all mit study that 95 of stuff fails
i also saw again the other i think it was mit study as well that like using chat gpt will make your brain
whatever yeah but i think like the reality is that today ai is getting ingrained in whatever we do
yeah whether for good or for bad i don't know like you can argue about it look we had the example of
this running a running app right yeah like it used to be like it used to be just programmatic like and
now every running apple probably have an option to have an ai generated plan and i think often you
you don't even see anymore the ai gets used behind the that is simply the reality right it's getting
embedded in everything for better or for worse and i think probably when we look back in 10 years a
lot of these products have become a lot better than they are now for sure they will just keep getting
better i mean the people are using as well yeah and also the the features ai enables yeah yeah
through that definitely and i think this is a healthy stance right like don't do everything
don't try to do everything but do something watch yourself yeah the last thing is you said like
challenging yourself as well i also i heard it somewhere and i also think it's true broader than just
programming right or whatever that i think it's healthy to not be married to your beliefs right like
always be like maybe i'm wrong maybe let's try this maybe let's try that i think it's just healthy for
you as a person you know to have that attitude towards life you know like yeah i agree like i think if
you ever hear yourself saying i don't agree and i don't care what you say there's nothing you can
say to convince me that's not healthy right there should always be something that i can say to you
know yeah i fully agree yeah and it works the other way around as well and i think we shouldn't
every now and then have a look at like we're discussing a lot like where is this actually helping
about true like yeah and i think that's the other the other extreme quote-unquote right like uh
maybe i is not the way to go and i'm happy to to consider that possibility as well right so let's see well let's see
that is it for the articles we have today yeah um we have a little bit of news as well we have a
sign-up link we have a sign-up link or tell me more about the sign you can go to newsletter dot
monkeypatching.io and you can basically leave your email address what are you gonna do with this email
you're gonna send well that's a big question you're gonna sell this to people to yeah we're
gonna sell their data i'm gonna get rich off it yes um first thing we'll buy is a yacht yeah okay
we're gonna be recording on the yacht from now on yeah yeah uh no but in all seriousness so we
have a you can sign up newsletter dot monkeypatching.io there is no newsletter yet
but we are definitely playing with a bit with how can we make something that is um
quickly digestible trusting something that you can um like do a look a five-minute read or
maybe not even five minutes like once a week where that gives you a bit of an overview what happened
this week a bit like the the the quote-unquote paper version of what we're doing here i think
that might be interesting for some people also use the channel to explain a little bit more on
what we are gonna do with some ideas to repackage we are playing with some ideas around events
we're also playing with some ideas to bring startups young startups in contact with interested investors
like there are a lot of ideas that we're working on but that have not been fully made concrete but if
you want to stay up to date do sign up on a newsletter dot monkey patching dot io also we're gonna have
to think a bit like how to make this valuable right there's a lot of ai generated content and i think
we're both very much like okay if that's how it's gonna be then let's not do it right but i also do
think that having like a different format on this even if it's the same content sometimes like i hear
something but then i'm like okay i need to sit down and read this because just hearing it like it's harder
like sometimes you'd also like to really dive deep or to you have some you need some visuals you need
this so we'll definitely have a thing we don't know if it's going to be all the articles or just
going to be the highlights or just going to be this just going to be that but uh yeah feel free to
to subscribe if you like to stay up to date if you also have any any thoughts suggestions as well
feel free to to reach out there are also contact details of uh our general one mine and merilos on
monkeypatching.io also even for the the regular quote-unquote podcast right like if you have any
article that you've seen throughout the week that you love that you hate it that you would like to to
hear us discussing about it feel free to to let us know and uh we'll be super happy yeah we also did
a short write-up today i published it on my blog bartz.space i also cross post on linkedin on a bit
of uh like we're six months in a bit on the stats that we have so far a bit on these plans that we have
coming up um so if you're interested uh check it out indeed i think this is also a good um
like we're saying this now but i think it's also it's another one of those good like to digest it for
like we sat down you wrote it like you sat down you wrote about it so if you want to to read a bit
more in detail and not just the high level discussion i think it's also something something
something interesting indeed and again have any thoughts feedback let us know we'll be super happy
here and um always very free to leave a five star review just five nothing below five unless you're
unless you unless your scale goes to 10 then no but yeah leave a if you leave a review star we'll be
super super happy as well thanks everybody for listening thanks everyone we'll see you uh next week
yes thank you bart maybe we'll have more decor decoration decor maybe around we can maybe buy uh from all the
the data that we sell like a van gogh maybe issue like maybe be a monkey like yeah you know figure it out
thanks everyone ciao
Creators and Guests


