#26 For non-urgent LLM tasks, DeepSeek has offers great value for money

AI is not at the core of what poketto.me does, but it helps a lot: I’m using LLMs to translate saved content and to smooth out formatting issues (especially with PDF content). Any old LLM can do these things quite well, but when it comes to pricing, none beats DeepSeek. When using their API, processing a million input tokens can be as cheap as $0.035, and a million output tokens will cost you at most $1.10. To give you an example: A typical 1,500-word essay will come down to about 2,000 tokens (input and output combined). ...

July 26, 2025

#25 Never trust ChatGPT

I may sound like a broken record on this, but I’ve seen it over and over again while working with AI tools on poketto.me: Don’t trust the chatbots. Ever. ChatGPT in particular has two immense problems: sycophancy and accuracy. Regarding the former: It’s trying to please you—the user—to the point where it feels like every response is prefaced with a compliment that’s only designed to keep you engaged. Some examples? ...

July 25, 2025

#21 No, you don’t have to learn LangChain

...or LangGraph, or LlamaIndex, or RAG, or whatever new AI-hype framework is trending this week in order build an AI-powered app. More often than not, these frameworks are just wrappers around basic functionality—in this case, calling an API. And the layers of abstraction they introduce can make even simple things (“prompt an LLM”) feel unnecessarily complex. Take RAG, for example. All it really does is frontload your prompt with additional context. That’s it. In practice, it boils down to concatenating a few strings—something you can do in five lines of code. But LangChain adds layer upon layer of custom methods, config objects, routing logic, etc., that often just get in the way. ...

July 21, 2025

#10 Running text-to-speech in the #Cloud is harder than you would think (part three)

So, after finally setting up a dedicated virtual machine (VM) to run my text-to-speech workloads and wiring up all the build and deployment scripts, I got a bit excited. Could I reduce the TTS latency even further if the VM had GPU power? In theory: Yes. In practice: Google doesn't give you access to their GPUs straight away. There’s a special quota setting for VM instances with GPUs, and by default that’s set to zero. As a regular user, you cannot increase this without contacting Google Cloud Support. ...

July 10, 2025

#9 Running text-to-speech in the #Cloud is harder than you would think (part two)

Do you remember when I mentioned the difficulty of running 🐸 CoquiTTS in the cloud yesterday? My first experiment was to run it directly in my Cloud Run backend service. In theory, this could have worked, but you'll never guess why it failed in practice. x86 CPUs. Really. Like the ones we had in our computers in the 90s. How did I figure this out? After taking a horribly long time to start up, the TTS service failed with a message saying that it was running on an 'incompatible' CPU architecture. Specifically, 32-bit x86 CPUs. ...

July 9, 2025

#8 Running text-to-speech in the #Cloud is harder than you would think (part one)

For the podcast automation feature that I’m planning for a future version of poketto.me, I’ve been experimenting with various text-to-speech solutions. The easiest and highest-quality approach would have been the ElevenLabs API. However, considering the “throwaway” nature of these audio files – most of which would only be listened to once by one person – and the cost structure that this would introduce, I desperately need a cheaper approach. The Python library 🐸 CoquiTTS is pretty awesome: There are many different models to choose from, ranging from 'super low latency' to 'high quality' (including voice cloning). Therefore, poketto.me users could choose from many different voices, and from a commercial perspective, I could set different price points for different levels of quality and latency. However, they all require significant computing power to function. ...

July 8, 2025

#7 Refactoring “legacy” code? Let the AI handle it!

For reasons outlined in yesterday's post, I had to switch poketto.me from #CloudSQL (MySQL) to a completely different database architecture: Firebase 🔥 At that stage, the Python backend code base wasn’t huge, but it was already fairly substantial. It included CRUD operations for several entities, as well as some basic lookup logic. Rewriting the whole thing would have taken me at least half a day. Instead, I asked my good friend #Claude to take care of things. And, to my surprise, the result worked straight away! 🎁The “dorp in” replacement generated by the AI immediately passed my unit tests, and also the chatbot’s instructions for how to set up and configure #Firebase were actually useful. ...

July 7, 2025

#5 There’s no “npx cap remove” 🤦‍♂️

#Capacitor comes with a user-friendly command line interface. To add a new mobile platform to your project, simply run “npx cap add [android | ios]”. And to remove one? Exactly — you guessed it: 'npx cap remove...' But: That command isn’t implemented – for understandable reasons. The interesting thing is, though, that it's "plausible" that it would be there, right?. So it's not surprising that #Claude insists it exists. This once again highlights a major issue with LLMs that I just can’t shut up about: Just because what the chatbot says sounds 'plausible' doesn't mean it's correct. In the case of AI-assisted coding, that’s not such a big deal – you, the developer, will eventually realise that the AI was wrong. But what about the many other use cases where we blindly trust the AI and put whatever it says into action? 🤔 ...

July 5, 2025