#88 The memory consumption patterns of LangChain are… disturbing

As I said in No, you don’t have to learn LangChain, we shouldn’t get distracted by the artificial complexity introduced by our frameworks. LangChain is mostly a wrapper around the REST APIs of various LLM providers. Useful? Yes—switching between models becomes easy. But here’s a mystery I can’t explain. When I added Gemini as a fallback to DeepSeek (see yesterday’s post about DeepSeek refusing to touch Chinese politics), I thought it would be straightforward: ...

September 26, 2025

#87 DeepSeek really won’t touch anything related to Chinese politics

For most use cases in poketto.me, I’m pretty happy with #DeepSeek: it’s cheap, reliable, and the output quality matches any other LLM I’ve tried. But there’s one big caveat: anything related to Chinese politics can trigger an immediate refusal. Example: Right after the launch of poketto.me, a user tried saving an article about the September 3rd Beijing meeting between Xi Jinping, Vladimir Putin, Kim Jong Un, et al. ( https://orf.at/stories/3404330/) ...

September 25, 2025

#83 The Gemini API for Video Understanding is surprisingly good

As I mentioned in Gemini’s URL Context feature is 90% hype, 10% value, I was pretty disappointed with Gemini’s “URL Context” feature. But “Video Understanding”? That one actually works like a charm. How it works: 👉 Provide a YouTube video link 👉 Ask Gemini questions about the video 👉 Get a structured response back For poketto.me, this unlocks a really neat feature: users can save any YouTube video in the app and either watch it later or read a textual description of the video. ...

September 21, 2025

#82 Gemini’s “URL Context” feature is 90% hype, 10% value

I’ll admit—I was pretty excited when Google announced that the Gemini API would support a new “URL Context” tool. The idea: you could “ask” Gemini about the content of a specific web page, with Google handling all the heavy lifting. The documentation even shows a neat example: send Gemini two recipe URLs and prompt it to compare ingredients and cooking times. If it worked, this would’ve been a game-changer for poketto.me: ...

September 20, 2025

#57 Multi-threaded TTS: A bad idea

Running text-to-speech in the cloud is fun—until it isn’t. Early on, I didn’t think much about thread safety. During my own testing, rarely would more than one TTS task be running in parallel, so there were no big issues. But once more users started using the feature, strange bugs popped up: Errors like “Assertion srcIndex < srcSelectDimSize failed” started showing up in the logs—and worse, once triggered, the entire Cloud Run instance would become unusable until a redeploy. ...

August 26, 2025

#54 AI is not a value proposition

I didn’t coin this phrase (sadly), but it keeps proving itself true—especially now that I’m working on GTM details for some of the more advanced features in poketto.me. Most users don’t care how your app works. They care what it does for them—and whether that’s worth paying for. Since LLMs became easy to embed, companies started slapping “powered by AI” stickers on everything as if that alone justified a price tag. But unless the user clearly feels the value, it doesn’t matter what's under the hood. Case in point: Garmin’s hilariously underwhelming $7/month “AI subscription”. The so-called “insights” offered nothing users couldn’t deduce themselves—or the app couldn’t have generated with much simpler logic. ...

August 23, 2025

#50 Prompt engineering: A task best left to the machines

Under the hood, poketto.me makes heavy use of LLMs. The podcast feature is a great example: Users can turn any web content into a podcast, but often that content isn’t well-suited for listening. LLMs are great at optimizing this—simplifying complex sentences, turning headlines into enumerations, describing images verbally, etc. But the challenge: How do you craft a single, generic prompt that works across all types of content and runs unsupervised via the API? ...

August 19, 2025

#35 LLM-Based Translations: The Good, the Bad, and the Ugly

Automatic content translation has been a key feature of poketto.me from day one. Why? Because I believe there’s immense value in making content accessible to non-native speakers. Personally, I’m deeply interested in developments in countries like India, Pakistan, and China — but the best publications from those regions often don’t publish in English. Being able to read and compare both Dawn News (Pakistan) and the Hindustan Times (India) coverage of tensions between the two countries — in English — for example is fascinating. ...

August 4, 2025

#31 No, AI will not take McKiney or BCG out of business any day soon

Despite what the “God of Prompt” (sic!) or any other self-proclaimed “AI expert” is trying to tell you, none of the current AI models will replace a multi-hundred-thousand-dollar product strategy project. First of all, the people making these claims are, most likely, just trying to sell you their overpriced list of “magic” prompts — and hoping for endorsement from the big AI companies or a retweet from Elon Musk. But giving the AI tools the benefit of the doubt, I tried using Grok, ChatGPT, and Claude to iterate on a commercial strategy for poketto.me. The results were… disappointing. Here are the main issues: ...

July 31, 2025

#29 Good things come to those who wait ⏳

Remember when I was complaining about how hard it is to run even basic ML workloads on GCP? Turns out, Google has listened 😊 (well, probably not to me personally, but in general). You can now request GPUs for Cloud Run instances in the UI as well as on the command line. That means all the hassle I went through deploying my text-to-speech service into a Docker environment running inside a preemptible VM with GPUs—and then figuring out how to start, stop, and deploy the VM automatically—was… well, not exactly wasted, but at least: not necessary anymore. ...

July 29, 2025