Dev | Build in Public

#103 The state of the stack

When I started building poketto.me about a year ago, I didn’t know much about my tech stack. I knew I wanted an Angular frontend because it’s flexible and scalable—and I had some prior experience with it. I also knew I wanted to run everything on Google Cloud because of their generous Startup Program and because I was somewhat familiar with the environment from previous jobs. But other than that? I pieced together a stack through trial and error, loosely basing it on the idea that you don’t need to bring out the big guns right away… ...

#100 Use your database for what it's worth

Parliamentary inquiries have an interesting quality. Often, the same question is asked multiple times, either of different government ministers (“How much does your ministry spend on newspaper advertising?”, “How much yours?”, …) or with slight variations (“How much financial aid did NGO X receive?”, “How much financial aid did NGO Y receive?”, and so on). However, parliament treats each such inquiry as a separate entity with no link to its relatives. I thought it would be useful to group together “similar” inquiries, for example to show users how many unique questions were raised in a given timeframe or to later allow cross-inquiry analyses for similar inquiries. ...

#99 Chose the right tool for the job

When I talk about “Conscientious AI,” I keep circling back to proportionality: is an LLM really the right tool for the task, or can something simpler, cheaper, and more predictable get the job done just as well? This question arose once again while I was working on parlametrics, a data platform that I developed based on the Austrian Parliament’s Open Data offering. Parliamentary inquiries in particular are a valuable source of data: They enable parliamentarians to extract detailed information about how the government operates. In the process, “keywords” are eventually assigned to each inquiry so that they can be grouped and searched more easily, but this often takes days or even weeks. To speed up this process, parlametrics tries to predict the most likely keywords for new inquiries based on their title, enabling users to discover similar inquiries without having to wait for manual tagging. ...

#98 Supabase is my new favorite database

Since the early days of poketto.me, I’ve been dissatisfied with my persistence architecture. I started with Google Cloud SQL, but I quickly realized that it was too expensive for a small app. I switched to Firebase and had Claude do the replatforming, if you remember. However, when I introduced full-text search, I had to add BigQuery as a second database just for that because Firebase doesn’t have full-text indexing. The point is: It’s a lot of headaches for something that should be easy. ...

#96 Stopping the scrape: Why I switched to the Wikimedia API

I’ve noticed a welcome uptick in users saving Wikipedia articles to poketto.me recently. But until now, the app treated Wikipedia just like any other website: it scraped the raw HTML. Turns out, for Wikipedia, that is far from ideal: 🤯 Artifacts: The extracted content often included UI clutter like “Edit” buttons, navigation links, and “Citation missing” tags. 📋 Rendering issues: The standard HTML → Markdown → HTML conversion pipeline introduced plenty of ugly formatting glitches specific to wikis. ...

#95 Be careful when counting your whitespace

One of the great things about working on poketto.me is that I'm constantly learning about fascinating linguistic subtleties. For instance, while working on automatic content summaries and extracting key facts and figures, I came across an interesting issue with token counting in Chinese script. I had put a safeguard in place so that poketto.me would only attempt to summarize content longer than 100 words. This works well for German and English content, but when I tested the feature on an article published by Xinhua, a Chinese news agency, my code said the article had only about 12 words, which was obviously incorrect, so it didn't produce a summary. ...

#94 BigQuery’s SEARCH function only works with ASCII characters

Admittedly, it may not have been my brightest idea to use BigQuery as the search backend for poketto.me. But since Firebase doesn’t have built-in full-text search, I would have needed to add another tool to my stack anyway. I figured BigQuery would be easier than managing something external like Apache Lucene or Elasticsearch. Plus, BigQuery has a built-in SEARCH(...) function, so why not give it a try? As it turns out, SEARCH(...) is really more of a token-based text analyzer than a true search function: 1️⃣ It splits both the search term and the text into tokens (words). 2️⃣ It matches full tokens only. 3️⃣ It returns just TRUE or FALSE—no offsets, no match lengths. 🤯 And worst of all: It only works with ASCII. Yes. Like it’s 1979. ...

#92 Angular data binding with arrays: Yes, it can work!

Angular is awesome. And data binding, in particular, has been a game changer for developing modern web apps. Something changes somewhere and—magically—every part of your UI that needs to respond does so. However, I’ve always struggled with one corner case: What if you’re binding to an array, and the change that occurs is that something gets added or removed, and you need to respond to the change programmatically (not just in your HTML template)? ...

#91 How to fix the ominous Android Status Bar Issue

Remember Capacitor + Android Status Bar = 🤯? 🪲 This bug has haunted me for months—stalling the Android release of poketto.me and draining way too much mental energy. It’s one of those dreaded dev problems: no obvious solution, hard to debug, and endless rabbit holes. Eventually, I had to bite the bullet and dig in. Here’s what I found: 💣 Issue #1: The Capacitor status bar plugin only half-works. There’s a StatusBar.setOverlaysWebView(true/false) API, but on modern Android versions it doesn’t behave as advertised. Why? ...

#88 The memory consumption patterns of LangChain are… disturbing

As I said in No, you don’t have to learn LangChain, we shouldn’t get distracted by the artificial complexity introduced by our frameworks. LangChain is mostly a wrapper around the REST APIs of various LLM providers. Useful? Yes—switching between models becomes easy. But here’s a mystery I can’t explain. When I added Gemini as a fallback to DeepSeek (see yesterday’s post about DeepSeek refusing to touch Chinese politics), I thought it would be straightforward: ...