Post

When Your App Lives Off Two APIs That Don't Always Answer

Two free public APIs power every food lookup in OpenNutriTracker. Both are flaky enough that retry, cache, and custom-meal fallback need to be a layered system.

When Your App Lives Off Two APIs That Don't Always Answer

You are standing in the supermarket on a Wednesday evening, basket in one hand, phone in the other, and you have already been in here longer than you meant to be. You scan a barcode and the app spins for a few seconds before quietly giving up on you. You scan it again, with the kind of small impatience that has been building all afternoon, and again it fails. The third attempt resolves. The product was always there in Open Food Facts; the API just did not answer the first two times. That is the failure mode most nutrition scanners in this category have, and it is the one that quietly makes people stop using them, because the patience for an app that does not work the first time is a thing most of us have run out of by the time we get to the supermarket.

The thing that bothers me about that failure mode, when I sit with it, is who it lands on hardest. The people who depend on calorie tracking most reliably are very often the ones with the least slack to spare for it: someone managing a chronic condition, someone in the slow careful work of recovery, someone caring for a partner who cannot easily do the lookups themselves. None of them have the patience for an app that fails three times before working, and none of them should need to develop that particular patience just to log a meal on a Wednesday evening when they are already tired.

OpenNutriTracker resolves every food lookup against two free public sources: Open Food Facts for branded products, and a Supabase-hosted mirror of the USDA Food Data Central database for raw and minimally processed foods. Both are free, both are excellent, and both fall over often enough that the app has no real choice but to plan around their unreliability carefully rather than hope for the best.

A handful of recent PRs were about turning that planning into a layered system that actually behaves the way you want when something goes wrong. Retry first, then cache, then a custom-meal fallback that survives even when the network is gone entirely. None of the pieces are especially clever in isolation, and I want to be honest about that, because the part that turned out to matter is not the cleverness; it is the order they sit in, and what each of them is quietly doing for the person on the receiving end.

A retry helper that learned not to be loud

The first piece is the simplest, and I love when a fix is this small. The barcode scan flow already had a private _withRetry helper buried in import_meal_scanner_screen.dart, used to retry the camera-driven scan once if it failed. That helper got pulled out into lib/core/utils/retry_util.dart and applied to the search and barcode lookups themselves: three attempts, exponential backoff at 1s, 2s, and 4s, with the 404 case skipped because a barcode the API definitively does not know is not going to start being known on attempt three.

The smaller change in the same PR matters more than it looks at first glance. Sentry capture used to fire on the first network exception, before any retry had a chance to succeed, which meant transient failures of the kind that resolve quietly on attempt two were generating noise in the error stream that nobody was ever going to act on. Capture moved to after all attempts are exhausted, and the error log is now a list of things that actually went wrong, not a list of things that nearly went wrong before sorting themselves out.

There was also a small pre-existing bug worth mentioning here, because it is the kind of thing that gets missed for years at a time until somebody reads the code carefully enough to notice it. fetchBarcodeResults was throwing ProductNotFoundException rather than ProductNotFoundException(), which is to say it was throwing the class itself rather than an instance of it. Dart let that compile because Future.error accepts anything you pass it, and the resulting exception looked correct enough in the stack trace that nobody had cause to look twice. It got fixed quietly in the same PR, which felt like the right place for that sort of small structural correction to live: not heralded, just put right.

The cache that fills itself

There was an open issue (#319) proposing a different shape for the offline problem altogether: bundle a subset of the Open Food Facts database, around 230 MB, directly into the app, and let people search it locally without ever needing a network call. The proposal is workable in principle, and I want to give it the credit it deserves before disagreeing with it. Operationally, though, it is a different conversation entirely from the one this project is currently set up for. Someone has to host the subset somewhere, someone has to keep it updated as the upstream database changes, the binary size doubles for everyone whether they need the offline coverage or not, and most of what ends up being bundled will never be looked at by any particular user, because each person’s logged foods cluster around a small set of things they actually eat in their own kitchen.

The shape that landed in PR #352 is the inverse of that approach, and it took me a while to be sure of it. Every successful network call writes its result to a new RemoteSearchCacheDataSource, keyed by the same identifier the API returned, and every subsequent search consults the cache before going remote. The cache fills naturally, with exactly the items the person using the app actually looks up, and nothing they do not.

The storage cost in practice is the kind of number that ends a conversation, and I felt a quiet relief when I worked the figures out. Someone logging around ten items a day for six months sees about 500 to 1500 unique items in the cache, taking somewhere between 0.3 and 0.9 MB on disk. A heavier user who scans curiously rather than only logging meals might hit 3 MB after six months. That is against the 230 MB the bundled-subset design needed, before you factor in the cost of hosting and updating it.

There is a settings tile that surfaces the count and on-disk size in human units, with a clear-cache button that is disabled whenever there is nothing to clear. Custom meals are explicitly not touched by the clear, because they are personal templates the person built themselves and the cache is for remote lookups. Mixing the two would have made the button a much more dangerous thing to press, and I would rather move slowly and protect what people have already made than ship something brisker that puts their own work at risk.

The fallback that runs even when the network is dead

The third layer is the smallest in code and the easiest to take for granted, and it is also the one I felt most protective about getting right. When a search failed entirely, you used to see nothing at all. The remote source returned an error, the cache had nothing for the query, and the screen showed an empty state. The custom meals you had built yourself, sitting in a local Hive box that has nothing to do with the network, were nowhere on the screen. That is a particular kind of unkindness in app design: hiding work the person already did because the network cannot help them right now. If you have ever opened an app to find your own templates missing because the wifi is patchy, you will know exactly the small flat feeling I mean.

PR #350 changed the search logic so the custom-meal box is queried first, before the remote lookup, and its results are kept regardless of whether the remote source succeeds. If the network is dead, you still see your own templates. If the network is fine, you see your templates first, then the fresh remote results, then, deduplicated against the fresh results, the cached results from previous searches.

That ordering is deliberate, and worth being explicit about, because the choice of what to surface first carries more weight in a search experience than it tends to get credit for. Your own meals come first because they are yours, and the templates you took the time to build deserve to be the easiest thing in the world to find. Recent intake history goes second, because the meal you ate last Tuesday is statistically the meal you are most likely to be logging again today. Fresh remote results sit third, because they are the source of truth whenever the network is willing to give them to you. Cached remote results fall in last, with the dedup helper making fresh results win on collisions, so that you always see the most up-to-date version of an item when the network is cooperating, and the cached version is there as a quiet safety net for the moments when it is not.

What this combination feels like to use

The interesting thing about layered fallbacks is how invisible they are when they work, which is also the thing that makes them so easy to undervalue. Someone who has never had a search fail does not notice that there are three layers underneath the results she is seeing. Someone logging a meal in a basement carpark with no signal opens the app, taps a custom meal, and gets exactly the same experience she would have at home with full bars. Someone whose Supabase FDC instance is having a bad afternoon sees her previous lookups instead of an empty screen. The app does not tell her anything has gone wrong, because from where she is sitting nothing visible has gone wrong, and that is exactly the point.

That is the goal, and I want to say it out loud rather than rush past it. Resilience is not a feature anyone notices. It is the absence of the failure modes that would otherwise have made someone stop using the app. The work that went into these three PRs is the work that means a barcode scan in a supermarket on a Wednesday evening does not make someone scan three times before giving up and switching to MyFitnessPal.

The thing I keep coming back to, when I am tired at the end of a long day and reading back over what landed, is how cheap the resilience actually is in absolute terms. The retry helper is fewer than a hundred lines once you count the tests. The cache is a Hive box and a small dedup helper. The custom-meal-first ordering is a couple of lines tucked into the search bloc. None of those pieces are difficult to write or expensive to maintain, and yet their cumulative effect is the difference between an app that is technically functional in the demo and an app that is reliable enough to keep using when the world around it is being uncooperative.

That ratio of effort to outcome is unusually generous, and there is something quietly beautiful about how small the changes were. The people on the receiving end of these three small layers, the ones quietly logging their meals in basement carparks and supermarkets and hospital cafeterias and at kitchen tables where the wifi is patchy, are the people I most want this app to keep working for. I am still thinking about how much of good engineering turns out to be this kind of work, and I do not have a clean answer about it, but I wanted to put it down somewhere while it was fresh. Reliability is not glamorous, and it does not make for a screenshot, and the people it helps will mostly never know it was a decision somebody made on their behalf. That is alright. That is the kind of work I want to default to building from now on, quietly and on their behalf, and I hope this is the version of the app that is there for you on the evening you most need it to be.

This post is licensed under CC BY 4.0 by the author.