What's Wrong With Beer Recommendation Systems (And What PINtPOINT Does Differently)
Every couple of years, a smart data-science student builds a beer recommender on top of Untappd or RateBeer data and posts the write-up. The models are often solid. The question they rarely answer is simpler: is this useful to someone standing at a bar tonight?
The best beer recommendation system is useless
if it ignores what's actually pouring tonight.
Two pieces are worth reading as a starting point:
- Ethan Haley — Untappd as a recommender (RPubs, 2021)
- NYC Data Science — NINKASI: Beer Recommender System (2020)
Haley's piece treats Untappd's rating data as fuel for a collaborative-filtering recommender. Ninkasi scrapes RateBeer, runs SVD++ and Restricted Boltzmann Machines, and wraps the output in a Flask app. Both are careful, honest work — the kind you wish more consumer-app companies did publicly.
They also both bump into the same three walls, and those walls are what PINtPOINT was designed around rather than over.
Wall 1: The cold-start problem
Collaborative filtering needs history. These models need dozens of ratings per user before predictions stabilise. What does a rating-based recommender serve to a brand-new user with zero check-ins?
In practice: popular beers. Pliny the Elder for anyone with "IPA" in their signature. Guinness for anyone who once rated a stout. That's not a recommendation. It's a bestseller list.
PINtPOINT uses a structured preference-elicitation step instead: a feature called Sip-or-Skip. Ten quick card-swipes — "would you order this? yes/no" — produces a usable style profile in under a minute. It's the same principle Tinder uses for matchmaking: forced binary choices beat free-form ratings for fast signal.
Asking "rate this beer 1-5" is a harder cognitive task than "would you order this now?" — and it produces noisier data.
Wall 2: Popularity bias
Any model trained on open rating data learns what's popular long before it learns what's personal. RateBeer and Untappd both have a heavy head: the top 1% of beers collect 50%+ of ratings. A recommender trained on that data is, mathematically, a popularity predictor disguised as personalisation.
This shows up in practice as a kind of ceiling. Haley's model and Ninkasi can both predict ratings well — but "predict rating" and "predict the pint you'll actually enjoy ordering next" are different tasks. Predicting ratings is already a mature problem. Predicting the pint someone will actually want next is much less solved.
PINtPOINT's approach is to use pair-choice rounds (Head-to-Head) where both options are plausible for the user — forcing the model to learn genuine preference gradients, not just which beer is on the hype cycle this month.
Wall 3: Geography-blindness
The biggest practical problem with nearly every public beer recommender is that it doesn't know, and doesn't try to know, what you can actually order right now.
Ninkasi's output is: "you'll probably like Westmalle Tripel." Great. The bar in front of you is a 30-pub craft garden in Shoreditch. Which of those 30 taps is actually the Westmalle-flavoured experience? The model can't say, because it only knows beers, not venues.
PINtPOINT scores your style profile against the live tap lists at pubs near you. The recommendation engine runs on what's pouring, not a global beer catalogue:
- You walk into range of a venue tracked by PINtPOINT
- The app pulls its current tap list (cask + keg, freshness-indicated)
- Your PINtDEXTER profile scores each line
- The top match gets flagged with a taste-match percentage
The unit of recommendation is a pint that exists, on a bar you can reach, tonight — not an abstract beer name you'll screenshot and forget.
What a good beer recommender is actually solving for
Haley and Ninkasi are both solving "predict this user's rating of beer X." That's a well-defined ML task. It trains, it evaluates, it publishes nicely.
The consumer-facing question is different:
"Of the pints I can order right now, which should I pick?"
That's a smaller problem — and a more tractable one. You don't need to predict every beer in the world. You need to rank the 8-20 beers on the bar in front of the user, with enough preference signal to be confidently better than "just pick the IPA". Once the problem is scoped that way, cold start, popularity bias, and geography-blindness all shrink.
How PINtDEXTER layers the signals
The engine inside PINtPOINT is called PINtDEXTER. It combines:
- Sip-or-Skip — fast binary swipes for cold-start signal
- Head-to-Head — pair rounds that sharpen an existing taste profile
- Preferred styles — an interpretable layer across 13 style families
- TUNeDEXTER sliders — manual overrides when the user disagrees
- Safe / Adventurous toggle — a single dial that tilts venue scoring toward focused taprooms or wide-range ones
- Venue-aware ranking — recommendations scored against live taps nearby
The layered structure is deliberate: the user can see why the app recommended what it did, and correct it if they want.
What we're not claiming
PINtDEXTER isn't a breakthrough in recommender research. It's not going to out-predict SVD++ on a RateBeer leaderboard. Those models and the people who build them are doing harder work than what's in a consumer app.
What we think it is is a better answer to the practical question a drinker is actually asking: what should I get? Framed as a venue-bound, tonight-shaped problem, the recommender gets to use much smaller, cleaner inputs and deliver a sharper output.
Where the public recommender projects are still valuable
Honest respect to both linked pieces. A few things they get exactly right:
- Ratings data does have real structure. Style co-occurrences and per-user bias are picked up cleanly by matrix-factorisation approaches like SVD++.
- RateBeer and Untappd are generous corpora by industry standards — the beer world is unusually open about its ratings.
- Transparent methodology + working Flask/R web apps + public notebooks is a great cultural norm. The models linked above could be dropped into a real product with modest effort.
If you're interested in the ML side of beer recommendation specifically, both links in the intro are worth reading end-to-end. If you're interested in what to drink tonight, PINtPOINT will answer that question faster.
Most beer recommenders predict ratings.
PINtPOINT picks pints.
Frequently asked questions
Why is the cold start problem so hard for beer recommenders?
Collaborative-filtering models like SVD++ or RBMs need dozens of ratings per user before predictions stabilise. A brand-new user gets "popular beers" as a fallback, which isn't personalisation — it's a leaderboard. PINtPOINT solves this with Sip-or-Skip: ~10 binary card swipes yield a usable style profile on first use.
What's wrong with using Untappd ratings as the main input to a recommender?
Untappd ratings are a biased sample — heavy-hitting releases collect disproportionate ratings, so a model trained on them learns popularity more than preference. Recommenders built on purpose-collected pair-choice signals, rather than free-form reviews, produce sharper personalisation.
How does PINtPOINT's PINtDEXTER recommender work?
Three layers: (1) Sip-or-Skip + Head-to-Head pair rounds build a style profile from binary choices; (2) that profile derives 13 style-category sliders the user can fine-tune directly; (3) the live tap lists at nearby venues are scored against the profile, so recommendations are always tied to beer actually in front of you.
Why do most beer recommenders ignore location?
Because the training data doesn't include it, or the system is built as a catalogue recommender (what beers exist) rather than an availability recommender (what's pouring near you tonight). A recommendation for a beer you'll never order isn't really a recommendation.
Is a good beer recommender even possible given how noisy ratings are?
It depends what you're predicting. Predicting star ratings from other star ratings hits a noise floor fast. Predicting "of the 4 pints on the bar in front of you, which one will you enjoy most" is a narrower, more tractable problem.
How is this different from Next Glass, BeerMenus, or Untappd's own recommendations?
Next Glass (acquired by Untappd) and similar rating-driven systems optimise for catalogue-style historical preference. PINtPOINT optimises for the real-time decision at a specific venue. The two are complementary — Untappd is the diary, PINtPOINT is the decision engine.
- Ethan Haley — "Untappd as a recommender" (RPubs)
- NINKASI: Beer Recommender System (NYC Data Science blog)
- Ninkasi Beer recommender app (Internet Archive)
If you're a data scientist, go read the linked work — it's good. If you're a beer drinker, try PINtPOINT and let PINtDEXTER pick your next pint based on the tap list in front of you, not a global leaderboard.
Download PINtPOINT