May 26, 2026

Latest full-universe model results from the QA-vetted prediction dataset now live on Project Three Star.

For each year since 1936, Michelin has updated its guidebooks to award or deduct up to 3 stars for each restaurant it inspects. The inspection process is shrouded in secrecy to avoid conflicts of interest and stars are scarcely awarded. The stars have since become one of the most recognized and respected awards a restaurant can achieve, with thousands of chefs from all corners of the globe devoting their lives to the pursuit of just a single star.

In fact, Michelin-starred cooks take the stars so seriously that renowned chefs, such as Gordon Ramsay, have wept over the loss of one or more stars, with some even taking their own lives.

“I started crying when I lost my stars. It's a very emotional thing for any chef. It's like losing a girlfriend. You want her back. I think every top chef in the world, from Alain Ducasse to Guy Savoy, when you lose a star it's like losing the Champions League. There's next year. So it's not done forever that you can't win those things back. I got asked the question literally two weeks ago on holiday. What would you do if you ever lost your third star? Honestly? I would win it back. It's nice to stay focused.”
— Gordon Ramsay, Celebrity Chef

And it’s no wonder. A single star can have customers flocking to that designated restaurant, bringing about both immense amounts of fame and profit. The late and great Joël Robuchon, who himself had an absolutely astounding record of 32 Michelin stars, once said that “with one Michelin star, you get about 20 percent more business. Two stars, you do about 40 percent more business, and with three stars, you’ll do about 100 percent more business.”

Customers and investors alike are therefore very keen to predict which restaurants eventually earn a Michelin star before the competition gets fierce.

In the latest training set, the imbalance is stark: 63,523 Michelin-covered examples are treated as No Star, compared with 2,809 one-star restaurants, 481 two-star restaurants, and just 144 three-star restaurants.

That makes correctly identifying a restaurant with a star so difficult that a standard ML model would probably be worse off than a basic line of code that constantly outputs a negative Michelin classification. The model has to find rare signal without turning every expensive, highly rated restaurant into a fake three-star prediction.

The live site now uses the Vertex full-universe exact star-tier model. It scores restaurants directly into No Star, 1 Star, 2 Stars, or 3 Stars, then filters the result through the frontier-LLM audience-trust and data-quality review before publication.

Restaurants scored77,356full universe scored by the Vertex star-tier model
Published rows74,081site-facing rows after audience-trust and data-quality QA
Star predictions6,126published rows predicted as 1, 2, or 3 stars
New-star watchlist5,239current non-starred restaurants predicted to earn stars
No Star67,955
published restaurants predicted not to receive stars
1 Star4,945
published restaurants predicted at the one-star tier
2 Stars950
published restaurants predicted at the two-star tier
3 Stars231
published restaurants predicted at the three-star tier

The published table contains 6,126 restaurants predicted to earn at least one star. Of those, 5,239 are current non-starred restaurants, making them the primary discovery watchlist.

We tried several model shapes before settling on the exact star-tier model. The binary model was useful for finding starred candidates, but it produced labels like “Starred candidate,” which is not the product we wanted to publish. The six-class Michelin-status model tried to distinguish Selected and Bib Gourmand at the same time as star tiers, which weakened the rare-star signal.

Model triedOutputValidation readDecision
HistGradientBoosting exact star-tier modelPublished modelNo Star, 1 Star, 2 Stars, 3 Stars89.9% accuracy, 42.5% macro F1, 92.3% weighted F1Used for the live site because it predicts the exact public classes we need and had the strongest grouped validation among tiered models.
Calibrated binary XGBoostGuide candidate / Starred candidate68.4% starred precision, 58.1% recall, 70.5% average precisionKept as a ranking diagnostic, but not used for the public table because it collapsed the outcome into candidate labels instead of tiers.
Six-class city-universe XGBoostNot in Guide, Selected, Bib, 1, 2, 3 Stars73.8% accuracy, 32.6% macro F1, 0.794 log lossRejected for the site because it mixed Selected and Bib Gourmand with star tiers and was weaker on grouped out-of-fold validation.
Baseline six-class XGBoost holdoutNot in Guide, Selected, Bib, 1, 2, 3 Stars80.7% accuracy, 34.1% macro F1Useful as an early baseline, but the split was easier than the city-grouped validation and did not solve the final star-tier publishing problem.

The selected HistGradientBoosting model is not perfect on rare two-star and three-star classes, but it was the best fit for the live product: stronger grouped validation than the six-class tiered model, exact public labels, and no forbidden binary candidate labels.

Validation uses city-grouped out-of-fold splits so the model is tested on held-out geographies instead of memorizing a city’s restaurant mix. The selected model reaches 89.9% exact-tier accuracy and 42.5% macro F1 across the four published classes.

Exact-tier accuracy89.9%
city-group out-of-fold validation on Michelin-covered training rows
Macro F142.5%
balanced score across No Star, 1 Star, 2 Stars, and 3 Stars
Weighted F192.3%
class-frequency weighted F1 across all four published classes
1-star recall66.4%
known one-star restaurants recovered in grouped validation
Predicted No StarPredicted 1 StarPredicted 2 StarsPredicted 3 Stars
Actual No Star58,094correct no-star calls4,406called 1 Star731called 2 Stars292called 3 Stars
Actual 1 Star281missed as No Star1,864correct 1-star calls517called 2 Stars147called 3 Stars
Actual 2 Stars19missed as No Star213called 1 Star176correct 2-star calls73called 3 Stars
Actual 3 Stars4missed as No Star38called 1 Star42called 2 Stars60correct 3-star calls

The matrix also shows the central limitation: distinguishing 2-star and 3-star restaurants is still hard because there are so few examples. We publish those rows as ranked signals, not next-guide guarantees.

Before choosing the exact star-tier model, we used the calibrated binary model to test whether the underlying guide and starred signals behaved like probabilities. This chart is from that diagnostic pass.

Out-of-fold calibration reliability line charts for in-guide and starred predictions
The dotted diagonal represents perfect calibration. The binary model was useful as a ranking and calibration diagnostic, but the public site needed exact star-tier outputs.
Why it matters

We used calibration diagnostics to check whether higher scores actually corresponded to higher hit rates, especially before moving from binary star detection into exact star tiers.

What the graph shows

The earlier binary guide and starred signals were directionally reliable enough to rank restaurants, but the binary model could not answer the product question: No Star, 1 Star, 2 Stars, or 3 Stars.

How we use it

The final star-tier model keeps the ranked signal, while the QA pass removes rows whose place identity, city, cuisine, or audience-trust profile would make publication risky.

The model output is not published raw. We remove rows flagged by the LLM due-diligence pass for audience-trust risk, bad city values, country mismatches, non-restaurant place types, or cuisine fields that are not actually cuisines.

Published after QA74,081
of scored rows survived audience-trust and data-quality filters
Audience-trust holds715
held back by manual-review or suppress/fix recommendations
Data-quality removals2,560
removed for bad city, country, cuisine, or place-type signals

The current live table keeps 74,081 of 77,356 scored rows. It excludes 715 audience-trust holds and 2,560 data-quality removals before the dataset is bundled into the production site.

These are the highest-confidence current non-starred restaurants from the live QA-vetted watchlist. They are not presented as next-cycle guarantees; they are the restaurants the model most strongly believes deserve a closer look.

RankRestaurantCityCurrent statusStar score
1VueSingaporeSelected Restaurants91%
2Prime Steak & WineBudapestNot in Michelin Guide86%
3Flagstaff HouseBoulderNot in Michelin Guide85%
4Cafe MonarchScottsdaleNo Star81%
5Roberto's DubaiDubaiNot in Michelin Guide80%
6California GrillLake Buena VistaNot in Michelin Guide80%
7Terasa U Zlaté studněPragueNot in Michelin Guide80%
8Gallo 71San Pedro Garza GarcíaNot in Michelin Guide79%
9Trèsind DubaiDubaiSelected Restaurants78%
10Cantinetta Antinori ViennaViennaNot in Michelin Guide77%

The full prediction table contains the broader ranked universe, including guide candidates, current Michelin restaurants, and lower confidence rows.

View full prediction table