For each year since 1936, Michelin has updated its guidebooks to award or deduct up to 3 stars for each restaurant it inspects. The inspection process is shrouded in secrecy to avoid conflicts of interest and stars are scarcely awarded. The stars have since become one of the most recognized and respected awards a restaurant can achieve, with thousands of chefs from all corners of the globe devoting their lives to the pursuit of just a single star.
In fact, Michelin-starred cooks take the stars so seriously that renowned chefs, such as Gordon Ramsay, have wept over the loss of one or more stars, with some even taking their own lives.
“I started crying when I lost my stars. It's a very emotional thing for any chef. It's like losing a girlfriend. You want her back. I think every top chef in the world, from Alain Ducasse to Guy Savoy, when you lose a star it's like losing the Champions League. There's next year. So it's not done forever that you can't win those things back. I got asked the question literally two weeks ago on holiday. What would you do if you ever lost your third star? Honestly? I would win it back. It's nice to stay focused.”
And it’s no wonder. A single star can have customers flocking to that designated restaurant, bringing about both immense amounts of fame and profit. The late and great Joël Robuchon, who himself had an absolutely astounding record of 32 Michelin stars, once said that “with one Michelin star, you get about 20 percent more business. Two stars, you do about 40 percent more business, and with three stars, you’ll do about 100 percent more business.”
Customers and investors alike are therefore very keen to predict which restaurants eventually earn a Michelin star before the competition gets fierce.
In the latest training set, the imbalance is stark: 63,523 Michelin-covered examples are treated as No Star, compared with 2,809 one-star restaurants, 481 two-star restaurants, and just 144 three-star restaurants.
That makes correctly identifying a restaurant with a star so difficult that a standard ML model would probably be worse off than a basic line of code that constantly outputs a negative Michelin classification. The model has to find rare signal without turning every expensive, highly rated restaurant into a fake three-star prediction.
The live site now uses the Vertex full-universe exact star-tier model. It scores restaurants directly into No Star, 1 Star, 2 Stars, or 3 Stars, then filters the result through the frontier-LLM audience-trust and data-quality review before publication.
The published table contains 6,126 restaurants predicted to earn at least one star. Of those, 5,239 are current non-starred restaurants, making them the primary discovery watchlist.
We tried several model shapes before settling on the exact star-tier model. The binary model was useful for finding starred candidates, but it produced labels like “Starred candidate,” which is not the product we wanted to publish. The six-class Michelin-status model tried to distinguish Selected and Bib Gourmand at the same time as star tiers, which weakened the rare-star signal.
| Model tried | Output | Validation read | Decision |
|---|---|---|---|
| HistGradientBoosting exact star-tier modelPublished model | No Star, 1 Star, 2 Stars, 3 Stars | 89.9% accuracy, 42.5% macro F1, 92.3% weighted F1 | Used for the live site because it predicts the exact public classes we need and had the strongest grouped validation among tiered models. |
| Calibrated binary XGBoost | Guide candidate / Starred candidate | 68.4% starred precision, 58.1% recall, 70.5% average precision | Kept as a ranking diagnostic, but not used for the public table because it collapsed the outcome into candidate labels instead of tiers. |
| Six-class city-universe XGBoost | Not in Guide, Selected, Bib, 1, 2, 3 Stars | 73.8% accuracy, 32.6% macro F1, 0.794 log loss | Rejected for the site because it mixed Selected and Bib Gourmand with star tiers and was weaker on grouped out-of-fold validation. |
| Baseline six-class XGBoost holdout | Not in Guide, Selected, Bib, 1, 2, 3 Stars | 80.7% accuracy, 34.1% macro F1 | Useful as an early baseline, but the split was easier than the city-grouped validation and did not solve the final star-tier publishing problem. |
The selected HistGradientBoosting model is not perfect on rare two-star and three-star classes, but it was the best fit for the live product: stronger grouped validation than the six-class tiered model, exact public labels, and no forbidden binary candidate labels.
Validation uses city-grouped out-of-fold splits so the model is tested on held-out geographies instead of memorizing a city’s restaurant mix. The selected model reaches 89.9% exact-tier accuracy and 42.5% macro F1 across the four published classes.
| Predicted No Star | Predicted 1 Star | Predicted 2 Stars | Predicted 3 Stars | |
|---|---|---|---|---|
| Actual No Star | 58,094correct no-star calls | 4,406called 1 Star | 731called 2 Stars | 292called 3 Stars |
| Actual 1 Star | 281missed as No Star | 1,864correct 1-star calls | 517called 2 Stars | 147called 3 Stars |
| Actual 2 Stars | 19missed as No Star | 213called 1 Star | 176correct 2-star calls | 73called 3 Stars |
| Actual 3 Stars | 4missed as No Star | 38called 1 Star | 42called 2 Stars | 60correct 3-star calls |
The matrix also shows the central limitation: distinguishing 2-star and 3-star restaurants is still hard because there are so few examples. We publish those rows as ranked signals, not next-guide guarantees.
Before choosing the exact star-tier model, we used the calibrated binary model to test whether the underlying guide and starred signals behaved like probabilities. This chart is from that diagnostic pass.

We used calibration diagnostics to check whether higher scores actually corresponded to higher hit rates, especially before moving from binary star detection into exact star tiers.
The earlier binary guide and starred signals were directionally reliable enough to rank restaurants, but the binary model could not answer the product question: No Star, 1 Star, 2 Stars, or 3 Stars.
The final star-tier model keeps the ranked signal, while the QA pass removes rows whose place identity, city, cuisine, or audience-trust profile would make publication risky.
The model output is not published raw. We remove rows flagged by the LLM due-diligence pass for audience-trust risk, bad city values, country mismatches, non-restaurant place types, or cuisine fields that are not actually cuisines.
The current live table keeps 74,081 of 77,356 scored rows. It excludes 715 audience-trust holds and 2,560 data-quality removals before the dataset is bundled into the production site.
These are the highest-confidence current non-starred restaurants from the live QA-vetted watchlist. They are not presented as next-cycle guarantees; they are the restaurants the model most strongly believes deserve a closer look.
| Rank | Restaurant | City | Current status | Star score |
|---|---|---|---|---|
| 1 | Vue | Singapore | Selected Restaurants | 91% |
| 2 | Prime Steak & Wine | Budapest | Not in Michelin Guide | 86% |
| 3 | Flagstaff House | Boulder | Not in Michelin Guide | 85% |
| 4 | Cafe Monarch | Scottsdale | No Star | 81% |
| 5 | Roberto's Dubai | Dubai | Not in Michelin Guide | 80% |
| 6 | California Grill | Lake Buena Vista | Not in Michelin Guide | 80% |
| 7 | Terasa U Zlaté studně | Prague | Not in Michelin Guide | 80% |
| 8 | Gallo 71 | San Pedro Garza García | Not in Michelin Guide | 79% |
| 9 | Trèsind Dubai | Dubai | Selected Restaurants | 78% |
| 10 | Cantinetta Antinori Vienna | Vienna | Not in Michelin Guide | 77% |
The full prediction table contains the broader ranked universe, including guide candidates, current Michelin restaurants, and lower confidence rows.
View full prediction table