Fun with Numbers: How Not to Project Playoff Probabilities
The top story on the Toronto Star's website today was the ominous-sounding "Leafs chances of getting to playoffs? 1.7%." This 1.7% figure comes from sportsclubstats.com, which runs Monte Carlo simulations of the NHL season to determine how many times the Leafs will make the playoffs. And according to Dave Feschuk, who authored the column, that makes sense. After all:
"You don't need to think too much to know the Leafs are abysmal. They're 29th in the 30-team league standings. They're 125-1 Las Vegas long shots to win the Stanley Cup, the second-longest odds on the board behind the 150-1 Islanders."
Sounds about right so far? Let's look at the Leafs and the Islanders a little more closely:
| Team | Cup Odds | Playoff Odds |
Odds of Winning the Cupif they make the Playoffs
|
| New York Islanders | 0.67% (1/150) | 43% | 1.6% |
| Toronto Maple Leafs | 0.8% (1/125) | 1.7% | 47.1% |
The odds of each team making the playoffs comes from Sportsclubstats.com, while the odds of winning the cup comes from the Vegas lines that Feschuk mentions.
Think about that for a moment. Professional gamblers think the Leafs have the second-lowest odds to win the cup. But if we assume the 1.7% figure from Sportsclubstats is correct, then if the Leafs were somehow able to squeak in to the last playoff spot this season, they would be the favorite to win the Stanley Cup. By a wide margin. They'd have a 90% chance of beating Pittsburgh in a 7-game series. So they're both the worst team in the league...and by far the best.
What Sportsclubstats and Feschuk are both missing in their analysis is a little thing that I spend way too much time yammering on about called "regression to the mean." For example, when the Leafs had a seven-game winless streak to start the season, I found that past teams that started out similarly had made the playoffs 29% of the time. But Sportsclubstats had their playoff odds at just 8.2% that day. However bad these seven-game losers were, Sportsclubstats thinks they're way, way worse.
Sportsclubstats may use a sophisticated algorithm to look at strength-of-schedule, but when it determines a team's expected winning percentage, it does nothing more than look at goals for and against. But as Jonathan Willis pointed out quite succinctly, early season extremes never stay extreme. If you want to estimate a team's chance of making the playoffs this early in the season, you have to take its current record and regress very heavily to the mean. And, in the case of Toronto, who have outshot their opponents and have been without their presumptive #1 goaltender for much of the season, you should probably expect them to play over .500 hockey the rest of the way.
But it's nicer to think that the Leafs have just a 1-in-60 chance of making the playoffs, isn't it? Because then you "don't need to think too much" and you can just beat up on Brian Burke for being such a crappy GM.
Oh, and if anybody out there wants to give me the Leafs at 60-1 to make the playoffs, I'll take it.
0 recs |
14 comments
|
Comments
Biggest flaw in the story
The biggest flaw in Feschuk’s story was that he took the 1.7% chances and said that the Bruins would now have a 98.3% chance of a lottery pick—which is ridiculous because he is citing a stat that speaks to the team’s chances of making the playoffs. The stat doesn’t attempt to predict where the Leafs would finish in the standings (between 16th and 30th). Toronto would have to be among the worst six teams in the NHL to have their pick be part of the lottery.
Where every team is our home team
by IllegalCurve on Nov 17, 2009 9:41 AM EST reply actions 0 recs
Very true. Feschuk’s work is a rich minefield of fallacies, appropriate for the #1 circulation paper in the country.
by Hawerchuk on Nov 17, 2009 9:56 AM EST up reply actions 0 recs
I still like the 47% chance of winning the cup.
by Hawerchuk on Nov 17, 2009 9:56 AM EST up reply actions 0 recs
I’m surprised the Whalers and Jets weren’t listed with a 45% chance of winning the Cup with those stats. I included this post in the Morning Papers beside the article to put it in proper context.
Where every team is our home team
by IllegalCurve on Nov 17, 2009 10:01 AM EST reply actions 0 recs
Hello from the SportsClubStats dude
Hi Hawerchuk,
I’m Ken the SportsClubStats fellow.
I agree my site has plenty of shortcomings.
I do use a regression towards the mean. (I did not use to).
You can see the final number I use for team strength by hovering over the “pythag” number on the league page. (Too many of those pythag numbers are red, buy the way, one of the shortcoming I need to get to the bottom of.)
But, I’m not using statistics to look at past years and calculate a proper amount to regress at each point in the season. I don’t know how to do that yet, do you? Right now I just piece together 2 linear sliding scales that regresses less and less as the season goes on. So, your right, I certainly may be not regressing enough today.
Thanks
by kendroberts on Nov 17, 2009 11:09 AM EST reply actions 0 recs
hi Ken,
Thanks for the comment. I think there are two ways to do this.
1) Determine how good you think a team is using player projections, and regress towards that (the PECOTA model.)
2) Look at past performance of teams with a given record through a given number of games and use the future winning percentage of that group.
3) Determine how much regression to the mean there is after a given number of games for all teams and use that in your calculation.
Number 3 is the simplest to figure out. I’ll put something together and post it and you can see if you like the info.
by Hawerchuk on Nov 17, 2009 11:59 AM EST up reply actions 0 recs
4. Go bayesian and use priors. I would think a mix of last season and some of the power rankings by the pundits would work.
by Mogen_david on Nov 18, 2009 11:23 AM EST up reply actions 0 recs
Obviously a good approach. Like #1 and #2, I doubt it’s worth his while – his site covers so many sports, this strikes me as a lot of effort for little gain.
On the other hand, approach #3 says you merely regress the Leafs to a .408 WPCT even though they’ve outshot their opponents.
by Hawerchuk on Nov 18, 2009 2:34 PM EST up reply actions 0 recs
Something about this technique bothers me. I’d have to sit down and think about what it is really doing. It feels a tad cockeyed but I think that’s because I’m not 100% sure what you are doing.
It seems more natural to create a confidence interval around the point% and then create a range but given the he is running a monte carlo simulation that gets a cumbersome quickly… I guess you could draw your win percentage from a distribution function at the start of each simulation and that would incorporate your uncertainty in the win% into the simulation. That feels more “natural” to me.
Adjusting by average regression to the mean feels contrived. It might not be and I’m just not used to thinking about the problem in this way. I’m not sure if the average error of estimation tells you something. And seems dependent on the win% given the distribution we are working with and I’m not sure that has been accounted for.
by Mogen_david on Nov 18, 2009 5:26 PM EST up reply actions 0 recs
Spitballing here.
This is me going off memory on the Appendix to “The Book.”
Presumably there are game logs of past seasons, right? Take as many of those as you can get, and split them up into even and odds (I find that day of game works well for this sort of thing). Then figure the standard error (root mean square error, really) like so:
sqrt(avg((w_pct1-w_pct2)^2))
Where w_pct1 is the “odd” win percentage and w_pct2 is the “even” win percentage.
Then figure out expected standard error due to random variation:
sqrt(.25/82)
Subtract that from the standard error you calculated earlier to get the “true” standard error. Call that x. Then, for each team you want to regress to the mean, call the number of games so far in the season g and win percentage w, and:
(w/.25/g + .5/(x^2))/(1/.25/g + 1/(x^2))
That should do the trick.
by cwyers on Nov 17, 2009 12:05 PM EST reply actions 0 recs
In fairness to Sports Club stats
They pretty much understand that the weighted stats don’t make any sense this early in the season.
I always got the impression that they, correctly, use 50/50 in the early season and the weighted should only be used late in the season when a lot of the stats have normalized themselves.
At least that’s how I look at it.
The New Improved Avalanche. Now with Real Coaches!
Jibblescribbits: C'mon over and waste some time
by Jibblescribbits on Nov 17, 2009 1:52 PM EST reply actions 0 recs
Lookig at the problem backwards.
You’ve already done this a bit with your historical analysis but what if you turned the problem on its head? What is the minimum points per game/win% for post lockout teams? Take that value and ask what are the odds that this team would have a record as bad as the Leafs (or the ’Canes or the Isle).
by Mogen_david on Nov 18, 2009 11:29 AM EST reply actions 0 recs
No need to do the math for the Hurricanes, 0.00000000000000000000000000000000000000000001% chance of the Hurricanes making the playoffs.
by Caniacinthestand on Nov 29, 2009 9:34 PM EST reply actions 0 recs

by 









