Ishdul
Registered User
- Jan 20, 2007
- 4,012
- 187
Actually the 2nd is really insanely unimaginably far-fetched.1) They enjoy math.
2) They want to ruin other people's enjoyment of hockey.
Neither is far fetched.
Actually the 2nd is really insanely unimaginably far-fetched.1) They enjoy math.
2) They want to ruin other people's enjoyment of hockey.
Neither is far fetched.
Maggie the monkey actually has a better than 50% success rate in predicting playoff winners.
I do have to admit that I frequently "hate hockey" for several hours following Avalanche games this year.
I find that I frequently hate it for a couple hours *during* Leafs games.
And so does a simple comparison of corsi for percentages.
As was demonstrated earlier (and has been done many other times) corsi has a better record at predicting playoffs wins than wins, points, seed and goal differential.
If Maggie the Monkey is better than all those things (including corsi) I'd love to see a link to her work, she seems like an exceptionally talented primate.
My whole point is that a primate spinning a wheel has as good a chance predicting the outcome of a playoff series.
As was demonstrated earlier (and has been done many other times) corsi has a better record at predicting playoffs wins than wins, points, seed and goal differential.
And as has also been demonstrated earlier, these supposed predictions fall apart under close scrutiny, and the claim of a better record falls apart along with it.
I must have missed the post where you provided a statistical sample (of any non-trivial size) that contradicted the data Epsilon posted. Perhaps you can help me out.
A typical response from someone too focused on the math and not enough on the methodology. The claims fall apart because the predictions cannot be satisfactorily tied to the teams on the ice either beforehand or afterwards. As an example, the fact that your sample includes predictions based on the composition of teams that were not the compilation actually present in the playoffs.
This is akin to claiming credit for a team winning because your dog pooped in the front yard instead of the back.
Garbage in, garbage out. Start using acceptable data and we will start debating the quality of your conclusions.
Do you have criteria for satisfactorily?
How often does a team substantively change in personnel or system between the regular season and the playoffs?
However what has happened is that for virtually all playoffs teams, the roster is made up of significantly the same players as the team that skated in the regular season. Yes, in some cases there are significant acquisitions, or dramatic coaching changes, or large injury issues but I'm failing to see why that wouldn't be included in the "other analysis" that, again, has been said is necessary many times in this thread.
You've arbitrarily decided the data is garbage and thus refused to accept any conclusions it draws. Perhaps the best option is for you to rigorously define where your goalposts are and then we can determine if it's possible to score.
If that was your point, then your point has already been proven false in this thread.
Sure. Let's start with data that is actually tied to the roster that is actually in the playoffs. Is that too much to ask?
Quite often, between trades (eg, Kings) and injuries (eg, Blues).
Then we might as well just use that "other analysis", because "this analysis" is full of bad data.
No different than you. Of course, because you don't agree with it, somehow the standards change...
You hang your hat on conclusions that have been DEMONSTRATED to be based on bad data.
No, it hasn't. You claim corsi can predict accurately and yet it merely reflects what has taken place in the past.
The bottom line is good stats don't make a good team. Good teams make good stats.
Over the last 5 years and the 80 teams which have made the playoffs, how many of those do you believe have changed so much between the regular season and the playoffs that their regular season performance is completely divorced from their postseason performance? I'd be interested to see the actual list.
Again, your definition of bad data is extremely suspect, seeming to be based around the premise that outliers invalidate data
I have demonstrated evidence against it. That's all that is required.
You've demonstrated anecdotal outliers. If you feel that is all that's required to invalidate a much broader statistical trend, then I suppose there's really not much for us to discuss further.
My whole point is that corsi is a tool to track past performance. Predicting is just that predicting, an educated guess.
Corsi is reliable and predictable. It repeats. Teams that are good at The Corsi on Wednesdays are also good on Fridays. Teams that are good at Corsi'ing in February are good at doing Corsi puzzles in March. It doesn't correlate as strongly with winning, but it has FAR more predictive value.
The point isn't that "Corsi predicts winning", the point is that "Corsi is the best/most reliable predictor of winning we have" (it's not, there's a whole host of other predicting stats we have, like score-adjusted fenwick, but I'm working the concept, not specifics here). It guarantees nothing but it is repeatable and has demonstrated a correlation to winning hockey games.
This is longer than I anticipated it to be, but maybe some people will read the whole thing and have a bit more of an understanding of why these things are useful. After all, we're on page 25 and a lot of people still don't understand..
I'm not trying to be disrespectful, but if you're going to talk about statistics and how to use them, you should understand more about statisti
cs, because you're skipping past what they teach you in stats 101 -- Reliability vs Validity.
Validity is a measure of the extent that something correlates with the real world.
Reliability is a measure of the consistency of that result.
Shooting percentages are an example of great validity; the correlation (R2) between "5v5 GF%" (goal differential represented by a ratio) and "PDO" (on-ice sv% + on-ice sh%) is 0.6802 -- meaning, this year, about 68% of a players' goal differential is determined by PDO. This is the same reason that +/- is atrocious and useless. It is highly affected by percentages.
However, validity is not in and of itself useful.
Let's say you weigh 180lbs. You weigh yourself on a scale 5 times and you get:
140
210
100
80
370
Those average to 180lbs. Your scale is valid! It's also not reliable -- looking at past measurements doesn't predict future measurements. You don't know if the next time you look at that scale if it will be 100lbs under or 200lbs over.
If you had measured yourself on the scale and gotten "150, 151, 150, 149, 148", well, that's a pretty reliable scale! It's also not valid. It's not measuring what it says it will measure.
Shooting percentages are the faulty bathroom scale -- they are not reliable.
Corsi isn't super accurate, but it's definitely pretty reliable, and it's KINDA valid. It's not going to spit out 180 180 180 180 180 at you, but it gets a lot closer than any other scale we have.
Here's the shooting percentages and Corsi for each team in the NHL, splitting the partial NHL season into 2 halves (my cutoff date was December 6th; most teams have played 26-30 games before and after then)
Team|Oct-Dec6 Corsi|Dec7-Feb13 Corsi|Oct-Dec6 Sh%|Dec7-Feb13 Sh%|Change in Corsi (as a %)|Change in Sh% (as a %)
ANA|50.6%|51.2%|8.2%|8.2%|101.1%|100.2%
ARI|48.3%|48.8%|7.0%|5.6%|101.0%|79.8%
BOS|52.4%|51.6%|7.9%|7.0%|98.5%|88.9%
BUF|37.1%|37.0%|7.4%|7.1%|99.7%|96.6%
CAR|51.6%|50.4%|6.5%|6.1%|97.7%|93.2%
CBJ|46.1%|46.1%|6.7%|8.1%|99.9%|120.3%
CGY|44.0%|45.1%|9.5%|8.6%|102.4%|89.9%
CHI|55.1%|53.0%|7.5%|7.3%|96.3%|97.7%
COL|44.8%|42.8%|8.6%|7.5%|95.5%|86.8%
DAL|49.4%|52.1%|8.8%|8.8%|105.6%|101.0%
DET|52.8%|55.4%|8.4%|6.2%|105.0%|74.1%
EDM|50.5%|47.6%|6.7%|7.3%|94.3%|108.7%
FLA|51.5%|51.6%|6.6%|8.1%|100.1%|123.6%
L.A|51.9%|57.9%|7.6%|7.5%|111.4%|98.2%
MIN|54.2%|49.9%|8.3%|7.2%|92.2%|87.7%
MTL|50.3%|48.3%|8.0%|8.8%|95.9%|110.7%
N.J|51.1%|43.8%|7.4%|8.4%|85.8%|112.8%
NSH|53.5%|51.3%|8.3%|8.5%|95.8%|102.4%
NYI|52.5%|54.8%|8.1%|8.1%|104.2%|100.9%
NYR|49.7%|50.3%|9.3%|8.3%|101.2%|89.8%
OTT|47.5%|51.5%|7.0%|8.8%|108.5%|124.5%
PHI|47.2%|49.7%|7.3%|8.1%|105.4%|110.4%
PIT|50.7%|53.3%|8.8%|7.7%|105.1%|88.2%
S.J|52.2%|50.5%|7.3%|8.0%|96.8%|110.6%
STL|50.9%|52.1%|7.2%|9.8%|102.4%|135.1%
T.B|53.9%|54.3%|9.7%|9.1%|100.7%|93.7%
TOR|47.0%|45.5%|9.7%|6.8%|97.0%|70.4%
VAN|51.2%|48.5%|7.8%|7.2%|94.8%|92.3%
WPG|51.5%|53.5%|6.5%|8.6%|104.0%|132.9%
WSH|51.2%|51.7%|7.5%|9.4%|100.9%|124.6%
That's a lot of numbers... What's key to note is:
- Teams that had a low Corsi to start the year, tended to keep a low Corsi in the next few months. There weren't many teams with wild changes.
- Teams that shot very well sometimes continued to shoot well (DAL), and sometimes tanked (TOR). Teams that shot poorly sometimes continued to shoot poorly (CAR) and sometimes they started scoring in bunches (FLA, WPG).
- 21 of 30 teams saw their Corsi stay stable -- between 95% and 105% of their previous result.
- Only 7 of 30 teams saw their Sh% stay stable in the same way.
- Only 2 teams saw their Corsi change by more than 10% of their previous result -- NJ fell (85.8%) and LA rose (111.4%). Every other team was between 90% and 110% of their previous performance.
- 19 teams saw their sh% change by more than 10% of their previous result.
2 months into the year, Arizona was shooting 7.0%, and St. Louis was shooting 7.2%.
Over the next 2 months, Arizona's sh% dropped to 5.6%, and St. Louis's rocketed up to 9.8%.
This isn't an actual study, and there's far better methods I could have taken, but this is just to demonstrate the difference between reliability and validity, predictive vs descriptive.
Shooting percentages are valid and descriptive. It correlates strongly with goal differential and winning. It is not reliable; it does not give you strong predictive value. A team might shoot 4% on Mondays and 12% on Thursdays, or 6% in October and 12% in November. Past knowledge does not help with future results.
Corsi is reliable and predictable. It repeats. Teams that are good at The Corsi on Wednesdays are also good on Fridays. Teams that are good at Corsi'ing in February are good at doing Corsi puzzles in March. It doesn't correlate as strongly with winning, but it has FAR more predictive value.
The point isn't that "Corsi predicts winning", the point is that "Corsi is the best/most reliable predictor of winning we have" (it's not, there's a whole host of other predicting stats we have, like score-adjusted fenwick, but I'm working the concept, not specifics here). It guarantees nothing but it is repeatable and has demonstrated a correlation to winning hockey games.
This is longer than I anticipated it to be, but maybe some people will read the whole thing and have a bit more of an understanding of why these things are useful. After all, we're on page 25 and a lot of people still don't understand..
Good post, would read again.
Tbh, I'm not sure what a lot of people find so hard to understand about this. These are pretty easy, straight forward numbers staring you right in the face. Where does all the apprehension and animosity come from? As a Ph.D. math student, I'm always curious to find out why people refuse to accept known facts as known facts??
Good post, would read again.
Tbh, I'm not sure what a lot of people find so hard to understand about this. These are pretty easy, straight forward numbers staring you right in the face. Where does all the apprehension and animosity come from? As a Ph.D. math student, I'm always curious to find out why people refuse to accept known facts as known facts??
Well, kind of importantly, the number 1 "known fact" trotted out ("Corsi is the best predictor of wins we have") has been proven false in this very thread. Luddites: 1, Fancy Stats: 0. Maybe it was a revolution for someone to see correlation numbers suggesting that more rubber directed in the other direction will help a team win on some level, but everyone should already know that intuitively. Thing is, we also know intuitively that just firing a higher volume of shots at goaltenders/defenses today doesn't make the difference you might expect if you stop attempting to improve the quality of chances you're getting in the process.
First of all, special teams are not ignored, they're just in another category. 80% of a hockey game happens at 5v5, and typically even more than that in the playoffs, which is why it's usually discussed the most.Doesn't help that they also ignore special teams completely (in order to focus on 5-on-5 play in "close" games), when powerplays contribute over 20% (~33 out of 151 goals on average) of scoring around the league.
So what you're saying is that some outliers can also be partially explained by outside factors? You don't say.Imbalance in on-ice personnel obviously has a huge impact on the quality AND quantity of chances. Calgary is +52 in PP opportunities, while Boston is -45, so a spread like that has to have an impact on both number and quality of chances, and likely the outcome of more than just a couple of games. (And in fact, we see Calgary currently in a playoff position with a team Corsi <<50%, and Boston outside of a spot with Corsi >50%).
If you say so.
I've never met one, and I've met a lot of hockey analysts. But maybe we run in different circles.
I'd love to see someone who's sole motivation for hockey analytics is #2 above. Do they get a castle in the woods and a monocle?
ESPN only focuses on hockey when something negative happens. You've certainly seen them do it.
They get to do their job and **** on something they don't like. Why do the need to become a caricature for that to make sense to you?