Is it safe to say Corsi is a garbage stat now? (Warning Post 455)

Bear of Bad News

"The Worst Guy on the Site" - user feedback
Sep 27, 2005
14,227
29,386
I do have to admit that I frequently "hate hockey" for several hours following Avalanche games this year. :rant:
 

hatterson

Registered User
Apr 12, 2010
36,570
14,095
North Tonawanda, NY
Maggie the monkey actually has a better than 50% success rate in predicting playoff winners.

And so does a simple comparison of corsi for percentages.

As was demonstrated earlier (and has been done many other times) corsi has a better record at predicting playoffs wins than wins, points, seed and goal differential.

If Maggie the Monkey is better than all those things (including corsi) I'd love to see a link to her work, she seems like an exceptionally talented primate.

I do have to admit that I frequently "hate hockey" for several hours following Avalanche games this year. :rant:

I find that I frequently hate it for a couple hours *during* Leafs games.
 

Bear of Bad News

"The Worst Guy on the Site" - user feedback
Sep 27, 2005
14,227
29,386
I find that I frequently hate it for a couple hours *during* Leafs games.

That explains it - we're the two mods on the By The Numbers forum, and we frequently hate hockey!

tumblr_inline_nfawhdjSHs1rfajnl.jpg
 

onlyalad

The bounce
Jan 13, 2008
7,204
1,061
Corsi is a tool. Like a wrench. It can be useful but if you want to build a go cart you may want to bring your whole toolbox.
 

CrashBartley

Registered User
Nov 19, 2014
602
86
And so does a simple comparison of corsi for percentages.

As was demonstrated earlier (and has been done many other times) corsi has a better record at predicting playoffs wins than wins, points, seed and goal differential.

If Maggie the Monkey is better than all those things (including corsi) I'd love to see a link to her work, she seems like an exceptionally talented primate.

If you google Maggie the monkey, you'll find a lot of information. But here are her picks and method.
My whole point is that a primate spinning a wheel has as good a chance predicting the outcome of a playoff series.
Method

Maggie's method of selecting teams is entirely random. She is presented with a wheel (similar to a carnival big six wheel), divided into sections. Half the sections display one team in a particular playoff matchup, the other half display the opposing team. Prompted by her handler, Maggie then spins it — whichever team is picked is her selection to win the best-of-seven series. James Duthie of TSN claimed in 2003 that one of the primary motives behind having Maggie on the network was because of the unpredictability of hockey.[2]
Record
Playoff year Correct Incorrect
2003 8 7
2004 7 8
2006 9 6
2007 8 7
2008 8 7
2009 5 10
Total 45 45


My whole point is that corsi is a tool to track past performance. Predicting is just that predicting, an educated guess.
 
Last edited by a moderator:

Crazy_Ike

Cookin' with fire.
Mar 29, 2005
9,081
0
As was demonstrated earlier (and has been done many other times) corsi has a better record at predicting playoffs wins than wins, points, seed and goal differential.

And as has also been demonstrated earlier, these supposed predictions fall apart under close scrutiny, and the claim of a better record falls apart along with it.
 

hatterson

Registered User
Apr 12, 2010
36,570
14,095
North Tonawanda, NY
And as has also been demonstrated earlier, these supposed predictions fall apart under close scrutiny, and the claim of a better record falls apart along with it.

I must have missed the post where you provided a statistical sample (of any non-trivial size) that contradicted the data Epsilon posted. Perhaps you can help me out.

And before we go too far down the rabbit hole I might as well just say that I have met no one (at least in this thread, and really I haven't run across anyone at all) that says corsi is the cause of winning, merely that it's correlated with winning. Specifically in the last few pages of this thread, there has been the claim (provided with some data and calculations) that corsi is correlated with winning playoff series. There was also evidence presented that corsi has a better record at predicting playoff series wins than goal differential and team record.
 

Crazy_Ike

Cookin' with fire.
Mar 29, 2005
9,081
0
I must have missed the post where you provided a statistical sample (of any non-trivial size) that contradicted the data Epsilon posted. Perhaps you can help me out.

A typical response from someone too focused on the math and not enough on the methodology. The claims fall apart because the predictions cannot be satisfactorily tied to the teams on the ice either beforehand or afterwards. As an example, the fact that your sample includes predictions based on the composition of teams that were not the compilation actually present in the playoffs. This is akin to claiming credit for a team winning because your dog pooped in the front yard instead of the back. Garbage in, garbage out. Start using acceptable data and we will start debating the quality of your conclusions.

Frankly this claim you and others have made is about as related to science as the correlation of pirates to global warming. Not good enough. Not convincing enough. Better is required.

PiratesVsTemp.png
 

hatterson

Registered User
Apr 12, 2010
36,570
14,095
North Tonawanda, NY
A typical response from someone too focused on the math and not enough on the methodology. The claims fall apart because the predictions cannot be satisfactorily tied to the teams on the ice either beforehand or afterwards. As an example, the fact that your sample includes predictions based on the composition of teams that were not the compilation actually present in the playoffs.

Do you have criteria for satisfactorily? How often does a team substantively change in personnel or system between the regular season and the playoffs? Should not those changes be included in the "other analysis" that many of us have said is necessary above and beyond a trivial comparison of corsi?

This is akin to claiming credit for a team winning because your dog pooped in the front yard instead of the back.

My dog hasn't suited up for an NHL team as I don't actually own one (dog nor NHL team). However what has happened is that for virtually all playoffs teams, the roster is made up of significantly the same players as the team that skated in the regular season. Yes, in some cases there are significant acquisitions, or dramatic coaching changes, or large injury issues but I'm failing to see why that wouldn't be included in the "other analysis" that, again, has been said is necessary many times in this thread.

Garbage in, garbage out. Start using acceptable data and we will start debating the quality of your conclusions.

You've arbitrarily decided the data is garbage and thus refused to accept any conclusions it draws. Perhaps the best option is for you to rigorously define where your goalposts are and then we can determine if it's possible to score.
 

Crazy_Ike

Cookin' with fire.
Mar 29, 2005
9,081
0
Do you have criteria for satisfactorily?

Sure. Let's start with data that is actually tied to the roster that is actually in the playoffs. Is that too much to ask?

How often does a team substantively change in personnel or system between the regular season and the playoffs?

Quite often, between trades (eg, Kings) and injuries (eg, Blues).

However what has happened is that for virtually all playoffs teams, the roster is made up of significantly the same players as the team that skated in the regular season. Yes, in some cases there are significant acquisitions, or dramatic coaching changes, or large injury issues but I'm failing to see why that wouldn't be included in the "other analysis" that, again, has been said is necessary many times in this thread.

Then we might as well just use that "other analysis", because "this analysis" is full of bad data.

You've arbitrarily decided the data is garbage and thus refused to accept any conclusions it draws. Perhaps the best option is for you to rigorously define where your goalposts are and then we can determine if it's possible to score.

No different than you. Of course, because you don't agree with it, somehow the standards change...

You hang your hat on conclusions that have been DEMONSTRATED to be based on bad data.
 

CrashBartley

Registered User
Nov 19, 2014
602
86
If that was your point, then your point has already been proven false in this thread.

No, it hasn't. You claim corsi can predict accurately and yet it merely reflects what has taken place in the past.
The bottom line is good stats don't make a good team. Good teams make good stats.
 

hatterson

Registered User
Apr 12, 2010
36,570
14,095
North Tonawanda, NY
Sure. Let's start with data that is actually tied to the roster that is actually in the playoffs. Is that too much to ask?

Quite often, between trades (eg, Kings) and injuries (eg, Blues).

Over the last 5 years and the 80 teams which have made the playoffs, how many of those do you believe have changed so much between the regular season and the playoffs that their regular season performance is completely divorced from their postseason performance? I'd be interested to see the actual list.

Then we might as well just use that "other analysis", because "this analysis" is full of bad data.

Again, your definition of bad data is extremely suspect, seeming to be based around the premise that outliers invalidate data

No different than you. Of course, because you don't agree with it, somehow the standards change...

You hang your hat on conclusions that have been DEMONSTRATED to be based on bad data.

I haven't arbitrarily decided that any data is good or bad, I've done it based on weighing the evidence. Some of it linked in this thread, a lot more linked over on the By The Numbers forum, even more in various other blogs around the internet. The data has been shown to have non-trivial predictive power across a very large sample size.

And I'm not hanging my hat on any conclusions. I'm saying the data supports the presented conclusions, that's it. I've already noted that there's a non-zero chance those conclusions are wrong and the data is just lucky. I'm not somehow personally vested in the outcome, I don't make decisions in an NHL front office, I don't gamble on games. If the conclusions are true, cool I learned something. If the conclusions aren't, cool I learned something else.


No, it hasn't. You claim corsi can predict accurately and yet it merely reflects what has taken place in the past.

And, as demonstrated, is also a good predictor of what is likely to happen in the future.

The bottom line is good stats don't make a good team. Good teams make good stats.

I'm curious where you've seen anyone arguing otherwise. It certainly wasn't in this tread.
 

Crazy_Ike

Cookin' with fire.
Mar 29, 2005
9,081
0
Over the last 5 years and the 80 teams which have made the playoffs, how many of those do you believe have changed so much between the regular season and the playoffs that their regular season performance is completely divorced from their postseason performance? I'd be interested to see the actual list.

It's not necessary for me to go that far, which in any event would require a great deal more time than I care to devote to convincing people on the internet that they are wrong. It is enough to show that, given the margin of "better than other stats over the last three years" claimed earlier in the thread, three of the four wins were of such small margin the effect of having Gaborik (who wasn't present for most of the data that put the Kings at #1) can easily be claimed to be the reason why they got through any of them. Lacking that, it is enough BY ITSELF to knock Corsi out of its hallowed place you and other adherents put it. That is just ONE example, there are others. The cumulative effect is to reduce the entire Corsi claim to outright noise.

The connection of prediction is gone.

Now you like to say you are only referring to correlation, not causation, but the underlying current of your posts belies this. I have demonstrated your attempt to tie Corsi to playoff round wins dubious at best. Without that, I reject the claims that it is better on the same grounds that I don't take the monkey's predictions very seriously either. There IS correlation, absolutely, because they ultimately derive from the same place. But better? Not really.

Again, your definition of bad data is extremely suspect, seeming to be based around the premise that outliers invalidate data

I do not require your approval of my definitions of bad data. It is immaterial to me whether or not you accept it. I have demonstrated evidence against it. That's all that is required.
 

Crazy_Ike

Cookin' with fire.
Mar 29, 2005
9,081
0
You've demonstrated anecdotal outliers. If you feel that is all that's required to invalidate a much broader statistical trend, then I suppose there's really not much for us to discuss further.

Even if I were to accept this at face value, one "outlier" was enough to reduce the margin of superiority claimed by Corsi to nothing by itself, with others already mentioned.

But no real need to go into that, because that's not what an outlier is. An outlier is a data point that doesn't seem to be correlating with the others. The examples shown here are outright removal of data points as not representative of what was claimed to be measured.

If you feel bad data is perfectly acceptable to count in correlations (as long as it supports your preferred conclusion, of course), then I suppose there is not really much for us to discuss further.
 

eklunds source

Registered User
Jul 23, 2008
8,323
0
Ed Snider's basement
My whole point is that corsi is a tool to track past performance. Predicting is just that predicting, an educated guess.

I'm not trying to be disrespectful, but if you're going to talk about statistics and how to use them, you should understand more about statistics, because you're skipping past what they teach you in stats 101 -- Reliability vs Validity.

Validity is a measure of the extent that something correlates with the real world.

Reliability is a measure of the consistency of that result.

Shooting percentages are an example of great validity; the correlation (R2) between "5v5 GF%" (goal differential represented by a ratio) and "PDO" (on-ice sv% + on-ice sh%) is 0.6802 -- meaning, this year, about 68% of a players' goal differential is determined by PDO. This is the same reason that +/- is atrocious and useless. It is highly affected by percentages.

However, validity is not in and of itself useful.

Let's say you weigh 180lbs. You weigh yourself on a scale 5 times and you get:

140
210
100
80
370

Those average to 180lbs. Your scale is valid! It's also not reliable -- looking at past measurements doesn't predict future measurements. You don't know if the next time you look at that scale if it will be 100lbs under or 200lbs over.

If you had measured yourself on the scale and gotten "150, 151, 150, 149, 148", well, that's a pretty reliable scale! It's also not valid. It's not measuring what it says it will measure.

Shooting percentages are the faulty bathroom scale -- they are not reliable.
Corsi isn't super accurate, but it's definitely pretty reliable, and it's KINDA valid. It's not going to spit out 180 180 180 180 180 at you, but it gets a lot closer than any other scale we have.

Here's the shooting percentages and Corsi for each team in the NHL, splitting the partial NHL season into 2 halves (my cutoff date was December 6th; most teams have played 26-30 games before and after then)

Team|Oct-Dec6 Corsi|Dec7-Feb13 Corsi|Oct-Dec6 Sh%|Dec7-Feb13 Sh%|Change in Corsi (as a %)|Change in Sh% (as a %)
ANA|50.6%|51.2%|8.2%|8.2%|101.1%|100.2%
ARI|48.3%|48.8%|7.0%|5.6%|101.0%|79.8%
BOS|52.4%|51.6%|7.9%|7.0%|98.5%|88.9%
BUF|37.1%|37.0%|7.4%|7.1%|99.7%|96.6%
CAR|51.6%|50.4%|6.5%|6.1%|97.7%|93.2%
CBJ|46.1%|46.1%|6.7%|8.1%|99.9%|120.3%
CGY|44.0%|45.1%|9.5%|8.6%|102.4%|89.9%
CHI|55.1%|53.0%|7.5%|7.3%|96.3%|97.7%
COL|44.8%|42.8%|8.6%|7.5%|95.5%|86.8%
DAL|49.4%|52.1%|8.8%|8.8%|105.6%|101.0%
DET|52.8%|55.4%|8.4%|6.2%|105.0%|74.1%
EDM|50.5%|47.6%|6.7%|7.3%|94.3%|108.7%
FLA|51.5%|51.6%|6.6%|8.1%|100.1%|123.6%
L.A|51.9%|57.9%|7.6%|7.5%|111.4%|98.2%
MIN|54.2%|49.9%|8.3%|7.2%|92.2%|87.7%
MTL|50.3%|48.3%|8.0%|8.8%|95.9%|110.7%
N.J|51.1%|43.8%|7.4%|8.4%|85.8%|112.8%
NSH|53.5%|51.3%|8.3%|8.5%|95.8%|102.4%
NYI|52.5%|54.8%|8.1%|8.1%|104.2%|100.9%
NYR|49.7%|50.3%|9.3%|8.3%|101.2%|89.8%
OTT|47.5%|51.5%|7.0%|8.8%|108.5%|124.5%
PHI|47.2%|49.7%|7.3%|8.1%|105.4%|110.4%
PIT|50.7%|53.3%|8.8%|7.7%|105.1%|88.2%
S.J|52.2%|50.5%|7.3%|8.0%|96.8%|110.6%
STL|50.9%|52.1%|7.2%|9.8%|102.4%|135.1%
T.B|53.9%|54.3%|9.7%|9.1%|100.7%|93.7%
TOR|47.0%|45.5%|9.7%|6.8%|97.0%|70.4%
VAN|51.2%|48.5%|7.8%|7.2%|94.8%|92.3%
WPG|51.5%|53.5%|6.5%|8.6%|104.0%|132.9%
WSH|51.2%|51.7%|7.5%|9.4%|100.9%|124.6%

That's a lot of numbers... What's key to note is:

- Teams that had a low Corsi to start the year, tended to keep a low Corsi in the next few months. There weren't many teams with wild changes.
- Teams that shot very well sometimes continued to shoot well (DAL), and sometimes tanked (TOR). Teams that shot poorly sometimes continued to shoot poorly (CAR) and sometimes they started scoring in bunches (FLA, WPG).

- 21 of 30 teams saw their Corsi stay stable -- between 95% and 105% of their previous result.
- Only 7 of 30 teams saw their Sh% stay stable in the same way.

- Only 2 teams saw their Corsi change by more than 10% of their previous result -- NJ fell (85.8%) and LA rose (111.4%). Every other team was between 90% and 110% of their previous performance.
- 19 teams saw their sh% change by more than 10% of their previous result.

2 months into the year, Arizona was shooting 7.0%, and St. Louis was shooting 7.2%.
Over the next 2 months, Arizona's sh% dropped to 5.6%, and St. Louis's rocketed up to 9.8%.

This isn't an actual study, and there's far better methods I could have taken, but this is just to demonstrate the difference between reliability and validity, predictive vs descriptive.

Shooting percentages are valid and descriptive. It correlates strongly with goal differential and winning. It is not reliable; it does not give you strong predictive value. A team might shoot 4% on Mondays and 12% on Thursdays, or 6% in October and 12% in November. Past knowledge does not help with future results.

Corsi is reliable and predictable. It repeats. Teams that are good at The Corsi on Wednesdays are also good on Fridays. Teams that are good at Corsi'ing in February are good at doing Corsi puzzles in March. It doesn't correlate as strongly with winning, but it has FAR more predictive value.

The point isn't that "Corsi predicts winning", the point is that "Corsi is the best/most reliable predictor of winning we have" (it's not, there's a whole host of other predicting stats we have, like score-adjusted fenwick, but I'm working the concept, not specifics here). It guarantees nothing but it is repeatable and has demonstrated a correlation to winning hockey games.

This is longer than I anticipated it to be, but maybe some people will read the whole thing and have a bit more of an understanding of why these things are useful. After all, we're on page 25 and a lot of people still don't understand..
 
Last edited:
Jul 29, 2003
31,823
5,598
Saskatoon
Visit site
Corsi is reliable and predictable. It repeats. Teams that are good at The Corsi on Wednesdays are also good on Fridays. Teams that are good at Corsi'ing in February are good at doing Corsi puzzles in March. It doesn't correlate as strongly with winning, but it has FAR more predictive value.

The point isn't that "Corsi predicts winning", the point is that "Corsi is the best/most reliable predictor of winning we have" (it's not, there's a whole host of other predicting stats we have, like score-adjusted fenwick, but I'm working the concept, not specifics here). It guarantees nothing but it is repeatable and has demonstrated a correlation to winning hockey games.

This is longer than I anticipated it to be, but maybe some people will read the whole thing and have a bit more of an understanding of why these things are useful. After all, we're on page 25 and a lot of people still don't understand..

I thought your entire post was excellent, but this in particular really should be emphasized. It's far from perfect and will always have exceptions, but overall is pretty good. The biggest key with trying to use it as a predictive tool is to understand why there's such a disconnect between a team's record and their possession numbers. I don't think enough people do this.
 

Carolinas Identity*

I'm a bad troll...
Jun 18, 2011
31,250
1,299
Calgary, AB
I'm not trying to be disrespectful, but if you're going to talk about statistics and how to use them, you should understand more about statisti
cs, because you're skipping past what they teach you in stats 101 -- Reliability vs Validity.

Validity is a measure of the extent that something correlates with the real world.

Reliability is a measure of the consistency of that result.

Shooting percentages are an example of great validity; the correlation (R2) between "5v5 GF%" (goal differential represented by a ratio) and "PDO" (on-ice sv% + on-ice sh%) is 0.6802 -- meaning, this year, about 68% of a players' goal differential is determined by PDO. This is the same reason that +/- is atrocious and useless. It is highly affected by percentages.

However, validity is not in and of itself useful.

Let's say you weigh 180lbs. You weigh yourself on a scale 5 times and you get:

140
210
100
80
370

Those average to 180lbs. Your scale is valid! It's also not reliable -- looking at past measurements doesn't predict future measurements. You don't know if the next time you look at that scale if it will be 100lbs under or 200lbs over.

If you had measured yourself on the scale and gotten "150, 151, 150, 149, 148", well, that's a pretty reliable scale! It's also not valid. It's not measuring what it says it will measure.

Shooting percentages are the faulty bathroom scale -- they are not reliable.
Corsi isn't super accurate, but it's definitely pretty reliable, and it's KINDA valid. It's not going to spit out 180 180 180 180 180 at you, but it gets a lot closer than any other scale we have.

Here's the shooting percentages and Corsi for each team in the NHL, splitting the partial NHL season into 2 halves (my cutoff date was December 6th; most teams have played 26-30 games before and after then)

Team|Oct-Dec6 Corsi|Dec7-Feb13 Corsi|Oct-Dec6 Sh%|Dec7-Feb13 Sh%|Change in Corsi (as a %)|Change in Sh% (as a %)
ANA|50.6%|51.2%|8.2%|8.2%|101.1%|100.2%
ARI|48.3%|48.8%|7.0%|5.6%|101.0%|79.8%
BOS|52.4%|51.6%|7.9%|7.0%|98.5%|88.9%
BUF|37.1%|37.0%|7.4%|7.1%|99.7%|96.6%
CAR|51.6%|50.4%|6.5%|6.1%|97.7%|93.2%
CBJ|46.1%|46.1%|6.7%|8.1%|99.9%|120.3%
CGY|44.0%|45.1%|9.5%|8.6%|102.4%|89.9%
CHI|55.1%|53.0%|7.5%|7.3%|96.3%|97.7%
COL|44.8%|42.8%|8.6%|7.5%|95.5%|86.8%
DAL|49.4%|52.1%|8.8%|8.8%|105.6%|101.0%
DET|52.8%|55.4%|8.4%|6.2%|105.0%|74.1%
EDM|50.5%|47.6%|6.7%|7.3%|94.3%|108.7%
FLA|51.5%|51.6%|6.6%|8.1%|100.1%|123.6%
L.A|51.9%|57.9%|7.6%|7.5%|111.4%|98.2%
MIN|54.2%|49.9%|8.3%|7.2%|92.2%|87.7%
MTL|50.3%|48.3%|8.0%|8.8%|95.9%|110.7%
N.J|51.1%|43.8%|7.4%|8.4%|85.8%|112.8%
NSH|53.5%|51.3%|8.3%|8.5%|95.8%|102.4%
NYI|52.5%|54.8%|8.1%|8.1%|104.2%|100.9%
NYR|49.7%|50.3%|9.3%|8.3%|101.2%|89.8%
OTT|47.5%|51.5%|7.0%|8.8%|108.5%|124.5%
PHI|47.2%|49.7%|7.3%|8.1%|105.4%|110.4%
PIT|50.7%|53.3%|8.8%|7.7%|105.1%|88.2%
S.J|52.2%|50.5%|7.3%|8.0%|96.8%|110.6%
STL|50.9%|52.1%|7.2%|9.8%|102.4%|135.1%
T.B|53.9%|54.3%|9.7%|9.1%|100.7%|93.7%
TOR|47.0%|45.5%|9.7%|6.8%|97.0%|70.4%
VAN|51.2%|48.5%|7.8%|7.2%|94.8%|92.3%
WPG|51.5%|53.5%|6.5%|8.6%|104.0%|132.9%
WSH|51.2%|51.7%|7.5%|9.4%|100.9%|124.6%

That's a lot of numbers... What's key to note is:

- Teams that had a low Corsi to start the year, tended to keep a low Corsi in the next few months. There weren't many teams with wild changes.
- Teams that shot very well sometimes continued to shoot well (DAL), and sometimes tanked (TOR). Teams that shot poorly sometimes continued to shoot poorly (CAR) and sometimes they started scoring in bunches (FLA, WPG).

- 21 of 30 teams saw their Corsi stay stable -- between 95% and 105% of their previous result.
- Only 7 of 30 teams saw their Sh% stay stable in the same way.

- Only 2 teams saw their Corsi change by more than 10% of their previous result -- NJ fell (85.8%) and LA rose (111.4%). Every other team was between 90% and 110% of their previous performance.
- 19 teams saw their sh% change by more than 10% of their previous result.

2 months into the year, Arizona was shooting 7.0%, and St. Louis was shooting 7.2%.
Over the next 2 months, Arizona's sh% dropped to 5.6%, and St. Louis's rocketed up to 9.8%.

This isn't an actual study, and there's far better methods I could have taken, but this is just to demonstrate the difference between reliability and validity, predictive vs descriptive.

Shooting percentages are valid and descriptive. It correlates strongly with goal differential and winning. It is not reliable; it does not give you strong predictive value. A team might shoot 4% on Mondays and 12% on Thursdays, or 6% in October and 12% in November. Past knowledge does not help with future results.

Corsi is reliable and predictable. It repeats. Teams that are good at The Corsi on Wednesdays are also good on Fridays. Teams that are good at Corsi'ing in February are good at doing Corsi puzzles in March. It doesn't correlate as strongly with winning, but it has FAR more predictive value.

The point isn't that "Corsi predicts winning", the point is that "Corsi is the best/most reliable predictor of winning we have" (it's not, there's a whole host of other predicting stats we have, like score-adjusted fenwick, but I'm working the concept, not specifics here). It guarantees nothing but it is repeatable and has demonstrated a correlation to winning hockey games.

This is longer than I anticipated it to be, but maybe some people will read the whole thing and have a bit more of an understanding of why these things are useful. After all, we're on page 25 and a lot of people still don't understand..

Good post, would read again.

Tbh, I'm not sure what a lot of people find so hard to understand about this. These are pretty easy, straight forward numbers staring you right in the face. Where does all the apprehension and animosity come from? As a Ph.D. math student, I'm always curious to find out why people refuse to accept known facts as known facts??
 

aasiaat

Registered User
Sep 26, 2014
8
0
Vancouver
Good post, would read again.

Tbh, I'm not sure what a lot of people find so hard to understand about this. These are pretty easy, straight forward numbers staring you right in the face. Where does all the apprehension and animosity come from? As a Ph.D. math student, I'm always curious to find out why people refuse to accept known facts as known facts??

I think it's really easy for people to think of examples of players with good Corsi numbers who they disliked, typically because they were "quantity over quality" type shooters or defencemen perceived to be prone to grievous errors. When I first heard about Corsi the first thing that popped into my head was "...I bet Jason Blake was a Corsi superstar."

It's easy for some people to think of a few "Corsi inflaters" off the top of their head and as a result anchor themselves to the belief that it's a flawed way to measure any part of a player's value. It happens at the team level, too - Edmonton and Toronto still struggled to win even after dramatic changes in CF% this season, and many take this to mean that Corsi is not predictive. If you're not interested (or mathematically inclined) all of the analysis on Corsi's predictive power, autocorrelation, etc, is meaningless and you can easily live in a world of confirmation bias where all outliers help reaffirm your beliefs despite inadequate sample size or lack of context.
 

Ohashi_Jouzu*

Registered User
Apr 2, 2007
30,332
11
Halifax
Good post, would read again.

Tbh, I'm not sure what a lot of people find so hard to understand about this. These are pretty easy, straight forward numbers staring you right in the face. Where does all the apprehension and animosity come from? As a Ph.D. math student, I'm always curious to find out why people refuse to accept known facts as known facts??

Well, kind of importantly, the number 1 "known fact" trotted out ("Corsi is the best predictor of wins we have") has been proven false in this very thread. Luddites: 1, Fancy Stats: 0. Maybe it was a revolution for someone to see correlation numbers suggesting that more rubber directed in the other direction will help a team win on some level, but everyone should already know that intuitively. Thing is, we also know intuitively that just firing a higher volume of shots at goaltenders/defenses today doesn't make the difference you might expect if you stop attempting to improve the quality of chances you're getting in the process.

If Corsi was used to evaluate something like coaching (style/system, execution, etc) over time instead of compare players (or even players collectively as teams) I don't think there'd be nearly the resistance, and something interesting AND useful might reveal itself (how did this change in personnel, or this change in systems/strategy, affect the ability to generate/reduce shots for/against, etc). If possession doesn't result in execution, then Corsi numbers are just... numbers; numbers comprised of elements that actually have significantly reduced correlation with winning over the long term (missed/blocked shots).

Doesn't help that they also ignore special teams completely (in order to focus on 5-on-5 play in "close" games), when powerplays contribute over 20% (~33 out of 151 goals on average) of scoring around the league. Imbalance in on-ice personnel obviously has a huge impact on the quality AND quantity of chances. Calgary is +52 in PP opportunities, while Boston is -45, so a spread like that has to have an impact on both number and quality of chances, and likely the outcome of more than just a couple of games. (And in fact, we see Calgary currently in a playoff position with a team Corsi <<50%, and Boston outside of a spot with Corsi >50%).

Basically, not many people have an issue with whether or not Corsi describes what it does as much as the limitations that many of its proponents seem to be content with when Corsi is brought out for comparisons.
 

eklunds source

Registered User
Jul 23, 2008
8,323
0
Ed Snider's basement
Well, kind of importantly, the number 1 "known fact" trotted out ("Corsi is the best predictor of wins we have") has been proven false in this very thread. Luddites: 1, Fancy Stats: 0. Maybe it was a revolution for someone to see correlation numbers suggesting that more rubber directed in the other direction will help a team win on some level, but everyone should already know that intuitively. Thing is, we also know intuitively that just firing a higher volume of shots at goaltenders/defenses today doesn't make the difference you might expect if you stop attempting to improve the quality of chances you're getting in the process.

Again, missing the forest for the trees. You're looking for reasons it fails instead of trying to understand what it doesn't fail at. You can find reasons that EVERY statistical model will fail by altering the input (in this case, changing your play to manipulate the data, rather than trying to exploit what is causing the data).

The point is not: "Throwing a mad amount of pucks on net, even low-percentage shots, will result in winning hockey games". Nobody is saying "100% of the reason Corsi works is because shots on net mean long term success"

The point is: "Teams that do the things that result in a lot of shots on net (without many against) tend to also do the same things that result in winning." You can't score if you don't have the puck, and teams with a great corsi tend to have control of the puck more. They get more opportunities to score, and allow fewer opportunities. Do you remember that ****** bounce or rebound in your teams' last game that resulted in a goal against? Those happen less when the puck is in your zone less.

If teams were just madly throwing low-percentage pucks on net to inflate their Corsi, we would see evidence of that in shooting percentages - namely, we would see notably lower and persistent shooting percentages. We don't. At least, nothing outside the realm of what we would expect to see due to nothing but normal variance.

Doesn't help that they also ignore special teams completely (in order to focus on 5-on-5 play in "close" games), when powerplays contribute over 20% (~33 out of 151 goals on average) of scoring around the league.
First of all, special teams are not ignored, they're just in another category. 80% of a hockey game happens at 5v5, and typically even more than that in the playoffs, which is why it's usually discussed the most.

Second, for every minute of 5v5 hockey a team plays, they get about about 7 seconds of 5v4 powerplay. With smaller sample sizes comes more variance in shooting percentage, but again, teams that are consistently good at generating shot attempts are the teams that trend to the better long-term success rate.

Imbalance in on-ice personnel obviously has a huge impact on the quality AND quantity of chances. Calgary is +52 in PP opportunities, while Boston is -45, so a spread like that has to have an impact on both number and quality of chances, and likely the outcome of more than just a couple of games. (And in fact, we see Calgary currently in a playoff position with a team Corsi <<50%, and Boston outside of a spot with Corsi >50%).
So what you're saying is that some outliers can also be partially explained by outside factors? You don't say.
 

DyerMaker66*

Guest
If you say so.

I've never met one, and I've met a lot of hockey analysts. But maybe we run in different circles.

I'd love to see someone who's sole motivation for hockey analytics is #2 above. Do they get a castle in the woods and a monocle?

ESPN only focuses on hockey when something negative happens. You've certainly seen them do it.

They get to do their job and **** on something they don't like. Why do the need to become a caricature for that to make sense to you?
 

Bear of Bad News

"The Worst Guy on the Site" - user feedback
Sep 27, 2005
14,227
29,386
ESPN only focuses on hockey when something negative happens. You've certainly seen them do it.

They get to do their job and **** on something they don't like. Why do the need to become a caricature for that to make sense to you?

You talk like ESPN is some monolithic, walking, talking entity that does things (and in particular, hates hockey).

There are plenty of people at ESPN who *love* hockey.
 

Ad

Upcoming events

Ad

Ad