The Numbers Game: Predicting Divisional Outcomes

PDO Bear knows when you mistake luck for skill

Getting a bead on which teams in the Atlantic Flortheast are pretenders and which are contenders with the help of PDO.

It's a bit early in the season to rake the statistical leaves and dive into the number-pile, but we're inching close enough to the quarter pole to begin to see which among the hot and cold starts are going to last, and which are banking points on lady luck. Given the new playoff format, lets tell the rest of the league to screw off and take a close look at just the Atlantic division's starts out of the gate.

PARDON a DRY ORIENTATION:

If you're fresh to those new-fangled stats, we're going to use the statistic PDO as a sub-in for luck to assess teams' early season fortune. PDO is simply the combination of team SV% and team SH%. League wide, every shot on net is either a save or a goal, so if you add it all up it comes out to 100% of all shots. If you had an infinite stretch of games, everybody's PDO would gradually move toward 100. Year-to-year, teams are unable to exert much control over the two component stats, and both have been established to be subject to very high amounts of chance. So it basically tells us that "what goes up, must come down."

The Bruins, for instance, careened from 1024 to 1005 over the past couple seasons with essentially an identical roster. Because of the lack of persistence from season to season, PDO serves as a pretty handy proxy for luck and can tell the astute observer what teams are winning on the merits of reproducible talent and which are getting all the bounces.

If you're curious about the stat and would like further reading and a more detailed description, please take a gander at our PDO roundup from last season or better yet this awesomely thorough explanation from Fear the Fin.

If you want to get to the meat and p(d)otatoes and just see how the teams are faring, feel free to scroll on down to the table below and dig right into it. But lets take a moment to further examine what we're witnessing in these numbers, how they're arrived at and how to project future performance with greater accuracy.

PROBABLY DONE OVERACHIEVING: Regression

Regression comes up in all kinds of fields, from investment to sociology to sports, but what the heck are we talking about? You might get intuitively that we're looking at expected rises and falls from a given outcome, but just why are we expecting these fluctuations to return to the mean?

Basically what we're doing when we talk about regression is using past information about events to predict their future outcome by understanding how the info is interrelated. That relationship tells us how much one piece of data influences another. A common application, and what we'll be looking at right now, is to see how multiple outcomes of a test - like a coin toss, stock prices on a given day, or, I dunno, shots in a hockey game - fare against each other to gauge how reproducible or unsustainable the results are, giving us a glimpse at how much sheer chance is involved in the outcome we're observing. For further reading, BroadStreetHockey has a nice illustration in their article on regressing shooting percentage.

To cite an example of an unsustainable NHL performance that isn't the high-profile collapse of the 11/12 Minnesota Wild, we can look no further than our own squad in the same season. As the narrative goes, after shredding the league for a couple months in the late Fall, the White House debacle divided the locker room and created such discord that it bled over into on ice results. As pointed out by Cam Charron, the 2012 post-all star break decline had everything to do with an obscenely high early-season PDO and not actually a lot with a goalie's political musings. "Quota," as we know our six-goal norm through November and December 2011, was in fact the product of abnormal shooting percentage highs coupled with a strong territorial advantage in shots (but it sure was fun). In essence, we were fooled by our good fortune in conversion rate into believing that steam-rolling uber dominance was our true level, not the freezing cold start that kicked off the season. The truth was between those two outliers, and the chickens came home to roost.

We might infer from November that our rising, elevated shooting percentage was going to continue its ascent forever just as we might infer a horrid season ahead from October. To have a more accurate prediction for subsequent months, we'd need to account for the amount of luck involved in shooting percentages league-wide and long-term and pull our outcome back toward that league mean by that amount. We don't want to disregard the skill involved altogether by yanking it all the way back - we want to adjust just for the part that isn't clearly the skill of the team. We shouldn't assume that a completely average performance is destined.

If you were tossing coins, you're dealing with 100% chance and will wind up with 50/50 results over a long enough span. If something were 100% skill, you'd get the same outcome every time. Hockey, chaotic system that it is, is a bit of both. It has a good deal of repeatable skill alongside heaping helpings of luck, and we have to separate the two out to understand how replicable performances might be. Dunno if you've noticed, but that skill doesn't always win you the game (grumble grumble Devils grumble grumble Game 6 grumble)

I'm going to get into a whole math-y spiel on just how we assess what teams are due for regression and how much we can expect a correction, but the great thing is, you don't really need to know any of the following to get the gist, which again is basically: "what goes up, must come down." If you want to view past seasons to get a sense of the norm, that's enough to have a pretty decent guess of the upper and lower bounds of PDO and an understanding of performances that aren't going to hold out in the long run. But if you wanna get technical and predict future results with some accuracy, well, let's roll up our sleeves:

The first thing we need to do is calculate that meddlesome luck. We do this with regression analysis, which to reiterate is just statistical term for the comparison of variables to establish their relationship.

Lets plot a couple variables, we'll call one Season 1 and the other Season 2. In Season 1, Team A posted a 995 PDO. In Season 2, they got a 1002 due to slightly higher SV%. We do the same for Teams B through Z and wind up with a bunch of dots strewn all over a graph, plotted to their Season 1 and 2 coordinates. What we're trying to do is see how much variance there was between the two season throughout the league to see if teams are repeating their performance or getting radically different results every time. (Obviously we want to increase the samples here to get more reliable data, but c'mon, it's an example people!)

If we only had two points of data, we could draw a straight line between them, dust our hands and call it an extremely correlated day - though it won't be a meaningful result with that little info. But if we've got thirty data points (yeah, I know, the alphabet's not long enough) representing each team in the National Hockey League, we have to figure out how to run that line through the maze of dots so it keeps the closest relationship to each of the 30 measurements, the least distance from them. The closer those dots are to your line, the stronger the correlation and the more predictive they are of future performance. If they're not very correlated, they're going to look like a Rorschach and will tell you that outcomes are going to fluctuate wildly year to year whether a team likes it or not.

Lets skip the underlying equations since we're not actually going to hand-calculate that stuff, because it's not the goddamn stone age - there's computers to do it for you. While you're going to have Excel crunch it for you, what the computer's doing is calculating how far all those dots are above and below the line, squaring those distances to get rid of the negative figures, and adding them up to find the lowest total. The line that gives the "least squares," or lowest sum of squared distances, is the "line of best fit." The slope of that line gives us the coefficient of determination (r^2), which tells us how well that sucker fits - how close the dots are overall. If it's 1, it's going through every single dot on the plot. If it's closer to 0, then the fit sucks and they're all very far from the line. For our purposes, a good fit means teams are reproducing their results and we're seeing high skill, a bad fit is heavy luck.

To get the correlation, just square root the r^2 result to un-do that squaring we did earlier, and voila, you have a percentage for skill. The remainder is chance.

As to PDO, we'll be looking a the past four full seasons, comparing each year to the subsequent season for every team. Data comes from HockeyAnalysis.com. What we're trying to do is see how closely the four batches correlate, which will tell us how persistent the skill involved is and how much luck is at play, and thereby how much we should be regressing current performance toward the mean for future projection.

Screen_shot_2013-10-29_at_12

Now we've got our trend line. The scribblings of equations underneath the title of the above image are, first, the slope of that line and below that the coefficient of determination (r^2), which tells us how well that line fits the data points. That low number tells us "not well at all," which we can pretty easily tell by how far those dots are orbiting the line of best fit. If we root that r-squared, we get a weak-ass correlation coefficient (r) of about .23. Only 23% of PDO, according to this measurement, consists of persistent talent, requiring us to regress 77% toward the mean when evaluating PDO to account for the lion's share luck quotient.

No matter how you compare PDO or it's component parts season-to-season (2yr buckets for each team gives similar results), you invariably wind up with low correlation because of the prevalence of luck and lack of persistence of skill to overcome it. If you do the same with, say Fenwick Close, you'll find your correlation coefficient a lot higher, indicating that teams can exert control year-to-year.

Any time the correlation is low, regressing to the mean is going to yield more predictive projections.

Below, in addition to current PDO info, I'll provide regressed projections for each team using the above data.

(And keep in mind, you can use this methodology to find interdependence between non-like variables as well as persistence of like-variables. Regression analysis is critical to our understanding of statistics, establishing relationships between stats: such as connections between Corsi and Scoring Chances or Wins.)

Alright, screw the math, on to the hockey:

PARSING DATA OBSERVATIONS

For ease of reading, we'll be expressing save percentage as a two-digit percentage rather than the .000 format you're used to seeing on the stat sheet and PDO in the same format. Everything is at 5v5, save for Fenwick, which I've included for some extra context on skill. That's looking at 5v5 Close, for reasons that will become clear later. All data is from ExtraSkater.com

Team GP FF% SH% SV% PDO
Boston 10 53.9 9.1 96.4 105.4
Toronto 12 41.7 9.2 93.4 102.6
Montreal 12 51.3 6.6 95.7 102.3
Ottawa 11 49.6 8.2 93.2 101.4
Tampa 10 48.1 9.0 90.8 99.5
Detroit 12 50.7 5.6 93.3 98.9
Buffalo 13 38.6 6.1 92.5 98.6
Florida 11 48.4 7.0 90.2 97.2

BOSTON - 105.4 PDO

One of these things is not like the others. One of these things just doesn't belong. Well, it looks like Boston's a primo case study in PDO through the early going, with a figure so inflated that the pending correction is entirely intuitive. What we have here is an even-strength save percentage that sits as far above the single-season all-time record for team ESSV (thanks Timmy) as that record does from league average. Tuukka's great, but it might be just a bit optimistic to think that he's historically great by an order of magnitude. (And that Colorado and Minnesota are in comparable record shattering territory all at the same time...)

To the Bruins' credit, their possession game is still very strong in spite of a recent drop from the upper echelon of league rankings. The territorial advantage will help soften the blow of Rask's inevitable fall back into the stratosphere. To further cushion the landing, the current 20th place team in CorsiFor events per game should look to get more rubber toward the net. We're presently winning possession the way New Jersey does; entirely by prevention, which is not following past patterns under Julien.

Meanwhile, though the team shooting percentage looks fine at first glance, the fact that it's being propped up by a single forward line running 5-6% higher than mean is cause for concern. To date, the KIL/LIK/ILK/Whatever line has provided a disproportionate 37% of the offense and is clicking in the vicinity of 13% On-ice. Meanwhile, Marchand and Chara are particular outlying lows, running around 5% on-ice sh%. Bergeron too is running low for his norm at sub-7%. Chara, it should be noted, has yet to record a single point at even strength.

I bring up the 11/12 season above entirely because we are now carrying the same PDO as during that high-shooting run, but from goaltending this time. You know what to do: brace for re-entry.

Regressed projection: 101.2 PDO (of note: this is precisely the Bruins' six-season average - albeit including last year's shortened season - in case you're wondering about the impact of persistent goalie talent from Rask and Thomas)


TORONTO - 102.6 PDO

The Leafs are getting so comfortable playing with fire, they should really consider joining the circus. Only saved from being worst possession team by a country mile by the historically abysmal Sabres (more on this in a moment), the Leafs to-date have been woefully outshot, not getting extraordinarily outlying shooting luck in the opposing end and still winning.

Their vaunted shooting percentage from last year hasn't held up, seeing them fall back by a percent and a half into plausibly hot rather than outlandishly so. Even so, expect narratives about coaching strategy enhancing shot quality to continue for a bit longer, as seven players are posting 10% or better through current games played (pst, same thing in Boston...and Phoenix... and St Louis... and Anaheim...). In spite of this, they remain only a collective tick above normality. Instead, this season's elevated PDO sees the team getting top-tier goaltending. As discussed last year, this divisional rival now has at least one legitimate tender and we can expect that half of the equation to remain somewhat elevated for the duration of the year.

On a more granular level, to take in some other context, the Leafs league leading 94.0 PK sv% is due for correction. Six-year mean falls around 87.5 when down a man, with only one instance in that span of a team maintaining above 93.0 for a full season.

Regressed projection: 100.6 PDO

MONTREAL - 101.4 PDO

The Habs are being pulled in two directions, with an outlying low in sh% and high in sv%. Anticipate Price correcting hard toward his still-healthy career average. The drop, however, will be offset by an expected rise in shooting and any blow will be cushioned by their strong possession advantage. Further, the Canadiens are leading the division by a healthy margin in CorsiFor/60, which should bolster their offensive numbers over time.

Further to shooting luck, they presently have not one player seeing an on-ice sh% above 8.4, with a couple unlikely suspects like Bourque and Desharnais getting below 3.5% on-ice. Annoyingly the entire bunch is due to pick up - no team between lockouts shot below where the Habs are now, and only New Jersey, Florida and Ottawa managed it in last year's abbreviated campaign. (stupid Ottawa and their stupid goaltending...)

Regressed projection: 100.5 PDO


OTTAWA - 101.4 PDO

After finally seeing Craig Anderson drop off during the playoffs, the Sens have obnoxiously picked right back up where they left off last regular season with 93.2 ESSV goaltending. In good news for rival fans, this is already falling from just a few games ago. Anderson has proven he's an above league average goalie, so don't expect them to collapse back to the middle of the pack, but he's had no long stint at any point in his career more than .4 above league average.

Outside of Cory Conacher, there's really absolutely nothing amiss with on-ice sh%, which would occupy the middle of the pack under normal season circumstances. Condra, Greening and Karlsson are a tick low, but not really enough to move the needle altogether. Collectively, they sit pretty much right on the mean.

Riding a high PDO and still posting a positively Hartford-esque record has to be aggravating for Senators fans, but this team appears better than the standings show. They'll correct by around 1% but haven't far to fall in the rankings. So long as they can start inching their CorsiAgainst out of the bottom ten, they'll start to turn things around. Lame.

Regressed projection: 100.3 PDO

TAMPA - 99.5 PDO

Control case for league-average, Tampa sits just a hair under 100 with slightly above league average shooting and below league average goaltending. Sounds about right given their makeup. Though both Stamkos and St. Louis have exhibited some ability to maintain high on-ice shooting together by a couple percent season-to-season, I've no problem red-flagging their current 15.7% for inevitable decline. And I'd say it's probably safe to assume that Radko Gudas and Matt Carle aren't going to keep over 10% from the blue line either even if they remain glued to Stamkos

Beyond the usual suspects, Panik, Sustr and Labrie are making up for the big-minute boys in the opposite direction and BJ Crombeen somehow hasn't seen anything go in while on the ice at all, even incidentally, in his 85 minutes of 5v5 TOI - so... that's gonna move.

Their current division leading standing (!?) will soon be jeopardized by their poor possession showing, inferior to even Florida, but luck-wise there's very little reason to correct for Tampa at all.

Regressed projection: 99.8 PDO

DETROIT - 98.9 PDO

Perhaps it's their pedigree, perhaps it's the fanfare with which they entered the division, but I was somewhat surprised to see where Detroit landed on this list. At 6-4-2, they are presently converting on their shots less than Buffalo and have only been better than the uber-regression-candidate Rangers in this department in the East. They're due for some bounces for half the roster, particularly vets like Alfredsson, Franzen and Weiss.

Contributing to their mih record as they maintain a low but not entirely grave PDO, they're only third in the division in possession. Completely middle-of-the-pack league wide. Granted it's the post-Lidstrom era, but still! This isn't usually what people intend when talking about Detroit being a "mean possession team." Boston: better by a wide margin? Feels good, man.

On the other side of the coin, goaltending has been well above-average to date, exceeding Ottawa. Pretty much what I wrote about the Senators and Anderson, just copy and paste it here for Howard and you've got the spirit of the thing.

Regressed projection: 99.7 PDO

BUFFALO - 98.6 PDO

Well, the good news for Sabres fans is technically it should get better... except it won't. A few big caveats should be considered when reviewing and projecting the Sabres. First, their goaltending has been stellar and past ES showings from Miller haven't been to far from this existing mark. Over the past six seasons, the Sabres have held PDO above 100 on the back of their netminder. Unfortunately for them, he's not long for this team, having already been rumored in deals in the first quarter of the season. As with Detroit, shooting shouldn't remain this terrible, but any correction would be statistically meaningless if they don't increase their league-worst CorsiFor.

Speaking of... the Sabres are on target to be the worst possession team in the BehindTheNet era. The reason I've shown 5v5 Close above for FF% is to provide a like comparison to prior teams and the comparisons are not exactly flattering. The dreadful 07/08 Thrashers, current worst team recorded, only managed to maintain 43% FF. Buffalo is presently chilling at a positive arctic 38.2. It's early going yet, but given that possession figures show strong persistence of skill, there's only small expected mean regression for the stat. History Will Be Made?

Regressed projection: 99.6 PDO (except lol nope, because Miller)

FLORIDA - 97.2 PDO

Was entering the season with an unproven youngster and a semi-retired goalie a wise decision? Probably not, and receiving some of the worst goaltending in the league can partially be attributed to the individuals in net. Thomas' percentage was already beginning to pull toward the mean before he was re-injured, but as with Buffalo, the uncertainty in net leaves question as to whether we can expect any positive regression for this team. They otherwise would be the only real candidate for upward regression within the division.

PK save percentage is also a key issue for this Franken-team, landing over 1 SD from mean and second worst in the East, worst in division. Further indication, perhaps, that the goaltending side of the equation is really being dragged down by its true skill level.

Like us, the Panthers shooting percentage is buoyed by a small group of high-converting forwards (among them old friend Brad Boyes) covering for the less fortunate shooters. But these cats have more guys presently scraping the bottom of the barrel, including an impressive on-ice goose-egg from Tomas Kopecky, beating out Crombeen for the Count Rugen memorial "Worst Thing I've Ever Heard" award with nada from anyone through 150 minutes. Plenty of room for an upward pull on the shooting side, with just as much to question about the goaltending.

Regressed projection: 99.3 PDO

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

Join Stanley Cup of Chowder

You must be a member of Stanley Cup of Chowder to participate.

We have our own Community Guidelines at Stanley Cup of Chowder. You should read them.

Join Stanley Cup of Chowder

You must be a member of Stanley Cup of Chowder to participate.

We have our own Community Guidelines at Stanley Cup of Chowder. You should read them.

Spinner

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9355_tracker