Player Comparison

Guest User

In search of similarity: Finding comparable NHL players by Guest User

AnzeKopitar.jpeg

By Owen Kewell

The following is a detailed explanation of the work done to produce my public player comparison data visualization tool. If you wish to see the visualization in action it can be found at the following link, but I wholeheartedly encourage you to continue reading to understand exactly what you’re looking at:

https://public.tableau.com/profile/owen.kewell#!/vizhome/PlayerSimilarityTool/PlayerSimilarityTool

NHL players are in direct competition with hundreds of their peers. The game-after-game grind of professional hockey tests these individuals on their ability to both generate and suppress offense. As a player, it’s almost guaranteed that some of your competitors will be better than you on one or both sides of the puck. Similarly, you’re likely to be better than plenty of others. It’s also likely that there are a handful of players league-wide whose talent levels are right around your own.

The NHL is a big league. In the 2017-18 season, 759 different skaters suited up for at least 10 games, including 492 forwards and 267 defensemen. In such a deep league, each player should be statistically similar to at least a handful of their peers. But how to find these league-wide comparables?

Enter a bit of helpful data science. Thanks to something called Euclidean distance, we can systemically identify a player’s closest comparables around the league. Let’s start with a look at Anze Kopitar.

Anze Kopitar's closest offensive and defensive comparables around the league

Anze Kopitar's closest offensive and defensive comparables around the league

The above graphic is a screenshot of my visualization tool.

With the single input of a player’s name, the tool displays the NHL players who represent the five closest offensive and defensive comparables. It also shows an estimate of the strength of this relationship in the form of a similarity percentage.

The visualization is intuitive to read. Kopitar’s closest offensive comparable is Voracek, followed by Backstrom, Kane, Granlund and Bailey. His closest defensive comparables are Couturier, Frolik, Backlund, Wheeler, and Jordan Staal. All relevant similarity percentages are included as well.

The skeptics among you might be asking where these results come from. Great question.

A Brief Word on Distance

The idea of distance, specifically Euclidean distance, is crucial to the analysis that I’ve done. Euclidean distance is a fancy name for the length of the straight line that connects two different points of data. You may not have known it, but it’s possible that you used Euclidean distance during high school math to find the distance between two points in (X,Y) cartesian space.

Now think of any two points existing in three-dimensional space. If we know the details of these points then we’re able to calculate the length of the theoretical line that would connect them, or their Euclidean distance. Essentially, we can measure how close the data points are to each other.

Thanks to the power of mathematics, we’re not constrained to using data points with three or fewer dimensions. Despite being unable to picture the higher dimensions, we've developed techniques for measuring distance even as we increase the complexity of the input data. 

Applying Distance to Hockey

Hockey is excellent at producing complex data points. Each NHL game produces an abundance of data for all players involved. This data can, in turn, be used to construct a robust statistical profile for each player.

As you might have guessed, we can calculate the distance between any two of these players. A relatively short distance between a pair would tell us that the players are similar, while a relatively long distance would indicate that they are not similar at all. We can use these distance measures to identify meaningful player comparables, thereby answering our original question.

I set out to do this for the NHL in its current state.

Data

First, I had to determine which player statistics to include in my analysis. Fortunately, the excellent Rob Vollman publishes a data set on his website that features hundreds of statistics combed from multiple sources, including Corsica Hockey (http://corsica.hockey/), Natural Stat Trick (https://naturalstattrick.com) and NHL.com. The downloadable data set can be found here: http://www.hockeyabstract.com/testimonials. From this set, I identified the statistics that I considered to be most important in measuring a player’s offensive and defensive impacts. Let’s talk about offense first.

List of offensive similarity input statistics

List of offensive similarity input statistics

I decided to base offensive similarity on the above 27 statistics. I’ve grouped them into five categories for illustrative purposes. The profile includes 15 even-strength stats, 7 power-play stats, and 3 short-handed stats, plus 2 qualifiers. This 15-7-3 distribution across game states reflects my view of the relative importance of each state in assessing offensive competence. Thanks to the scope of these statistical measures, we can construct a sophisticated profile for each player detailing exactly how they produce offense. I consider this offensive sophistication to be a strength of the model.

While most of the above statistics should be self-explanatory, some clarification is needed for others. ‘Pass’ is an estimate of a player’s passes that lead to a teammate’s shot attempt. ‘IPP%’ is short for ‘Individual Points Percentage’, which refers to the proportion of a team’s goals scored with a player on the ice where that player registers a point. Most stats are expressed as /60 rates to provide more meaningful comparisons.

You might have noticed that I double-counted production at even-strength by including both raw scoring counts and their /60 equivalent. This was done intentionally to give more weight to offensive production, as I believe these metrics to be more important than most, if not all, of the other statistics that I included. I wanted my model to reflect this belief. Double-counting provides a practical way to accomplish this without skewing the model’s results too heavily, as production statistics still represent less than 40% of the model’s input data.

Now, let's look at defense.

List of defensive similarity input statistics

List of defensive similarity input statistics

Defensive statistical profiles were built using the above 19 statistics. This includes 15 even-strength stats, 2 short-handed stats, and the same 2 qualifiers. Once again, even-strength defensive results are given greater weight than their special teams equivalents.

Sadly, hockey remains limited in its ability to produce statistical measurements of individual defensive talent. It’s hard to quantify events that don’t happen, and even harder to properly identify the individuals responsible for the lack of these events. Despite this, we still have access to a number of useful statistics. We can measure the rates at which opposing players record offensive events, such as shot attempts and scoring chances. We can also examine expected goals against, which gives us a sense of a player’s ability to suppress quality scoring chances. Additionally, we can measure the rates at which a player records defense-focused micro-events like shot blocks and giveaways. The defensive profile built by combining these stats is less sophisticated than its offensive counterpart due to the limited scope of its components, but the profile remains at least somewhat useful for comparison purposes.

Methodology

For every NHLer to play 10 or more games in 2017-18, I took a weighted average of their statistics across the past two seasons. I decided to weight the 2017-18 season at 60% and the 2016-17 season at 40%. If the player did not play in 2016-17, then their 2017-18 statistics were given a weight of 100%. These weights represent a subjective choice made to increase the relative importance of the data set’s more recent season.

Having taken this weighted average, I constructed two data sets; one for offense and the other for defense. I imported these spreadsheets into Pandas, which is a Python package designed to perform data science tasks. I then faced a dilemma. Distance is a raw quantitative measure and is therefore sensitive to its data’s magnitude. For example, the number of ‘Games Played’ ranges from 10-82, but Individual Points Percentage (IPP%) maxes out at 1. This magnitude issue would skew distance calculations unless properly accounted for.

To solve this problem, I proportionally scaled all data to range from 0 to 1. 0 would be given to the player who achieved the stat’s lowest rate league-wide, and 1 to the player who achieved the highest. A player whose stat was exactly halfway between the two extremes would be given 0.5, and so on. This exercise in standardization resulted in the model giving equal consideration to each of its input statistics, which was the desired outcome.

I then wrote and executed code that calculated the distance between a given player and all others around the league who share their position. This distance list was then sorted to identify the other players who were closest, and therefore most comparable, to the original input player. This was done for both offensive and defensive similarity, and then repeated for all NHL players.

This process generated a list of offensive and defensive comparables for every player in the league. I consider these lists to be the true value, and certainly the main attraction, of my visualization tool.

Not satisfied with simply displaying the list of comparable players, I wanted to contextualize the distance calculations by transforming them into a measure that was more intuitively meaningful and easier to communicate. To do this, I created a similarity percent measure with a simple formula.

Similarity Formula.jpg

In the above formula, A is the input player, B is their comparable that we’re examining, and C is the player least similar to A league-wide. For example, if A->B were to have a distance of 1 and A->C a distance of 5, then the A->B similarity would be 1 - (1/5), or 80%. Similarity percentages in the final visualization were calculated using this methodology and provide an estimate of the degree to which two players are comparable.

Limitations

While I wholeheartedly believe that this tool is useful, it is far from perfect. Due to a lack of statistics that measure individual defensive events, the accuracy of defensive comparisons remains the largest limitation. I hope that the arrival of tracking data facilitates our ability to measure pass interceptions, gap control, lane coverage, forced errors, and other individual defensive micro-events. Until we have this data, however, we must rely on rates that track on-ice suppression of the opposing team’s offense. On-ice statistics tend to be similar for players who play together often, which causes the model to overstate defensive similarity between common linemates. For example, Josh Bailey rates as John Tavares’ closest defensive comparable, which doesn’t really pass the sniff test. For this reason, I believe that the offensive comparisons are more relevant and meaningful than their defensive counterparts.

Use Scenarios

This tool’s primary use is to provide a league-wide talent barometer. Personally, I enjoy using the visualization tool to assess relative value of players involved in trades and contract signings around the league. Lists of comparable players give us a common frame through which we can inform our understanding of an individual's hockey abilities. Plus, they’re fun. Everyone loves comparables.

The results are not meant to advise, but rather to entertain. The visualization represents little more than a point-in-time snapshot of a player’s standing around the league. As soon as the 2018-19 season begins, the tool will lose relevance until I re-run the model with data from the new season. Additionally, I should explicitly mention that the tool does not have any known predictive properties.

If you have any questions or comments about this or any of my other work, please feel free to reach out to me. Twitter (@owenkewell) will be my primary platform for releasing all future analytics and visualization work, and so I encourage you to stay up to date with me through this medium.

Cover photo credited to Jae C. Hong — Associated Press

Analysis: How five elite scorers get their goals by Guest User

kuch.jpeg

By Owen Kewell

There’s something beautiful about scoring a goal.

Goals are the building blocks that make up hockey success, both on the individual and team level. They are a single moment in time, a culmination of a series of plays that ends with one team’s attack successfully defeating the other’s defense.

Teams are forever searching to add goals to their lineup, and for good reason. Goals win games, playoff series and, eventually, championships.

Goal-scoring is a repeatable talent, and while certain NHLers are far better at it than others, each player does it their own way. Each scorer exhibits unique tendencies of shot type selection and shot location.

Alex Ovechkin, Evgeni Malkin, Connor McDavid, Nikita Kucherov, and Patrik Laine are five of the best scorers in the game. Of the 10 goal leaders for the 2017-18 season, these five players possess the highest career goals per game rates. They are the elite of the elite when it comes to putting the puck into NHL nets.

I wanted to explore how they each do it differently.

Elite Scorers 1.jpg

The above visualization separates by shot type to show how each player scored their goals in the 2017-18 season. Overall, the most popular shot type was wrist shot, followed by snap shot, slap shot, and finally backhand.

It should be noted that the ‘AVG (10+ G Forwards)’ represents a weighted average of the relevant shot rate among all forwards who scored 10 or more goals, weighted by the number of goals that they scored. It’s a way to quantify ‘normal’ rates for the league’s goal scoring forwards.

Let’s take a more detailed look at each of these five players.

Alex Ovechkin

Elite Scorers 3.jpg

It’s no secret that Alex Ovechkin is really good at scoring goals. Since breaking into the league, he’s won the scoring title 7 times and no one else has won it more than twice. Sitting at 607 career goals, Ovi continues to propel himself further up the list of all-time greats. His 0.605 goals per game ranks first league-wide, beating out all other forwards by at least 0.08 G/GP.

Ovechkin loves slap shots, which should come as no surprise to anyone who’s watched Washington’s power play operate. His 17 slap shot goals were an uncontested 1st league-wide, with Steven Stamkos being the only other forward to score more than 7. Ovechkin’s slap shot is so powerful that it beats goalies clean even whey they know it’s coming, meaning that it can be unleashed without needing to be disguised.

Equally noteworthy, Ovechkin scored just 31% of his goals by wrist shot, which represents the lowest rate among all 32 players who scored 30+ goals.

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

The red areas in the above heat map show where Ovechkin shoots more frequently than the rest of the league. Ovechkin makes an absolute killing at the top of the left faceoff circle, often referred to as the ‘Ovi Spot’. This area lines up with Ovechkin’s average shot distance of 32.3 feet, which ranked in the 80th percentile among the league’s forwards.

Although it’s not reflected in the heat map, much of Ovechkin’s damage is done with the man advantage playing the left point. Of his 49 goals, 17 were scored on the power play, which ranked 2nd only behind a player further down this list. His remaining 32 were scored at even-strength, which again ranked 2nd in the league. Elite scoring across both special teams and even-strength situations throughout his career has propelled Ovechkin to the status of the league’s premier goal scorer.

Evgeni Malkin

Elite Scorers 5.jpg
Elite Scorers 6.jpg

Despite being the second-best player on his team, Malkin has put together the resume of an elite goal scorer. He’s scored 75 goals in 140 games over the past two seasons, which converts to 44 goals over an 82-game season. His career goals per game of 0.472 ranks 6th among active forwards, placing him in elite company.

What makes Malkin dangerous is his offensive versatility; he can score from anywhere on the ice. Equal parts power and precision, Malkin possesses a variety of weapons. His snap shot goal rate clocks in at roughly double the league average (his 11 snap shot goals ranked 4th), but his middle-of-the-pack rates for wrist shots, slap shots and backhands speak to his balanced toolkit. Malkin does not rely on a single shot type to score goals, meaning that defenders must respect all shot types that Malkin credibly threatens. 

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

Did I mention that Malkin can score from anywhere? The sea of red is the beauty of Evgeni Malkin. He’s one of the most complete offensive players in the league. In addition to his heavy shot, his slick puck-handling ability and power forward frame allow him to generate shots and scoring chances at elite rates in the low slot area. His shot distance ranked just inside the upper third league-wide, influenced both by his crease-area chances and his shot activity in the high slot.

Malkin joins Ovechkin as the only two players in the league to finish top-10 in both even-strength goals and power play goals. He scored 28 times at evens, ranking 7th, and 14 times with the man advantage, ranking 6th. Malkin is one of the game’s most dangerous players in the offensive zone, and his goal scoring abilities rank among the NHL’s elite.

Connor McDavid

Elite Scorers 8.jpg
Elite Scorers 9.jpg

At this point, not much more needs to be said about Connor McDavid’s offensive game. His 108 points were enough for a second consecutive Art Ross (but not Hart) Trophy. He’s the been the league’s best forward for the last two years, and he’s only 21 years old.

But is he a goal scorer? While it’s true that McDavid has been viewed more as a set-up man than a finisher thus far in his young career, in 2017-18 we saw a transformation in McDavid’s offensive role. Compared to the year prior, McDavid scored 11 more goals and took 23 more shots. He became more of a trigger man, electing to attempt shots more often instead of looking to pass. This development calls to mind a young Sidney Crosby, who recorded seasons of 70 and 84 assists before breaking out for 51 goals in 2009-10.

McDavid prefers to score goals with his wrist shot. His 25 wrist shot goals ranked 3rd league-wide behind only Nathan MacKinnon and Eric Staal, while his rate of 61% ranked 9th among the 32 players who scored 30+ goals. He hardly ever takes slap shots, registering just 7 of these shots during the entire season, of which just 1 beat the goalie. Rather than rely on strength to generate power, McDavid creates offense thanks to generational skating and elite-level hands. He’s able to create and navigate space better than anyone else on the planet and puts himself into positions where a quick and accurate wrist shot is more than enough to beat the goalie.

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

McDavid has figured out hockey’s (not-so) secret formula: if you get close to the net, you’re more likely to score. He's extremely effective at using his speed, hands, and vision to attack the most dangerous area of the ice. McDavid’s sub-20’ average shot distance is a testament to his elite ability to generate scoring chances from the crease and low slot area.

McDavid’s special teams split is intriguing. His 35 even-strength goals ranked first in the entire NHL, but his 5 power play goals tied him for 96th among forwards. This latter can be explained both by Edmonton’s league-worst power play and also McDavid’s primary role as a puck distributor on the top unit. If Edmonton’s power play improves, which is likely given regression to the mean, McDavid’s special teams goal-scoring could very well take a step forward and supplement his elite even-strength scoring totals. He is already the game’s best forward and he poses a legitimate threat to become the game’s best scorer sooner rather than later.

Nikita Kucherov

Elite Scorers 11.jpg
Elite Scorers 12.jpg

A late 2nd round pick, Nikita Kucherov has emerged from relative anonymity to become one of the league’s most dangerous forwards. His 79 goals over the past two seasons place 3rd league-wide, and he was one of just three players to break 100 points in 2017-18.

While Kucherov’s absurdly accurate wrist shot remains his primary weapon (4th in wrist shot goals with 24), he is equally dangerous on the backhand. He scored 8 times (21% of all goals) on his backhand, ranking 2nd among 30+ goal scorers to Brad Marchand in both raw total and rate. Kucherov’s ability to score using wrist shots and backhands is all the more impressive considering that he shoots from further away than 93% of other forwards. He can be successful from this range without relying on the power of slap and snap shots due to his innate ability to find and exploit tiny gaps that goaltenders leave open. His shots are precise and accurate, and he excels at finding any available daylight.

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

An incredibly versatile player, Nikita Kucherov generates shots at elite rates all over the mid and high-slot. Rather than favour a specific shooting location, he elects to test the goalie from all areas of the offensive zone. This makes Kucherov unpredictable, which helps explain why his quick-release wrist shot and backhand are so devastating. He doesn’t shoot much from the crease area, but driving the net really isn’t part of how he creates offense.

Kucherov was more of a goal-scorer at even-strength than on the power play in 2017-18. He recorded 31 ES goals, one of just four players to crack 30, compared with 8 on the man advantage. He played more of a set-up role on Tampa Bay’s 3rd-ranked power play, registering 28 assists as he regularly sent cross-ice passes to Steven Stamkos (15 PP goals). Kucherov’s outstanding season cemented his status as one of the most dangerous goal scorers in the NHL, and at the prime age of 25 he’s as good a bet as any to repeat his offensive dominance next season.

Patrik Laine

Elite Scorers 14.jpg

At just 20 years old, Patrik Laine is already among the game’s premier snipers. His 44 goals ranked 2nd league-wide in 2017-18, fueling the Jets to their franchise-best season. Laine’s biggest asset is his shot, which may very well be the best in the league. Among current NHLers with 50+ career goals, Patrik Laine’s career shooting percentage of 18.0% ranks 2nd behind only Paul Byron. Byron, meanwhile, had an average shot distance of 19.3 feet in 2017-18, least of all eligible forwards, while Laine’s average shot came from 36.1 feet, ranking in the 97th percentile. The kid can shoot the puck.

Laine’s weapon of choice is his snap shot, which he routinely uses to one-time pucks into the back of the net. His quick release and accurate shot placement resulted in 14 snap shot goals in 2017-18, which tied for the league lead with Phil Kessel. He also is a fan of the slap shot, with his 6 slap shot goals placing him in a tie for 4th among all forwards.

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

Heat Map courtesy of Micah Blake McCurdy's website HockeyViz (https://hockeyviz.com)

Here we see Laine’s favourite shooting locations. A right-handed shot, Laine loves to one-time pucks from the high slot. The fact that he’s able to beat the goalie so consistently from so far away speaks to his talent as a shooter. Like Ovechkin, Laine’s shooting locations lack variety, but he’s so good from his spots that goalies have difficulty stopping the shot even if they can anticipate that it’s coming.

The triggerman for the Jets’ 5th-ranked power play, Laine lead all NHLers with 20 power play goals in 2017-18. He would routinely patrol the space between the left half-wall and left point, making himself open to cross-seam passes and one-timing his quick snapshot on net. His 24 even-strength goals tied for 20th in the league, so he’s no slouch at 5-on-5 scoring either.

Since breaking into the league, Laine has used his generational shot to pick apart opposing goalies. The odds-on favourite to inherit Ovechkin’s throne as best goal-scorer is the league, the sky’s the limit for a kid who potted 44 goals in just his second season in the league.

 

Conclusion

So there we have it; the modus operandi of five of the game’s elite. While Ovechkin, Malkin, McDavid, Kucherov, and Laine possess a shared gift for putting the puck in the net, they achieve it with vastly different sets of techniques, skills, and strategies. There is no uniform way to score a goal across the league, but all that matters is that it goes in.

With goals representing the currency of the NHL, goal-scorers are among the most valuable assets out there. Scoring goals wins you games, playoff series, and, as 32-year old Alex Ovechkin and 31-year-old Evgeni Malkin know, Stanley Cup championships. Kucherov (25), McDavid (21), and Laine (20) have not yet won hockey’s ultimate prize but given their relative youth and their otherworldly ability to put the puck in the net, they might not be far away.

 

Data courtesy of Hockey Abstract (http://hockeyabstract.com/testimonials), Natural Stat Trick (https://naturalstattrick.com), and NHL.com (https://nhl.com)

Shot heat maps courtesy of Micah Blake McCurdy’s wonderful visualization website HockeyViz (https://hockeyviz.com)

Cover photo credited to NHL.com