Showing posts with label Tennis. Show all posts
Showing posts with label Tennis. Show all posts

Sunday, February 7, 2016

CourtHive

Fan Participation Platform

CourtHive is a web application for charting tennis matches. It was inspired by the Match Charting Project and is intended to support the growth of an open-source repository of matches which will aid statisticians in advancing the state of the art with respect to tennis.

Tuesday, November 17, 2015

Rally Tree: Point Distribution and Win Percentage

Tennis is an "intermittent" sport.  The level of intensity can vary greatly with the rally length of points and the time taken between points (among other factors including surface, ball type, sex and level of play).  When rallies are visualized they are typically depicted temporally from the first point to the last, which gives a jagged chart where it is difficult to discern any pattern at all.  "Rally Tree" is an attempt to bring a different perspective to the analysis of rallies.

 "Rally Tree" depicts the distribution of points across various rally lengths, beginning at the top with rally lengths of Zero, which indicate either Aces, Serve Winners, or Double Faults. Color coding differentiates errors where balls were "netted" vs. hit long.

 
There are several available views. The default view displays all points for a single match or selection of matches. You can filter by player to display only points served by either player (or composite of opponents).  Notice the number of winners among the points won for servers vs. those receiving.

Additionally there is an overlay depicting the percentage chance that a point was won for any given rally length. The offset vertical lines represent 50% either side of center (0%).
For the "served points" views, this gives a graphic representation of the Persistence of Server Advantage, which varies greatly among players. Please note that this is not the same as percentage of points won for a given rally length.
The "Rally Tree" graphics in this post are of Novak Djokovic's matches at Wimbledon in 2014 and 2015. The last two images depict the persistence of Djokovic's server advantage on the left and a composite of his opponents' server advantage on the left. Djokovic's dominance is obvious.  Apart from rallies of seven, he had a greater than 50% chance of winning all points with rallies up to sixteen.  His opponents' composite server advantage only extended to rallies of five.

You can play around with a live version of Rally Tree and explore your favorite players at TennisVisuals.com

In the near future "Rally Tree" will be integrated with "Game Tree" and other TAVA components so that selections in one component can drive views in another.  For instance, a "Point Progression" from 0-0 to 0-15 can be selected in "Game Tree" to view the distribution of points in the "Rally Tree" or "Points-to-Set".  From this point it will be possible to explore whether there are certain points in a match when rally lengths increase...

To read more about "Persistence of Server Advantage" please follow the link to Jeff Sackmann's blog post on the topic.

Thursday, August 13, 2015

Radial Horizon Graphs: An Original?!?

The logical conclusion to my exploration of Horizon and Corona Graphs.  I've searched and can find no other examples on the "internets" of a radial Horizon Graph, so perhaps this is a first!

For your viewing pleasure, here are ten matches from Grand Slam Finals.  I hope the Radial Horizon Graphs have captured something of the dynamics of the matches that you wouldn't otherwise see when glancing at the score. I've provided a brief (unprofessional) "reading" of each graph, making the (perhaps unwarranted) assumption that there are more winners and forced errors being made than unforced errors.  In the next version of this graphic, or in the interactive version for TAVA, changes in momentum due to winners and losers should be easily discerned.


LEFT: Apart from the first few games of the first set, Kvitova dominated Bouchard.
RIGHT: Hingis was dominating until near the end of the 2nd set; Capriati had a strong finish.


LEFT: Navratilova and Evert were neck-and-neck in the first two sets; Martina dominated in the beginning of the 3rd, but Evert clawed her way back and took the lead at the very end, then lost.
RIGHT: Muguruza took an early lead but lost momentum; she began to recover near the end of the 2nd set, but it was too late.


LEFT: Safina was stronger at the beginning of both sets.
RIGHT: Sharapova was strong at the beginning of the 1st set; she lost ground steadily in the 2nd.


LEFT: Na Li almost gave up her early lead in the 1st set; she never looked back in the 2nd.
RIGHT: Notice that all three sets ended at 6-3; there are differences, but Cilic steadily advanced.


LEFT: Federer began and ended in control; Roddick lead throughout the 2nd set; the 3rd set was a bit of a seesaw, but Federer held; the 4th was definitive.
RIGHT: Wozniacki had a chance to break in the 1st game of the 1st set; after that point she couldn't win points fast enough to keep up with Serena.

Wednesday, August 12, 2015

Points-to-Set: Horizon Corona



I've been searching for a representation of a tennis match that captures the dynamics of play yet remains simple enough and compact enough to use as either an icon or a control structure suitable for selecting a range of points within a match.  I also wanted a graphic that could be used to quickly compare a series of matches, with enough detail to easily differentiate a 6-0, 6-0 win that was a "cakewalk" from a 6-0, 6-0 win where every game went to deuce and beyond.

The Corona/Horizon Graphs above are the result of my early attempts to use Points-to-Set data in a new way, charting the difference between the two players' Points-to-Set numbers rather than the absolute values.

Corona Graphs

Corona graphs are actually formally known as radial area graphs; there are also examples of radial histograms which I would describe as "Corona Graphs". These graphs share a lot in common with Polar Coordinate Graphs (such as TAVA's Radar Chart), but they look like the Corona that surrounds our Sun.  I haven't seen the name used in the Visualization community as yet, but it is fitting, especially considering the formal definition of a Coronagraph: "A coronagraph is a telescope that can see things very close to the Sun. It uses a disk to block the Sun's bright surface, revealing the faint solar corona, stars, planets and sungrazing comets. In other words, a coronagraph produces an artificial solar eclipse".  So, with Corona Graphs I hope to highlight important aspects of a match which normally are obscured by the quantity of data available within the match.

[Update: The term "Corona Charts" (here and here) is used in the financial community.  But it is not a radial structure and doesn't resemble the graphs above.]

Horizon Graphs / Charts

Horizon graphs are a type of Time-Series graph which were developed relatively recently by Panopticon Software (now known as DataWatch).  Here is a paper describing the development of the graph, and here is an in-depth analysis of the Horizon Graph by Stephen Few of Perceptual Edge, a "Visual Business Intelligence" company.

Horizon Graphs excel at displaying a large number of time series at one time.  They are described as a tool for rapidly scanning huge amounts of data to quickly identify "points of concern"; they "preserve data density while preserving resolution."  A Tennis Match can certainly be thought of as a time series, a progression of points through time.  Horizon Graphs seem ideally suited for comparing matches, but it turns out they are also useful for comparing Sets within matches, and for identifying critical moments during play.

When I began this project I was overwhelmed by the variety of chart examples available.  I wanted to try them all, but it wasn't immediately obvious how each type of chart could be meaningfully applied. It wasn't until I generated my first Corona graphs with Point-to-Set data that I realized how I could use Horizon Graphs, and how useful they could be.

Here is the progression from my first Match Corona visualization to my first Match Horizon:



In the first Corona graph, on the left, the difference in Points-to-Set values varies from positive to negative.  For the second Corona graph I simply flipped the negative values and changed the color to represent the second player.  Below you can see the same data values in a standard horizon graph.


The horizon graph is then cut into bands and layered.  The peaks are still visible and no space "under the curves" is wasted.  Color gradations indicate distance from the baseline so that the greater values become darker.


With this realization it became possible to compare sets and matches with a very compact visual.

When you see a Horizon Graph for the first time you might find it to be somewhat confusing.  But with a bit of study and experience I think you'll find them very valuable.  Read the links above or this in-depth overview by a team at Berkeley: "Sizing the Horizon: The Effects of Chart Size and Layering on the Graphical Perception of Time Series Visualizations".

Set Comparison

Here are the sets from the 2001 R16 match at Wimbledon between Pete Sampras and Roger Federer. Federer won the match 7-6, 5-7, 6-4, 6-7, 7-5.  Federer is in blue; Sampras is in Green.  


You can see the winner of each set by the final color of each graph.  The depth of color at any given moment indicates the distance between the two Points-to-Set numbers: darker colors indicate a greater point difference. Turning the graphic into a control structure will enable point and game selection as well as "brushing" to select a range of points in a game. For the next version of TAVA I will add ticks and marks to optionally indicate breakpoints, aces, winners, errors & etc.  I'll save the use of Horizon and Corona graphs as control structures for a future post.  

To illustrate the ability of the Horizon Graph to enable rapid differentiation of sets which have the same score in games but which vary widely in the intensity of play and the distribution of points, here are Horizon Graph for three sets which each finished at 6-0:


In the first example one player dominated completely, winning all points.  In the second example, which is taken from the 2012 Olympics final between Serena Williams and Maria Sharapova, Serena gave up 12 points to Sharapova and needed 28 points to close out the set.  In the third example every game of the set went to deuce and most games were at deuce more than once. Seventy-one points were played in the final example versus only twenty-four in the first example and forty in the second.

Match Comparisons

The screen real-estate provided by Blogger makes these a bit too compact, but I hope this gives some idea of the expressiveness of Horizon Graphs.  You can click on each graph to see the full size image:

 

And finally, here is a link to a video about Interactive Horizon Graphs.  This is a bit orthogonal to my intent to use Horizon Graphs as control structures, but it is interesting nevertheless and may provide some inspiration for a way to compare very large numbers of matches in the future.  I'm discovering that there are many attributes of matches other than Points-to-Set which may be usefully visualized with Horizon Graphs.

Acknowledgements

I want to recognize again the work of Francis X. DieboldGlenn Rudebusch and Professor Diebold's students at the University of Pennsylvania.  As far as I and they can tell, their work on the concept of Points-To-Set is completely original.

Thursday, August 6, 2015

Visualizing Momentum

Momentum has been described as an "invisible" or "hidden force" in tennis.  (See "The Hidden Force" and the NYT article "The Importance of Momentum in Tennis").  Whether momentum actually exists at all in Tennis or any other sport has long been debated, but it is a certainly that momentum is something that many players and even the crowd "feels" when watching a match.

In "Analyzing Wimbledon", Professors Klaassen and Magnus conclude via statistical analysis that some limited momentum exists for weaker players, but not for top players.

While it is impossible to fully capture the emotional and physical dynamics that contribute to changes in momentum, whether it exists or not, it is possible to create a representation of the progression of points throughout a match which includes details relevant to the outcome of each point.

The following graphic captures the outcome of first and second serves, the return of serve, the Key Shot which determines a point winner, as well as the length of the rally, if any, while a point is being played.  The centerline which runs down the middle of the graphic represents an even point score and the line moves left or right depending on which player has won the most points; a standard score-matrix is overlaid to give an understanding of the outcome of each game.

For a full explanation of how to read this graphic, please see my post on the GameFish, which was derived from the Momentum Chart.


Winning the most points does not, however, insure that a player will win a match.  Psychological factors aside, in certain cases when a point is won is more important than the fact that a point was won, at least with respect to the match outcome.  I will discuss this in a future post and hopefully have some visuals which can facilitate better understanding this point.  At the moment I'm working on a graphic that merges the basis of the Momentum Chart (difference in total points) with idea of the "Points-to-Set" graph (number of points required to win, at any given moment) and I'm hoping it will provide some insight.

The Momentum Chart in TAVA was inspired by the excellent Momentum Chart in the ProTracker Tennis App (for iPhones/iPads).  ProTracker Tennis has a few features which I didn't incorporate in my Proof-of-Concept version.  The score-matrix overlay is original to my implementation.

Version 2 of TAVA will increase the use of the Momentum Chart as a control structure and seek to overlay visualizations which provide additional analysis into factors which may be seen to have an influence on changes in momentum.  At the moment the Momentum Chart drives the Court View (post forthcoming) which displays shots for matches captured with ProTracker Tennis.

There is an excellent discussion of Momentum for Players and Coaches on the Turbo Tennis blog at The Tennis Server.  Please see the articles "Momentum... Swing it in your favor",  "Momentum Revisited" and "The Big MO!".

Of course, Momentum can also be interpreted in the context of a series of matches.  The cross-match visualizations I'm doing for the next version of TAVA will look at this aspect of Momentum in depth.

Monday, August 3, 2015

Points-To-Set

The "Points-to-Set" graph was inspired by the work of Francis X. DieboldGlenn Rudebusch and Professor Diebold's students at the University of Pennsylvania.  In December, 2014, Professor Diebold published "A Tennis Match Graphic" on his blog No Hesitations, and in February when I was just discovering D3 I decided to attempt to recreate his work for the data I had just learned to parse from ProTracker Tennis.  Here is the result of that effort, taken from the 2015 Wimbledon semifinal match between Roger Federer and Andy Murray, where you can view these charts "live":


And here is the latest version:


We can think of the "Points-to-Set" number as the minimum distance from the current number of points won until the end of the Set; it always assumes your opponent wins no additional points.  In TAVA this number is expressed graphically for each player to indicate at any given moment in a Set which player is closer to winning.


To win a standard Set in a tennis match a player must, at a minimum, win six games and be ahead by two games.  Giving no more than two points away, there is a minimum of four points which must be won in each game.  That means that at the beginning of a Set each player needs twenty-four points to win the Set.  The Y-axis of the graph below ranges from 24 up to 0, which is where the Set concludes. The X-axis shows the total number of points within the Set.  In the match depicted in these "Points-to-Set" visualizations you can see the varying number of points which had to be played for Roger Federer to close out each Set.

Every point won brings a player closer to the end of the Set, obviously.  Some games, when they are lost, increase the "Points-to-Set" number.  For instance, at the beginning of a Game when the score is 5-4 in the Set, the first player needs only four points to win, while the second requires twelve.  If the first player loses the game and the score becomes tied at 5-5, each player is then eight points from winning the Set. In fact this scenario occurred twice in this match, in both the first and second Sets which were won by Federer 7-5.  You may also notice that in the first game of both the second and third Sets there was a moment when Andy Murray needed 25 points to win the Set.  This actually occurs quite frequently when the first few games are won by one player.  When a player leads 5-0, the opponent actually needs 28 points to win the set.

In the second Set the game which Federer lost there were seven deuces; you can see this in the "Points-to-Set" graphic below where the lines for both players become jagged. You can also see that Federer failed to convert on six breakpoints before winning the set by finally converting a breakpoint.


As I work on the re-write of TAVA I'm developing a gallery of re-usable visualization components and adding configurable features.  In addition to the "orientation highlighting" demonstrated above, I'm adding "game highlighting", which you can see in the chart for the third set below:


When using the "Points-to-Set" component in TAVA, the corresponding moments of the match are highlighted on the Sunburst and you can see the longest game of the match occurred in the second set and was won by Andy Murray (purple) when Federer failed to convert two breakpoints.


In a recent postProfessor Diebold has updated his Tennis Graphic to include elements which indicate where breakpoints occurred and highlight when tiebreaks take place.  Here is the site where his team has collected the visualizations they've created.  I've taken some of these ideas on board and in the re-write of TAVA I'm going to try to push the features and usefulness of the "Points-to-Set" graphic further.  I am intrigued by the idea of producing some variation of a Points-to-Match graphic as a slider/filter for generating dynamic statistics for a range of points within a match...

Sunday, August 2, 2015

GameFish: Point Progression, Key Shots, Rallies

The GameFish visualization provides a single-glance overview of one game from a tennis match.  It is an enhancement of the standard score-matrix for tennis matches.  


The GameFish above is an example from the 2015 Finals at Roland Garros between Stan Wawrinka and Novak Djokovic (the preceding link takes you to the TAVA visualization of the match).

The boxes on the left of the graphic indicate that the server was Stan Wawrinka, as well as the outcomes for all first and second serves.  Light Green dots represent Service Winners; Yellow dots represent Serves that were "In"; Red dots represent faults.  In this game there were no Aces (darker Green dots).  On the right of the graphic the dots represent Novak Djokovic's Return of Serve.

The Game Grid in the center of the graphic indicates the winner of the point by cell color (blue for Wawrinka; purple for Djokovic) as well as the final "Key Shot" which determined the point winner.  In this game there were only three points won with Winners; the majority of points were won due to opponent error.

Rally lengths are depicted with bluish-grey bars which appear "behind" the GameFish.  These rally-bars turn yellow when the mouse hovers over the point, and the number of shots and point-score appear at the top of the graphic.

The GameFish also triggers orientation highlighting in the Sunburst visualization which is the primary control structure for TAVA and which is used to "drive" the GameFish visualizations.


I have seen a fish-like view used in a number of tennis betting applications (example), and it bears some resemblance to the very cool "Game Tree" which appeared on GameSetMap.com in February of 2014.  (An interactive version can be found here).

I also recently discovered the "The Tennis Notebook" blog on Medium.com.  Nikita Taparia created an attractive score-matrix to visualize Point Outcomes for entire matches (point distributions) in Tennis Note # - Rafa in Paris: The Numbers.

There is an IEEE research paper from 2014 which references a "TennisVis" application (apparently a research project) which uses a *very* similar "Fish Grid" to depict point progression.

[update 2015-08-08: found this link on the IEEE contest website to a graphic from "TennisVis"].

The GameFish was derived from and is the basis for the Momentum Chart which strings together all games from a match showing the relative point-score for each player; it is also a control structure within TAVA that provides additional drill-down capabilities for matches which include detailed information about points within games.  

A future post will cover the Court view and the visualization of shots within points.

Thursday, July 30, 2015

Sunburst: Match at a Glance


The goal of the Sunburst Visualization is to provide an information-rich view of a match in a single graphic. The layout for the Sunburst is like a clock.  The match begins at 12:00 and proceeds clockwise.  


The Sets, Games and Points won by each player are colored: Djokovic is blue, Nishikori is purple. The circle at the center of the visualization indicates the outcome of the match.  In this case it is blue, indicating that Djokovic won the match. 

The first ring around the center represents Sets. Djokovic won the first Set, Nishikori the second Set, and Djokovic the third Set. The length of each ring segment indicates the length of each Set relative to every other Set. The second ring from the center represents Games. The darker the color, the more intense the game (in terms of length of rallies). The third ring from the center represents Points. The darker the color, the longer the rally for the point. 

The final and outermost ring represents significant Shots: First and Second Serves, Return of Serve, and the final, Key Shot.  Shots colored Green are Winners.  Light green represents a Shot which forced the opponent to make an error ("forcing shots").  Red represents Errors.  Shots that are "In" are colored to indicate who made the Shot.

Comparing Sunburst visualizations of different matches you can quickly get a gut feel for whether one or both players achieved a significant Aggressive Margin (see also here).  

At the moment I spend most of my time tracking U12 and U10 matches, so the outer ring is mostly Red and the Aggressive Margins are usually negative. Compare the Djokovic/Nishikori match on the left to the Semifinals match between Roger Federer and Andy Murray at Wimbledon (2015):


In the match between Djokovic and Nishikori the relative Aggressive Margins were 5% and -9% whereas in the Federer/Murray match the relative Aggressive Margins were 29% and 23%.

TAVA

[UPDATE: The role of the Sunburst Visualization has been diminished in the latest update to TAVA.  It is still used for orientation highlighting and some navigation.  Sunburst will make a more forceful return at a later date when new functionality is added.]

The Tennis AiP visualization application (TAVA) uses the Sunburst as a control structure to aid in the exploration of various aspects of a match. Clicking on a Set segment reveals a Points-to-Set visualization; clicking a Game segment reveals a GameFish visualization; clicking a Point reveals the GameFish where that point occurred.

The Sunburst Visualization is zoomable.  Double clicking a Set segment transitions to a view of a single Set, while double clicking a Game segment transitions to a view of a single Game:


In TAVA, clicking the center of the Sunburst initiates a zoom transition to the enclosing Set or to the initial Match view.  From the initial Match view, clicking the center of the Sunburst transitions the display to a Momentum Chart.

The Sunburst also provides "Orientation Highlighting" while using other components of TAVA to indicate where Shots and Points occurred:

Finally, TAVA uses the Sunburst to indicate where Breakpoints occurred during a match:

In the visualization above Djokovic (Blue) had a breakpoint opportunity in the second game; he successfully converted (Green).  In the sixth game Djokovic had three breakpoints, but Nishikori saved the breakpoints and Djokovic failed to convert (Red).  In the final Game of the first Set, Nishikori (purple) had two breakpoints which he failed to convert (Red).