31 Dec 2018

Data analysis for finishing times of race Cross de invierno Ciudad de los Poetas 2018



#63 Data analysis for finishing times of race "Cross de invierno Ciudad de los Poetas 2018"

Last week (16/12/2018) I participated in the running race "XXXVI Cross de Invierno Ciudad de los Poetas - Memorial Javier Martínez Morales", a 6 km race (3 rounds o 2 km with some hills, for map, see: https://es.wikiloc.com/rutas-carrera/cross-invierno-dehesa-de-la-villa-8295067) in my neighourhood in Madrid' (Ciudad de los Poetas (Saconia)), in the beautiful park Dehesa de la Villa in Madrid, organised by the local running club Agrupación Deportiva Ciudad de los Poetas (A.D. Ciudad de los Poetas). The nice thing of this race, apart from its environment, is that there are races for all ages, from young to old (competing in 10 categories), so that the whole family can participate, which we also did. And besides, the race is also free, and perfectly organised, so I can really recommend it.
A.D. Ciudad de los Poetas also posted many nice photos of the race on their Flickr-space:

My favourites:
And for a nice video (of an earlier edition of the race):

https://www.youtube.com/watch?v=ejJhOhsDHTU&feature=youtu.be

In my race participated 108 runners, some were member of a running club, or of a school,besides 'independent' runners as me (total: #42, see worksheet "Pivot1" of the attached Excel). 

To analyse the finishing times, I downloaded the PDF-file from: 

and converted this to Excel with this (free) tool: 

https://www.pdftoexcel.com/

I added a column "TimeSecTotal" so that all finish times are converted to seconds (using simple string-functions, e.g. MID and RIGHT).
And also column "RankInRunnersClub", that indicates a ranking in a sub-competition (e.g. my (overall-)ranking was 75 (of 109) (69th percentile), but in my sub-competition ('Independent runners"), my ranking was 24 (of 42) (57th percentile)). For this sub-ranking ranking in a group), I used the function SUMPRODUCT, see e.g.:

https://www.extendoffice.com/documents/excel/4319-excel-rank-by-group.html

In my previous blog-posts about races in which I participated, you can see how you can make a boxplot and histogram for the finishing times in Excel, see e.g.: 

but this time I wanted  to see if there were some free tools with which this can be done, and there are.
To create the histogram (see fig.1), I used:

NB: For other races about which I blogged I could check my histogram with that at the site of Runedia, see e.g.: 

https://runedia.mundodeportivo.com/en/race/carrera-de-las-empresas-10k-actualidad-economica-2014/201419683/

but for this race, they didn't publish it, but maybe later, here:

https://runedia.mundodeportivo.com/en/race/cross-de-invierno-ciudad-de-los-poetas-2018/20183350/.

And to create the boxplot (see fig.2), I used: 

Note that the boxplot also plotted an outlier (so this means that the 'upper-whisker' is not the slowest finishing time (2542 sec.), but the penultimate slowest time (2429 sec.)).
NB: I also found this stats tool:

https://plot.ly 

with which you can create all kinds of charts, but the box-plot didn't show the outliers.

To conclude, an interesting read (article with histogram and analysis):







fig.1: Histogram finishing times category Men Vet.B





fig.2: Boxplot v1 (with outlier) of finishing times category Men Vet.B




fig.3: Boxplot v2 of finishing times category Men Vet.B




fig.3: Boxplot v2 (detail) of finishing times category Men Vet.B



Downloads:

#Mirror 1: Google Drive



25 Feb 2018

Speed skating at Winter Olympics: Power BI Dashboard for all medals


#62 Speed skating at Winter Olympics: Power BI Dashboard for all medals


For the 2018 Winter Olympic Games in PyeongChang (South Korea), in Korean:
평창 동계 올림픽 

I made a Power BI dashboard for all medals won in the long track speed skating competitions, see fig.1 for the result



fig.1: Power BI Dashboard with medals/rankings for all speed skating competitions

NB: translations Dutch - English
- Massastart =  Mass start
- Ploegenachtervolging = Team pursuit


The datasource I used is:

https://nl.wikipedia.org/wiki/Schaatsen_op_de_Olympische_Winterspelen_2018

NB: This is the Dutch version of this English page:
https://en.wikipedia.org/wiki/Speed_skating_at_the_2018_Winter_Olympics#Medal_summary

but I choose to use the Dutch page as the tables on this page (with medals-table and competition-results) where in a better format to import them in Power BI.

This datasource had a table with per competition a row with the athletes with rank 1, 2 and 3 (see the tables at the bottom of the dashboard), which with Power Query's M can be easily be unpivoted, so in 1 row per rank (needed to do the aggregations, leading to the tables and charts in the middle of the report).
For more info about (un)pivoting tables with rankings, see e.g.:

https://blogs.oracle.com/sql/how-to-convert-rows-to-columns-and-back-again-with-sql-aka-pivot-and-unpivot

As you can see in the report, the Netherlands did very well again this Winter Games, and was #1 in medals-table for long track speed skating with16 medals in total. Besides that we won another 4  medals in short track speed skating, so we won 20 in total, see for all medal-winners:

https://en.wikipedia.org/wiki/Netherlands_at_the_2018_Winter_Olympics

Only in the previous Winter Games in Sochi the Netherlands did better.
For my post about these medals ( so in Sochi), see:

http://worktimesheet2014.blogspot.com.es/2014/02/rankings-dutch-skaters-in-wintergames.html

And for a nice chart with Dutch medal-scores at all the Winter Games, see:

https://www.gelderlander.nl/olympische-spelen/de-nederlandse-medailles-tijdens-de-winterspelen~a1e54d7c/117331823/

As I told in my previous post about the 2018 Winter Games, see:

http://worktimesheet2014.blogspot.com.es/2018/02/power-bi-chart-with-map-for-winter.html

skating is a very popular sport in the Netherlands and we did send the max. allowed number of skaters to the Winter Games (20), more then any other country. So you could say that it is logically that we win the most medals in skating.  I read here:

https://www.washingtonpost.com/news/fancy-stats/wp/2018/02/20/2018-winter-olympics-power-rankings-norway-netherlands-dominating/

and interesting article which ranked the Netherlands #2 (after Norway) using a special 'power rating', which not only looks at the medals won, but also to the ranking of the other (not medal-winners) athletes, the # of participating athletes etc.).

Well done #TeamNL !!


Downloads

Mirror #1 (Google Drive):

https://goo.gl/cHb2Q4



18 Feb 2018

Power BI chart with map for 2018 Winter Olympics

#61 Power BI chart with map for 2018 Winter Olympics


For the 2018 Winter Olympic Games in PyeongChang (South Korea), in Korean:
평창 동계 올림픽 

Google did some research of what the favourite winter sport of every country is, and published this on Google Trends, see: 


and:


Not surprisingly, for the Netherlands the favourite sport is (ice) skating (speed skating and shorttrack). I was interested to see how popular skating is world-wide, so I looked at above Google Trends site for 'ice skating', filtering for the period of the Olympics 2018 (9 Feb. until today, 17 Feb.), here:

https://trends.google.com/trends/story/US_cu_vSct5mABAADwQM_en

I downloaded the CSV file from Google Trends and with this I made the Power BI chart in fig.1.


fig.1: Power BI 'filled map' chart with Google Trends Index for ice skating

The only red country on the map, left to the 'E' of Europe, is the Netherlands, with the highest Google Trends index (of 100) (and in green are the lowest scoring countries, and in pink the countries with a medium score).

To make this chart, this site was very helpful for me:


References:


For some information about Korean, see my blog-post: 

-For some of my blog-posts with skate-statistics of the 2014 Winter Olympics, see: