31 Dec 2018

Data analysis for finishing times of race Cross de invierno Ciudad de los Poetas 2018



#63 Data analysis for finishing times of race "Cross de invierno Ciudad de los Poetas 2018"

Last week (16/12/2018) I participated in the running race "XXXVI Cross de Invierno Ciudad de los Poetas - Memorial Javier Martínez Morales", a 6 km race (3 rounds o 2 km with some hills, for map, see: https://es.wikiloc.com/rutas-carrera/cross-invierno-dehesa-de-la-villa-8295067) in my neighourhood in Madrid' (Ciudad de los Poetas (Saconia)), in the beautiful park Dehesa de la Villa in Madrid, organised by the local running club Agrupación Deportiva Ciudad de los Poetas (A.D. Ciudad de los Poetas). The nice thing of this race, apart from its environment, is that there are races for all ages, from young to old (competing in 10 categories), so that the whole family can participate, which we also did. And besides, the race is also free, and perfectly organised, so I can really recommend it.
A.D. Ciudad de los Poetas also posted many nice photos of the race on their Flickr-space:

My favourites:
And for a nice video (of an earlier edition of the race):

https://www.youtube.com/watch?v=ejJhOhsDHTU&feature=youtu.be

In my race participated 108 runners, some were member of a running club, or of a school,besides 'independent' runners as me (total: #42, see worksheet "Pivot1" of the attached Excel). 

To analyse the finishing times, I downloaded the PDF-file from: 

and converted this to Excel with this (free) tool: 

https://www.pdftoexcel.com/

I added a column "TimeSecTotal" so that all finish times are converted to seconds (using simple string-functions, e.g. MID and RIGHT).
And also column "RankInRunnersClub", that indicates a ranking in a sub-competition (e.g. my (overall-)ranking was 75 (of 109) (69th percentile), but in my sub-competition ('Independent runners"), my ranking was 24 (of 42) (57th percentile)). For this sub-ranking ranking in a group), I used the function SUMPRODUCT, see e.g.:

https://www.extendoffice.com/documents/excel/4319-excel-rank-by-group.html

In my previous blog-posts about races in which I participated, you can see how you can make a boxplot and histogram for the finishing times in Excel, see e.g.: 

but this time I wanted  to see if there were some free tools with which this can be done, and there are.
To create the histogram (see fig.1), I used:

NB: For other races about which I blogged I could check my histogram with that at the site of Runedia, see e.g.: 

https://runedia.mundodeportivo.com/en/race/carrera-de-las-empresas-10k-actualidad-economica-2014/201419683/

but for this race, they didn't publish it, but maybe later, here:

https://runedia.mundodeportivo.com/en/race/cross-de-invierno-ciudad-de-los-poetas-2018/20183350/.

And to create the boxplot (see fig.2), I used: 

Note that the boxplot also plotted an outlier (so this means that the 'upper-whisker' is not the slowest finishing time (2542 sec.), but the penultimate slowest time (2429 sec.)).
NB: I also found this stats tool:

https://plot.ly 

with which you can create all kinds of charts, but the box-plot didn't show the outliers.

To conclude, an interesting read (article with histogram and analysis):







fig.1: Histogram finishing times category Men Vet.B





fig.2: Boxplot v1 (with outlier) of finishing times category Men Vet.B




fig.3: Boxplot v2 of finishing times category Men Vet.B




fig.3: Boxplot v2 (detail) of finishing times category Men Vet.B



Downloads:

#Mirror 1: Google Drive