#63 Data analysis for finishing times of race
"Cross de invierno Ciudad de los Poetas 2018"
Last week (16/12/2018) I participated in the running
race "XXXVI Cross de Invierno Ciudad de los Poetas - Memorial Javier
Martínez Morales", a 6 km race (3 rounds o 2 km with some hills, for map,
see: https://es.wikiloc.com/rutas-carrera/cross-invierno-dehesa-de-la-villa-8295067) in my neighourhood in Madrid' (Ciudad de los Poetas
(Saconia)), in the beautiful park Dehesa de la Villa in Madrid, organised by the local running
club Agrupación Deportiva Ciudad de los Poetas (A.D. Ciudad de los Poetas). The nice thing of this
race, apart from its environment, is that there are races for all ages, from
young to old (competing in 10 categories), so that the whole family can
participate, which we also did. And besides, the race is also free, and
perfectly organised, so I can really recommend it.
A.D. Ciudad de los Poetas also posted many nice photos
of the race on their Flickr-space:
My favourites:
And for a nice video (of an earlier edition of the
race):
https://www.youtube.com/watch?v=ejJhOhsDHTU&feature=youtu.be
https://www.youtube.com/watch?v=ejJhOhsDHTU&feature=youtu.be
In my race participated 108 runners, some were member
of a running club, or of a school,besides 'independent' runners as me (total:
#42, see worksheet "Pivot1" of the attached Excel).
To analyse the finishing times, I downloaded the
PDF-file from:
I added a column "TimeSecTotal" so that all
finish times are converted to seconds (using simple string-functions, e.g. MID and RIGHT).
And also column "RankInRunnersClub", that
indicates a ranking in a sub-competition (e.g. my (overall-)ranking was 75 (of 109)
(69th percentile), but in my sub-competition ('Independent runners"), my
ranking was 24 (of 42) (57th percentile)). For this sub-ranking ranking in a group), I used the function SUMPRODUCT, see e.g.:
https://www.extendoffice.com/documents/excel/4319-excel-rank-by-group.html
https://www.extendoffice.com/documents/excel/4319-excel-rank-by-group.html
In my previous blog-posts about races in which I
participated, you can see how you can make a boxplot and histogram for the
finishing times in Excel, see e.g.:
but this time I wanted to see if there were some
free tools with which this can be done, and there are.
To create the histogram (see fig.1), I used:
NB: For other races about which I blogged I could
check my histogram with that at the site of Runedia, see e.g.:
https://runedia.mundodeportivo.com/en/race/carrera-de-las-empresas-10k-actualidad-economica-2014/201419683/
https://runedia.mundodeportivo.com/en/race/carrera-de-las-empresas-10k-actualidad-economica-2014/201419683/
but for this race, they didn't publish it, but maybe later, here:
https://runedia.mundodeportivo.com/en/race/cross-de-invierno-ciudad-de-los-poetas-2018/20183350/.
https://runedia.mundodeportivo.com/en/race/cross-de-invierno-ciudad-de-los-poetas-2018/20183350/.
And to create the boxplot (see fig.2), I used:
Note that the boxplot also plotted an outlier (so this
means that the 'upper-whisker' is not the slowest finishing time (2542 sec.),
but the penultimate slowest time (2429 sec.)).
NB: I also found this stats tool:
https://plot.ly
with which you can create all kinds of charts, but the box-plot didn't show the outliers.
https://plot.ly
with which you can create all kinds of charts, but the box-plot didn't show the outliers.
To conclude, an interesting read (article with histogram and analysis):
fig.1: Histogram finishing times category Men Vet.B
fig.2: Boxplot v1 (with outlier) of finishing times category Men Vet.B
fig.3: Boxplot v2 of finishing times category Men Vet.B
fig.3: Boxplot v2 (detail) of finishing times category Men Vet.B
Downloads:
#Mirror 1: Google Drive