Can we really predict injuries in team sports?

Dae-Jin Lee
Applied Statistics Research Line. BCAM-Basque Center for Applied Mathematics
ORCid: 0000-0002-8995-8535
Lore Zumeta-Olaskoaga
Applied Statistics Research Line. BCAM. Departamento de Matemáticas Universidad del País Vasco (UPV/EHU)
ORCid: 0000-0001-6141-1469


In the last decade several works have emerged in which statistical and machine learning methods have been proposed for the prediction of sports injuries. The field of medicine and sports science has included in its area multidisciplinary profiles with expertise in data analysis, injury epidemiology or artificial intelligence. However, injury phenomena are very complex and multifactorial. Understanding the mechanisms that produce an injury remains extremely complex and requires expert knowledge. This paper aims to illustrate from a statistical perspective what challenges need to be addressed from data collection, analysis of athlete performance and scientific reflection on questions of interest for knowledge-based decision making in data analysis in sport.

Keywords: sports injury data, athlete’s performance, statistical modelling, survival analysis.

MSC Subject classifications: 62J02, 62J07, 62N01, 62M10.


Injuries are common in professional sports and can have significant physical, psychological and financial consequences on a team performance and considerable impact in athletes’ careers. Understanding injury risk factors and their interplay is thereby a key component of preventing future injuries in sport ((Bahr and Krosshaug 2005; Finch 2006)). During the last decade, thanks to the professionalisation of the specialists involved in sports teams and the use of new technologies (e.g. computer vision, thermal cameras, Global Positioning Systems, etc … ), the interest in the modelling and prediction of injuries in professional sports through machine learning and artificial intelligence algorithms has dramatically grown (see (Fiscutean 2021; Jauhiainen et al. 2021; Ley et al. 2022) for a detailed review). Hence, the field of sports medicine and sports science has become an interesting field of research for data scientist, statisticians and computer scientists, such that a new role of sports biostatistician, with knowledge in statistics, epidemiology, sports medicine and communication skills is more and more required in professional team sports ((Casals and Finch 2017)).

Here, we will address some of the most important challenges facings sports science and medicine research, from our (probably biased) perspective and based on our recent experience in collaborations with a professional football team.

Some modelling challenges in sports injury

In this section, we focus on sports injury data modelling from the perspective of:

  1. Sports injury data. From a descriptive analysis, to injury incidence and burden and graphical representations, exploratory data analysis all this is crucial to pose the right questions related to sports injuries epidemiology from a team sports perspective (e.g. is my supported team more impacted by injuries than the others?, which type of injuries were most frequent? And most burdensome? or how does injury affect on the performance of the team in terms of the final classification?).
  2. The analysis of training (internal and external) loads. Internal Load represents an individual athlete’s response to training, and can be quantified by the intensity and duration of the physiological stress imposed on the athlete. The internal load is better explained with the external load that consists of what can be measured by GPS and accelerometers (i.e. distance in different speed zones, total distance covered, etc …).
  3. Self-report wellness. Self-report wellness questionnaires are a relatively simple and inexpensive means for determining an athlete’s training load and their subsequent responses to that training. In fact, this is the most common method for monitoring athlete fatigue and recovery. A substantial amount of research has been conducted which confirms that wellness questionnaires, can indicate changes in training load/stress in elite team sport athletes.
  4. Modelling injury risk. Based on the question of interest (either epidemiological or an individual athlete’s performance or conditioning) the modelling approach may differ. We consider a time-to-event analysis approach that is a useful statistical tool to analyze the influence of changing exposures on injury risk. Time-to-event modelling allow change in training load to be included as a time-varying exposure for sport injury and modelling recurrent events.

There are other many aspects that are related to sports injury that are not included in the previous classification that are of great interest in the sports injury field.

Sports injury data analysis: the R package injurytools

The R package injurytools ((L. Zumeta-Olaskoaga and Lee 2022)) facilitates the data analysis workflow by providing convenience functions and handy tools for sports injury data1. To illustrate some capabilities of the package, it includes injury data from top European teams in the four leagues: La Liga (Spain), Bundesliga (Germany), Premier League (England) and Serie A (Italy). The package includes several functions that can be classified into (sports injury) data preparation, descriptive analyses and data visualisation routines.

The aim of the package is: 1) to provide a consistent way and general routines to analyse sports injury data, in R, including functions to perform informative visualisations and functions to facilitate the estimation of injury summary statistics, following the standards established in the consensus statement on injuries; 2) to help automate the descriptive reports that are routinely performed for sports injury surveillance. The statistical modelling of sports injuries is for the moment beyond the scope of injurytools, but the data structures are suitable for further analyses with other R packages and methods.

To illustrate some examples, we consider data scrapped from the German webpage Transfermarkt ( Figure 1 shows a descriptive visualization of the injuries of Liverpool FC male team during 2017-2018 and 2018-2019 seasons. The horizontal axis represent the time line and the vertical line the Liverpool FC players. For each player, the black line represents the time the player was enrolled to the team with symbols \(\times\) and \({\circ}\) to denote the date of the injury and the date of recovery and player’s availability to train and play matches respectively.

Figure 1: Representation of Liverpool FC injuries on seasons 2017/18 and 2018/19.

The extent of the sports injury problem is often described by injury incidence and by indicators of the severity of sports injuries. Sports injury incidence should preferably be expressed as the number of sports injuries per exposure time (e.g. per 1000 hours of sports participation, i.e. training sessions and matches) in order to facilitate the comparability of research results ((WW.Van Mechelen, Hlobil, and Kemper 2012)).

Thus, when attempting to describe the distribution of injuries it is necessary to relate this to the population at risk over a specified time period. This is why the fundamental unit of measurement is a rate. A rate is a measure that consists of a denominator and a numerator over a period of time. Denominator data can be a number of different things (e.g. number of minutes trained/played, number of matches played). As such, it reflects the speed at which new “injury-related” events occurs. There are two important definitions to consider:

Injury incidence rate is the number of new injury cases (I) per unit of player-exposure time, i.e. \[\label{eq:IR}
I_r = \frac{I}{\Delta T}\qquad(1)\]

Injury burden rate is the number of days lost (\(n_d\)) per unit of player-exposure time, i.e. \[I_{br} = \frac{n_d}{\Delta T}\qquad(2)\] where \(\Delta T\) is the total time under risk of the study population.

Note that, either injury incidence (\(I_r\)) nor injury burden (\(I_{br}\)) are ratios, and they are not interpreted as a probability; they are rates and their unit (person-time)\(^{-1}\) (e.g. per 1000h of player-exposure, per player-season etc …).

In Table 2, exposure time unit is match minutes, hence injury incidence and injury burden are calculated per 100 player-matches of exposure (90 minutes times 100). Indeed, a correct exposure time should include training minutes for the total exposure time. However, Transfermarkt webpage do not collect the training minutes per team or per player.


Figure 3 shows the injury incidence and burden evolution of four European teams from season 2008-2009 to 2018-19. This plot is merely a descriptive and forecasting for the future seasons does not make any sense for such a short time series. The trend in the incidence of all type injuries has increased in Borussia Dortmund, for the rest of the teams the trend is not clear. However the injury burden has not a clear trend in any of the teams analyzed. Overall the most impacted team by injuries was Borussia Dortmund. Liverpool was the team with lowest injury incidence. In terms of the type of injuries (classified in Transfermarkt as muscular, ligament, concussion, bone and unknown), in all seasons and teams, most frequent injuries were muscle injuries. Ligament injuries were by far the most burdensome in Liverpool 2015/16, Roma 2016/17 and Borussia Dortmund 2017/18 (results not shown).

Figure 2: Comparison of linear trends among four European teams (Barcelona, Borussia Dortmund, Liverpool and Roma). Incidence: number of injuries per unit of player-exposure time (frequency). Burden: number of days lost per unit of player-exposure time (severity and frequency).

Another way to visualize sports injury data is the so-called risk matrix of injuries in Figure 4. For season 2017/18 it shows the relationship between the severity (consequence) and incidence (likelihood) of the most common injuries ((Bahr, Clarsen, and Ekstrand 2018; Fuller 2018)). The main advantages of using risk matrices, and the reasons for their attractiveness, are the minimal inputs required, the convenience of understanding the visual information presentation, the transparent nature of the assessment standards and the simplicity with which the conclusions can be communicated to stakeholders. Injury burden is most often used for risk evaluation that motive the lost of days of training and matches, ranking the importance of injuries risk factors and prioritising injury prevention plans.

Figure 3: Risk matrices for Barcelona, Borussia Dortmund, Liverpool and Roma for season 2016-2017.

Athletes’ performance: strength, conditioning and wellness

Strength and conditioning professionals aim to maximize athletic performance and reduce the associated injury risk. Therefore, understanding the relationships between different physical capacities and performance metrics, as well as the acute and long-term effects of distinct training interventions on athletic populations is crucial for coaches and practitioners. Now, we will first define the internal and the external load.

The internal load

The Borg scale also known as rate of perceived exertion (or RPE) is an instrument that was created for the purpose of measuring effort in training, it measures, as its name suggests, the perception of effort, intensity and volume of physical activity, so it is a good alternative to assess the level of demand in each workout. The session rating of perceived exertion (sRPE) proposed by (Foster 1998) considers the overall effort of the training session (i.e. the product between RPE and the total time of the training/match session, which is also generally referred to as Training Load (TL)). Two different RPE scales are used in sports: (i) CR-10 where the RPE values are ranged between 0 (no exertion at all) and 10 (maximal exertion), and (ii) 6–20 scale where the values are ranged between 6 (no exertion at all) and 20 (maximal exertion). The TL is widely used in sports as an easy index describing the athletes’ internal workload. Another important feature describing the internal workload is heart rate (HR). Even if HR is an important objective index of internal load, the use of heart rate monitoring in team sports is not a standardized procedure due to the fact that the chest strap is uncomfortable while performing contact sports.

The external load

External workloads are defined as the training features that describe the effort performed during training or match sessions. Global Position System (GPS) commonly records such features. The use of GPS ‘wearable technology’ in high-performance sport is becoming increasingly popular ((Cummins et al. 2013; Colby et al. 2014; Chambers et al. 2015)). The type of variables collected from the devices are:

  • “Kinematic variables”. Measures athlete’s overall movement during a training session, e.g., total distance and high-speed running distance (Distance in meters covered above 5.5 \(m/s\));
  • “Metabolic variables”. Measures the energy expenditure of an athlete’s overall movement during a training session, e.g., high metabolic load distance (distance in meters covered by a player with a Metabolic Power is above 25.5 \(W/Kg\));
  • “Mechanical variables”. Describes athlete’s overall musculo-skeletal load during a training session, e.g., explosive distance (Distance in meters covered above 25.5 \(W/Kg\) and below 19.8 \(Km/h\)), and the number of accelerations and decelerations above 2 and 3 \(m/s^2\).

These features are the most used to evaluate external workloads and to predict the risk of injury ((Rossi et al. 2018)).

Self-reported wellness

Perceived wellness has been linked with both internal and external stressors, as well as muscle damage biomarker. Several questionnaires are used in sports to evaluate players’ well-being, the most general one consist of a 5-point Likert scale of 5 items (i.e., fatigue, sleep quality, soreness, stress, and mood), where 1 and 5 indicated the highest and lowest values of wellness for each item. See Table 5 ((McLean et al. 2010)).

Wellness data is not standardized between individuals, and equivalent scores may not indicate equivalent levels of fatigue and/or wellness ((Thornton et al. 2016)). The data must be considered within the individual context of each player and, thus, it’s necessary to use relative change within each player when interpreting longitudinal trends amongst groups.


These forms generally consist of 5-12 items using 1-to-5 or 1-to-10 point Likert scales, or modification of existing questionnaires by placing greater emphasis on ratings of muscle soreness, physical fatigue and general wellness.

In the past decade, significant efforts have been made to understand injury risk in sport using subjective (i.e. rating of perceived exertion) and objective (i.e. accelerometers, gyroscopes and magnetometers) player monitoring strategies.

Modelling sports injury risks

Modelling sports injury data encompasses the complex time-varying and recurrent nature of injuries: an athlete’s injury susceptibility may change over time, and moreover, an athlete can sustain more than one injury, as subsequent injuries are often influenced by previous ones ((Hägglund, Waldén, and Ekstrand 2006)). Models for recurrent events are appealing for sports injuries prevention ((Ullah, Gabbett, and Finch 2014; Rasmus Oestergaard Nielsen et al. 2019; R. O. Nielsen et al. 2019)).

A non-exhaustive list of methods and algorithms in the literature are:

  • Generalized linear/additive models, regression trees and random forests.
  • Survival analysis and time-to-event data analysis.
  • Mixed-effects models (longitudinal modelling).
  • Multivariate times series for classification (injury/non-injury).
  • Variable selection and dimension reduction.

Figure 6 presents the Kaplan-Meier curves for the four European teams analyzed in Section 2.1 for the time to the first injury of the season (for minutes per match played until the first injury of the season 2017/18). The Kaplan-Meier estimates is to be used to measure the fraction of football players available for training and matches for a certain amount of time. For recurrent events a gap time approach can be considered ((Ullah, Gabbett, and Finch 2014)). (Lore Zumeta-Olaskoaga et al. 2021) consider the gap time approach for predicting sports injuries with regularized cox regression models with frailty including covariates from functional screening tests and anthropometric measurements of female players during one regular season. A major challenge in sports injury data is usually the small sample size and the few number of injuries.

Figure 4: Comparison of Kaplan-Meier curves for four European teams (Barcelona, Borussia Dortmund, Liverpool and Roma).

When internal and external load is considered in the analysis, the most commonly used measure is the acute:chronic workload ratio (ACWR), that comprised an athlete’s ‘fitness’ and ‘fatigue’, and can be calculated using very basic time series analysis methods such as the rolling average (RA) model or the exponentially weighted moving average (EWMA) model. The actual value computed by the ACWR has different implications, and can assist fitness coaches in understanding the readiness of an athlete, the relative injury risk of an athlete from day-to-day, and therefore, with carefully planned intervention, can help to prevent injury. This ratio is usually considered as a flagging value for injury risk.

Typically, this is the workload performed by an athlete in 1-week (7 days). This value contains both training-and match-load information over this 7-day period. The acute workload represents the ‘fatigue’ aspect of the ACWR.

The chronic workload is typically the 4-week (28 day) average acute workload. This value is important as it provides a clear indication of what an athlete has done leading up to the present training or match day. Therefore, it is commonly viewed as an indication of an athlete’s ‘fitness’.

Several studies suggested that large increases in acute workload with respect to the chronic workload (i.e. the average training workload of the previous month) are associated with an increased injury risk ((Hulin et al. 2014)). In particular, they showed that players with a high ratio between acute and chronic workload are more likely to become injured compared to those with a lower ratio . Traditional calculations of ACWR are ‘mathematically coupled’, as the most recent week is included in estimates of both the acute and chronic workloads. The uncoupled version consists of using the ACWR where the acute load is not part of the chronic load instead.

The R package ACWR ((Fernandez-Santos 2022)), allows for computing the ACWR using three different methods: exponentially weighted moving average (EWMA), rolling average coupled (RAC) and rolling averaged uncoupled (RAU) in (Williams et al. 2017; Windt and Gabbett 2019).

Figure 7 illustrates the daily training load of an athlete (sPRE) through a regular season. The vertical lines represent the sRPE per type of session (match or training) and the grey shades areas are the time period the athlete was injured. Similar plots can be obtained from other external and internal loads (i.e. kinematic, metabolic and mechanical variables, training loads and wellness tests) and a multivariate approach for forecasting injuries in soccer for evaluating and interpreting the complex relations between injury risk and training performance ((Rossi et al. 2018, 2022)).

Figure 5: Simulated daily training loads (sRPE) of an athlete over a season. There are training and match sessions and the external load measure consist of ACWR (coupled version). Additionally, acute and chronic workload are show. Grey shaded areas show the days the athlete was injured.

So can we really predict injuries in team sports?

In the last decade, the number of studies about machine learning algorithms applied to sports, e.g., injury forecasting and athlete performance prediction, have rapidly increased. However, a world where we can prevent sports injuries before they happen is impossible, sports injuries occur and will continue to occur. However, it is entirely possible to accurately assess your risk level in terms of physical activity and injury. From lifestyle to biological constitution or genetic characteristics, there are many factors that influence an athlete’s level of sports injury risk. In this paper, we have presented some challenges in team sports injury risk modelling, from the type of data collected, the concepts of performance and strength of the internal/external training loads and self-report wellness questionnaires. However, the leading approaches in machine learning are notoriously data-hungry. Unfortunately, in teams sports injury field there is no large number of injury data because acquiring data involves a process that is expensive or time-consuming.

However, the most important aspect in sports injury data modelling comes from a sports science and medical staff perspective. It is important to effectively use evidence-based knowledge to develop decision-making processes that reduce injury risk and optimize athlete performance ((Drew, Raysmith, and Charlton 2017; Meyer 2017; Nassis 2017)). From our perspective, as statisticians, statistical modelling plays an important role in bridging the gap for understanding and quantifying the risk of team sports injuries where awareness about relevant concepts such as causality, association and complexity are crucial rather prediction of an athletes’ injury itself ((Meeuwisse 1994; Ruddy et al. 2019; Fonseca et al. 2020)). An evidence-based injury risk assessment can help prevent future injuries and increase your potential for better performance.

Unfortunately, we can never predict injuries with complete certainty because we certainly can’t predict the future. However, there is a way to determine injury risk, which in part can help predict or even prevent sports injuries.

Based on scientific research and the sciences of biomechanics, kinesiology, and ergonomics, the sports and medical communities have identified certain risk factors that can lead to sports injuries. Of course, risking any or all of these factors doesn’t necessarily mean you’ll end up hurting. However, knowing that you are at risk will help prevent many types of sports injuries in the future.


This research was funded by projects PID2020-115882RB-I00 funded by Agencia Estatal de Investigación and acronym “S3M1P4R”, by the Basque Government (BERC 2022-2025 program) and by the Spanish Ministry of Science, Innovation, and Universities (BCAM Severo Ochoa accreditation SEV-2017-0718). This project has been also funded by the Provincial Council of Bizkaia within the Technology Transfer Programme 2022 and is co-financed by the European Regional Development Fund (ERDF) through the project “MATH4SPORTS – Modelización matemática para la industria deportiva: salud y rendimiento.” Provincial Council of Bizkaia 6/12/TT/2022/00006 (BFA/DFB).


Bahr, R., B. Clarsen, and J. Ekstrand. 2018. “Why We Should Focus on the Burden of Injuries and Illnesses, Not Just Their Incidence.” Br J Sports Med 52 (August): 1018–21.
Bahr, R., and T. Krosshaug. 2005. “Understanding Injury Mechanisms: A Key Component of Preventing Injuries in Sport.” Br J Sports Med 39 (June): 324–29.
Casals, M., and C. F. Finch. 2017. “Sports Biostatistician: A Critical Member of All Sports Science and Medicine Teams for Injury Prevention.” Injury Prevention 23 (December): 423–27.
Chambers, R., T. J. Gabbett, M. H. Cole, and A. Beard. 2015. “The Use of Wearable Microsensors to Quantify Sport-Specific Movements.” Sports Medicine 45 (July): 1065–81.
Colby, M. J., B. Dawson, J. Heasman, B. Rogalski, and T. J. Gabbett. 2014. “Accelerometer and GPS-Derived Running Loads and Injury Risk in Elite Australian Footballers.” Journal of Strength and Conditioning Research 28: 2244–52.
Cummins, C., R. Orr, H. O’Connor, and C. West. 2013. “Global Positioning Systems (GPS) and Microtechnology Sensors in Team Sports: A Systematic Review.” Sports Medicine 43 (October): 1025–42.
Drew, M. K., B. P. Raysmith, and P. C. Charlton. 2017. “Injuries Impair the Chance of Successful Performance by Sportspeople: A Systematic Review.” British Journal of Sports Medicine 51 (August): 1209–14.
Fernandez-Santos, J. R. 2022. ACWR: Acute Chronic Workload Ratio Calculation.
Finch, C. F. 2006. “A New Framework for Research Leading to Sports Injury Prevention.” Journal of Science and Medicine in Sport 9: 3–9.
Fiscutean, A. 2021. “Data Scientists Are Predicting Sports Injuries with an Algorithm.” Nature 592 (April).
Fonseca, S. T., T. R. Souza, E. Verhagen, R. van Emmerik, N. F. N. Bittencourt, L. D. M. Mendonça, A. G. P. Andrade, R. A. Resende, and J. M. Ocarino. 2020. “Sports Injury Forecasting and Complexity: A Synergetic Approach.” Sports Medicine.
Foster, C. 1998. “Monitoring Training in Athletes with Reference to Overtraining Syndrome.” Med Sci Sports Exerc. 30 (July): 1164–68.
Fuller, C. W. 2018. “Injury Risk (Burden), Risk Matrices and Risk Contours in Team Sports: A Review of Principles, Practices and Problems.” Sports Medicine 48 (July): 1597–1606.
Hägglund, M., M. Waldén, and J. Ekstrand. 2006. “Previous Injury as a Risk Factor for Injury in Elite Football: A Prospective Study over Two Consecutive Seasons.” British Journal of Sports Medicine 40 (September): 767–72.
Hulin, B. T., T. J. Gabbett, P. Blanch, P. Chapman, D. Bailey, and J. W. Orchard. 2014. “The Acute-Chronic Workload Ratio-Injury Figure and Its ‘Sweet Spot’ Are Flawed.” British Journal of Sports Medicine 48: 708–12.
Jauhiainen, S., J. P. Kauppi, M. Leppänen, K. Pasanen, J. Parkkari, T. Vasankari, P. Kannus, and S. Ayramo. 2021. “New Machine Learning Approach for Detection of Injury Risk Factors in Young Team Sport Athletes.” International Journal of Sports Medicine 42 (February): 175–82.
Ley, C., R. K. Martin, A. Pareek, A. Groll, R. Seil, and T. Tischer. 2022. “Machine Learning and Conventional Statistics: Making Sense of the Differences.” Knee Surgery, Sports Traumatology, Arthroscopy 30: 753–57.
McLean, B. D., A. J. Coutts, V. Kelly, M. R. McGuigan, and S. J. Cormack. 2010. “Neuromuscular, Endocrine, and Perceptual Fatigue Responses During Different Length Between-Match Microcycles in Professional Rugby League Players.” International Journal of Sports Physiology and Performance 5 (September): 367–83.
Meeuwisse, W.H. M.D. 1994. “Assessing Causation in Sport Injury: A Multifactorial Model.” Clinical Journal of Sport Medicine 4: 166–70.
Meyer, T. 2017. “How Much Scientific Diagnostics for High-Performance Football?” Science and Medicine in Football 1 (May): 95.
Nassis, G. P. 2017. “Leadership in Science and Medicine: Can You See the Gap?” Science and Medicine in Football 1 (September): 195–96.
Nielsen, R. O., M. L. Bertelsen, D. Ramskov, M. Møller, Adam H., D. Theisen, C. F. Finch, L. V. Fortington, M. A. Mansournia, and E. T. Parner. 2019. “Time-to-Event Analysis for Sports Injury Research Part 2: Time-Varying Outcomes.” British Journal of Sports Medicine 53 (January): 70–78.
Nielsen, Rasmus Oestergaard, Michael Lejbach Bertelsen, Daniel Ramskov, Merete Møller, Adam Hulme, Daniel Theisen, Caroline F. Finch, Lauren Victoria Fortington, Mohammad Ali Mansournia, and Erik Thorlund Parner. 2019. “Time-to-Event Analysis for Sports Injury Research Part 1: Time-Varying Exposures.” British Journal of Sports Medicine 53 (January): 61–68.
Rossi, A., L. Pappalardo, P. Cintia, F. M. Iaia, J. Fernàndez, and D. Medina. 2018. “Effective Injury Forecasting in Soccer with GPS Training Data and Machine Learning.” Edited by Jaime Sampaio. PLOS ONE 13 (July): e0201264.
Rossi, A., E. Perri, L.Pappalardo, P. Cintia, G. Alberti, D. Norman, and F. M. Iaia. 2022. “Wellness Forecasting by External and Internal Workloads in Elite Soccer Players: A Machine Learning Approach.” Frontiers in Physiology 13 (June).
Ruddy, J. D., S. J. Cormack, R. Whiteley, M. D. Williams, R. G. Timmins, and D. A. Opar. 2019. Modeling the Risk of Team Sport Injuries: A Narrative Review of Different Statistical Approaches.” Frontiers in Physiology 10 (July): 829.
Thornton, H. R., J. A. Delaney, G. M. Duthie, B. R. Scott, W. J. Chivers, C. E. Sanctuary, and B. J. Dascombe. 2016. “Predicting Self-Reported Illness for Professional Team-Sport Athletes.” International Journal of Sports Physiology and Performance 11 (May): 543–50.
Ullah, S., T. J. Gabbett, and C. F. Finch. 2014. “Statistical Modelling for Recurrent Events: An Application to Sports Injuries.” British Journal of Sports Medicine 48: 1287–93.
Williams, Sean, Stephen West, Matthew J Cross, and Keith A Stokes. 2017. “Better Way to Determine the Acute:chronic Workload Ratio?” British Journal of Sports Medicine 51: 209–10.
Windt, J., and T. J. Gabbett. 2019. “Is It All for Naught? What Does Mathematical Coupling Mean for Acute:chronic Workload Ratios?” British Journal of Sports Medicine 53 (August): 988–90.
WW.Van Mechelen, H. Hlobil, and H. C. G. Kemper. 2012. “Incidence, Severity, Aetiology and Prevention of Sports Injuries.” Sports Medicine 1992 14:2 14 (October): 82–99.
Zumeta-Olaskoaga, L., and D.-J. Lee. 2022. Injurytools: A Toolkit for Sports Injury Data Analysis.
Zumeta-Olaskoaga, Lore, Maximilian Weigert, Jon Larruskain, Eder Bikandi, Igor Setuain, Josean Lekue, Helmut Küchenhoff, and Dae Jin Lee. 2021. “Prediction of Sports Injuries in Football: A Recurrent Time-to-Event Approach Using Regularized Cox Models.” AStA Advances in Statistical Analysis, November, 1–26.

  1. The injurytools package is under construction and can be accessed at↩︎


Can we really predict injuries in team sports?

This paper illustrates from a statistical perspective what challenges need to be addressed from data collection, analysis of player performance and scientific reflection on questions of interest for informed decision making in sports medicine.

What does the research tell us about the understanding of the random variables and its probability distributions?

La variable aleatoria representa uno de los conceptos clave en el modelamiento de fenómenos aleatorios a través de las distribuciones de probabilidad. Por tanto, este estudio tiene como objetivo analizar y describir las principales investigaciones que la literatura reporta sobre variable aleatoria y su distribución de probabilidad. Los resultados muestran la existencia de algunas propuestas de enseñanza en torno a estas nociones, las cuales se caracterizan por utilizar tecnología.

Técnicas estadísticas en geolingüística. Modelización onomástica

Esta tesis se centra en la introducción de nuevos métodos estadísticos para el tratamiento de datos y la modelización en geolingüística, concretamente, en los apellidos de Galicia. El trabajo realizado contempla dos problemas principales: (i) la construcción de regiones de apellidos en Galicia y (ii) la modelización de patrones espaciales y espacio-temporales de apellidos en esta región.

Conceptos de modelización en la formación universitaria de los analistas de datos

A lo largo de los años hemos observado que los titulados en programas universitarios relacionados con el análisis de datos solemos tener cuando finalizamos nuestros estudios una visión parcial del proceso de modelización de problemas. En este artículo repasamos algunos de los conceptos que los analistas de datos van a tener que manejar cuando se incorporen al entorno empresarial y que tal vez podrían ser incluidos en los planes de estudio de esas titulaciones.

Contributions to Close-Enough Arc Routing Problems

En esta tesis doctoral nos centramos en el estudio y la resolución de problemas de Rutas por Arcos basados en el concepto Close-Enough, que se refiere a servir a los clientes al pasar a una cierta distancia de ellos. Para resolverlos de manera óptima, se han diseñado e implementado algoritmos Branch and Price y Branch and Cut. Además, al ser un problema NP-hard, hemos propuesto algoritmos metaheurísticos para obtener soluciones buenas en un tiempo de computación considerable. Tesis defendida por Miguel Reula Martín.

Una mirada feminista y cariñosa a la Sociedad de Estadística e Investigación Operativa

Descripción gráfica y numérica de la composición de las socias y socios de la Sociedad de Estadística e Investigación Operativa cuyo objetivo es conocer con más detalle las características de sus miembros, especialmente en relación a su género binario, edad, tipo de membresía en relación a la sección en la que se integran, antigüedad y comunidad autónoma de procedencia.