Injuries, which are multifactorial and the most complex issue reported by practitioners, have a huge negative impact on the performance and economy of professional football teams.
As a starting point for decision making to reduce injury risk, teams should find an effective way to store and share important information between staff. For the analytical process of this data to be successful, critical actions must be carried out before (find the right question, extract the right data) and after (turn digestible and transformable insights and reports into specific actions) to ultimately provide a better understanding of the injury risk of each player.
There are three levels of analytical approaches: (1) Univariate approach. When one variable (e.g. PlayerLoad) is related to one outcome (injury), questionable findings can be obtained because some additional information is needed. (2) Interaction between two variables. By merging information from two different sources (e.g. high-speed running and perceived recovery), further insights can be obtained. (3) Multivariate approach. The advanced level of modelling for injury risk assessment is combining more data (GPS, strength, nutrition, etc.) into the mix and, with machine learning approaches, evaluate players in a more comprehensive and multifactorial way.
Multivariate approaches allow us to better identify players who have a high risk of injury (by means of more true and less false positives) than univariate models. In this way, practitioners can obtain valuable information on the risk assessment of players on a daily basis. To conclude, it is not only about creating the right model (which tell us about association of factors with injury, never about prediction), but also about reasoning, cleaning, and organizing data before which would ultimately provide better information to understand the changes which would be the most impactful to reduce the risk of injuries.
Key words: injury risk, machine learning, multivariate approach, analytics.