We present a benchmark dataset for evaluating physical human activity recognition methods from wrist-worn sensors, for the specific setting of basketball training, drills, and games. Basketball activities lend themselves well for measurement by wrist-worn inertial sensors, and systems that are able to detect such sport-relevant activities could be used in applications toward game analysis, guided training, and personal physical activity tracking. The dataset was recorded for two teams from separate countries (USA and Germany) with a total of 24 players who wore an inertial sensor on their wrist, during both repetitive basketball training sessions and full games. Particular features of this dataset include an inherent variance through cultural differences in game rules and styles as the data was recorded in two countries, as well as different sport skill levels, since the participants were heterogeneous in terms of prior basketball experience. We illustrate the dataset's features in several time-series analyses and report on a baseline classification performance study with two state-of-the-art deep learning architectures.
The study protocol is divided into two parts. The first part is designed to collect controlled data by having participants complete a sequence of predefined activities for a defined period of time, while this first part is controlled, it also simulates real-world basketball drills in practice sessions where players repeatedly practice a certain activity (e.g., layups, shooting, dribbling, running). The second part is a basketball game between two teams each with five players per team on the court, and extra players rotated into the game. Video cameras were set up along the sidelines of the court in order to record each participant’s activities for the labeling process.
Our study design used 24 subjects with 13 subjects living in Germany and 11 subjects living in the United States of America. In each study, the players simultaneously performed the drills and game while the entire basketball court was monitored using two wide-angle cameras. After the study, the camera footage was used for detailed annotation of all activity-relevant data.
Hang-Time HAR study protocol as executed at both locations. The German recording is ~110 min and the American recording ~76 min long.
Meta information as given through the study questionnaire by all participants, 13 from Germany, Europe (eu) and 11 from USA, North America (na). A total of 3 participants were female and 21 were male. The players were between 18 and 39 years old. Through self-assessment, in which participants were asked to evaluate their experience in basketball, 8 players responded with novice and 16 with expert. Two people were left-handed. Additional about the anthropomorphy of our participants are excluded due to restrictions given by the Ethical Council of our university.
Preprocessing: We decided to keep the preprocessing on the raw data from the smartwatches to a minimum, as these were already provided with a timestamp and in the g unit. The smartwatch’s accelerometer samples’ timestamps contained slight (<2%) deviations, so we adjusted the time-series by resampling to ensure that all data maintains exact 50 Hz equidistant timestamps. Other common methods of preprocessing inertial data for activity recognition, such as rescaling or normalization, were not applied.
The dataset is saved in CSV format, with each player having an individual file. It can be easily
loaded using the
read_csv from the Pandas library,
which is commonly used for data manipulation and analysis in Python.
Once the dataset is loaded, the labels are stored in four different columns: coarse, locomotion,
basketball, and in/out.
The coarse column separates the samples into different sessions, including warm-up, drills
(sitting, standing, walking, running, dribbling, penalty shots, two-point shots, and three-point
shots), game, and in/out.
The "game" label indicates when a game was played. The German study comprises two game sessions,
each lasting approximately 10 minutes, while the study conducted in the USA consists of one
session lasting approximately 22 minutes.
The basketball and locomotion tiers contain labels corresponding to different classes mentioned
in the table below, as well as the "not_labeled" label. The "not_labeled" label is assigned when the
specific activity of a player couldn't
be observed in the ground truth video or between sessions. The In/Out tier is only relevant
during the game session and indicates whether a player is on the court or not.
Combining Classes: The layers provided in our dataset make it possible to extend
it with additional and more challenging classes. For example, shots can be distinguished
between penalty_shots, two_point_shots, and three_point_shots by taking into account the coarse
layer. The locomotion layer holds the information if the activity dribbling was performed
while the player was standing, walking, or running. Therefore, the class definitions in
following table only contain the basic classes and can be extended individually, depending
on the requirements of one’s project.
Detailed class description for every class included in the dataset. The dataset is multi-tier labeled with 4 different layers (I) coarse, (II) locomotion, (III) basketball, and (IV) in/out. The coarse layer is not listed, since it is meant to indicate to which session an activity belongs. Relevant classes are classes 2–13. However, the classes in and out were not used in our validation.
Exemplar time-series data for the included activities. The examples shown for the periodic activities sitting , standing, walking, running, and dribbling contain 1200 samples (approx. 24 s). In order to better represent the complex activities shot and layup as well as the micro-activities pass and rebound. Jumps are marked in classes where the activity occurs. Such short periods were summarized in the activity jumping.
Class distribution of the Hang-Time HAR dataset. Total number of samples per class are: sitting : 383,622 (~2.1 h), standing: 368,189 (~2.0 h), walking: 1,885,644 (~10.5 h), running: 1,100,942 (~6.1 h), jumping: 96,857 (~0.53 h), dribbling: 878,514 (~4,8 h), shot: 149,040 (~0.82 h), layup: 62,393 (~0.34 h), pass: 86,291 (~0.47 h), and rebound: 18,886 (~0.10 h). In total: 5,030,378 labeled samples or ~27.7 h of data
Void Class: We originally included a void class for miscellaneous movements outside of the primary labeled ones, such as drinking from a water bottle or tying shoes. These were mostly performed during rest breaks. The samples annotated as void resulted in an irrelevant small class, which could not be recognized by our classifier because they are most often performed in conjunction with one of the locomotion classes. We ultimately decided against including this void class, since it was very rare that players were not performing one of the 10 classes of locomotion or basketball activities. However, the data that is not annotated as one of the aforementioned classes are categorized as not_labeled. This class can be seen as a very noisy but realistic void class that can be used by researchers whom focus on deeper insights in the NULL-class problem.We would like to thank the basketball players from the teams TuS Fellinghausen from Kreuztal, Germany, and the University of Colorado Boulder students for participating in our study.
@article{hoelzemannHangtimeHARBenchmark2023,
title = {Hang-time HAR: A benchmark dataset for basketball activity recognition using wrist-worn inertial sensors},
volume = {23},
url = {https://doi.org/10.3390/s23135879},
number = {13},
journal = {Sensors},
author = {Hoelzemann, Alexander and Romero, Julia L. and Bock, Marius and Van Laerhoven, Kristof and Lv, Qin},
year = {2023},
}