TY - JOUR
T1 - A Comparative Analysis of Deep Learning Models and Gradient Computation for Rally Detection in Badminton Videos
AU - See, Shin Yue
AU - Paramesran, Raveendran
AU - Hassan, Mohd Fikree
AU - Krishnasamy, Ganesh
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025/5/5
Y1 - 2025/5/5
N2 - As in many sports, badminton videos are commonly used by coaches and players to analyze performance and gain valuable insights into each point played in a match. Typically, these videos represent a collection of points won and lost by the players, and it is time consuming to manually annotate the start of service and end of each rally. For detecting the start of service we used two different models, Bidirectional Long Short Term Memory (BiLSTM) and Convolutional Neural Network (CNN) while the gradient computation determined the end of the rally. A comparative analysis of both models’ classification performance was studied to identify the frame interval separation that gave the best classification accuracy. A total of 12 frames with 26 features from each frame comprising the distance, angle, gradient and movements that are typical of a serving and receiving actions was arranged in a 2D format that was used as inputs to the models. Two datasets are used in this study. The first dataset consists of 14,688 video frames, evenly divided between serving and non-serving poses, from 30 players, with 80% used for training and 20% for testing. The second dataset consists of 4500 video frames from 10 players that are not used in the training of the models to evaluate the robustness of the models. The experimental results demonstrated the highest classification accuracy of approximately 95% for the BiLSTM model, with frame intervals of 4, while the CNN achieved similar accuracy at frame intervals of 5. For the detection of the end of a rally, it is required that at least 80% of consecutive frames exhibit no motion of the shuttlecock to signify the end of the rally. We achieved an average accuracy of 95% across a total of 187 rallies in the second dataset.
AB - As in many sports, badminton videos are commonly used by coaches and players to analyze performance and gain valuable insights into each point played in a match. Typically, these videos represent a collection of points won and lost by the players, and it is time consuming to manually annotate the start of service and end of each rally. For detecting the start of service we used two different models, Bidirectional Long Short Term Memory (BiLSTM) and Convolutional Neural Network (CNN) while the gradient computation determined the end of the rally. A comparative analysis of both models’ classification performance was studied to identify the frame interval separation that gave the best classification accuracy. A total of 12 frames with 26 features from each frame comprising the distance, angle, gradient and movements that are typical of a serving and receiving actions was arranged in a 2D format that was used as inputs to the models. Two datasets are used in this study. The first dataset consists of 14,688 video frames, evenly divided between serving and non-serving poses, from 30 players, with 80% used for training and 20% for testing. The second dataset consists of 4500 video frames from 10 players that are not used in the training of the models to evaluate the robustness of the models. The experimental results demonstrated the highest classification accuracy of approximately 95% for the BiLSTM model, with frame intervals of 4, while the CNN achieved similar accuracy at frame intervals of 5. For the detection of the end of a rally, it is required that at least 80% of consecutive frames exhibit no motion of the shuttlecock to signify the end of the rally. We achieved an average accuracy of 95% across a total of 187 rallies in the second dataset.
KW - Badminton single matches
KW - End of rally detection
KW - Service detection
KW - Video analysis
UR - https://www.scopus.com/pages/publications/105004173774
U2 - 10.1007/s42979-025-03935-0
DO - 10.1007/s42979-025-03935-0
M3 - Article
AN - SCOPUS:105004173774
SN - 2661-8907
VL - 6
JO - SN Computer Science
JF - SN Computer Science
IS - 5
M1 - 447
ER -