Badminton hitting type classification method based on bstcnet

By constructing the BSTCNet network model and using badminton shuttlecock coordinate sequences for multi-scale feature extraction and weight generation, the problems of low recognition rate and decreased generalization ability caused by occlusion and motion deformation in badminton hit type classification are solved, and high-accuracy hit type recognition is achieved.

CN121600451BActive Publication Date: 2026-06-16NANCHANG HANGKONG UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NANCHANG HANGKONG UNIVERSITY
Filing Date
2026-01-30
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies suffer from low recognition rates and reduced model generalization ability in badminton hit type classification due to athlete body obstruction and motion distortion.

Method used

We construct the BSTCNet network model, which locates the two-dimensional or three-dimensional coordinate sequence of a badminton shuttlecock, utilizes multi-scale feature extraction and weight generator, and combines an improved focus loss function to achieve fast and accurate classification of badminton shuttlecock hit types.

🎯Benefits of technology

It effectively avoids the problem of missed or false detection of skeletal points caused by athlete occlusion, improves classification accuracy and model generalization ability, and can accurately identify the type of shot in sequences of different lengths.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121600451B_ABST
    Figure CN121600451B_ABST
Patent Text Reader

Abstract

The application relates to a badminton hitting type classification method based on a BSTCNet, which comprises the following steps: constructing a badminton hitting type data set, and dividing the data set into a training set and a test set; constructing a BSTCNet network model, and outputting classification results corresponding to different scale feature information and confidence after operation; training the BSTCNet network model, using an improved focal loss function as a loss function of the model, so that the BSTCNet network model pays more attention to more difficult-to-classify samples; using the selected and trained BSTCNet network model to classify badminton hitting types, and taking the classification result with the highest confidence as the output result of the BSTCNet network model. The application quickly and accurately classifies various different hitting types through badminton coordinate sequences, and effectively overcomes the defects of low recognition rate and low model generalization ability caused by body shielding and action deformation of athletes.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of time sequence classification technology, specifically to a badminton hit type classification method based on BSTCNet. Background Technology

[0002] Currently, most methods for classifying badminton shot types rely on deep learning to identify human movements. First, a deep learning network is used for target detection, accurately identifying athletes on the court in the video. Then, pose extraction is performed on the detected athletes, extracting the coordinates of their key skeletal points. Finally, a designed action recognition network is used to identify the sequence of these key skeletal point coordinates. However, this method has the following drawbacks: First, athletes' bodies can self-occlude, such as during a backhand shot where the body obscures the racket hand, leading to the loss of key points and reducing the recognition rate. Second, different athletes have different movement habits, and the fast pace of a match can cause movement distortion, resulting in a decrease in the model's generalization ability. Summary of the Invention

[0003] The purpose of this invention is to provide a badminton hit type classification method based on BSTCNet, which can quickly and accurately classify various different hit types through badminton coordinate sequences, effectively overcoming the shortcomings of low recognition rate and low model generalization ability caused by athlete body obstruction and motion deformation.

[0004] The technical solution adopted in this invention is: a badminton hit type classification method based on BSTCNet, comprising the following steps:

[0005] S1: Decompose the badminton match video into continuous video frames, locate the position of the badminton shuttlecock in the video frame, label the badminton shuttlecock coordinates, and label different states; label the hit type according to the badminton shuttlecock flight trajectory corresponding to the badminton shuttlecock coordinate sequence, construct a badminton hit type dataset, and divide it into training set and test set;

[0006] S2: Construct the BSTCNet network model, which includes a backbone network, a classifier head, and a weight generator. The backbone network includes two CBS modules and N scale feature extraction modules to extract feature information at N different scales. The classifier head is a multi-scale classifier head, including N branches. Each branch includes a CBSE module, a fully connected layer, and a sigmoid activation function layer. The N different scale feature information output from the backbone network is input into each branch of the classifier head. After processing, the multi-scale classifier head outputs the classification results corresponding to the N different scale feature information. The weight generator includes a fully connected layer and a Softmax activation function layer to receive the mask data split from the input data and generate the confidence scores of the classification results corresponding to the N scale feature information.

[0007] S3: Train the BSTCNet network model using an improved focus loss function as the model's loss function; the improved focus loss function is used to overcome the imbalance of sample size, enabling the BSTCNet network model to focus on samples that are more difficult to classify.

[0008] S4: Use the trained BSTCNet network model to classify badminton hit types, and use the classification result with the highest confidence as the output of the BSTCNet network model.

[0009] Furthermore, the specific process of step S1 is as follows:

[0010] S101: Starting from the moment the player serves the shuttlecock and ending at the moment the shuttlecock lands or touches the net, the badminton match video is cut into video segments and broken down into continuous video frames.

[0011] S102: View the video frames in sequence, locate the position of the shuttlecock in the video frame, mark the coordinate label of the shuttlecock, and mark the different states of the shuttlecock; if the shuttlecock is hit, mark it as the hitting point; if the shuttlecock lands or touches the net, mark it as the landing point; if the shuttlecock is in the air, mark it as the flight point.

[0012] S103: Read the video frames and annotation information corresponding to the badminton shuttlecock's hit point and subsequent flight point in sequence. Mark the badminton shuttlecock with dots in the video frames. Then play the video frames at a speed of P frames / second. Based on the badminton shuttlecock's flight trajectory in the played video frames, determine which type of badminton hit the shuttlecock's coordinate sequence belongs to and save the badminton hit type number into the annotation information.

[0013] S104: Based on the hit type labels in the annotation information, construct a badminton hit type dataset and divide it into a training set and a test set.

[0014] Furthermore, the CBS module includes a convolutional layer, a BN normalization layer, and a SiLU activation function layer; each scale feature extraction module includes a ResUnit module and Each module has a CBS module; the ResUnit module includes a main path and branch paths. The main path includes a CBS module, a first convolutional layer, and a BN normalization layer. The branch path includes a second convolutional layer. The data input to the ResUnit module is divided into two paths, which are processed through the main path and the branch path respectively. The outputs of the main path and the branch path are added together and then input into the SiLU activation function layer. The output of the SiLU activation function layer is used as the output of the ResUnit module. The first CBS module in each scale feature extraction module is used to output the feature information of the corresponding scale. , .

[0015] Furthermore, the CBSE module adds an SE attention mechanism module on the basis of the CBS module. The SE attention mechanism module includes a global average pooling layer, a fully connected layer, a ReLU activation function layer, a fully connected layer, and a Sigmoid activation function layer. The data input to the SE attention mechanism module is processed sequentially through the global average pooling layer, the fully connected layer, the ReLU activation function layer, the fully connected layer, and the Sigmoid activation function layer. The output result is multiplied by the other input data and used as the output result of the SE attention mechanism module.

[0016] Furthermore, the weight generator includes a fully connected layer and a Softmax activation function layer. The weight generator splits mask data from the input data and calculates the weights of N classification results, using the weights as the confidence level of the classification results.

[0017] Furthermore, before training the BSTCNet network model, the badminton hit type dataset is preprocessed. Specifically, the badminton coordinate point data, corresponding labels, and hit type labels for each video frame are read sequentially. The coordinate data is then normalized by dividing the x-coordinate and y-coordinate by the width and height of the image. The normalized coordinates between hit points and between hit points and the landing point are extracted, and the normalized coordinate sequences are evenly padded before and after the extracted sequences to unify the BSTCNet network model. The input length is L; a mask sequence of length L is generated, where the mask value corresponding to the real coordinate data position is 1.0 and the mask value corresponding to the filled coordinate data position is 0.0; the normalized horizontal coordinate data is placed in the first row, the normalized vertical coordinate data is placed in the second row, and the mask data is placed in the third row, forming a two-dimensional matrix with 3 rows and L columns; the two-dimensional matrix is ​​converted into tensor form and used as the input of the BSTCNet network model, and the hit type label is placed in a one-dimensional matrix and converted into tensor form as the label.

[0018] Furthermore, the improved focus loss function The expression is:

[0019] ;

[0020] ;

[0021] in, This indicates that the category in the output of the BSTCNet network model is The probability of; Indicates the focus factor; Indicate category The weight, For category Sample size For category Sample size This represents the total number of categories.

[0022] The beneficial effects of this invention are as follows: By constructing a BSTCNet network model, this invention utilizes the two-dimensional or three-dimensional coordinate sequence of a badminton shuttlecock to achieve the task of classifying badminton hit types. Since the physical laws governing the shuttlecock's flight are relatively clear due to factors such as air resistance and initial velocity, the trajectory characteristics of different hit types are quite obvious. Furthermore, even if the shuttlecock is obscured by the athlete, it can still be tracked using methods such as Kalman filtering, exhibiting strong resistance to occlusion. Therefore, compared to badminton hit type classification based on the athlete's skeletal point coordinate sequence, this invention avoids the problem of classification errors caused by missed or false detections of skeletal points due to occlusion. Through multi-scale feature extraction and a weight generator, different sizes of convolutional kernels are used to obtain feature information at different time scales, effectively solving the classification problem for sequences of different lengths. Attached Figure Description

[0023] To more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0024] Figure 1 This is a flowchart of a method according to an embodiment of the present invention;

[0025] Figure 2 This is a schematic diagram of the BSTCNet network model in an embodiment of the present invention;

[0026] Figure 3 This is a schematic diagram of the CBS module structure in an embodiment of the present invention;

[0027] Figure 4 This is a schematic diagram of the SE attention mechanism module in an embodiment of the present invention;

[0028] Figure 5 This is a schematic diagram of the CBSE module in an embodiment of the present invention;

[0029] Figure 6 This is a schematic diagram of the ResUnit module in an embodiment of the present invention. Detailed Implementation

[0030] To better understand the above-described objects, features, and advantages of the present invention, the invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Many specific details are set forth in the following description to provide a thorough understanding of the invention; however, the invention may be practiced in other ways different from those described herein, and therefore, the invention is not limited to the specific embodiments disclosed below.

[0031] like Figure 1 As shown, a badminton hit type classification method based on BSTCNet includes the following steps:

[0032] S1: Obtain badminton match videos from live footage or from various video websites and public datasets. Decompose the badminton match videos into continuous video frames, locate the position of the shuttlecock in each frame, label the shuttlecock coordinates, and label different states. Based on the shuttlecock's flight trajectory, label the corresponding shuttlecock coordinate sequence with the hit type, construct a badminton hit type dataset, and divide it into training and test sets. The specific steps are as follows:

[0033] S101: Starting from the moment the player serves the shuttlecock and ending at the moment the shuttlecock lands or touches the net, the badminton match video is cut into video segments and broken down into continuous video frames.

[0034] S102: View the video frames sequentially, locate the position of the badminton shuttlecock in the video frame, label the coordinates of the shuttlecock, and mark the different states of the shuttlecock; if the shuttlecock is hit, mark it as the hitting point; if the shuttlecock lands or touches the net, mark it as the landing point; otherwise, mark the shuttlecock as the flight point. In this embodiment of the invention, the Labelme annotation tool is used for annotation, and the annotation information is stored in JSON format. The corresponding annotation for the hitting point is "hitting", the corresponding annotation for the landing point is "landing", and the corresponding annotation for the flight point is "points".

[0035] S103: Read the video frames and annotation information corresponding to the badminton shuttlecock's impact point and subsequent flight points in sequence. Highlight the shuttlecock with dots in the video frames, then play the video frames at a rate of P frames per second. Based on the shuttlecock's flight trajectory in the played video frames, determine which type of badminton hit the shuttlecock's coordinate sequence belongs to, and save the shuttlecock hit type number into the annotation information. In this embodiment, P is set to 30, and the shuttlecock hit type number table is shown in Table 1.

[0036] Table 1. Correspondence between Shot Type Numbers

[0037]

[0038] S104: Based on the hit type labels in the annotation information, construct a badminton hit type dataset and divide it into a training set and a test set in an 8:2 ratio.

[0039] S2: Construct the BSTCNet network model, such as Figure 2As shown, the BSTCNet network model includes a backbone network, a classifier head, and a weight generator. The backbone network includes two CBS modules and N scale feature extraction modules to extract feature information at N different scales. The classifier head is a multi-scale classifier head, including N branches. Each branch includes a CBSE module, a fully connected layer, and a sigmoid activation function layer. The N different scale feature information output from the backbone network is input into each branch of the classifier head, and the multi-scale classifier head outputs the classification results corresponding to the N different scale feature information after processing. The weight generator includes a fully connected layer and a Softmax activation function layer to receive the mask data split from the input data and generate the confidence scores of the classification results corresponding to the N scale feature information. Figure 2 In this context, k represents the kernel size, s represents the stride, and p represents the padding size.

[0040] like Figure 3 As shown, the CBS module includes a convolutional layer, a BN normalization layer, and a SiLU activation function layer. Each scale feature extraction module includes a ResUnit module and A CBS module, in an embodiment of the present invention Take 1. For example... Figure 6 As shown, the ResUnit module includes a main path and branch paths. The main path includes a CBS module, a first convolutional layer, and a BN normalization layer. The branch path includes a second convolutional layer. The data input to the ResUnit module is divided into two paths, which are processed through the main path and the branch path respectively. The outputs of the main path and the branch path are added together and then input into the SiLU activation function layer. The output of the SiLU activation function layer is used as the output of the ResUnit module. In this embodiment of the invention, the first convolutional layer is a 3×3 convolutional layer, the second convolutional layer is a 1×1 convolutional layer, and N is 3. The first CBS module in each scale feature extraction module is used to output the feature information of the corresponding scale. , .

[0041] like Figure 5 As shown, the CBSE module adds an SE attention mechanism module to the CBS module. The SE attention mechanism module is as follows: Figure 4 As shown, the input to the SE attention mechanism module includes a global average pooling layer, a fully connected layer, a ReLU activation function layer, a fully connected layer, and a Sigmoid activation function layer. The data input to the SE attention mechanism module is processed sequentially through the global average pooling layer, the fully connected layer, the ReLU activation function layer, the fully connected layer, and the Sigmoid activation function layer. The output result is multiplied by the other input data and used as the output result of the SE attention mechanism module.

[0042] The weight generator includes a fully connected layer and a Softmax activation function layer. The weight generator extracts mask data from the input data and calculates the weights of N classification results, using the weights as the confidence level of the classification results.

[0043] In this embodiment of the invention, the kernel size of the CBS module in the first scale feature extraction module and the kernel size of the CBSE module in the first branch of the classification head are both 3×3; the kernel size of the CBS module in the second scale feature extraction module and the kernel size of the CBSE module in the second branch are both 5×5; and the kernel size of the CBS module in the third scale feature extraction module and the kernel size of the CBSE module in the third branch are both 7×7.

[0044] S3: Train the BSTCNet network model using an improved focus loss function as the model's loss function; the improved focus loss function is used to overcome the imbalance of sample size, enabling the BSTCNet network model to focus on samples that are more difficult to classify.

[0045] Before training the BSTCNet network model, the badminton hit type dataset was preprocessed. Specifically, the badminton coordinate point data, corresponding labels, and hit type labels for each video frame were read sequentially. The coordinate data was normalized by dividing the x-coordinate and y-coordinate by the width and height of the image. The normalized coordinates between hit points and between hit points and the landing point were extracted, and the normalized coordinate sequences were evenly padded with coordinate data (0.0, 0.0) before and after to unify the BSTCNet network. The model's input length is L; a mask sequence of length L is generated, where the mask value corresponding to the real coordinate data position is 1.0, and the mask value corresponding to the filled coordinate data position is 0.0; the normalized horizontal coordinate data is placed in the first row, the normalized vertical coordinate data in the second row, and the mask data in the third row, forming a two-dimensional matrix with 3 rows and L columns; the two-dimensional matrix is ​​converted into tensor form and used as the input to the BSTCNet network model, while the hit type label is placed in a one-dimensional matrix and converted into tensor form as the label. In this embodiment of the invention, the value of L is 96.

[0046] The improved focus loss function The expression is:

[0047] ;

[0048] ;

[0049] in, This indicates that the category in the output of the BSTCNet network model is The probability of; This represents the focus factor. Increasing this value can reduce the loss contribution of easily classified samples, allowing the model to focus more on difficult-to-classify samples. In the embodiments of this invention... The value is 2; Indicate category The weights are used to address the problem of imbalanced data sample sizes. The larger the number of samples in each category, the smaller the corresponding weight value, thus causing the model to pay more attention to the category with the smaller number of samples. In this embodiment of the invention, the weights are calculated... , , , , ; For category Sample size For category Sample size This represents the total number of categories.

[0050] The initial training hyperparameters were set as follows: batch size 8, learning rate 0.001, and number of training epochs 200. The Adam optimizer was used to iteratively optimize the BSTCNet network model. The learning rate during training was set using an arithmetic progression strategy, and the formula for calculating the learning rate is as follows:

[0051] ;

[0052] in, This represents the learning rate at the epoch-th iteration, and the initial learning rate. It is 0.001. Less than The minute constant; This represents the total number of iterations, and its size is the same as the number of training rounds, which is 200 in this embodiment of the invention. This indicates the current iteration number.

[0053] S4: Use the trained BSTCNet network model to classify badminton hit types, and use the classification result with the highest confidence as the output of the BSTCNet network model.

[0054] In this embodiment of the invention, the weight file of the trained BSTCNet network model is converted into a serialized file in .engine format required by the TensorRT inference engine to accelerate inference. Then, a corresponding algorithm class is created in the software system to implement the relevant functions for data preprocessing, TensorRT accelerated inference, and data postprocessing.

[0055] The software system is started, and a complete video file or real-time footage captured by the camera is loaded. The image data is passed to the instance object of the STNet algorithm through Qt signals. The instance object of the STNet algorithm detects the image frame, obtains the badminton shuttlecock coordinate data, and stores the badminton shuttlecock coordinate data in the cache.

[0056] When the sequence length of badminton shuttlecock coordinate data in the cache is greater than or equal to 96, the coordinate data of the first 96 shuttlecocks is passed to the instance object of the badminton shuttlecock hitting point and landing point detection algorithm via Qt signal. The instance object of the badminton shuttlecock hitting point and landing point detection algorithm detects the hitting point and landing point in these 96 coordinates. Then, the coordinate data between hitting points and between hitting points and landing points is passed to the instance object of the BSTCNet algorithm via Qt signal. At the same time, the badminton shuttlecock coordinate data before the last hitting point or landing point is deleted from the cache.

[0057] After receiving the badminton shuttlecock coordinate data, the instance of the BSTCNet algorithm preprocesses the data as described in step S3, then uploads it to the GPU. The TensorRT inference engine is used to accelerate inference, and finally, the inference output is obtained. The hit type is determined based on the location of the maximum output value. This process is repeated until the entire video file is played or the camera stops recording, while simultaneously counting the occurrences of each badminton hit type. After the count is complete, a new video file or live footage captured by the camera can be selected.

[0058] Under a hardware environment with a processor model of "13th Gen Intel(R) Core(TM) i5-13400F", a graphics card model of "NVIDIA GeForce RTX 4060 Ti", 32GB of memory and 16GB of video memory, the effectiveness and advancement of the method described in the embodiments of the present invention were tested, and the results are as follows:

[0059] (1) Recognition accuracy test

[0060] In the test set data, the data for each type of shot were categorized, and the corresponding confusion matrix is ​​shown in Table 2. In the table, the vertical column represents the actual badminton shot type, and the horizontal column represents the inferred badminton shot type. The results show that the accuracy rate for the drop shot was 87.5%, the drop shot accuracy rate was 77.5%, the lift shot accuracy rate was 92.5%, the smash accuracy rate was 75%, the high clear accuracy rate was 85%, and the average accuracy rate was 83.5%.

[0061] Table 2 Confusion Matrix for Shot Type Classification

[0062]

[0063] (2) Deployment and operation testing

[0064] The BSTCNet network model was converted into a .engine format serialized file, loaded into the software system, and inference was accelerated using the TensorRT inference engine. Inference tests were conducted in a Release environment, and the deployment and test results are shown in Table 3. The results show that the average inference speed of this embodiment can reach over 300 FPS, which meets the real-time requirements of the software system.

[0065] Table 3. BSTCNet Network Model Deployment and Operation Test Results

[0066]

[0067] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A badminton hit type classification method based on BSTCNet, characterized in that, Includes the following steps: S1: Decompose the badminton match video into consecutive video frames, locate the position of the shuttlecock in each frame, label the shuttlecock coordinates, and label different states; based on the shuttlecock's flight trajectory, label the corresponding shuttlecock coordinate sequence with the hit type, construct a badminton hit type dataset, and divide it into training and test sets; the specific process is as follows: S101: Starting from the moment the player serves the shuttlecock and ending at the moment the shuttlecock lands or touches the net, the badminton match video is cut into video segments and broken down into continuous video frames. S102: View the video frames in sequence, locate the position of the shuttlecock in the video frame, mark the coordinate label of the shuttlecock, and mark the different states of the shuttlecock; if the shuttlecock is hit, mark it as the hitting point; if the shuttlecock lands or touches the net, mark it as the landing point; if the shuttlecock is in the air, mark it as the flight point. S103: Read the video frames and annotation information corresponding to the badminton shuttlecock's hit point and subsequent flight point in sequence. Mark the badminton shuttlecock with dots in the video frames. Then play the video frames at a speed of P frames / second. Based on the badminton shuttlecock's flight trajectory in the played video frames, determine which type of badminton hit the shuttlecock's coordinate sequence belongs to and save the badminton hit type number into the annotation information. S104: Based on the hit type labels in the annotation information, construct a badminton hit type dataset and divide it into a training set and a test set; S2: Construct the BSTCNet network model, which includes a backbone network, a classifier head, and a weight generator. The backbone network includes two CBS modules and N scale feature extraction modules to extract feature information at N different scales. The classifier head is a multi-scale classifier head, including N branches. Each branch includes a CBSE module, a fully connected layer, and a sigmoid activation function layer. The N different scale feature information output from the backbone network is input into each branch of the classifier head. After processing, the multi-scale classifier head outputs the classification results corresponding to the N different scale feature information. The weight generator includes a fully connected layer and a Softmax activation function layer to receive the mask data split from the input data and generate the confidence scores of the classification results corresponding to the N scale feature information. The CBS module includes a convolutional layer, a BN normalization layer, and a SiLU activation function layer; each scale feature extraction module includes a ResUnit module and Each module has a CBS module; the ResUnit module includes a main path and branch paths. The main path includes a CBS module, a first convolutional layer, and a BN normalization layer. The branch path includes a second convolutional layer. The data input to the ResUnit module is divided into two paths, which are processed through the main path and the branch path respectively. The outputs of the main path and the branch path are added together and then input into the SiLU activation function layer. The output of the SiLU activation function layer is used as the output of the ResUnit module. The first CBS module in each scale feature extraction module is used to output the feature information of the corresponding scale. , ; S3: Train the BSTCNet network model using an improved focus loss function as the model's loss function; the improved focus loss function is used to overcome the imbalance of sample size, enabling the BSTCNet network model to focus on samples that are more difficult to classify. S4: Use the trained BSTCNet network model to classify badminton hit types, and use the classification result with the highest confidence as the output of the BSTCNet network model.

2. The badminton hit type classification method based on BSTCNet according to claim 1, characterized in that, The CBSE module adds an SE attention mechanism module to the CBS module. The SE attention mechanism module includes a global average pooling layer, a fully connected layer, a ReLU activation function layer, a fully connected layer, and a Sigmoid activation function layer. The data input to the SE attention mechanism module is processed sequentially through the global average pooling layer, the fully connected layer, the ReLU activation function layer, the fully connected layer, and the Sigmoid activation function layer. The output result is multiplied by the other input data and used as the output result of the SE attention mechanism module.

3. The badminton hit type classification method based on BSTCNet according to claim 2, characterized in that, The weight generator includes a fully connected layer and a Softmax activation function layer. The weight generator extracts mask data from the input data and calculates the weights of N classification results, using the weights as the confidence level of the classification results.

4. The badminton hit type classification method based on BSTCNet according to claim 3, characterized in that, Before training the BSTCNet network model, the badminton hit type dataset is preprocessed. Specifically, the badminton coordinate point data, coordinate point corresponding labels, and hit type labels corresponding to the video frames are read in sequence. The coordinate data is normalized by dividing the horizontal coordinate X and vertical coordinate Y by the width and height of the image. The normalized coordinates between the hitting point and between the hitting point and the landing point are extracted, and the coordinate data is evenly filled before and after the extracted normalized coordinate sequence to unify the input length L of the BSTCNet network model. Generate a mask sequence of length L, where the mask value corresponding to the actual coordinate data position is 1.0 and the mask value corresponding to the filled coordinate data position is 0.0; place the normalized x-coordinate data in the first row, the normalized y-coordinate data in the second row, and the mask data in the third row, forming a two-dimensional matrix with 3 rows and L columns; convert the two-dimensional matrix into tensor form and use it as input to the BSTCNet network model, while placing the hit type label in a one-dimensional matrix and converting it into tensor form as the label.

5. The badminton hit type classification method based on BSTCNet according to claim 4, characterized in that, The improved focus loss function The expression is: ; ; in, This indicates that the category in the output of the BSTCNet network model is The probability of; Indicates the focus factor; Indicates category The weight, For category Sample size For category Sample size This represents the total number of categories.