Learning style recognition method and system based on intention weighting and double regularization

By employing an intent-weighted and dual-regularized learning style recognition method, and utilizing text semantic intent vectors to guide the adaptive modulation of numerical features, this approach addresses the issues of existing learning style recognition methods failing to effectively utilize text interaction information and class imbalance, thereby achieving more efficient learning style recognition and improved robustness.

CN122196642APending Publication Date: 2026-06-12TAISHAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TAISHAN UNIV
Filing Date
2026-05-14
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing learning style recognition methods cannot effectively utilize text interaction information, resulting in limited feature expression. They cannot distinguish learners with similar behavioral values ​​but fundamentally different learning styles. Furthermore, educational data suffers from class imbalance, leading to weak model generalization ability and classification imbalance.

Method used

By employing an intent-weighted and dual-regularized learning style recognition method, this approach utilizes textual semantic intent vectors to guide the adaptive modulation of numerical features. Furthermore, through channel orthogonal regularization and cross-modal semantic consistency constraints, it achieves fine-grained cross-modal interaction between textual intent and behavioral features, eliminates channel redundancy, and enhances the recognition capability of rare learning styles.

🎯Benefits of technology

It significantly improves the accuracy and robustness of learning style recognition, and can evenly enhance the recognition capabilities of each category under imbalanced data, meeting the real-time recognition needs of online education platforms.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122196642A_ABST
    Figure CN122196642A_ABST
Patent Text Reader

Abstract

The application provides a learning style recognition method and system based on intention weighting and double regularization, and belongs to the field of online learning behavior analysis. The method comprises the following steps: obtaining numerical behavior characteristics and text type comment information of learners, and performing preprocessing to obtain standardized numerical characteristics and text semantic intention vectors; dividing the data into a training set and a test set and performing class balancing processing on the training set; constructing a learning style recognition model, generating channel gating weights based on the text semantic intention vectors, element-by-element modulating the standardized numerical characteristics to obtain weighted numerical characteristics, and classifying after fusing the intention vectors to form fused features; introducing channel orthogonal regularization constraints in the training process to make the columns of the weight generation matrix orthogonal, and introducing cross-modal semantic consistency constraints to make the weighted features and the intention vectors semantically aligned; and inputting to-be-recognized learner data into the trained model to obtain learning style recognition results. The application improves the recognition ability of rare learning styles.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of online learning behavior analysis technology, and in particular relates to a learning style recognition method and system based on intent weighting and dual regularization. Background Technology

[0002] The statements in this section are merely background information related to the present invention and do not necessarily constitute prior art.

[0003] With the rapid popularization of online education platforms, learners generate a large amount of multi-dimensional behavioral data and text interaction information during the learning process. Accurately identifying learning styles from heterogeneous data has become a core technical requirement for personalized education recommendations and adaptive teaching. Current learning style identification methods mainly rely on numerical behavioral feature modeling, primarily using traditional machine learning algorithms such as XGBoost, Random Forest, and Support Vector Machines. Classification is achieved through indicators such as learning duration, resource access frequency, and practice accuracy. While these methods are computationally efficient, they completely ignore the subjective learning intentions and preferences contained in text such as course comments and discussion posts. They cannot distinguish learners with similar behavioral values ​​but fundamentally different learning styles, resulting in significant limitations in feature representation.

[0004] To compensate for the limitations of single-modality fusion, some studies have attempted to directly concatenate textual and numerical features before feeding them into a classifier. This approach merely achieves feature stacking without establishing explicit intermodal interaction and enhancement mechanisms. Textual semantics struggle to effectively guide the utilization of behavioral features, and the increased dimensionality easily introduces noise, resulting in limited actual performance improvement. In recent years, attention mechanisms and gating networks have been used for multimodal fusion, achieving feature modulation by generating channel weights for one modality from those of another. However, significant drawbacks remain: first, the columns of the weight generation matrix are prone to high correlation, leading to channel redundancy and insufficient weighted discrimination, failing to provide differentiated modulation for different feature dimensions; second, the lack of cross-modal semantic consistency constraints makes it difficult to align textual intent with weighted features in the semantic space, resulting in a lack of effective information transmission guarantees.

[0005] Meanwhile, educational data generally suffers from severe class imbalance, with a high proportion of visual and balanced learners and a very small number of rare styles such as auditory and kinesthetic learners. Existing models tend to favor the majority class on imbalanced data, and channel redundancy and semantic alignment issues further amplify the identification error of rare classes, resulting in weak model generalization ability and class imbalance. Summary of the Invention

[0006] To overcome the shortcomings of the prior art, this invention provides a learning style recognition method and system based on intent weighting and dual regularization, which is used to realize fine-grained cross-modal interaction between text intent and behavioral features, eliminate channel redundancy, ensure semantic alignment, and significantly improve the ability to recognize rare learning styles and the overall classification accuracy.

[0007] To achieve the above objectives, one or more embodiments of the present invention provide the following technical solutions: The first aspect of this invention provides a learning style recognition method based on intent weighting and dual regularization; Learning style recognition methods based on intent weighting and dual regularization include: We acquire learners’ numerical behavioral features and textual comment information, obtain standardized numerical behavioral features and textual semantic intent vectors through preprocessing, and construct a multimodal dataset. The multimodal dataset is divided into a training set and a test set, and the training set is subjected to class balancing to ensure that the number of samples in each class is balanced. A learning style recognition model is constructed and trained using a training set after class balancing to obtain a well-trained learning style recognition model. The intent-aware adaptive channel weighting module of the learning style recognition model uses the text semantic intent vector as a condition, introduces double regularization constraints to impose explicit constraints on the weight generation matrix, and uses the generated channel gating weights to modulate the standardized numerical features element by element to obtain weighted numerical features, which are then concatenated with the text intent vector to form fusion features. The fusion features are then input into the model's classification module to output the classification result of the learning style. The multimodal data of the learner to be identified is input into the trained learning style recognition model to obtain the recognition result of the learning style.

[0008] As a further technical solution, the numerical behavioral features and textual comment information are preprocessed to obtain standardized numerical behavioral features and textual semantic intent vectors, including: Z-score standardization is applied to each dimension of the numerical behavioral characteristics to give each dimension a distribution with zero mean and unit variance. Textual comment information is vectorized using TF-IDF to obtain high-dimensional sparse features. Then, the high-dimensional sparse features are reduced in dimensionality by truncated singular value decomposition to obtain a low-dimensional dense text semantic intent vector.

[0009] As a further technical solution, the multimodal dataset is divided into a training set and a test set, and the training set is subjected to class balancing processing, including: A stratified random sampling strategy was used to divide the multimodal dataset into a training set and a test set; The SMOTE oversampling method is used to synthesize and expand the minority class samples in the training set to balance the number of samples in each class. The method of generating the synthesized samples is as follows: in the feature space of the minority class samples, select the k nearest neighbor samples of the minority class samples, and perform linear interpolation on the line connecting the minority class samples and a randomly selected nearest neighbor sample to generate new synthesized samples.

[0010] As a further technical solution, the intent-aware adaptive channel weighting module uses the text semantic intent vector as a condition, introduces double regularization constraints to impose explicit constraints on the weight generation matrix, and uses the generated channel-gated weights to modulate the standardized numerical features element-wise to obtain weighted numerical features, which are then concatenated with the text intent vector to form fused features, including: The text semantic intent vector is mapped to channel weight coefficients corresponding to the standardized numerical feature dimensions by using a weight generation matrix and a bias vector:

[0011] in, This is the original activation value vector after linear mapping; This is a text semantic intent vector; Generate a matrix for the weights; It is the bias vector; Activation value after linear mapping The final channel-gated weight vector is generated by performing a nonlinear transformation using the Sigmoid activation function.

[0012] in, For the Sigmoid function, This is the generated channel weight vector; Based on the obtained channel gating weights, the weights are applied to the standardized numerical features through element-wise multiplication to complete the feature modulation guided by intent:

[0013] in, This represents element-wise multiplication. Weighted numerical features; The weighted numerical features are concatenated with the original text semantic intent vector to obtain the fused features, represented as follows:

[0014] in This represents a vector concatenation operation. The fused feature vector To standardize the number of dimensions of numerical behavioral features, The dimension of the text semantic intent vector.

[0015] As a further technical solution, the dual regularization constraint includes channel orthogonal regularization constraint and cross-modal semantic consistency constraint; The channel orthogonal regularization constraint is used to make the column vectors of the matrix used to generate weights in the intention-aware adaptive channel weighting module tend to be orthogonal. The cross-modal semantic consistency constraint is used to ensure that the weighted numerical features are aligned with the text semantic intent vector in the semantic direction.

[0016] As a further technical solution, the method also includes optimizing the learning style recognition model using a comprehensive loss function; the comprehensive loss function is a weighted sum of classification loss, channel orthogonality regularization loss, and cross-modal semantic consistency loss, as shown below:

[0017] in, This is the overall loss function; For classification loss; This is the channel orthogonal regularization loss; For cross-modal semantic consistency loss; , These are the weight coefficients for channel orthogonal regularization and cross-modal semantic consistency constraints, respectively, used to control the influence of the two regularization losses on the classification loss.

[0018] A second aspect of the present invention provides a learning style recognition system based on intent weighting and dual regularization.

[0019] A learning style recognition system based on intent weighting and dual regularization includes: The data acquisition and preprocessing module is configured to: acquire learners' numerical behavioral features and textual comment information, obtain standardized numerical behavioral features and textual semantic intent vectors through preprocessing, and construct a multimodal dataset; The dataset partitioning and balancing module is configured to: partition the multimodal dataset into a training set and a test set, and perform class balancing on the training set to ensure that the number of samples in each class is balanced; The model building and training module is configured to: build a learning style recognition model and train it using the training set after class balancing to obtain a trained learning style recognition model. The intent-aware adaptive channel weighting module of the learning style recognition model uses the text semantic intent vector as a condition, introduces double regularization constraints to impose explicit constraints on the weight generation matrix, and uses the generated channel gating weights to modulate the standardized numerical features element by element to obtain weighted numerical features, which are then concatenated with the text intent vector to form fusion features. The fusion features are then input into the model's classification module to output the classification result of the learning style. The recognition result prediction module is configured to input the multimodal data of the learner to be identified into the trained learning style recognition model to obtain the recognition result of the learning style.

[0020] A third aspect of the present invention provides a computer-readable storage medium having a program stored thereon, which, when executed by a processor, implements the steps of a learning style recognition method based on intent weighting and dual regularization as described in the first aspect of the present invention.

[0021] A fourth aspect of the present invention provides an electronic device, including a memory, a processor, and a program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of a learning style recognition method based on intent weighting and dual regularization as described in the first aspect of the present invention.

[0022] The fifth aspect of the present invention provides a computer program product, including a computer program / instruction that, when executed by a processor, implements the steps of the learning style recognition method based on intent weighting and dual regularization described in the first aspect of the present invention.

[0023] The above one or more technical solutions have the following beneficial effects: (1) This invention achieves deep cross-modal interaction between text semantic intent and numerical behavioral features through an intent-aware adaptive channel weighting mechanism, which can accurately transform learner subjective preferences into directional modulation signals of behavioral features. Compared with traditional methods such as simple splicing and single-modal modeling, this mechanism allows text intent to actively guide the activation and suppression of numerical features, enabling the same behavioral data to present differentiated expressions under different learning intents, significantly improving feature discrimination ability and model expression depth, and effectively solving the industry pain point that multimodal information cannot be synergistically enhanced.

[0024] (2) This invention introduces a channel orthogonal regularization constraint, which forces each column of the weight generation matrix to maintain low correlation and orthogonality, thus structurally eliminating the channel redundancy problem. This constraint avoids the weighting mechanism from degenerating into a globally uniform scaling, ensuring that each numerical feature channel obtains independent and differentiated modulation weights, enabling the model to focus on the key behavioral dimensions of different learning styles, improving feature discrimination and model stability, while reducing parameter redundancy and overfitting risks, and enhancing the model's generalization ability on complex educational data.

[0025] (3) This invention employs cross-modal semantic consistency constraints to achieve explicit alignment between the text intent space and the weighted behavioral feature space, ensuring the semantic accuracy of information transmission. This constraint prevents the model from ignoring text intent and relying on classification shortcuts during training, ensuring that the weighted results always match the learner's true subjective inclinations, improving the reliability and rationality of cross-modal fusion, and enabling the model to maintain stable output when faced with unseen intent-behavior combinations, greatly enhancing its robustness and credibility in practical applications.

[0026] (4) This invention organically combines dual regularization with a class balancing strategy, significantly improving the classification imbalance problem under the long-tail distribution of educational data. With the dual guarantee of channel redundancy removal and semantic alignment, the model can more efficiently capture the subtle feature differences of rare learning styles, avoid being dominated by the gradient direction by the majority class samples, and improve the recognition ability of each class in a balanced manner without sacrificing the overall classification accuracy. At the same time, the model structure is lightweight and computationally efficient, which can meet the real-time recognition and large-scale deployment requirements of online education platforms and has strong engineering practicality.

[0027] Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. Attached Figure Description

[0028] The accompanying drawings, which form part of this invention, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an improper limitation of the invention.

[0029] Figure 1 This is a flowchart of the method in the first embodiment.

[0030] Figure 2 This is a schematic diagram of the adaptive channel weight distribution under different learning styles in the first embodiment.

[0031] Figure 3 This is a comparison diagram of the experimental results of the method of the present invention and the benchmark method in the first embodiment.

[0032] Figure 4 This is a schematic diagram of the training loss convergence curve for the first embodiment.

[0033] Figure 5 This is a system structure diagram of the second embodiment. Detailed Implementation

[0034] It should be noted that the following detailed descriptions are exemplary and intended to provide further illustration of the invention. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains.

[0035] It should be noted that the terminology used herein is for the purpose of describing particular implementations only and is not intended to limit the exemplary implementations of the present invention.

[0036] Where there is no conflict, the embodiments and features in the embodiments of the present invention can be combined with each other.

[0037] Example 1 This embodiment discloses a learning style recognition method based on intent weighting and dual regularization. It uses the semantic intent information contained in learner text comments as a conditional signal, and transforms the intent vector into channel-wise gated weights through a learnable weight generation matrix. This adaptively modulates numerical behavioral features, thereby achieving cross-modal interactive fusion where text intent guides behavioral feature selection. Furthermore, channel orthogonal regularization constraints eliminate channel redundancy in the weight generation matrix, and cross-modal semantic consistency constraints ensure the alignment of the intent vector and weighted features in the semantic space. These two regularization methods are synergistically optimized with the classification loss, forming a complete dual regularization training framework. This method, combined with the SMOTE class balancing strategy, significantly improves the recognition ability of rare learning styles while maintaining overall classification accuracy, providing accurate and reliable technical support for personalized education recommendations.

[0038] Specifically, such as Figure 1 As shown, the learning style recognition method based on intent weighting and double regularization includes: Step S1: Obtain learners' numerical behavioral features and textual comment information, obtain standardized numerical behavioral features and textual semantic intent vectors through preprocessing, and construct a multimodal dataset.

[0039] In the learning style recognition system, the learner's raw data contains two different modalities of information: one is numerical behavioral features, which record the learner's objective behavioral indicators on the platform; the other is textual comment information, which reflects the learner's subjective learning feelings and preferences.

[0040] In this embodiment, the original dataset is assumed to contain Each learner sample contains both numerical features and textual comments.

[0041] For numerical behavioral features, let the original numerical feature matrix be... ,in This represents the number of dimensions of the numerical features. Each learner's numerical feature vector. It covers a variety of learning behavior indicators, including time-consuming features such as image content learning time, total video viewing time, text reading time, and audio material learning time; learning participation features such as the number of messages posted and the number of times participating in group discussions; and learning effectiveness features such as the number of correct answers to standard questions and practice accuracy. Because different types of behavioral features differ significantly in numerical scale—time-based features are typically large, measured in seconds or minutes, while proportional features like accuracy are distributed between zero and one—directly using the raw numerical values ​​would lead to large-scale features dominating the model optimization direction in subsequent calculations. To eliminate the influence of scale differences, each dimension of the original numerical feature matrix... Perform Z-score standardization. First, calculate the mean of all samples along this dimension. and standard deviation :

[0042]

[0043] in, Indicates the first The learner in the first The original values ​​in each feature dimension. The standardization transformation formula is:

[0044] After standardization, the numerical feature matrix is ​​obtained. Each dimension of the model has a distribution characteristic of zero mean and unit variance, which ensures the numerical compatibility of different types of features in subsequent model calculations.

[0045] For text-based comment information, the raw data consists of a segment of natural language text corresponding to each learner, denoted as _____. The text contains learners' subjective evaluations of course content, teaching methods, and learning experiences. This information implies learning intentions and preferences that are difficult to reflect directly with numerical features. To transform unstructured text data into numerical semantic representations that can be used for model computation, this embodiment employs a two-stage feature extraction process combining TF-IDF vectorization and truncated singular value decomposition.

[0046] The first stage is TF-IDF vectorization. The TF-IDF method measures the importance of each term to a document by multiplying the term frequency by the inverse document frequency. For the 1st term in the text set... text and terms The calculation process for its TF-IDF value is as follows: First, calculate the word frequency items. , indicating terms In the text Frequency of occurrence in:

[0047] in, For terms In the text The number of occurrences in the text, with the denominator being the number of occurrences in the text. The total number of times all terms appear in the text.

[0048] Then, calculate the inverse document frequency term. Used to measure terms Rarity within the entire text collection:

[0049] in, The total number of texts in the text collection. For included terms The number of texts is calculated by adding one to the denominator to avoid division by zero.

[0050] Multiplying the two together yields the term. For text TF-IDF weight values:

[0051] In practice, a bag-of-words model incorporating both unigrams and bigrams is employed, considering both individual words and combinations of adjacent words as feature terms to capture richer phrase-level semantic information. The maximum number of feature terms is set to 300, and English stop words are filtered out. Finally, a 300-dimensional TF-IDF sparse vector is generated for each text. The TF-IDF vectors of all samples are combined to obtain the text feature matrix. .

[0052] The second stage involves truncated singular value decomposition (SVD) for dimensionality reduction. Since the 300-dimensional TF-IDF vector is highly sparsity and has a high dimensionality, directly using it for subsequent model calculations is not only inefficient but also prone to introducing noise and overfitting. By employing truncated singular value decomposition to reduce the dimensionality of the TF-IDF matrix, it is compressed into a low-dimensional, dense semantic intent vector. Truncated singular value decomposition reduces the matrix... It can be approximately decomposed into the product of three matrices:

[0053] in, It is a left singular vector matrix. For the reason before A diagonal matrix consisting of the largest singular values; It is a right singular vector matrix. The number of singular values ​​to be retained. In this embodiment, The value is set to 32, meaning the first 32 principal singular value components are retained. This represents the text semantic intent vector for each learner. That is, a matrix The The computation process is equivalent to projecting the original TF-IDF vector onto a low-dimensional semantic subspace spanned by the first 32 right singular vectors:

[0054] in, For the first The TF-IDF vectors of each learner are used to truncate singular value decomposition, resulting in the text semantic intent matrix. Each of its rows This corresponds to the learner's 32-dimensional semantic intent vector. This vector encodes the learner's learning preferences and intent tendencies expressed in the text comments in a compact and dense form, removing the sparse and redundant information in the original TF-IDF representation and retaining the most discriminative semantic components.

[0055] After the above preprocessing steps, the original multimodal learner data is transformed into two sets of normalized feature representations: 17-dimensional normalized numerical behavioral features. and 32-dimensional text semantic intent vector This study characterizes learners' learning features from both objective behavior and subjective intention, providing high-quality multimodal feature inputs for subsequent intention-guided adaptive weighting and dual regularization optimization.

[0056] Step S2: Divide the multimodal dataset into a training set and a test set, and perform class balancing on the training set to ensure that the number of samples in each class is balanced.

[0057] Before training the model, the preprocessed dataset needs to be properly divided, and special measures need to be taken to address the class imbalance problem that is common in learning style data, so as to ensure that the model can obtain sufficient and balanced training signals in each class.

[0058] Step S2.1: First, the constructed multimodal dataset is divided into training and test sets. Each sample consists of standardized numerical features. Text semantic intent vector and learning style category tags It consists of three parts, among which... This represents the total number of learning style categories. A stratified random sampling strategy is used to divide the dataset into training and test sets at a ratio of 75% and 25%, respectively. Stratified sampling ensures that the proportion of samples of each category in the training and test sets is consistent with the original dataset, avoiding the complete absence of some rare categories in the test set due to random partitioning.

[0059] Suppose that the training set after partitioning contains The test set contains [number] samples. A sample, satisfying The numerical features, text features, and labels of the training set are denoted as follows: , and The corresponding data for the test set is denoted as: , and .

[0060] Step S2.2: Since learning style data in online education scenarios typically exhibits a severe class imbalance distribution, taking four typical learning styles as an example, balanced or visual learners may constitute the majority of the total sample, while auditory, kinesthetic, and other style learners have very few samples, with some categories possibly accounting for less than 5% of the total. If the model is trained directly on such imbalanced data, the optimization process will be dominated by the gradient signals of the majority class. The model tends to predict most samples as the majority class to obtain a higher overall accuracy, but its ability to identify rare classes is extremely weak. To address this issue, this embodiment uses the SMOTE oversampling method on the training set to synthesize and augment the minority class samples.

[0061] The core idea of ​​the SMOTE method is to generate new synthetic samples in the feature space of minority class samples through linear interpolation, rather than simply copying existing samples. For minority class samples in the training set... SMOTE first finds its [specific characteristics] in the sample set of this category. Let there be nearest neighbor samples. The The nearest neighbor samples are ,in The formula for generating the synthetic sample is:

[0062] in, To obtain from a uniform distribution Interpolation coefficients from random sampling This is the direction vector from the current sample to its nearest neighbor samples. Through different... The values ​​are selected such that the synthetic samples are distributed along the line connecting the two real samples, which maintains the class characteristics of the samples while introducing a moderate degree of diversity, thus avoiding the risk of overfitting caused by simple replication.

[0063] Furthermore, when implementing SMOTE, the sample size for some rare categories may be extremely small. Setting the value too large will exceed the actual number of samples in that category, causing the algorithm to fail. Therefore, an adaptive nearest neighbor parameter selection strategy is adopted. Set to the smaller of the preset value and the minimum number of samples in each class in the training set minus one:

[0064] in, This is the default number of nearest neighbors. For the training set The number of samples in each category This represents the total number of categories.

[0065] During oversampling, the numerical and textual features of the training set are first concatenated along the feature dimension to form a unified feature matrix. 49 is the sum of the 17-dimensional numerical features and the 32-dimensional text features:

[0066] SMOTE oversampling is performed on the concatenated feature matrix and its corresponding labels to ensure a consistent number of samples across all categories. After oversampling, the balanced feature matrix is ​​re-split into numerical and textual features. Let the balanced training set contain... For each sample, the balanced numerical feature is: Text features are Category label is And each category in They have the same number of samples.

[0067] It should be noted that SMOTE oversampling is only performed on the training set, while the test set always maintains the original class distribution. This ensures that the model evaluation results can truly reflect its recognition ability on real imbalanced data, allowing the model to fully learn the feature patterns of each learning style and avoiding bias towards the majority class during training.

[0068] Step S3: Construct a learning style recognition model and train it using the training set after class balancing. Input the balanced training data after dataset partitioning and class balancing into the model's intent-aware adaptive channel weighting module.

[0069] Step S3.1, let the first... The standardized numerical feature vector of each learner is The text semantic intent vector is ,in, For numerical feature dimensions, This refers to the text intent vector dimension. The first step in intent-guided weighting is to map the intent vector to channel weight coefficients that correspond one-to-one with the numerical feature dimensions. This mapping is achieved through a weight generation matrix. and bias vector The implementation and calculation process are as follows:

[0070] in, This is the original activation value vector after linear mapping. This represents the matrix multiplication operation between the intent vector and the weight generation matrix. Weight generation matrix It is the core bridge connecting the text semantic space and the numerical feature space, its first column vector The decision was made How are the weights of each numerical feature channel derived from the sum of the semantic components in the intent vector? (Matrix) and bias All parameters are learnable and are randomly initialized using a normal distribution with a mean of zero and a standard deviation of 0.05 during the model initialization phase.

[0071] in, This indicates that the mean is zero and the variance is... The weights follow a normal distribution. A small initial standard deviation ensures that the weight coefficients generated in the early stages of training fluctuate within a reasonable range, avoiding the interference of extreme weight values ​​on the stability of the model in the early training stages.

[0072] Step S3.2, activation values ​​after linear mapping Then, a nonlinear transformation is performed using the Sigmoid activation function to compress it to the interval between 0 and 1, generating the final channel gating weight vector:

[0073] in For the Sigmoid function, For the generated channel weight vector, its th Each component Indicates the first The gating coefficients for each numerical feature channel. The choice of the Sigmoid function has a clear physical meaning: a weight value close to 1 indicates that the feature channel is "activated" or "preserved" under the current intent condition, a weight value close to 0 indicates that the channel is "suppressed" or "ignored", and a value in between indicates partial preservation. This gating mechanism ensures that the activation level of each numerical feature channel is directly determined by the learner's textual intent, and learners with different intent tendencies will obtain different gating coefficients for the same numerical feature.

[0074] In practical calculations, to avoid numerical instability caused by exponential overflow, the input to the Sigmoid function is truncated. Limited to Within the range:

[0075] in, The function restricts the input value to a lower bound. and the Upper Realm Values ​​outside the range are truncated to boundary values.

[0076] Step S3.3: After obtaining the channel gating weights, the weights are applied to the standardized numerical features through element-wise multiplication (Hadamard product) to complete the intent-guided feature modulation.

[0077] in, This represents element-wise multiplication. This represents the weighted, modulated numerical feature vector. Expanding to each feature dimension, the above operation is equivalent to:

[0078] That is, each numerical feature component is multiplied by its corresponding gating coefficient. When the gating coefficient of a feature dimension is large, the original feature value of that dimension is preserved almost completely; when the gating coefficient is small, the feature value of that dimension is significantly attenuated. The effect of this channel-wise modulation is that, after intention-weighted analysis, the relative importance of each dimension of the 17-dimensional numerical features of the same learner is redistributed according to the textual intent. For example, a learner who expresses an intent to "prefer video learning" in a comment may receive a higher gating coefficient for the video-related numerical feature channels, while the channels related to text reading or practice are moderately suppressed, making the weighted features more prominently reflect the learner's visual learning preferences.

[0079] Step S3.4: The weighted numerical features are then concatenated with the original text semantic intent vector to form the final fused feature representation.

[0080] in This represents a vector concatenation operation. The fused feature vector consists of 17 dimensions representing the numerical features weighted by intent, and 32 dimensions representing the original text intent vector. The original intent vector is retained because the fused features not only contain the behavioral information modulated by intent, but also retain the complete semantic expression of the intent itself, providing a dual source of information for the subsequent classifier.

[0081] Step S3.5 involves mapping the obtained fused features to predicted scores for the learned style categories through a classification layer. The classification layer consists of a weight matrix. and bias vector Composition, in which, This represents the number of style categories learned. After linear mapping, the class probability distribution is generated using the Softmax function:

[0082]

[0083] in, This is the unnormalized category score vector (logits). For the first The sample belongs to the first The predicted probabilities of each learning style category are given, and the sum of the probabilities of all categories is 1. In the Softmax calculation, to prevent exponential overflow, a numerical stabilization technique is used by subtracting the maximum value.

[0084] This process does not change the probability distribution, but effectively avoids the overflow problem caused by exponential operations when the logits value is large. The final learned style prediction category is the category with the highest probability.

[0085] in, For the first Predicted learning style category for each learner. Weight matrix of the classification layer. and bias Also a learnable parameter, the initialization method is the same as... Consistent, using a normal distribution with a standard deviation of 0.05 and a zero vector.

[0086] It is important to note that all learnable parameters in the entire intent-aware adaptive channel-weighted network include: the weight generation matrix. and its bias Classification layer weights and its bias The network structure is simple and efficient, with a small total number of parameters. This avoids the risk of overfitting on small-scale educational data due to deep and complex networks, and achieves cross-modal feature interaction capabilities that surpass simple concatenation methods through an intent-guided gating weighting mechanism. The module outputs three key sets of information: category probability distribution... Channel gating weights and weighted numerical features These will be used for subsequent classification loss calculation, interpretability analysis, and dual regularization constraints, respectively.

[0087] Furthermore, based on the intent-aware adaptive channel weighting mechanism, a dual regularization constraint is introduced. This imposes explicit constraints on the weighting process at both the structural level of the weight generation matrix and the cross-modal semantic alignment level, ensuring the diversity of channel weights and the semantic accuracy of intent information transmission. The dual regularization consists of two parts: channel orthogonal regularization constraints and cross-modal semantic consistency constraints, providing solutions to the channel redundancy problem and semantic deviation problem identified in step S3.1, respectively.

[0088] (I) The first constraint is channel orthogonal regularization. Weight generation matrix The column vector The decision was made How are the weights of each numerical feature channel derived from the components of the intent vector? When the matrix... When the column vectors are pairwise orthogonal, the weight generation directions of each channel are independent, and the weighting mechanism can provide modulated signals with substantial differences for different feature dimensions. Conversely, when the column vectors are highly correlated, the weights generated by different channels tend to be consistent, and the weighting mechanism degenerates into an approximately uniform scaling operation.

[0089] To quantify the degree to which each column vector deviates from the orthogonal state, in this embodiment, the matrix is ​​calculated. The product of the transpose and itself And examine its relationship with the identity matrix. The difference between them. Matrix The The elements are:

[0090] in, Representation matrix No. Line number The elements of the column. When At that time, the value was the first. The square of the magnitude of a column vector; when At that time, the value was the first. Column and number The dot product of column vectors reflects the correlation between the directions of the two channels. If all column vectors are unit orthogonal vectors, then... Exactly equal to the identity matrix Based on this observation, the channel orthogonal regularization loss... Defined as a matrix With the identity matrix The square of the Frobenius norm of the difference:

[0091] The Frobenius norm is defined as the square root of the sum of the squares of all elements of a matrix, i.e. Therefore, the above losses can be broken down as follows:

[0092] when hour, The column vectors are all orthogonal and have a magnitude of 1; The larger the value, the higher the correlation between the column vectors or the further the magnitude deviates from the unit value. During training, the gradient of this loss term drives... The column vectors gradually move towards orthogonal directions, thereby eliminating redundancy between channels.

[0093] For the weight generation matrix The gradient has an analytical form, and the derivation process is as follows: remember ,but ,in, This represents the trace operation of a matrix. It utilizes the chain rule of matrix differentiation:

[0094] This gradient expression indicates that when When the gradient approaches zero, the loss reaches its minimum when the deviation is close to the identity matrix; when the deviation is large, the gradient magnitude also increases accordingly, providing a stronger correction signal.

[0095] (II) The second constraint is the cross-modal semantic consistency constraint. The original intention of the intention-guided weighting mechanism is to make the weighted numerical features consistent with the input intention vector in the semantic direction. That is, the learning preference direction expressed by the text intention should be faithfully reflected in the weighted features.

[0096] To impose explicit constraints on this semantic alignment relationship, a cross-modal semantic consistency loss based on cosine similarity is introduced. Cosine similarity measures the degree of consistency between two vectors in direction, is not affected by the absolute length of the vectors, and is suitable for comparing vectors in text semantic spaces and numerical feature spaces at different numerical scales.

[0097] For the For each sample, its text intent vector is: The weighted numerical features are Because the two vectors have different dimensions ( and Since it's impossible to directly calculate cosine similarity across the entire dimension, we align and calculate it by taking the common dimension portion of the two vectors. Then take the first part of the intention vector. dimension and all weighted features dimension Perform cosine similarity calculation.

[0098] It is important to note that although the truncated intent vector and numerical features originally belong to different spaces in a physical sense, under the joint training framework, minimizing the consistency loss will force the front of the text semantic subspace to... The principal components and high-dimensional representations of numerical features are aligned in a common implicit semantic subspace. The calculation of cosine similarity effectively serves as the directional anchor for this cross-modal latent space alignment. First, L2 normalization is performed on both vectors:

[0099]

[0100] in, Represents the L2 norm, i.e. , To prevent division by zero by extremely small constants, L2 normalization maps the two vectors onto the unit hypersphere, eliminating the interference of length differences on direction comparison. The cosine similarity between the two normalized vectors is calculated by element-wise multiplication and summation:

[0101] The range of values ​​for cosine similarity is: A value of 1 indicates that the two vectors are in the same direction, a value of -1 indicates that the directions are completely opposite, and a value of 0 indicates that they are orthogonal and unrelated. Cross-modal semantic consistency loss. Defined as the average degree of inconsistency of the cosine similarity of all training samples:

[0102] in, This represents the number of samples in the current training batch. When the intent vector and weighted features are perfectly aligned in direction, The loss is zero if the direction deviation is zero; the greater the directional deviation, the greater the loss. Minimize The process forces the weighted numerical features to converge toward the intent vector in the semantic direction, ensuring the fidelity of the feature weighting process to the textual intent.

[0103] For the weight generation matrix The gradient needs to propagate layer by layer using a chain rule. Weighted features Among them, the gating weight ,therefore pass Indirectly dependent on The backpropagation path of the gradient is as follows: First, calculate... Weighted features The gradient of the normalized intention vector is approximately the negative direction of the vector.

[0104] Here, an approximate gradient direction is adopted by ignoring the complex derivative terms of the norm denominator. This not only significantly reduces the gradient computation complexity in the cross-modal alignment process, but also accurately preserves the core driving force that pushes numerical features toward the semantic intent of the text. It also effectively avoids the numerical instability and gradient explosion problems that are easily caused by the derivative of the normalization term at the minimum value, and significantly improves the overall robustness of training.

[0105] Then, the result is passed to the gating weights via the chain rule of element-wise multiplication. :

[0106] The derivative of the Sigmoid function is then passed to the linear activation value. The derivative of the Sigmoid function is , where 1 is the same as A vector of all one dimensions. Therefore:

[0107] final, right The gradient is obtained by summing the contributions of all samples:

[0108] The gradient calculation process described above is clear and complete. All operations are standard matrix and vector operations, which can be calculated together with the gradients of classification loss and orthogonality loss during training without introducing significant additional computational overhead.

[0109] Channel orthogonal regularization and cross-modal semantic consistency constraints impose constraints on the intent-guided weighting process from two complementary perspectives. Orthogonal constraints act on the weight generation matrix. The internal structure ensures that the weight generation directions of each channel are independent of each other. This is a prior structural constraint on the model parameters, independent of specific input data. The consistency constraint, on the other hand, acts on the semantic relationship between the input and output of the weighting process, ensuring that the semantic direction of the intent vector is maintained in the weighted features. This is a posterior semantic constraint on the model behavior, directly related to the quality of cross-modal information transmission for each specific sample. Figure 2 The paper demonstrates the channel-wise gated weight distributions generated by the model with double regularization constraints across four learning styles: visual, auditory, kinesthetic, and balance. Figure 2 The horizontal axes I-XVII represent image learning time, video viewing time, text reading time, audio learning time, number of messages posted, number of group discussions, number of correct answers to standard questions, accuracy rate of practice, frequency of resource access, number of button clicks, number of video pauses, number of resource skips, number of forum posts, number of assignment submissions, total online time, number of logins, and course completion progress, respectively. The channel weights for different learning styles show a significantly differentiated distribution: visual learners have significantly higher weights in the video learning time and image learning time channels; kinesthetic learners have higher weights in the practical operation and practice completion channels; auditory learners have prominent weights in the audio learning channels; and balanced learners have relatively even weight distributions across all channels. This distribution intuitively demonstrates that intention-guided weighting can accurately focus on key characteristic channels according to learning style, possessing strong interpretability and providing a reliable basis for teaching decisions.

[0110] After determining the specific forms of the intent-aware adaptive channel weighting mechanism and the dual regularization constraint, it is necessary to integrate the classification loss and the two regularization losses into a unified comprehensive loss function, and design corresponding training optimization strategies so that the three optimization objectives can work together during the training process.

[0111] The classification loss uses the multi-class cross-entropy loss function, which measures the difference between the model's predicted class probability distribution and the true label. For the first... The training samples have the following true labels: First, it is converted into a one-hot encoded representation. , of which One component is 1, and the rest are 0. The formula for calculating cross-entropy loss is:

[0112] in, The first output of the Softmax function in step 3.4 The sample belongs to the first The predicted probabilities of each category, To prevent extremely small constants with zero values ​​from occurring in logarithmic operations, and since only the component corresponding to the true class is 1 in one-hot encoding, the above equation actually simplifies to:

[0113] That is, the loss value equals the negative logarithm of the model's predicted probability of the true class. When the model correctly predicts with high confidence, When the value is close to 1, the loss approaches zero; when the predicted probability is too low, the loss increases sharply, providing a strong correction signal for the model. The cross-entropy loss of a training batch is obtained by averaging the losses across all samples within that batch.

[0114] in, This represents the number of samples in the current batch.

[0115] Based on the classification loss, the defined channel orthogonal regularization loss is applied. and cross-modal semantic consistency loss The weighted averages are integrated to form a comprehensive loss function. :

[0116] in, and These are the weight coefficients for channel orthogonal regularization and cross-modal semantic consistency constraints, respectively, used to control the influence of the two regularization losses on the classification loss.

[0117] The gradient of the comprehensive loss function with respect to the learnable parameters of the model is a weighted sum of the gradients of the three loss terms. For the weight generation matrix... Its gradient contains three sources:

[0118] Among them, cross-entropy loss is used for The gradient is obtained through backpropagation layer by layer through the classification layer and the gated weighted layer. Specifically, the gradient of the cross-entropy loss with respect to the Softmax output is the difference between the predicted probability and the one-hot label:

[0119] in, For the predicted probability vector, This is the logits vector. The gradient continues through the classification layer weights. Propagation to fusion features :

[0120] In the fusion features, the front Dimension corresponds to the weighted numerical features Therefore, the gradient extraction process begins before... Dimensional components:

[0121] Then, through element-wise multiplication and the chain rule of the sigmoid function, the propagation is gradually applied to the gating weights. and linear activation value :

[0122]

[0123] Finally, the contributions of all samples are summed to obtain the cross-entropy loss. gradient:

[0124] Combined with the derived orthogonal loss gradient The consistency loss gradient, along with the three gradients, are weighted and combined before being used for updating. For classification layer weights For each bias vector, only the cross-entropy loss generates the gradient; the orthogonality constraint and consistency constraint do not directly involve these parameters.

[0125] After gradient calculation, to prevent training instability caused by gradient explosion, all gradients are pruned element-wise. For the weight generation matrix... and classification layer weights The gradient restricts the value of each element to a certain value. Within the range:

[0126] in, For the comprehensive loss function The gradient with respect to the weight matrix; The function will exceed the interval The gradient values ​​are truncated to boundary values. Gradient pruning ensures that the magnitude of each parameter update is not too large, avoiding destructive updates to model parameters caused by extreme gradient values ​​generated by individual outliers.

[0127] Parameter updates employ a gradient descent optimization strategy with a fixed learning rate. This applies to all learnable parameters. The update rules for each step are as follows:

[0128] in, For learning rate, To integrate the loss parameters The gradient of the learning rate. The value of the learning rate was determined experimentally, ensuring both convergence speed and avoiding training oscillations caused by excessively large step sizes.

[0129] The training process employs a mini-batch stochastic gradient descent strategy. At the beginning of each training epoch, the training set is first balanced... Each sample is randomly shuffled and then sorted by batch size. The samples are divided into several batches. For each batch, forward propagation is performed to calculate the three loss terms, followed by backpropagation to calculate the gradient and update the parameters. Let the total number of training rounds be . During the complete training process, the model parameters undergo a total of [number] cycles. The next update, in which This indicates the rounding up operation.

[0130] During training, the overall loss for each round is recorded simultaneously. Orthogonal loss and consistency loss The average value is used to monitor the training progress and the optimization status of various constraints. The convergence trend of the orthogonal loss reflects the process of the column vectors of the weight generation matrix becoming more orthogonal, while the decreasing curve of the consistency loss reflects the gradual improvement of cross-modal semantic alignment. When all three losses tend to stabilize and the accuracy of the training set no longer changes significantly, the model reaches convergence and the training process ends.

[0131] Step S4: Input the multimodal data of the learner to be identified into the trained learning style recognition model to obtain the recognition result of the learning style.

[0132] After training and optimization, all learnable parameters of the intent-aware adaptive channel weighting module are: weight generation matrix. Bias Classification layer weights and bias The model has converged to its optimal state under double regularization constraints. Once trained, the model can classify and predict the learning style of new learner samples.

[0133] It's important to note that the difference in computational flow between the prediction and training phases lies in the fact that the prediction phase only performs forward propagation, without calculating the loss function and gradients, or updating parameters, thus resulting in higher computational efficiency. For a single learner sample, the computation from feature input to prediction output involves only two matrix-vector multiplications, one sigmoid operation, one element-wise multiplication, and one softmax operation, with an overall computational complexity of O(n log n). Under the parameter scale of the present invention ( , , The computational load is extremely small, which can meet the response speed requirements of online education platforms for real-time learning style recognition.

[0134] In addition, the channel gating weights generated during the prediction process It has intuitive and interpretable significance. Through analysis of different learning style categories... By observing the distribution patterns across various feature dimensions, educators can understand the key behavioral characteristics that the model emphasizes when identifying different learning styles. For example, when the model identifies a learner as visual, its gating weights are higher on the dimensions corresponding to image learning time and video viewing time, and lower on the dimension corresponding to audio learning time. This weight distribution pattern closely matches the description of the behavioral characteristics of visual learners in educational theory, providing an understandable basis for the model's predictions. This interpretability is a significant advantage of this invention compared to traditional black-box deep learning methods, helping educators build trust in the model's identification results and develop targeted teaching strategies accordingly.

[0135] Finally, to verify the effectiveness of the method of the present invention, the learning style recognition method of the present invention is compared with the pure numerical feature benchmark method (Method 1) and the simple splicing benchmark method (Method 2). The comparison results are as follows: Figure 3 As shown. The results show that the method of this invention significantly outperforms traditional methods using only numerical features and simple multimodal concatenation methods in all indicators. Specifically, its overall accuracy reaches 0.884, significantly better than the accuracy of Method 1 (0.864) and Method 2 (0.844), indicating that this method can correctly identify the styles of the vast majority of learners. The macro-average F1 is the arithmetic mean of the F1 values ​​of all categories and is not affected by class imbalance. The macro-average F1 of this method reaches 0.706, while the macro-average F1 of Method 1 and Method 2 are 0. The values ​​of 508 and 0.637 indicate that the method of this invention performs relatively evenly across all categories, without sacrificing the recognition ability of rare classes due to the large number of samples in the majority class. The weighted F1 of this method is 0.890, which is significantly better than 0.853 of Method 1 and 0.842 of Method 2, indicating that the overall performance of this method is more stable under the actual data distribution. The F1 of rare classes has been greatly improved from 0.133 of Method 1 and 0.408 of Method 2 to 0.508, verifying the effectiveness of the dual regularization framework in cross-modal fusion quality and rare class recognition ability.

[0136] Figure 4The convergence curves of the training loss are presented. Analysis of the loss curves during training shows that the comprehensive loss exhibits a stable decreasing trend and eventually converges, indicating that the collaborative optimization process of the three losses is stable and effective, without any conflict or oscillation. The orthogonal loss decreases rapidly in the early stages of training, reflecting that the column vectors of the weight generation matrix are quickly dispersed in different directions under the impetus of orthogonal constraints, and channel redundancy is rapidly eliminated. The consistency loss also shows a continuous decreasing trend, indicating that the degree of cross-modal semantic alignment continuously improves as training progresses, and the semantic consistency between the intent vector and the weighted features gradually strengthens. The convergence behavior of the two regularization losses is consistent with the decreasing trend of the comprehensive loss, verifying the compatibility and synergy among the constraint terms in the dual regularization framework.

[0137] Example 2 This embodiment discloses a learning style recognition system based on intent weighting and dual regularization; like Figure 5 As shown, the learning style recognition system based on intent weighting and dual regularization includes: The data acquisition and preprocessing module is configured to: acquire learners' numerical behavioral features and textual comment information, obtain standardized numerical behavioral features and textual semantic intent vectors through preprocessing, and construct a multimodal dataset; The dataset partitioning and balancing module is configured to: partition the multimodal dataset into a training set and a test set, and perform class balancing on the training set to ensure that the number of samples in each class is balanced; The model building and training module is configured to: build a learning style recognition model and train it using the training set after class balancing to obtain a trained learning style recognition model. The intent-aware adaptive channel weighting module of the learning style recognition model uses the text semantic intent vector as a condition, introduces double regularization constraints to impose explicit constraints on the weight generation matrix, and uses the generated channel gating weights to modulate the standardized numerical features element by element to obtain weighted numerical features, which are then concatenated with the text intent vector to form fusion features. The fusion features are then input into the model's classification module to output the classification result of the learning style. The recognition result prediction module is configured to input the multimodal data of the learner to be identified into the trained learning style recognition model to obtain the recognition result of the learning style.

[0138] Example 3 The purpose of this embodiment is to provide a computer-readable storage medium.

[0139] A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of a learning style recognition method based on intent weighting and dual regularization as described in Example 1.

[0140] Example 4 The purpose of this embodiment is to provide an electronic device.

[0141] An electronic device includes a memory, a processor, and a program stored in the memory and executable on the processor. When the processor executes the program, it implements the steps in a learning style recognition method based on intent weighting and dual regularization as described in Embodiment 1.

[0142] Example 5 Embodiment 5 of the present invention provides a computer program product, including a computer program / instruction, which, when executed by a processor, implements the steps in the learning style recognition method based on intent weighting and dual regularization as described in Embodiment 1.

[0143] The steps and methods involved in the apparatuses of Embodiments 2, 3, 4, and 5 above correspond to those in Embodiment 1. For specific implementation details, please refer to the relevant description section of Embodiment 1. The term "computer-readable storage medium" should be understood as a single medium or multiple media including one or more instruction sets; it should also be understood as including any medium capable of storing, encoding, or carrying an instruction set for execution by a processor and enabling the processor to perform any of the methods in this invention.

[0144] Those skilled in the art will understand that the modules or steps of the present invention described above can be implemented using general-purpose computer devices. Optionally, they can be implemented using computer-executable program code, thereby allowing them to be stored in a storage device for execution by a computer device, or they can be fabricated as separate integrated circuit modules, or multiple modules or steps can be fabricated as a single integrated circuit module. The present invention is not limited to any particular combination of hardware and software.

[0145] While the specific embodiments of the present invention have been described above in conjunction with the accompanying drawings, this is not intended to limit the scope of protection of the present invention. Those skilled in the art should understand that various modifications or variations that can be made by those skilled in the art without creative effort based on the technical solutions of the present invention are still within the scope of protection of the present invention.

Claims

1. A learning style recognition method based on intent weighting and dual regularization, characterized in that, include: We acquire learners' numerical behavioral features and textual comment information, obtain standardized numerical behavioral features and textual semantic intent vectors through preprocessing, and construct a multimodal dataset. The multimodal dataset is divided into a training set and a test set, and the training set is subjected to class balancing to ensure that the number of samples in each class is balanced. A learning style recognition model is constructed and trained using a training set after class balancing to obtain a well-trained learning style recognition model. The intent-aware adaptive channel weighting module of the learning style recognition model uses the text semantic intent vector as a condition, introduces double regularization constraints to impose explicit constraints on the weight generation matrix, and uses the generated channel gating weights to modulate the standardized numerical features element by element to obtain weighted numerical features, which are then concatenated with the text intent vector to form fusion features. The fusion features are then input into the model's classification module to output the classification result of the learning style. The multimodal data of the learner to be identified is input into the trained learning style recognition model to obtain the recognition result of the learning style.

2. The learning style recognition method based on intent weighting and dual regularization as described in claim 1, characterized in that, Preprocessing numerical behavioral features and textual comment information yields standardized numerical behavioral features and textual semantic intent vectors, including: Z-score standardization is applied to each dimension of the numerical behavioral characteristics to give each dimension a distribution with zero mean and unit variance. Textual comment information is vectorized using TF-IDF to obtain high-dimensional sparse features. Then, the high-dimensional sparse features are reduced in dimensionality by truncated singular value decomposition to obtain a low-dimensional dense text semantic intent vector.

3. The learning style recognition method based on intent weighting and dual regularization as described in claim 1, characterized in that, The multimodal dataset is divided into a training set and a test set, and the training set is subjected to class balancing, including: A stratified random sampling strategy was used to divide the multimodal dataset into a training set and a test set; The SMOTE oversampling method is used to synthesize and expand the minority class samples in the training set to balance the number of samples in each class. The method of generating the synthesized samples is as follows: in the feature space of the minority class samples, select the k nearest neighbor samples of the minority class samples, and perform linear interpolation on the line connecting the minority class samples and a randomly selected nearest neighbor sample to generate new synthesized samples.

4. The learning style recognition method based on intent weighting and dual regularization as described in claim 1, characterized in that, The intent-aware adaptive channel weighting module uses the text semantic intent vector as a condition, introduces double regularization constraints to impose explicit constraints on the weight generation matrix, and uses the generated channel-gated weights to modulate the standardized numerical features element-wise to obtain weighted numerical features, which are then concatenated with the text intent vector to form fused features, including: The text semantic intent vector is mapped to channel weight coefficients corresponding to the standardized numerical feature dimensions by using a weight generation matrix and a bias vector: in, This is the original activation value vector after linear mapping; This is a text semantic intent vector; Generate a matrix for the weights; It is the bias vector; Activation value after linear mapping The final channel-gated weight vector is generated by performing a nonlinear transformation using the Sigmoid activation function. in, For the Sigmoid function, This is the generated channel weight vector; Based on the obtained channel gating weights, the weights are applied to the standardized numerical features through element-wise multiplication to complete the feature modulation guided by intent: in, This represents element-wise multiplication. Weighted numerical features; The weighted numerical features are concatenated with the original text semantic intent vector to obtain the fused features, represented as follows: in, This represents a vector concatenation operation. The fused feature vector The number of dimensions for standardized numerical behavioral features; The dimension of the text semantic intent vector.

5. The learning style recognition method based on intent weighting and dual regularization as described in claim 1, characterized in that, The dual regularization constraints include channel orthogonal regularization constraints and cross-modal semantic consistency constraints; The channel orthogonal regularization constraint is used to make the column vectors of the matrix used to generate weights in the intention-aware adaptive channel weighting module tend to be orthogonal. The cross-modal semantic consistency constraint is used to ensure that the weighted numerical features are aligned with the text semantic intent vector in the semantic direction.

6. The learning style recognition method based on intent weighting and dual regularization as described in claim 1, characterized in that, It also includes optimizing the training of the learning style recognition model using a comprehensive loss function; the comprehensive loss function is a weighted sum of classification loss, channel orthogonality regularization loss, and cross-modal semantic consistency loss, as shown below: in, This is the overall loss function; For classification loss; This is the channel orthogonal regularization loss; For cross-modal semantic consistency loss; , These are the weight coefficients for channel orthogonal regularization and cross-modal semantic consistency constraints, respectively, used to control the influence of the two regularization losses on the classification loss.

7. A learning style recognition system based on intent weighting and dual regularization, characterized in that, include: The data acquisition and preprocessing module is configured to: acquire learners' numerical behavioral features and textual comment information, obtain standardized numerical behavioral features and textual semantic intent vectors through preprocessing, and construct a multimodal dataset; The dataset partitioning and balancing module is configured to: partition the multimodal dataset into a training set and a test set, and perform class balancing on the training set to ensure that the number of samples in each class is balanced; The model building and training module is configured to: build a learning style recognition model and train it using the training set after class balancing to obtain a trained learning style recognition model. The intent-aware adaptive channel weighting module of the learning style recognition model uses the text semantic intent vector as a condition, introduces double regularization constraints to impose explicit constraints on the weight generation matrix, and uses the generated channel gating weights to modulate the standardized numerical features element by element to obtain weighted numerical features, which are then concatenated with the text intent vector to form fusion features. The fusion features are then input into the model's classification module to output the classification result of the learning style. The recognition result prediction module is configured to input the multimodal data of the learner to be identified into the trained learning style recognition model to obtain the recognition result of the learning style.

8. A computer-readable storage medium having a program stored thereon, characterized in that, When the program is executed by the processor, it implements the steps in the learning style recognition method based on intent weighting and dual regularization as described in any one of claims 1-6.

9. An electronic device comprising a memory, a processor, and a program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps in the learning style recognition method based on intent weighting and dual regularization as described in any one of claims 1-6.

10. A computer program product, comprising a computer program / instructions, characterized in that, When the computer program / instruction is executed by the processor, it implements the steps in the learning style recognition method based on intent weighting and dual regularization as described in any one of claims 1-6.