Random subspace-based decision tree classification foreign Chinese difficulty evaluation method

A technique of random subspace and decision tree classification, which is applied in the field of evaluation of the difficulty of Chinese as a foreign language in decision tree classification, and achieves remarkable effects and a significant increase in classification basis

Active Publication Date: 2020-04-28
HUAZHONG NORMAL UNIV
View PDF12 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] For at least one of the above deficiencies or improvement needs of the prior art, especially due to the complexity of the text classification problem of Chinese learners, when facing the different needs of Chinese learners, t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Random subspace-based decision tree classification foreign Chinese difficulty evaluation method
  • Random subspace-based decision tree classification foreign Chinese difficulty evaluation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other. The present invention will be further described in detail below in combination with specific embodiments.

[0026] Such as figure 1 As shown, the present invention provides a method for assessing the difficulty of Chinese as a foreign language based on the random subspace feature selection of svm and bert model for decision tree classification, including the following steps:

[0027] S1. Preproc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a decision tree classification foreign Chinese difficulty evaluation method based on svm and bert model random subspace feature selection. The method comprises the steps: generating 86 statistical features according to the length and readability of an article, carrying out the classification through svm, and obtaining a confidence degree 1; and classifying the encoding features by using svm to obtain a confidence coefficient 2. And fusing the obtained two confidence coefficients to serve as new features, and classifying the features by using a decision tree. And for theencoding feature data, outputting an information result through the-1 layer of the encoding extracted by the BERT model, and then performing average-> max pooling processing to obtain 768-dimensionalfeatures in total without normalization. According to the method, the problems of low efficiency and under-fitting of a traditional algorithm are avoided, and all information is used most reasonably,so that the classification basis is increased, and the effect is remarkable. According to the method, the accuracy of evaluating the difficulty of foreign Chinese is 85.6%.

Description

technical field [0001] The invention belongs to the field of educational informatization, and in particular relates to a method for evaluating difficulty of Chinese as a foreign language based on decision tree classification of random subspace feature selection of SVM and BERT models. Background technique [0002] We all know that reading should be done step by step, from easy to difficult. Too difficult can easily lead to frustration of students' self-confidence and loss of interest in reading. Too simple and low-level repetition is not conducive to the continuous improvement of reading ability, and cannot meet the academic requirements for reading complex texts and conducting related research after entering university. In short, only the difficulty is suitable for the best. With the development of China, China plays an increasingly important role on the international stage, which makes more people have the demand to learn Chinese. Learning Chinese texts is one of the mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F16/33G06F40/211G06K9/62
CPCG06F16/35G06F16/3344G06F18/2411G06F18/24323Y02D10/00
Inventor 曾致中陈治平余新国方淙王静静袁航熊佳洁
Owner HUAZHONG NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products