Intelligent correction method and system for college english writing based on natural language processing
By using natural language processing technology, English essays are analyzed and scored at multiple levels, which solves the problems of large workload and long feedback cycle in college English writing teaching. It achieves accurate correction and personalized learning resource recommendation, forming a closed loop of correction-learning-consolidation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HEILONGJIANG NURSING COLLEGE
- Filing Date
- 2026-03-11
- Publication Date
- 2026-06-12
Smart Images

Figure CN122197862A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of natural language processing and educational technology, specifically to a method and system for intelligent grading of college English writing based on natural language processing. Background Technology
[0002] With the continuous expansion of higher education, college English writing instruction faces increasingly severe challenges. In the traditional teaching model, teachers need to grade each student's submitted English essay, a process that is not only time-consuming and energy-intensive but also makes it difficult to guarantee the timeliness and consistency of grading. According to relevant surveys, a college English teacher on average needs to grade hundreds of student essays per semester, with each essay typically taking more than ten minutes to grade. This makes it difficult for teachers to provide students with detailed feedback in a short period of time.
[0003] Chinese invention patent CN109614623A discloses a method and system for essay processing based on syntactic analysis. This method identifies grammatical errors in essays by determining whether there are errors in the logical relationships between the target sentence and adjacent sentences, and by comparing the syntactic structure of the target sentence with preset standard sentences in a standard regular expression database. This method, to a certain extent, achieves automatic detection of grammatical errors in essays, enabling the marking and correction of abnormal sentences and improving the efficiency of essay grading.
[0004] However, the technical solution of the Chinese invention patent has the following shortcomings. First, the solution mainly targets the syntactic analysis of Chinese essays, lacking a targeted mechanism for handling complex structures unique to English writing, such as subject-verb agreement, tense agreement, and nested clauses. Second, the solution only focuses on error detection at the syntactic level, failing to address higher-level writing ability assessments such as spelling and collocation errors at the vocabulary level, and discourse coherence and logical reasoning at the discourse level. Third, the error classification in the solution is relatively coarse, only distinguishing between logical relationship errors and syntactic structure errors, failing to provide fine-grained error type labeling, making it difficult for students to accurately locate and understand their own writing problems. Fourth, the solution lacks content relevance assessment functions, failing to detect whether students' essays deviate from the topic requirements, which is of great significance for college English exams that mainly focus on essay writing. Fifth, the scoring mechanism of the solution is relatively simple, calculating scores only based on the proportion of abnormal sentences, failing to establish a multi-dimensional scoring model aligned with authoritative scoring standards. Sixth, although the solution provides a model essay recommendation function, it fails to recommend corresponding learning resources based on students' specific error types, making it difficult to form a personalized correction-learning-reinforcement loop.
[0005] In conclusion, existing technical solutions are insufficient to meet the demands of college English writing instruction for an efficient, accurate, and comprehensive intelligent correction system. There is an urgent need for an intelligent English writing correction method that can analyze multiple levels, including vocabulary, syntax, and discourse, and provide fine-grained error classification and personalized learning resource recommendations. Summary of the Invention
[0006] The purpose of this invention is to provide a method and system for intelligent grading of college English writing based on natural language processing, so as to solve the technical problems of large workload, long feedback cycle, insufficient error type labeling and lack of personalized learning resource recommendations in the existing English writing grading.
[0007] To achieve the above objectives, this invention provides an intelligent grading method for college English writing based on natural language processing, comprising the following steps:
[0008] The text preprocessing step involves obtaining the English essay text to be graded, performing word segmentation and syntactic parsing on the English essay text, and generating a word segmentation sequence and a syntactic parse tree.
[0009] The multi-level language analysis process receives word segmentation sequences and syntactic parse trees. At the lexical level, it uses context-sensitive spell checking and part-of-speech tagging to identify word usage errors. At the syntactic level, it uses dependency parsing and constituent parsing to detect subject-verb agreement errors, tense misuse errors, and clause structure incompleteness errors. At the discourse level, it uses discourse structure analysis to evaluate paragraph organization, coherence, and argumentation logic. The output includes a lexical error list, a syntactic error list, and discourse-level evaluation results.
[0010] The error classification step, based on the preset error type mapping rules, categorizes errors from the lexical and syntactic error lists into corresponding subcategories within the categories of lexical collocation, grammatical structure, punctuation usage, and expression idiomaticity. It then adds error nature descriptions and modification suggestions to each type of error and outputs a set of categorized and labeled errors.
[0011] The content relevance assessment steps involve obtaining the question requirement text corresponding to the English essay text, using a pre-trained language model to extract the semantic vectors of the English essay text and the question requirement text, and calculating the similarity score between the two semantic vectors as the content relevance score.
[0012] The multi-dimensional scoring process calculates content dimension scores, structure dimension scores, language dimension scores, and normative dimension scores based on the number and severity of errors in each category of the error set, the discourse-level assessment results, and the content relevance score. The scores of the four dimensions are weighted and summed based on preset dimension weight coefficients to obtain a comprehensive score, and the comprehensive level is determined based on the comprehensive score.
[0013] The process of generating the correction report involves presenting the location of various errors and suggested corrections in the English essay text using inline annotations based on a categorized error set. Then, based on the distribution of error types in the categorized error set, it retrieves explanatory micro-lesson links and reinforcement exercises that match the error types. Finally, it integrates the scores across four dimensions, the overall grade, inline annotations, explanatory micro-lesson links, and reinforcement exercises to generate the correction report.
[0014] This invention also provides an intelligent grading system for college English writing based on natural language processing, including a text preprocessing module, a multi-level language analysis module, an error classification module, a content relevance assessment module, a multi-dimensional scoring module, and a grading report generation module, with each module corresponding to the steps of the above method.
[0015] The present invention has the following advantages over the prior art:
[0016] First, this invention comprehensively analyzes English compositions from three dimensions—lexical, syntactic, and discourse—through a multi-level language analysis module. This overcomes the limitations of existing technologies that only focus on syntactic errors, and can detect a variety of writing problems, from spelling errors to logical flaws in argumentation, resulting in a more comprehensive correction coverage.
[0017] Secondly, the error classification module of this invention categorizes detected errors into more than twenty subcategories, with each category accompanied by an explanation of the error's nature and modification suggestions. Compared with the coarse-grained classification of existing technologies, this can help students understand the causes of errors and directions for improvement more accurately.
[0018] Third, the present invention uses a pre-trained language model to calculate the semantic similarity between the essay and the requirements of the topic through a content relevance assessment module, which can effectively detect problems such as going off-topic and insufficient content coverage, and make up for the lack of content relevance assessment in the existing technology.
[0019] Fourth, the multi-dimensional scoring module of this invention establishes a scoring model aligned with the scoring standards for College English Test Band 4 and Band 6 writing, and scores each item from four dimensions: content, structure, language, and standardization, resulting in more scientific and objective scoring results.
[0020] Fifth, the correction report generation module of this invention can push targeted explanation micro-lessons and consolidation exercises according to the distribution of students' error types, forming a personalized learning loop of correction-learning-consolidation, and enhancing the educational value of correction feedback. Attached Figure Description
[0021] Figure 1 This is a flowchart of the intelligent grading method for college English writing based on natural language processing, which is the subject of this invention.
[0022] Figure 2 This is an architecture diagram of the intelligent grading system for college English writing based on natural language processing, which is the subject of this invention. Detailed Implementation
[0023] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, and not all of them. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative effort are within the scope of protection of the present invention.
[0024] Please see Figure 1 , Figure 1 This is a flowchart of the intelligent grading method for college English writing based on natural language processing, as described in this invention. In one embodiment of this invention, the method includes the following steps:
[0025] Step S1: Text preprocessing steps
[0026] The text preprocessing step is used to obtain the English essay text to be graded, perform word segmentation and syntactic parsing on the English essay text, and generate word segmentation sequences and syntactic parse trees.
[0027] In one embodiment of the invention, the English essay text to be graded is submitted by the student through an online writing platform, and the text format is either plain text or rich text. The system first performs format cleaning on the original text, removing redundant whitespace characters and formatting marks, while retaining the paragraph structure information of the text.
[0028] The word segmentation process employs a sub-word-based segmenter to segment the English essay text. Preferably, this invention uses a byte-pair encoding algorithm for word segmentation. Byte-pair encoding is a data-driven word segmentation method whose core idea is to iteratively merge the most frequent character pairs in the corpus into new sub-word units. In the embodiments of this invention, the word segmenter's vocabulary size is set to 30,000 sub-word units, a size that effectively balances vocabulary coverage and model efficiency.
[0029] For out-of-vocabulary words that may appear in student essays, such as proper nouns, neologisms, or words with serious spelling errors, the byte-pair encoding word segmenter can break them down into combinations of known sub-word units, ensuring that each word unit receives a valid vector representation. For example, when a student misspells "environment" as "enviroment," the word segmenter will break it down into two sub-word units: "enviro" and "ment." Subsequent spell checking modules can then perform error correction based on these sub-word units.
[0030] The syntactic parsing process generates two types of syntactic structure representations: dependency parsing trees and constituent parsing trees. Dependency parsing employs a transition-based dependency parser, which takes a sequence of word segments as input and constructs a dependency graph through a series of shift-reduction operations. In the dependency graph, a directed arc is established between each lexical unit and its governing word. The label on the arc indicates the type of dependency relationship between the two lexical units, such as subject (nsubj), object (dobj), and modifier (amod).
[0031] The constituent syntactic parsing employs a neural network-based constituent syntactic analyzer, which outputs a syntactic tree organized according to phrase structure rules. In this constituent syntactic tree, leaf nodes are lexical units, and non-leaf nodes are phrase type labels, such as noun phrases (NP), verb phrases (VP), and clauses (SBAR). The hierarchical structure of the constituent syntactic tree clearly reflects the nesting relationships within sentences, which is crucial for detecting complex grammatical errors such as incomplete clause structures.
[0032] In a preferred embodiment of the present invention, the syntactic parser is initialized using model parameters pre-trained on a large-scale English corpus, including the Penn Treebank annotated corpus and the OntoNotes corpus. Preferably, the annotation accuracy of the dependency parser is not less than 95%, and the bracket matching F1 score of the constituent parser is not less than 92%.
[0033] Step S2: Multi-level language analysis steps
[0034] The multi-level language analysis step receives word segmentation sequences and syntactic parse trees, and performs in-depth analysis of English essays from three dimensions: lexical, syntactic, and discourse levels, outputting a list of errors at the lexical level, a list of errors at the syntactic level, and evaluation results at the discourse level.
[0035] In lexical analysis, this invention employs a context-sensitive spell checking algorithm to identify spelling and word usage errors. Traditional spell checking methods rely solely on dictionary-based isolated word checks, making it difficult to handle context-related errors, such as misusing "their" as "there." The context-sensitive spell checking algorithm proposed in this invention combines edit distance calculation and language model probability estimation, effectively identifying these types of errors.
[0036] Specifically, for each word element in the word segmentation sequence The system first calculates the edit distance between the given word and all candidate words in the standard dictionary. The edit distance is calculated using the Levenshtein distance algorithm, which measures the minimum number of single-character edits required to transform one string into another. The formula for calculating the context-sensitive spell check score is defined as follows:
[0037] ,
[0038] in, as a word element Replaced with candidate words The overall score; as a word element With candidate words Edit distance between; and These are the lengths of the two strings, respectively. For language models, given the prior Predicting candidate words under the condition of one word element The probability of; For editing distance weighting coefficients, a value range of 0.3 to 0.5 is preferred; This is the size of the context window, preferably 5.
[0039] Normalized edit distance Less than the preset spell similarity threshold At that time, the system will select candidate words Add to the candidate set. Preferably, a preset spell similarity threshold is used. The value ranges from 0.2 to 0.4. The system selects the comprehensive score from the candidate set. The highest-ranking candidate word will be used as the final correction suggestion.
[0040] Lexical layer analysis also includes part-of-speech tagging and collocation checking. This invention employs a part-of-speech tagger based on a bidirectional long short-term memory network to tag the segmented word sequences. The tagging results are used for subsequent subject-verb agreement checks and collocation error detection. Collocation checking is based on a large-scale English collocation corpus containing over five million high-frequency collocation patterns. For collocation combinations in student essays, the system retrieves their frequency of occurrence in the collocation corpus. When the frequency is lower than a preset collocation frequency threshold, it is marked as a potential collocation error, and correct collocation suggestions are retrieved.
[0041] In syntactic layer analysis, this invention detects multiple types of grammatical errors based on dependency parsing and constituent parsing.
[0042] Subject-verb agreement error detection extracts the subject-verb dependency arc in a sentence using a dependency parser, obtaining subject and verb lexical units. The system analyzes the number features of the subject lexical units, determining whether the subject is a singular, plural, or uncountable noun, and checks whether the morphology of the verb matches the number features of the subject. The calculation formula for the subject-verb agreement score is defined as follows:
[0043] ,
[0044] in, Subject With predicate Consistency score; The number characteristic of the subject takes the value SINGULAR or PLURAL; This represents the number form of the predicate verb, and can also be either SINGULAR or PLURAL. When the agreement score is 0, the system marks the sentence as having a subject-verb agreement error.
[0045] The tense mixing error detection first extracts all predicate verbs and their tense features from the essay, constructing a tense sequence. The system analyzes whether the tenses of adjacent sentences or clauses within the same complex sentence are consistent or conform to reasonable tense transition rules. This invention establishes a tense transition rule base, which includes rules for judging the reasonableness of common tense combinations such as simple present tense and simple past tense, and present perfect tense and simple past tense. When a tense transition that does not conform to the rule base is detected, the system marks it as a tense mixing error.
[0046] Clause incompleteness error detection is based on the constituent syntactic parsing tree. The system traverses all clause nodes in the syntactic tree, i.e., the subtrees labeled SBAR, and extracts the constituent structure within the clauses. For each clause tree, the system checks whether it contains a subject (NP) and a predicate (VP). The formula for calculating the syntactic integrity score is defined in this invention as follows:
[0047] ,
[0048] in, For clause The syntactic integrity score; It is the set of syntactic components that actually exist in the clause; The necessary set of syntactic components required for a clause typically includes the subject (NP) and the predicate (VP). This indicates the number of elements in the set. When the syntactic integrity score is below a preset syntactic integrity threshold... When this occurs, the system determines that the clause contains a structural incompleteness error. Preferably, a preset syntactic integrity threshold is used. The value range is from 0.6 to 0.8.
[0049] In discourse-level analysis, this invention assesses the macro-level writing ability of an essay from three aspects: paragraph organization, coherence, and argumentation logic.
[0050] Paragraph organization assessment first identifies paragraph boundaries in the essay, then extracts the topic sentence for each paragraph. Preferably, the topic sentence is usually the first or last sentence of the paragraph. The system uses a pre-trained language model to calculate the semantic similarity between the topic sentence of each paragraph and the central argument of the entire text, and also calculates the semantic relevance between topic sentences of adjacent paragraphs. The formula for calculating the paragraph organization score is defined as follows:
[0051] ,
[0052] in, Score based on paragraph organization; Total number of paragraphs; For the first The vector of the topic sentences of each paragraph; This is the vector representing the central argument of the entire text. The cosine similarity function; This is a coherence scoring function for adjacent topic sentences.
[0053] The coherence assessment identifies the cohesive devices between adjacent sentences in an essay, including explicit cohesive words and implicit referential relationships. Explicit cohesive words include transition words such as "but" and "however," causal words such as "because" and "therefore," and progressive words such as "moreover" and "furthermore." The system statistically analyzes the frequency and appropriateness of cohesive word usage. Implicit referential relationship detection identifies the referential chains between pronouns and their antecedents using a coreference resolution algorithm. The formula for calculating the coherence score in this invention is defined as follows:
[0054] ,
[0055] in, To ensure a smooth and coherent scoring process; The number of correctly used conjunctions; The total number of sentences; The number of pronouns successfully resolved; The total number of pronouns; The weight coefficient for the conjunction is preferably in the range of 0.4 to 0.6.
[0056] The argument logic assessment uses an argument relationship identification model to detect the supporting relationship between arguments and thesis statements in an essay. The system categorizes declarative sentences in the essay into argument sentences and supporting sentences, and analyzes the strength of support between supporting sentences and argument sentences. The calculation formula for the argument logic score is defined as follows:
[0057] ,
[0058] in, Score points for logical reasoning; The number of argument sentences; To support the first A collection of supporting sentences that form the argument; This is a scoring function for the strength of support of the arguments to the thesis, with a value ranging from 0 to 1.
[0059] Step S3: Error Classification Step
[0060] The error classification step is used to categorize errors detected in the lexical and syntactic error lists into subcategories, and to add error nature descriptions and modification suggestions to each category of errors.
[0061] This invention establishes an error type system containing more than twenty subcategories. The system is divided into four major categories according to the nature of the error: vocabulary collocation, grammatical structure, punctuation usage, and expression idiomaticity.
[0062] The vocabulary collocation category includes seven subcategories: spelling errors, word form errors, parts of speech misuse, verb collocation errors, noun collocation errors, adjective collocation errors, and preposition collocation errors. The grammatical structure category includes seven subcategories: subject-verb agreement errors, tense misuse errors, voice usage errors, incomplete clauses, missing sentence components, dangling modifiers, and parallel structure errors. The punctuation category includes four subcategories: comma misuse, missing period, incorrect quotation marks, and incorrect apostrophe. The expression authenticity category includes four subcategories: Chinglish expressions, redundant expressions, inappropriate register, and vocabulary repetition.
[0063] For each detected error, the system maps it to a corresponding subcategory based on its specific characteristics. The mapping rules are based on feature information extracted during the error detection phase, such as error type labels, error location, and part-of-speech tags of related terms. The system retrieves the error cause explanation text and standard usage example text corresponding to the subcategory from a pre-set error knowledge base.
[0064] The error knowledge base of this invention includes standardized explanation templates and example libraries for each subcategory. Taking subject-verb agreement errors as an example, the explanation text for the error cause states that in English, the singular or plural form of the subject determines the form of the verb; a singular subject is paired with a singular verb, and a plural subject is paired with a plural verb. The example text for standard usage includes a comparison between the correct usage "The students study hard" and the incorrect usage "The student study hard."
[0065] The modification suggestions are generated based on the context information of the error. The system extracts five words before and after the erroneous word as a context window, combines it with the example text of standard usage, and generates targeted modification suggestions through template filling. Preferably, the modification suggestions include three parts: error identifier, correct form, and reason for correction.
[0066] Step S4: Content Relevance Assessment Step
[0067] The content relevance assessment step is used to detect whether student essays deviate from the requirements of the topic and to assess the extent to which the essay content covers the requirements of the topic.
[0068] This invention employs a pre-trained language model to extract semantic vector representations of the essay and the requirements of the prompt. Preferably, the pre-trained language model is a bidirectional encoder model based on the Transformer architecture, which is pre-trained on a large-scale English corpus and can capture rich semantic information.
[0069] For the English essay text to be graded, the system inputs it into the encoder of a pre-trained language model. The encoder performs multi-layer self-attention calculations on the input text and outputs the hidden state vector at each word position. This invention uses the output vector at the position corresponding to the special marker [CLS] as the semantic vector representation of the entire essay, denoted as... Similarly, the system inputs the question's text into the encoder of the same pre-trained language model to obtain the semantic vector representation of the question, denoted as... .
[0070] The system calculates the cosine similarity between the semantic vector of the essay and the semantic vector of the topic as a content relevance score. The formula for calculating the content relevance score is defined as follows:
[0071] ,
[0072] in, This is a content relevance score, with a value ranging from -1 to 1; For the semantic vector of the essay; The semantic vector of the question; This represents the vector dot product operation; This represents the L2 norm of a vector.
[0073] When the content relevance score is lower than the preset semantic similarity threshold When this happens, the system determines that the essay content deviates from the topic requirements. Preferably, a preset semantic similarity threshold is used. The value ranges from 0.5 to 0.7. The system will further analyze the coverage of key concepts between the essay and the question requirements, identify key points in the question requirements that are not covered in the essay, and output specific prompts for insufficient content coverage.
[0074] In a preferred embodiment of the present invention, the system employs a multi-scale semantic matching strategy. In addition to calculating semantic similarity at the full-text level, the system also segments the essay by paragraphs and calculates the semantic similarity of each paragraph to the requirements of the topic, thereby identifying which paragraphs are highly relevant to the topic and which deviate from the theme. This multi-scale analysis provides students with more granular feedback on content relevance.
[0075] Step S5: Multi-dimensional scoring steps
[0076] The multidimensional scoring step is used to calculate scores for four dimensions and a comprehensive score based on error analysis results and discourse evaluation results.
[0077] This invention establishes a multi-dimensional scoring model aligned with the writing scoring standards of College English Test Band 4 and College English Test Band 6. The model scores essays from four dimensions: content, structure, language, and standardization, with each dimension having a maximum score of 100 points.
[0078] The content dimension score is calculated based on the content relevance score and the argumentation logic score. The calculation formula for the content dimension score is defined in this invention as follows:
[0079] ,
[0080] in, Score based on content dimension; Assign a score based on content relevance; Score points for logical reasoning; and Let be the weighting coefficient, satisfying Preferably , .
[0081] The structural dimension score is calculated based on paragraph organization score and coherence score. The formula for calculating the structural dimension score is defined in this invention as follows:
[0082] ,
[0083] in, Score the structural dimension; Score based on paragraph organization; To ensure a smooth and coherent scoring process; and Let be the weighting coefficient, satisfying Preferably , .
[0084] The language dimension score is calculated based on the number of grammatical errors and lexical richness. The system statistically analyzes the total number of grammatical structure errors and collocation errors in the categorized error set, and calculates the lexical richness index of the essay, including the category-to-form ratio and the usage rate of advanced vocabulary. The calculation formula for the language dimension score is defined as follows:
[0085] ,
[0086] in, Score for the language dimension; The total number of grammatical structure errors and lexical collocation errors; The category-to-form ratio is the ratio of the number of different words to the total number of words. The deduction value for each error is preferably between 2 and 5 points; The weighting coefficient for vocabulary richness is preferably between 0.2 and 0.4.
[0087] The normativity dimension score is calculated based on the number of errors in punctuation usage and naturalness of expression. The formula for calculating the normativity dimension score is defined in this invention as follows:
[0088] ,
[0089] in, To standardize dimensional scores; The total number of errors in punctuation usage and natural expression; The deduction value for each error is preferably between 1 and 3 points.
[0090] The system calculates a comprehensive score by weighting and summing the scores of the four dimensions according to preset dimension weighting coefficients. The formula for calculating the comprehensive score is defined as follows:
[0091] ,
[0092] in, For comprehensive scoring; , , , The weighting coefficients are for four dimensions: content, structure, language, and standards, respectively, satisfying... Based on the scoring criteria for College English Test Band 4 and College English Test Band 6 writing, the preferred... , , , .
[0093] The system determines the overall grade based on the comprehensive score. This invention divides the overall grade into five levels: Excellent (85 to 100 points), Good (70 to 84 points), Average (60 to 69 points), Pass (50 to 59 points), and Fail (below 49 points).
[0094] Step S6: Report Generation Steps
[0095] The report generation step integrates the analysis results from the preceding steps to generate a structured report and push personalized learning resources.
[0096] The grading report includes the following modules: error annotation module, grading results module, and learning resources module.
[0097] The error annotation module marks the location of various errors in the original text using inline annotations. For each error, the system uses specific markers to distinguish different error types, such as underlining for vocabulary collocation errors, wavy lines for grammatical structure errors, circles for punctuation errors, and boxes for errors related to naturalness of expression. Each error marker is accompanied by a floating tooltip displaying the error type name, a description of the error's nature, suggested corrections, and a correct example.
[0098] The scoring results module displays scores across four dimensions and an overall grade. The system visualizes the score distribution across content, structure, language, and standardization using a radar chart, allowing students to intuitively understand their performance level in each dimension. Simultaneously, the system provides comparative data with the class average and historical best scores, helping students understand their relative position and progress.
[0099] The learning resource module pushes targeted learning resources based on the distribution of students' error types. The system statistically analyzes the frequency of errors in each sub-category of the categorized error set and selects the top three error categories by frequency as key learning categories. The formula for calculating the learning resource matching score is defined as follows:
[0100] ,
[0101] in, For the first The first learning resource and the first Match score for each error type; For learning resources A collection of keywords; Error type A collection of keywords.
[0102] The system retrieves links to micro-lessons from a pre-set teaching resource library that have the highest matching score with the key learning categories. The teaching resource library contains over two hundred micro-lessons explaining English writing knowledge points, each lasting five to ten minutes and covering grammar rules, common error analysis, and correct usage demonstrations.
[0103] The system also extracts reinforcement practice questions from the question bank corresponding to the key learning categories. The question bank contains over 5,000 English writing practice questions, categorized and labeled according to error type and difficulty level. The system selects practice questions of appropriate difficulty based on the severity of the student's errors, pushing three to five practice questions to each key learning category. The error severity level is divided into three levels based on the degree of impact of the error on semantic expression: minor errors, moderate errors, and serious errors. Minor errors refer to errors that do not affect semantic comprehension, such as spelling mistakes, with a deduction of 2 points; moderate errors refer to errors that partially affect semantic comprehension, such as inappropriate collocation, with a deduction of 3 points; and serious errors refer to errors that seriously affect semantic comprehension, such as subject-verb disagreement, with a deduction of 5 points.
[0104] In a preferred embodiment of the invention, the system also supports error trend analysis. When a student uses the grading system multiple times, the system records the distribution of error types each time, generates an error trend chart, and helps students track their improvement in various error types. For error types that have not been improved over a long period, the system will increase the priority of pushing learning resources of that type to enhance learning effectiveness.
[0105] In summary, the intelligent grading method for college English writing based on natural language processing provided by this invention achieves comprehensive analysis and accurate grading of English essays through six steps: text preprocessing, multi-level language analysis, error classification, content relevance assessment, multi-dimensional scoring, and grading report generation. This method conducts in-depth analysis at three levels: vocabulary, syntax, and discourse, categorizing errors into more than twenty subcategories, establishing a multi-dimensional scoring model aligned with authoritative grading standards, and pushing targeted learning resources to form a personalized learning loop of grading-learning-reinforcement.
[0106] Please see Figure 2 , Figure 2 This is an architecture diagram of the intelligent grading system for college English writing based on natural language processing, as per the present invention. The system includes a text preprocessing module 1, a multi-level language analysis module 2, an error classification module 3, a content relevance assessment module 4, a multi-dimensional scoring module 5, and a grading report generation module 6.
[0107] The text preprocessing module 1 is used to acquire the English essay text to be graded, perform word segmentation and syntactic parsing on the English essay text, and generate a word segmentation sequence and a syntactic parse tree. In one embodiment of the present invention, the text preprocessing module 1 includes a text acquisition unit, a word segmentation unit, and a syntactic parsing unit. The text acquisition unit acquires the English essay text submitted by students from an online writing platform through an application programming interface. The word segmentation unit uses a byte-pair encoding word segmenter to segment the essay text into a word sequence, as described in step S1 of the method embodiment. The syntactic parsing unit uses a dependency parser and a constituent parser to generate a dependency parse tree and a constituent parse tree, respectively.
[0108] The multi-level language analysis module 2 receives the word segmentation sequence and syntactic parse tree output by the text preprocessing module 1, performs in-depth analysis at the lexical, syntactic, and discourse levels, and outputs a lexical error list, a syntactic error list, and a discourse evaluation result. In one embodiment of the invention, the multi-level language analysis module 2 includes a lexical analysis unit, a syntactic analysis unit, and a discourse analysis unit. The lexical analysis unit performs context-sensitive spell checking and collocation checking, calculates context-sensitive spell checking scores, and generates a lexical error list. The syntactic analysis unit detects subject-verb agreement errors and verb-object collocation errors based on dependency parsing trees, and detects clause structure incompleteness errors based on constituent parsing trees, generating a syntactic error list. The discourse analysis unit evaluates paragraph organization, coherence, and argumentation logic, calculates paragraph organization scores, coherence scores, and argumentation logic scores, and outputs discourse evaluation results.
[0109] Error classification module 3 is used to categorize errors in the lexical and syntactic error lists output by multi-level language analysis module 2 into subcategories, and to add error nature descriptions and modification suggestions, outputting a set of categorized and labeled errors. In one embodiment of the present invention, error classification module 3 includes an error mapping unit, a knowledge base retrieval unit, and a suggestion generation unit. The error mapping unit categorizes detected errors into corresponding subcategories such as lexical collocation, grammatical structure, punctuation usage, and expression idiomaticity according to preset error type mapping rules. The knowledge base retrieval unit retrieves error cause explanation texts and standard usage example texts corresponding to the subcategories from the error knowledge base. The suggestion generation unit generates modification suggestions by combining the error context and standard usage example texts.
[0110] The content relevance assessment module 4 is used to calculate the semantic similarity between the essay and the question requirements using a pre-trained language model, and outputs a content relevance score. In one embodiment of the present invention, the content relevance assessment module 4 includes a semantic vector extraction unit and a similarity calculation unit. The semantic vector extraction unit inputs the essay text and the question requirement text into the pre-trained language model encoder respectively, and extracts the corresponding semantic vector representations. The similarity calculation unit calculates the cosine similarity between the two semantic vectors as the content relevance score. When the content relevance score is lower than a preset semantic similarity threshold, it is marked as content deviating from the question requirements.
[0111] The multi-dimensional scoring module 5 is used to calculate content dimension scores, structure dimension scores, language dimension scores, and normative dimension scores based on error analysis results and discourse evaluation results, and to calculate a comprehensive score and determine a comprehensive level based on preset dimension weight coefficients. In one embodiment of the present invention, the multi-dimensional scoring module 5 includes a sub-item scoring unit and a comprehensive scoring unit. The sub-item scoring unit calculates scores for each of the four dimensions, wherein the content dimension score is calculated based on content relevance score and argumentation logic score, the structure dimension score is calculated based on paragraph organization score and coherence score, the language dimension score is calculated based on the number of grammatical errors and lexical richness, and the normative dimension score is calculated based on the number of punctuation usage errors and expression idiomaticity errors. The comprehensive scoring unit weights and sums the scores of the four dimensions according to preset dimension weight coefficients to obtain a comprehensive score, and determines a comprehensive level based on the comprehensive score.
[0112] The grading report generation module 6 integrates the analysis results from the aforementioned modules to generate a grading report and pushes personalized learning resources. In one embodiment of the invention, the grading report generation module 6 includes an error annotation unit, a scoring display unit, and a resource push unit. The error annotation unit presents the location of various errors and suggested corrections in the original essay text using inline annotations. The scoring display unit visualizes the score distribution and overall grade across four dimensions using a radar chart. The resource push unit statistically analyzes the error type distribution, selects the most frequently occurring error category as the key learning category, retrieves matching micro-lesson links from the teaching resource database, and extracts corresponding reinforcement practice questions from the question bank.
[0113] In a preferred embodiment of the present invention, the above modules are deployed on a cloud server and communicate with each other through a microservice architecture. The text preprocessing module 1 and the multi-level language analysis module 2 are deployed on high-performance computing nodes equipped with graphics processors to accelerate the inference computation of the deep learning model. The error classification module 3 and the content relevance assessment module 4 are deployed on general-purpose computing nodes. The multi-dimensional scoring module 5 and the grading report generation module 6 are deployed on an application server and communicate asynchronously with the front-end interface through a message queue.
[0114] In summary, the intelligent grading system for college English writing based on natural language processing provided by this invention achieves automated grading of English essays through the collaborative work of six functional modules. This system can analyze language problems in essays at multiple levels, provide fine-grained error classification and revision suggestions, establish a multi-dimensional scoring model aligned with authoritative scoring standards, and push targeted learning resources. It effectively solves the problems of large workload, long feedback cycles, and insufficiently detailed error type labeling in college English writing teaching.
Claims
1. A method for intelligent grading of college English writing based on natural language processing, characterized in that: include: The text preprocessing step involves obtaining the English essay text to be graded, performing word segmentation and syntactic parsing on the English essay text, and generating a word segmentation sequence and a syntactic parse tree. The multi-level language analysis steps receive the word segmentation sequence and the syntactic parse tree. At the lexical level, context-sensitive spell checking and part-of-speech tagging are used to identify word usage errors. At the syntactic level, dependency parsing and constituent parsing are used to detect subject-verb agreement errors, tense misuse errors, and clause structure incompleteness errors. At the discourse level, discourse structure analysis is used to evaluate paragraph organization, coherence, and argumentation logic. The results output a lexical level error list, a syntactic level error list, and a discourse level evaluation result. The error classification step, according to the preset error type mapping rules, classifies the errors in the lexical layer error list and the syntactic layer error list into the corresponding subcategories of lexical collocation, grammatical structure, punctuation usage, and expression idiomaticity. It adds error nature descriptions and modification suggestions to each type of error and outputs a set of classified and labeled errors. The content relevance assessment step involves obtaining the question requirement text corresponding to the English essay text, using a pre-trained language model to extract the semantic vectors of the English essay text and the question requirement text, and calculating the similarity score between the two semantic vectors as the content relevance score. The multi-dimensional scoring process involves calculating content dimension scores, structure dimension scores, language dimension scores, and normative dimension scores based on the number and severity of errors in each category of the classification and labeling error set, the discourse-level evaluation results, and the content relevance scores. The scores of the four dimensions are then weighted and summed based on preset dimension weight coefficients to obtain a comprehensive score, and the comprehensive level is determined based on the comprehensive score.
2. The method according to claim 1, characterized in that, The lexical layer employs context-sensitive spell checking to identify word errors, including: for each lexical unit in the word segmentation sequence, calculating the edit distance between the lexical unit and candidate words in the standard dictionary; when the edit distance is less than a preset spelling similarity threshold, the candidate word is used as a spelling correction suggestion; simultaneously, based on the lexical units before and after the lexical unit, the probability of each candidate word appearing in the current context is calculated using a language model, and the candidate word with the highest probability of appearance is selected as the final correction suggestion.
3. The method according to claim 1, characterized in that, The syntactic layer employs dependency parsing and constituent parsing to detect grammatical errors, including: generating a dependency graph using a dependency parser, extracting subject-verb dependency arcs and verb-object dependency arcs, and detecting subject-verb agreement errors and verb-object collocation errors; generating a phrase structure tree using a constituent parser, identifying clause boundary markers and internal clause components, and determining clause structure incompleteness errors when a clause lacks a subject or verb component and its syntactic integrity score is lower than a preset syntactic integrity threshold.
4. The method according to claim 1, characterized in that, The content relevance assessment step calculates the similarity score between two semantic vectors, including: inputting the English essay text into the encoder of a pre-trained language model, and taking the sentence-level representation vector output by the encoder as the essay semantic vector; inputting the question requirement text into the encoder of the same pre-trained language model, and taking the sentence-level representation vector output by the encoder as the question semantic vector; calculating the cosine similarity between the essay semantic vector and the question semantic vector, and marking the content as deviating from the question requirements when the cosine similarity is lower than a preset semantic similarity threshold.
5. The method according to claim 1, characterized in that, The multi-dimensional scoring step involves weighting and summing the scores of the four dimensions based on preset dimension weight coefficients to obtain a comprehensive score. This includes: determining the weights of the content dimension score, the structure dimension score, the language dimension score, and the normative dimension score according to the College English Test Band 4 and College English Test Band 6 writing scoring standards, wherein the sum of the four weights is 1; and then multiplying the four dimension scores by their corresponding weights and summing the results to obtain the comprehensive score.
6. The method according to claim 1, characterized in that, The discourse layer employs discourse structure analysis to evaluate paragraph organization, coherence, and argumentation logic, including: identifying paragraph boundaries and topic sentences in the English essay text, and evaluating the semantic relevance of each paragraph's topic sentence to the central argument of the entire text as a paragraph organization score; identifying conjunctions and referential relationships between adjacent sentences, and statistically analyzing the frequency of conjunction usage and the completeness of the referential chain as a coherence score; and detecting the strength of the supporting relationship between arguments and the thesis based on an argumentation relationship identification model as an argumentation logic score.
7. The method according to claim 1, characterized in that, The error classification step includes adding an error nature description and modification suggestions to each type of error, including: searching a preset error knowledge base according to the sub-category to which the error belongs, obtaining the error cause explanation text and standard usage example text corresponding to the sub-category; and generating targeted modification suggestion text based on the context of the error in the English essay text and the standard usage example text.
8. The method according to claim 1, characterized in that, The process includes a report generation step, which involves presenting the location and suggested corrections for various errors in the English essay text using inline annotations based on the categorized error set; retrieving explanatory micro-lesson links and reinforcement exercises matching the error types in the categorized error set based on the distribution of error types; and integrating the four-dimensional scores, the overall grade, the inline annotations, the explanatory micro-lesson links, and the reinforcement exercises to generate a report. The report generation step, which retrieves matching explanatory micro-lesson links and reinforcement exercises based on the error type distribution, includes: statistically analyzing the frequency of errors in each sub-category of the categorized error set and selecting the top three error categories as key learning categories; retrieving explanatory micro-lesson links corresponding to the key learning categories from a pre-set teaching resource database; and extracting targeted reinforcement exercise questions from a question bank based on the key learning categories.
9. The method according to claim 8, characterized in that, The text preprocessing step involves segmenting the English essay text into words, including: using a sub-word-based segmenter to divide the English essay text into a sequence of lexical units, wherein the lexical unit sequence retains word form change information; and for out-of-vocabulary words, using byte pair encoding to split the out-of-vocabulary words into a combination of known sub-word units.
10. A college English writing intelligent correction system based on natural language processing, used to implement the method of claim 9, characterized in that, include: The text preprocessing module is used to obtain the English essay text to be graded, perform word segmentation and syntactic parsing on the English essay text, and generate word segmentation sequence and syntactic parse tree; The multi-level language analysis module receives the word segmentation sequence and the syntactic parse tree. At the lexical level, it uses context-sensitive spell checking and part-of-speech tagging to identify word usage errors. At the syntactic level, it uses dependency parsing and constituent parsing to detect subject-verb agreement errors, tense misuse errors, and clause structure incompleteness errors. At the discourse level, it uses discourse structure analysis to evaluate paragraph organization, coherence, and argumentation logic, and outputs a lexical level error list, a syntactic level error list, and a discourse level evaluation result. The error classification module is used to classify errors in the lexical layer error list and the syntactic layer error list into corresponding subcategories in the lexical collocation, grammatical structure, punctuation usage, and expression idiomatic categories according to preset error type mapping rules, add error nature descriptions and modification suggestions to each type of error, and output a set of classified and labeled errors; The content relevance assessment module is used to obtain the question requirement text corresponding to the English essay text, and to extract the semantic vector of the English essay text and the semantic vector of the question requirement text using a pre-trained language model, and calculate the similarity score between the two semantic vectors as the content relevance score. The multi-dimensional scoring module is used to calculate the content dimension score, structure dimension score, language dimension score and normative dimension score respectively based on the number and severity of errors in each category in the classification and labeling error set, the discourse layer evaluation results and the content relevance score, and to obtain a comprehensive score by weighting and summing the four dimension scores based on preset dimension weight coefficients, and to determine the comprehensive level based on the comprehensive score. The correction report generation module is used to present the location and modification suggestions of various errors in the English essay text in the form of inline annotations based on the categorized error set. It retrieves the explanatory micro-lesson links and consolidation exercises that match the error types according to the distribution of error types in the categorized error set, and integrates the four-dimensional scores, the comprehensive level, the inline annotations, the explanatory micro-lesson links and the consolidation exercises to generate a correction report.