Method for grading Chinese electronic document reading on the Internet

An electronic document, Internet technology, applied in electronic digital data processing, special data processing applications, instruments, etc., can solve the problem of no Chinese document reading and grading.

Inactive Publication Date: 2011-10-12
NANJING UNIV
View PDF2 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Using this type of information is more effective for ideographic languages, but has not yet been applied to reading grades for Chinese documents

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for grading Chinese electronic document reading on the Internet
  • Method for grading Chinese electronic document reading on the Internet
  • Method for grading Chinese electronic document reading on the Internet

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] figure 1 Shown is the technical framework of the Chinese electronic document reading classification method. The input of the method is the target document to be graded and the document library of the determined reading level reserved in the previous period. The output of the method is the reading level to which the target document belongs. The technical framework is divided into 5 modules: determine the frequency distribution of Chinese characters, phrases and sentence structure indicators at each reading level; filter Chinese characters and phrases for reading grading; analyze the word composition of the document for the target document; count the sentence structure indicators of the document ;Finally calculate the reading level of the target document.

[0038] First, determine the frequency distribution of Chinese characters, phrases and sentence structure indicators at each reading level. Let the number of Chinese document reading levels be m. The value of m can ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for grading Chinese electronic document reading on the Internet, comprising firstly determining the frequency distributions of Chinese characters, word groups and sentence structure indexes in different grades of documents; selecting the Chinese characters and the word groups for grading document reading, and avoiding the interference of often-used words and little-used words, then analyzing the word composition of a to-be-graded target document, analyzing the document to be a two-tuple vector (of words and occurrence number); calculating the sentence structure indexes of the document comprising an average paragraph length, an average sentence length, the length difference between the longest sentence and the shortest sentence and the like; and finally using the Naive Bayes method for determining the reading grade of the document based on the word composition information and the sentence structure information of the Chinese document. The reading grade of a Chinese electronic document is efficiently determined by analyzing the Chinese characters and word group composition of the document, combining with the sentence structures of the document, reasoning from the frequency distribution of each word and the structure indexes in different reading grades of documents and applying the Naive Bayes method.

Description

technical field [0001] The invention relates to a method for grading reading of Chinese electronic documents, especially for the increasing popularity of electronic documents in the Internet age, and the reading levels need to be divided to be suitable for reading by users of different age levels or Chinese mastery levels. Background technique [0002] With the rapid development of the Internet and the increasing popularity of smart phones, tablet computers, and other portable electronic devices, electronic documents have increasingly become the main object of people's daily reading. Teenagers have become one of the mainstream groups for reading electronic documents; in addition, Chinese learning has become a popular trend abroad, and a large number of foreign Chinese learners also learn Chinese through electronic documents. All of these require a reasonable definition of the reading level of electronic documents, so that readers can choose appropriate Chinese electronic doc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
Inventor 顾庆李敏骆斌汤九斌陈道蓄
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products