Word counting method and device
A statistical method and word counting technology, which is applied in the computer field, can solve the problem of not being able to count the word counts in multiple languages, and achieve the effect of saving time and accurate word counting
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0012] Embodiment one, such as figure 1 As shown, a word counting method, the technical solution includes: Step 1, read the text content, and read the text into the memory in batches according to a certain length; the certain length can be a fixed number of bytes, a sentence, or It can be a piece of text or an article. Can be set according to needs. Step 2, after reading a batch of text in the memory, scan the text in the memory, identify and count the number of punctuation marks between the text, and then remove the punctuation marks to form a new string that does not contain punctuation marks; Step 3, Read the words or characters in the string from which the punctuation marks have been filtered out, identify and count the language types word by word; step 4, add up the counted punctuation marks and the number of words or characters in each language.
[0013] The present invention provides a method for counting words, which counts the number of words in a section of text or...
Embodiment 2
[0014] Embodiment two, such as figure 2 As shown, on the basis of Embodiment 1, more optimally, the step 3, identifying the corresponding language and counting the specific steps are: sequentially identify whether it is Chinese, if it is, then count, if not, then identify whether it is English, If it is, count, if not, identify whether it is French, if it is, count, if not, identify whether it is other languages, until the language corresponding to each word or a word is identified.
[0015] More optimally, set an encoding library and language model for each language, traverse the encoding library to initially identify the language category of a word or a character, and then completely identify a character according to the language model and specific rules of each language word, word or character.
[0016] More preferably, the step 3, identifying the corresponding language and counting the specific steps is as follows: the number of words is calculated according to the actua...
Embodiment 3
[0037] Embodiment three, such as image 3 As shown, a word counting device, the technical solution includes: a reading module, used to read text content, and read the text into the memory in batches according to a certain length; a punctuation mark recognition module, used to read a After batches of text, scan the text in the memory, identify and count the number of punctuation marks between the text, and then remove the punctuation marks to form a new string that does not contain punctuation marks; the language recognition module is used to read and filter out punctuation marks Words or characters in the character string of the symbol, the corresponding language is identified and counted word by word; the sub-item statistics module is used to add the punctuation marks counted successively and the number of words or characters in each language.
[0038] More preferably, the language identification module identifies the corresponding language and counts the specific steps as fo...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 