Unlock instant, AI-driven research and patent intelligence for your innovation.

Word-by-word comparison method for realizing high hit rate

A high hit rate and comparison technology, applied in the fields of instruments, computing, electrical and digital data processing, etc., can solve the problems of wrong marking position, unable to mark, unable to find corresponding data, etc. Effect

Active Publication Date: 2014-09-17
HAIMEN THE YELLOW SEA ENTREPRENEURSHIP PARK SERVICE CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

After the current Chinese-to-English markup is used, there are misses caused by the Chinese participle being too thick. For example, there is a participle "I think" in Chinese, but there is no "I think" in its English relationship.
Cause word segmentation results, but cannot be marked
In our common Chinese-English sentence examples, there are often verbs, articles or prepositions that cannot be translated into meaningless words in the corresponding language example sentences. When comparing word by word, it is necessary to correspond to Chinese and English Tags, such words cannot be added to the tag sequence due to their meaninglessness
In the process of successive comparison between Chinese and English, there will be a situation where one Chinese corresponds to multiple English. The usual practice is to mark them in order, which will lead to the wrong position of the mark when the English example sentence is an inverted sentence.
In the word-by-word comparison function, it is necessary to mark the Chinese and the corresponding English at the same time. However, since most of the collected data comes from dictionary data, it is largely a standard interpretation, but the actual Chinese-English example sentences are not. There are various flexible applications, resulting in frequent cases where the corresponding data cannot be found according to the standard interpretation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word-by-word comparison method for realizing high hit rate
  • Word-by-word comparison method for realizing high hit rate
  • Word-by-word comparison method for realizing high hit rate

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The present invention provides a word-by-word comparison method to achieve a high hit rate, through the following four links to improve the hit rate of Chinese-English word-by-word comparison.

[0043] 1. The present invention adopts a secondary scanning method for unmarked data, starting from English in reverse, using an English-to-Chinese relational dictionary, taking the English-Chinese dictionary’s Chinese deformed character string as a basis, scanning the missing Chinese character string, searching and marking the result, to increase the hit rate.

[0044] see figure 1 , the specific method is defined as follows:

[0045] 1) Multi-segment English character string units form an English character string (engdata), and multi-segment Chinese character string units form a Chinese character string (chndata);

[0046] 2) Segment the English character string, and obtain the English word segmentation result set engphr (n=0,...phrlen-1) whose length is the word segmentatio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A method for a word for word comparison between Chinese and English. The method comprises the following steps: 1) forming an English data string by a plurality of English data string units, and forming a Chinese data string by a plurality of Chinese data string units; 2) segmenting the English data string to obtain a result set of segmented English phrases of phrase length; 3) determining whether or not an item in the result set of the segmented English phrases exists in a list of keyword characters; if the determination is negative, stopping; if the determination is positive, proceeding to step 4); 4) searching an English-Chinese dictionary to obtain a corresponding Chinese ID sequence; 5) determining whether or not the same exists in the list of keyword characters; if the determination is positive, proceeding to step 6); 6) searching the Chinese ID sequence for a related Chinese word sequence; 7) determining whether or not the related Chinese word sequence exists in the list of keyword characters; if the determination is positive, proceeding to step 8); 8) performing data string matching on the related Chinese word sequence in the Chinese data string, and then proceeding to step 9) if there is a match; 9) adding a mark to the English and the Chinese to indicate a hit, and deleting the related Chinese word sequence from the Chinese data string. The present invention achieves a word for word comparison with a high hit rate by solving a critical domain problem in comparison.

Description

technical field [0001] The invention relates to a Chinese-English word-by-word comparison method, which realizes a word-by-word comparison with a high hit rate by solving the key field problem in the comparison. Background technique [0002] In the process of daily English learning, we often encounter the situation of comparing Chinese and English. It would be very pleasant if we could provide a simple word-by-word comparison method. After the current Chinese-to-English markup is used, there are misses caused by too thick Chinese word segmentation. For example, there is a word "I think" in Chinese, but there is no "I think" in its English relationship. It leads to word segmentation results, but cannot be marked. In our common Chinese-English sentence examples, there are often verbs, articles or prepositions that cannot be translated into meaningless words in the corresponding language example sentences. When comparing word by word, it is necessary to correspond to Chinese a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/28
CPCG06F17/2809G06F40/42
Inventor 陈淮琰巨雷郑建锋唐海波
Owner HAIMEN THE YELLOW SEA ENTREPRENEURSHIP PARK SERVICE CO LTD