Alignment algorithm for bilingual paragraphs
A technology of segment and smoothing parameters, applied in the field of English-Chinese bilingual understanding, which can solve the problem of sparse probability data.
Inactive Publication Date: 2009-09-02
刘建
View PDF0 Cites 4 Cited by
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
This makes the probability data in the model (such as the probability of any English word translated into any Chinese word) have serious data sparsity problems
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View moreImage
Smart Image Click on the blue labels to locate them in the text.
Smart ImageViewing Examples
Examples
Experimental program
Comparison scheme
Effect test
Embodiment Construction
[0057] The alignment scheme can be realized on the computer to form the final system
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more PUM
Login to view more
Abstract
The invention provides a method for carrying out syntagm alignment by stacking Bilingual Corpus, which is the basis of example-based machine translation (EBMT), provides an alignment module of bilingual syntagm of English and Chinese based on anchor word-pair and corresponding alignment algorithm and solves the data sparse problem of middle or small Corpus. The ambiguity in syntagm segmentation is delayed to be removed by the system at the time of syntagm alignment, thereby improving the correctness of syntagm segmentation.
Description
technical field [0001] The invention patent relates to English-Chinese bilingual understanding technology in natural language understanding. Especially segment alignment technology Background technique [0002] In recent years, with the development of corpus linguistics, the example-based machine translation (Example-based MT) method has become one of the new ideas of machine translation. The EBMT system stores a large number of segment-level aligned bilingual sentence pairs in advance, that is, a bilingual corpus. When translating, the system only conducts a shallow analysis of the translated sentence, divides it into segments, and then finds the best translation for each segment from the bilingual corpus according to the context, arranges them in a certain order, and finally generates translated sentence. This method avoids many difficulties in traditional translation methods (such as syntactic analysis, word meaning recognition, etc.), and has certain practicability, e...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more Application Information
Patent Timeline
Login to view more
IPC IPC(8): G06F17/28
Inventor 刘建
Owner 刘建
Who we serve
- R&D Engineer
- R&D Manager
- IP Professional
Why Eureka
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Social media
Try Eureka
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap