Alignment method and apparatus for parallel spoken language materials

A technology of oral language and corpus, applied in the field of information processing, can solve problems such as inability to achieve results

Inactive Publication Date: 2009-06-24
KK TOSHIBA
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[00011] Due to the difference between spoken language and written language with complete structure, in speech machine translation, even if the alignment method that can well align written language with complete structure is used to align spoken language, satisfactory results cannot be achieved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Alignment method and apparatus for parallel spoken language materials
  • Alignment method and apparatus for parallel spoken language materials
  • Alignment method and apparatus for parallel spoken language materials

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method and a device for aligning parallel spoken language material and a phonetic machine translation method and a system respectively adopting the alignment method and the device for parallel spoken language material. The alignment method of parallel spoken language material comprises the following steps: obtaining word alignment set based on a statistical method and a dictionary from the parallel spoken language material; conducting phrase alignment to the parallel spoken language the dictionary, so as to get phrase alignment set; and conducting word alignment in the alignment phrase in the parallel spoken language material, so as to get word alignment set based on the phrase alignment. The invention utilizes word alignment set with high accurate rate obtained from the parallel spoken language material in a corpus and based on the statistical method and the dictionary to conduct phrase alignment and further word alignment to the parallel spoken language material, so as to get phrase alignment set and word alignment set, as well as apply into use in phonetic machine translation, thereby reducing the ambiguity of spoken language alignment through utilizing the word completeness.

Description

Technical field [0002] The present invention relates to information processing technology, in particular, relates to phrase alignment and word alignment of parallel spoken language corpus. Background technology [0004] Machine translation technology is mainly divided into: rule-based translation and corpus-based translation. [0005] In corpus-based machine translation, the main translation resources come from corpus. That is to say, in corpus-based machine translation, the parallel bilingual corpus in the corpus is used as the training basis for machine translation. Moreover, the process of corpus-based machine translation is to first perform word alignment and syntactic analysis on the parallel bilingual corpus in the corpus to form aligned sentence pairs that have undergone syntactic analysis; then, the translation engine converts such sentence pairs into It is regarded as a frame structure. When the user enters the sentence to be translated, the translation engine match...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28
CPCG06F17/2827G06F40/45
Inventor 任登君吴华王海峰
Owner KK TOSHIBA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products