Word segmentation phonetic transcription and ligature writing method and device based on SC grammar

A continuous writing and word segmentation technology, applied in the field of machine translation, can solve the problems of lower translation accuracy and achieve the effect of easy expansion and maintenance, and improved accuracy

Inactive Publication Date: 2016-06-01
HUAJIAN YUTONG TECH BEIJING CO LTD +1
View PDF3 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the domestic translation from Chinese to Braille is still in the manual stage, in order to bring more and better educational materials to the blind, the heavy translation work has brought about a decrease in accuracy, so there is an urgent need for a set of translation from Chinese to Braille A high-accuracy word segmentation, phonetic and continuous writing method, thus laying a solid foundation for Chinese-blind translation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Word segmentation phonetic transcription and ligature writing method and device based on SC grammar
  • Word segmentation phonetic transcription and ligature writing method and device based on SC grammar
  • Word segmentation phonetic transcription and ligature writing method and device based on SC grammar

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] The present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

[0066] A method for word segmentation, phonetic transcription and continuous writing based on SC grammar, the process is as follows figure 1 shown, including the following steps:

[0067] ⑴ Chinese character strings and article genre types that accept word segmentation and phonetic transcription;

[0068] Taking the accepted article genre as modern Chinese and the content of the Chinese character string as "In 2008, Xiao Li was promoted to be the chief engineer of this project" as an example, the implementation process of the method of the present invention will be described.

[0069] (2) Segment the Chinese character strings based on the dictionary database, and perform part-of-speech tagging and phonetic marking on the word blocks after word segmentation. Such as figure 2 As shown, this content is achieved through the following process:

[0070...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a word segmentation phonetic transcription and ligature writing method and device based on an SC grammar and belongs to the technical field of computer translation in computer science. Firstly, based on a word segmentation ambiguity rule of the SC grammar, an ambiguity segmentation rule library is built by means of abutment constraint conditions in natural language, and illegal segmentation is eliminated so that the word segmentation precision can be improved; secondly, based on a word segmentation ligature writing rule library of the SC grammar and a ligature writing corpora statistical library, the ligature writing corpora statistical library is used for performing ligature writing on ligature writing knowledge which cannot be presented as rules; finally, based on a dictionary library of the SC grammar, a dictionary is used for performing maximum matching to perform word segmentation, the word segmentation ambiguity rule is called for fields where ambiguity happens so that a correct segmentation result can be acquired, and the context of a word is analyzed so that correct part-of-speech tagging and phonetic transcription can be acquired. Compared with the prior art, word segmentation accuracy is improved, and the word segmentation ambiguity rule library, a combined ambiguity word library, the ligature writing rule library, the dictionary library and the ligature writing corpora statistical library are easy to expand and maintain.

Description

technical field [0001] The invention relates to a method and device for word segmentation and phonetic continuation, in particular to a method and device for word segmentation and phonetic continuation based on SC grammar in a Chinese-blind translation system, and belongs to the technical field of machine translation in computer science. Background technique [0002] Machine translation refers to the process of converting one natural language into another natural language expression by using a computer. The Chinese-blind translation system automatically translates Chinese information into Braille characters, which is of great help to the education and life of blind people. Braille is a special form of phonetic writing. To realize the translation of Chinese characters into Braille, the Chinese word segmentation and ligature should first be converted into Pinyin, and then converted from Pinyin to Braille. Therefore, the accuracy of Chinese word segmentation and phonetic transc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/28
CPCG06F40/284G06F40/47
Inventor 黄河燕黄静
Owner HUAJIAN YUTONG TECH BEIJING CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products