Multi-language based word segmentation method and apparatus

A word segmentation method and multilingual technology, applied in special data processing applications, instruments, electrical digital data processing, etc.

Inactive Publication Date: 2016-01-13
ORA
View PDF4 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

That is, a certain word segmentation method can only perform word segmentation for one language

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-language based word segmentation method and apparatus
  • Multi-language based word segmentation method and apparatus
  • Multi-language based word segmentation method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0026] figure 1 It is a flow chart of Embodiment 1 of the multilingual word segmentation method based on the present invention, such as figure 1 As shown, the execution subject of this embodiment is a computer, a notebook computer, a server, and the like. Specifically, it can be realized by means of software. The multilingual word segmenta...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a multi-language based word segmentation method and apparatus. The method comprises: receiving a to-be-segmented text transmitted by a user, wherein the to-be-segmented text carries a statement separator; according to the statement separator, identifying the language type of each statement in the to-be-segmented text; according to the language type, searching for a corresponding word segmentation method in a pre-stored corresponding relationship between the language type and the word segmentation method; by adopting the word segmentation method corresponding to the language type, performing word segmentation on a statement of the corresponding language type; and outputting a word segmentation result of the to-be-segmented text to the user. According to the multi-language based word segmentation method, uniform word segmentation for applications or texts involving multiple languages can be performed, thereby improving the word segmentation efficiency.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of word segmentation, and in particular, to a multilingual-based word segmentation method and device. Background technique [0002] In artificial intelligence-related work such as search engines, text analysis, and data mining, for languages ​​without spaces or other obvious symbols to separate words, when using computers for natural language analysis, word segmentation is a necessary basis for easy access to words Work. In order to carry out other processing work after word segmentation. [0003] The word segmentation methods in the prior art are independently designed for a certain language. That is, a certain word segmentation method can only perform word segmentation for one language. Among them, word segmentation methods for a certain language include: word segmentation methods based on dictionaries, methods based on grammatical rules and word segmentation methods based on s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
Inventor 马志芳孟茜严巍
Owner ORA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products