Language recognition method and device for text and electronic equipment

A language recognition and language technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problem of low robustness of correct recognition rate

Active Publication Date: 2017-04-26
阿里巴巴(中国)网络技术有限公司
View PDF13 Cites 39 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0019] The present application provides a language recognition method, device and electronic equipment for text, so as to solve the problems of low correct recognition rate and low robustness in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Language recognition method and device for text and electronic equipment
  • Language recognition method and device for text and electronic equipment
  • Language recognition method and device for text and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0145] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the application. However, the present application can be implemented in many other ways different from those described here, and those skilled in the art can make similar promotions without violating the connotation of the present application. Therefore, the present application is not limited by the specific implementation disclosed below.

[0146] In the present application, a language recognition method, device and electronic equipment for text are provided. Each will be described in detail in the following examples.

[0147] The core basic idea of ​​the language recognition method for text provided in the embodiment of the present application is: to identify the language of the text to be recognized by designing tens of millions of language features and using a machine learning model. Since the method provided by the present application performs lang...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a language recognition method and device for a text and electronic equipment. The language recognition method for the text comprises the steps of extracting language characteristics from a text to be recognized; taking the extracted language characteristics as an input of a text language classifier generated in advance; and calculating to acquire a language to which the text to be recognized belongs by the text language classifier, wherein the language characteristics comprise at least one of N-element continuous word characteristics, N-element continuous character characteristics and affix characteristics. With the adoption of the method provided by the invention, the correct recognition rate and the robustness of language recognition can be improved; and meanwhile, a training corpus set is only needed to be a historical query set marked with a correct language, more content is not needed to be marked, and thus, an effect of high practicability can be achieved.

Description

technical field [0001] The present application relates to the technical field of language recognition, in particular to a language recognition method, device and electronic equipment for texts. Background technique [0002] International e-commerce websites generally include an English main station and multilingual sub-stations, both of which are open to global users. When users log in to any site to search for products, the text used can be in any language. In order to accurately understand the user's intent, the first problem that needs to be solved is to automatically identify the language of the query text entered by the user, that is: text language recognition. Correct subsequent processing, such as translation or search, is only possible if you know exactly which language the texts to be processed are in. [0003] Currently, commonly used text language recognition methods include the following: [0004] 1) In 2000, Xerox Corporation obtained a US patent entitled "AU...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
Inventor 蒋宏飞骆卫华林锋
Owner 阿里巴巴(中国)网络技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products