Method and device for identifying entity words

A recognition method and a recognition device technology, which are applied in the directions of instruments, calculations, and electrical digital data processing, etc., can solve the problems of high cost and low efficiency of entity word mining, and achieve the effects of rapid recognition, reduced mining cost, and guaranteed accuracy

Active Publication Date: 2014-03-26
ALIBABA GRP HLDG LTD
View PDF10 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] This application provides a method and device for entity word recognition, which can solve the problem of low efficiency and high cost of entity word mining

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying entity words
  • Method and device for identifying entity words
  • Method and device for identifying entity words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] refer to figure 1 , showing Embodiment 1 of a method for identifying entity words of the present application, including the following steps:

[0048] Step 101, receiving data to be identified, and segmenting the data to be identified according to a first predetermined rule to obtain grouped data.

[0049] The data to be recognized can be Chinese, English or other languages, a complete sentence, or a phrase or phrase.

[0050]The first predetermined rule is predefined and can be determined according to actual conditions. In this application, according to human reading habits from left to right, the data to be recognized is segmented according to the rule of combining the first word from the left with other words. That is, each set of grouped data is a combination of the order of the first word from the left and other words. The word here is an independent character or word, for example, it can be a word in English, it can also be understood as a character in Chinese, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for identifying entity words. The method comprises the following steps: receiving data to be identified, segmenting the data to be identified according to a first predetermined rule to obtain grouped data, extracting the characteristics of the grouped data in each group according to a second predetermined rule, calculating the category combination which the grouped data in each group belong to and the probability of the grouped data in each group on the basis of the weight of each characteristic and predetermined word categories, selecting the entity words included in the category combination from the category combination which the grouped data in each group belong to, calculating the identification probability of each entity word, and sorting the entity words according to the probability of each entity word. The invention further provides a device for identifying the entity words. The method for identifying the entity words can be achieved through the device for identifying the entity words. According to the method and device for identifying the entity words, the entity word mining efficiency can be improved, and the mining cost can be reduced.

Description

technical field [0001] The present application relates to the technical field of computer data processing, in particular to a method and device for identifying entity words. Background technique [0002] With the rapid development of science and technology and the Internet, computer and network technology has penetrated into every aspect of people's work and life. The use of computers to obtain the required information is gradually being adopted by people, such as information retrieval query, computer-aided translation, automatic question answering and so on. Some entity words are stored in the database of the computer server, such as product name, model, company name, brand name and so on. If the sentence entered by the user through the client contains entity words in the database, the corresponding results can be directly searched from the database of the server, such as corresponding translation results, question and answer results, and search results, and then fed back ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F40/216G06F40/279
Inventor 廖剑吴克文张永刚林锋
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products