Named entity recognition method for conditional random field based on word vector representation

A technology for named entity recognition and conditional random field, which can be used in special data processing applications, instruments, electrical digital data processing, etc., and can solve the problems of low generalization ability and high cost.

Inactive Publication Date: 2017-07-25
DALIAN UNIV OF TECH
View PDF2 Cites 29 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The invention provides a named entity recognition method based on a conditional random field represented by a word vector, which firstly solves the problems of high cost and low generalization ability caused by manual feature extraction, and secondly solves the prob

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Named entity recognition method for conditional random field based on word vector representation
  • Named entity recognition method for conditional random field based on word vector representation
  • Named entity recognition method for conditional random field based on word vector representation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] The specific implementation manners of the present invention will be further described below in conjunction with the technical solutions.

[0072]The system of the present invention can perform high-quality gene name recognition on a given biomedical text, avoiding the problems of high cost and low generalization ability caused by extracting artificial features, and improving the level of biomedical text recognition. It is also simple and convenient to operate. The system adopts B / S (Browser / Server, browser / server mode, mainly implemented by JSP, HTML, JS and other technologies) structure design, and is divided into three parts: view layer, logic layer and data layer.

[0073] 1. The user enters the text to be parsed

[0074] As shown in Table 1, text input supports keyboard input and uploading local files. The view layer accepts the text to be retrieved input by the user, submits it to the logic layer, and stores it in the data layer. Assuming that the text to be ana...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a named entity recognition method for a conditional random field based on word vector representation and belongs to the field of natural language processing technology. A conditional random field algorithm based on word vector representation, a conditional random field algorithm adopting fusion word vector representation and an online named entity recognition system adopting B/S physical design and providing a graphic interaction interface are included in the method. Through the method, a biomedical named entity of a biomedical text to be analyzed by a user can be recognized, semantic representation characteristics of word vectors are brought into play in the recognition process with little dependency on artificial feature participation, the problem that the conditional random field is only valid to discrete feature representation is solved, and the advantages of the conditional random field algorithm, which is a discriminant undirected graph model, are also brought into play; the method provides named entity interactive relationship data retrieval service for the user and also provides a correction function on an automatic analysis result for the user.

Description

technical field [0001] The invention belongs to the field of natural language processing, and relates to a method for high-quality biological named entity recognition on biomedical texts, in particular to a biological named entity recognition method based on the fusion of a conditional random field (CRF) model and a word representation method. Background technique [0002] The task of named entity recognition is to identify words or phrases with specific meanings such as names of people, places, and institutions that appear in the text. Named entity recognition in the field of biomedicine is called Biomedical Named Entity Recognition (Bio-NER), which aims to use biomedical text mining technology to identify specific types of entity names that appear in biomedical literature, such as protein , genes, diseases, cells, etc. are automatically identified and classified. Biological named entity recognition is a key step in biomedical text mining and a prerequisite for deep text m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/295
Inventor 李丽双姜宇新陈曦冯轶然
Owner DALIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products