Condition random field based telecom field named entity recognition method

A technology for named entity recognition and conditional random field, which is applied in electrical digital data processing, special data processing applications, instruments, etc. It can solve the problems of unstructured telecom text named entity recognition, excessive dependence on rule templates, etc., to avoid time And the effect of manpower consumption, improving recognition ability, and improving efficiency

Inactive Publication Date: 2018-03-23
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to overcome the deficiencies of the prior art, provide a named entity recognition method in the field of telecommunications

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Condition random field based telecom field named entity recognition method
  • Condition random field based telecom field named entity recognition method
  • Condition random field based telecom field named entity recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] Embodiments of the present invention will be described below in conjunction with the accompanying drawings.

[0024] Such as figure 1 As shown, the present invention has designed a kind of name entity recognition method in the telecommunications field based on conditional random field, and this method comprises the following steps:

[0025] Step 1. Convert the corpus into the input format of the conditional random field CRF model and use the word-based tagging model to tag it.

[0026] First, preprocess the corpus, including word segmentation and part-of-speech tagging. This process uses the IKAnalyzer Chinese word segmenter for word segmentation and stanford-postagger-3.5.2 for part-of-speech tagging.

[0027] Then, convert the corpus text that has completed word segmentation and part-of-speech tagging into the input format specified by the conditional random field CRF model. The standard format is as follows:

[0028] Definition 1: The data content of each line of t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a condition random field based telecom field named entity recognition method. The condition random field based telecom field named entity recognition method includes the steps:converting corpus into an input format of a condition random field CRF model, and using a word based marking model to mark the corpus; selecting the size of a context window and selecting features from a candidate feature set to construct a feature template; defining the feature template of the condition random field CRF model, inputting the acquired corpus and the acquired feature model into thecondition random field CRF model, and acquiring a telecom field named entity recognition CRF model, using the telecom field named entity recognition CRF model to perform telecom field named entity recognition on a telecom text to be recognized, and acquiring an output shaft; and restoring the recognized telecom field named entity from the acquired output result. The method can extract the telecomfield named entity through an automatic method, can improve the efficiency of telecom field named entity recognition to a certain extent, and can ensure the good accuracy and the good recall rate ofthe telecom field named entity recognition result.

Description

technical field [0001] The invention relates to a named entity recognition method in the telecommunications field based on a conditional random field, which belongs to the technical field of computers. Background technique [0002] With the rapid development of the telecommunications industry, the traditional manual service model has been difficult to meet the actual needs, so people began to pay attention to related technologies such as the construction of knowledge bases in the telecommunications field and the construction of question-and-answer systems in the telecommunications field, hoping to use automated systems instead of manual labor to meet the growing demand. business needs. Most of the telecommunications domain knowledge comes from telecommunications related documents. In the face of massive data, it is obviously unrealistic to rely entirely on manual means to extract valuable information from it, so people began to hope to extract information through automated ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
CPCG06F40/295
Inventor 章韵张歌
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products