Multi-strategy integration standard terminology processing method for oil and gas pipeline field

A technology for oil and gas pipelines and processing methods, which is applied in the fields of language analysis and pipeline systems, and can solve problems such as small amount of calculation, unsatisfactory effect, and uncommon terminology.

Active Publication Date: 2014-09-24
PIPECHINA SOUTH CHINA CO
View PDF3 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The method of using rules to deal with technical terminology has the advantages of high accuracy and small amount of calculation. The disadvantage is that it is difficult to formulate a complete set of rules to exhaust all linguistic phenomena, and in different professional fields, the construction rules

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-strategy integration standard terminology processing method for oil and gas pipeline field
  • Multi-strategy integration standard terminology processing method for oil and gas pipeline field
  • Multi-strategy integration standard terminology processing method for oil and gas pipeline field

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0116] Embodiment. The specific embodiment of the present invention is described with this example and the present invention is described further.

[0117] This example is an experimental method, which consists of Figure 1-Figure 4 shown. Select page 6 of "GB50253-2006 Code for Design of Oil Pipeline Engineering.doc" as an example to illustrate how to implement this example.

[0118] The overall process is:

[0119] 1) Oil and gas pipeline corpus preprocessing and corpus word segmentation result optimization

[0120] Convert the text format of the oil and gas pipeline corpus, perform ICTCLAS word segmentation and optimize the word segmentation results, and perform noise filtering to obtain word segmentation results;

[0121] 2) Construction method of terms in the field of oil and gas pipelines

[0122] TF-IDF algorithm, C-MI algorithm, RD algorithm and combined algorithm can be used for term construction;

[0123] 3) Optimization of terminology construction in the field ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-strategy integration standard terminology processing method for the oil and gas pipeline field and relates to the technical field of linguistic analysis and pipeline systems. The method is characterized by mainly comprising three modules as follows: 1) corpus preprocessing in the oil and gas pipeline field and text segmentation result optimization are performed; 2) term construction is realized in forms of a single algorithm and combination of multiple algorithms respectively; 3) obtained terms are filtered according to summarized rules, junk terms and conventional terms are rejected, and term processing results are optimized. The overall process is as follows: 1) corpus preprocessing in the oil and gas pipeline field and text segmentation result optimization; 2) a term construction method in the oil and gas pipeline field; 3) term construction optimization in the oil and gas pipeline field. With the adoption of the method, the segmentation accuracy is improved, and the term extraction precision ratio and the technical field correlation of final relative terms are improved.

Description

technical field [0001] The invention is a multi-strategy fusion standard term processing method oriented to the field of oil and gas pipelines, and relates to the technical fields of language analysis and pipeline systems. Background technique [0002] As a unified name for a specific thing or concept in a specific field, technical terms have a certain degree of recognition, domain and stability. For example, "pressure test" and "horizontal compressor" in the field of oil and gas pipelines are professional terms. At present, in the field of oil and gas pipelines, there is a lack of national or industry-developed terminology standards, and the terminology extraction in this field is all done manually. However, on the one hand, manual summarization of terms requires a large workload and consumes a lot of manpower; on the other hand, it is difficult to unify the standards and there are ambiguities. Therefore, how to use computers to objectively identify and efficiently constr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/3329G06F16/335
Inventor 刘冰潘腾黄维和税碧垣刘艳双李云杰张妮吴凯旋王禹钦
Owner PIPECHINA SOUTH CHINA CO
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products