A method and device for sorting geological document feature terms based on hierarchical terms

A technology of geological documents and sorting devices, which is applied in the fields of instruments, calculations, electrical digital data processing, etc., can solve problems such as the inability to reflect the differences in the importance of different terms to the subject, and achieve reliable sorting and effective calculations

Active Publication Date: 2021-07-20
CENT SOUTH UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in the general term-document matrix, the number of occurrences of the term is purely used to represent the representation of the term to the topic of the document. The TextRank algorithm uses the relationship between local words (co-occurrence window) to sort the subsequent feature words. Reflect the difference in the importance of different terms to the topic

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for sorting geological document feature terms based on hierarchical terms
  • A method and device for sorting geological document feature terms based on hierarchical terms
  • A method and device for sorting geological document feature terms based on hierarchical terms

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] See attached figure 1 , the method of this embodiment specifically includes:

[0041] A1. Get range type parameter information.

[0042] A2. Determine whether the type parameter is the same as a preset first parameter, second parameter, or third parameter.

[0043] If so, get range parameter information.

[0044] The range parameter information includes: a first range parameter or a second range parameter.

[0045] A3. Based on the scope type parameter information and the scope parameter information, obtain a preset first document set, second document set, or third document set corresponding to the scope type parameter information and scope parameter information.

[0046] The first document set includes multiple first rule documents, and the first rule document in this embodiment is any document that may be extracted.

[0047] The second document set includes a plurality of second rule documents, and in this embodiment, the second rule document is any document belon...

Embodiment 2

[0058] (1) Input description

[0059] The input includes the term classification table words_list, the term level weight table levels_weights, the document term table files_words, and the feature value sorting parameter list orders.

[0060] (1-1) Words_list: It is a database table containing basic information of all words extracted from documents in a specific range. See Table 1 for specific field definitions.

[0061] Table 1 Definition of Term Grading Table

[0062]

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to a method for sorting geological document feature terms based on hierarchical terms, including: acquiring range type parameter information; judging whether the range type parameter is the same as the preset first parameter, second parameter or third parameter; if , the range parameter information is obtained; based on the range type parameter information and the range parameter information, obtain the preset first document set or the second document set or the third document set corresponding to the type parameter information and the range parameter information; obtain the first The term frequency of the feature term in the document set or the second document set or the third document set; based on the term frequency of the feature term in the first document set or the second document set or the third document set, and the preset corresponding to the feature term The level and level weight of the term, to obtain the feature value of the feature term in the first document set or the second document set or the third document set; based on the feature value of the feature term, obtain the first N feature values ​​corresponding to the feature value feature term.

Description

technical field [0001] The invention relates to the field of language processing, in particular to a method and device for sorting geological document feature terms based on hierarchical terms. Background technique [0002] The subject (or feature) of a geological document is determined by all terms in the document and their grammar, context dependencies, etc., among which terms play an important role. [0003] The terms in geological documents include geological named entities such as "XX fault", "XX mine", "XX rock", geological property terms such as "normal fault" and "rhyoline structure", and "2019 Common named entities such as "October 10, Year" and "Hunan Academy of Geological Sciences", basic geological terms such as "stratum", "structure", and "rock mass", and "control", "basis", and "area" , "feature" and other common word segmentations, different terms have different characterization functions on geological documents. [0004] At present, most Chinese text classi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/284G06F40/295
Inventor 邓吉秋路馥毓刘文毅李晨菡何美香
Owner CENT SOUTH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products