Geographic information service metadata text multi-level multi-label classification method

A technology of geographic information service and classification method, applied in the field of multi-level and multi-label classification of geographic information service metadata text, can solve the problem of not considering the semantics of professional terms, unable to effectively fit the text characteristics of geographic information service metadata, and not combining the characteristics of the field. and other problems to achieve a good overall performance of the classification results.

Active Publication Date: 2020-01-17
WUHAN UNIV
View PDF13 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these methods usually do not combine the characteristics of the field, do not consider the semantics of professional terms in the text, and cannot effectively fit the text characteristics of geographic information service metadata

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Geographic information service metadata text multi-level multi-label classification method
  • Geographic information service metadata text multi-level multi-label classification method
  • Geographic information service metadata text multi-level multi-label classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0052] There are currently 46,000 Web Map Service (WMS) text data, of which 400 are marked with SBAs topics, and the topics are evenly distributed. The text content comes from the URL, Abstract, Keywords and Title fields in the Service tag in the WMS GetCapability capability document. Because the content of the text is mixed, the length of the text is different, a single piece of data corresponds to multiple topic categories, and the amount of sample data labeled with the topic is small, it is difficult for the traditional multi-label classification algorithm to classify accurately and comprehensively, and it is also imposs...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a geographic information service metadata text multi-level multi-label classification method, which comprises the following steps: 1) obtaining a geographic information servicemetadata text set to perform text preprocessing, and dividing each data sample into text feature word combinations; 2) setting a primary classification directory, and generating a typical word list semantically associated with classification categories; 3) screening the text feature words according to the typical word list; 4) selecting ML-KNN as a base model for collaborative training; 5) establishing a topic prediction model ML-CSW as another base model for collaborative training; 6) designing a collaboration mechanism, matching a multi-label topic for the metadata text, and taking the multi-label topic as a primary coarse-grained topic classification result. According to the method, domain features and text semantics of the geographic information service metadata are considered, only asmall number of marked data samples are depended on, and the classification result is better in overall performance compared with a traditional multi-label classification method.

Description

technical field [0001] The invention relates to natural language processing technology, in particular to a multi-level and multi-label classification method for geographic information service metadata texts. Background technique [0002] As an important means of data analysis, accurate text classification is the key to improving the quality of geographic information resource retrieval, and has a wide range of application scenarios. Most traditional classification methods are suitable for binary classification or single classification scenarios, and rely too much on a large number of labeled samples to train classification models, which limits the accuracy and comprehensiveness of text classification and the applicable scenarios of the model. Especially for the metadata of geographic information services, there is usually a lack of sample datasets marked with topics, and the text content is mixed, and the feature vocabulary is complicated by the mixture of geoscience terms an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F16/387G06F16/34G06F40/284G06F40/30G06K9/62
CPCG06F16/35G06F16/387G06F16/345G06F18/23213G06F18/24155
Inventor 桂志鹏张敏彭德华吴华意
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products