Method and system for extracting Chinese key phrases in scientific and technological innovation field by utilizing semantic features

A technology of key phrases and semantic features, applied in semantic tool creation, semantic analysis, natural language data processing, etc., can solve the problems of high professional quality requirements, low efficiency, incomplete key phrases, etc. The effect of heavy workload and simple and efficient process

Pending Publication Date: 2021-08-06
ZHEJIANG UNIV +1
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] The traditional extraction of key phrases in the field of scientific and technological innovation relies on manual operations and requires relevant staff to have relatively rich relevant professional knowledge. If the extracted key phrase field does not match the field of personnel knowledge, it will often lead to errors in judging and extracting phrases.
Th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for extracting Chinese key phrases in scientific and technological innovation field by utilizing semantic features
  • Method and system for extracting Chinese key phrases in scientific and technological innovation field by utilizing semantic features
  • Method and system for extracting Chinese key phrases in scientific and technological innovation field by utilizing semantic features

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0049] The present invention will be further elaborated and explained below in conjunction with the accompanying drawings and specific embodiments.

[0050] like figure 1 As shown in one preferred embodiment of the present invention, a technological key phrase extraction method using a technological innovation field of semantic features can be provided, which can correspond to technological innovation systems that utilize semantic features. To achieve, the system consists of four parts: the key phrase of the key phrase library, candidate phrase generator, key phrase generator, and technological innovation field.

[0051] The specific process is as follows:

[0052] S1, get a variety of Chinese documents in the field of technology innovation, and convert to a unified Chinese code format to form a technological innovation domain document library.

[0053] The Chinese documentation in the technology innovation domain document library can be a collection of any Chinese technology clas...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and a system for extracting Chinese key phrases in the scientific and technological innovation field by utilizing semantic features. According to the method, Chinese stop words and a stop mode library are constructed by mining corpus features of Chinese scientific and technological innovation documents, so that high-performance filtering of invalid information is realized; in addition, various key phrase extraction algorithms are quantitatively evaluated and analyzed by means of domain expert labeling, so that an algorithm model more suitable for domain cognition is selected, and multiple statistical rules are used for filtering to improve phrase extraction performance; and the structural characteristics of the document are further utilized to carry out vector space embedding representation on the topic semantics of the document, and the semantic similarity between the extracted phrases and the topic of the document and the semantic importance degree of the phrases are comprehensively utilized to carry out calculation and ranking so as to finish further screening of the key phrases. The method can support various downstream tasks and applications, including scenes of scientific and technological innovation field knowledge graph construction, scientific and technological innovation document semantic retrieval, scientific and technological innovation entity accurate search and the like.

Description

technical field [0001] The invention relates to the fields of computer systems, big data, artificial intelligence, knowledge map construction, natural language processing, etc., and specifically relates to a method for extracting key phrases in the field of scientific and technological innovation using semantic features. Background technique [0002] Traditional key phrase extraction in the field of scientific and technological innovation relies on manual operations and requires relevant staff to have rich relevant professional knowledge. If the extracted key phrase field does not match the personnel knowledge field, it will often lead to errors in judging and extracting phrases. The key phrases extracted manually are prone to problems such as incompleteness, lack of detail, untimelyness, and inconsistency with the direction of objective needs. Therefore, the traditional artificial key phrase extraction method has defects such as heavy workload, low efficiency, high error ra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/289G06F40/30G06F40/211G06F40/216G06K9/62G06F16/335G06F16/36
CPCG06F40/289G06F40/30G06F40/211G06F40/216G06F16/335G06F16/367G06F18/22
Inventor 庄越挺宗畅陈泽群鲁伟明邵健
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products