Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Semi-automatic construction method of Chinese patent corpus based on TRIZ

A construction method and patent language technology, which is applied in the field of semi-automatic construction of Chinese patent corpus based on TRIZ, can solve the problems of consuming a lot of manpower and material resources, troublesome labeling personnel, and difficult to guarantee reliability, and achieve the effect of improving quality and saving manpower and material resources

Inactive Publication Date: 2021-03-12
SOUTH CHINA AGRI UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Academic paper (Author: Shi Cui, Title: Analysis and Solution of Patent Document Corpus Retrieval Problems, Journal of Liaoning Administration Institute) This document provides a certain solution from three aspects of patent document segmentation, part-of-speech tagging and dependency syntax analysis, but The representation of its patent is composed of feature engineering, which does not consider automatic learning of effective features, and has certain limitations
[0007] The corpus of traditional TRIZ-based patent text-related research relies on manual construction, which not only consumes a lot of manpower and material resources, but also has difficulty in guaranteeing its reliability due to differences in the understanding of knowledge contained in patents by annotators.
In addition, there are 40 original TRIZ invention principles, which are relatively large in number, and each principle is relatively abstract, and there is a phenomenon of overlapping text description information for some principles, which has caused certain troubles for annotators

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semi-automatic construction method of Chinese patent corpus based on TRIZ
  • Semi-automatic construction method of Chinese patent corpus based on TRIZ
  • Semi-automatic construction method of Chinese patent corpus based on TRIZ

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0064] 1. Reorganization of TRIZ invention principles

[0065] TRIZ theory can be translated as "Theory of Solving Invention Problems". It is a systematic and practical theory of solving invention problems established from 2.5 million patent documents after sorting out and summarizing by a group of scholars headed by G.S. Altshuller. TRIZ believes that innovation is a method to solve contradictions, and obtained 39 parameters and contradiction matrices that cause conflicts, such as the weight of moving objects, from patents, and summarized 40 invention principles such as segmentation, extraction / separation, and local quality. The principle of an invention can be used to find a possible solution to a problem or an innovation, showing a specific law or a specific pattern behind the invention. To a certain extent, innovation is invention and crea...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a semi-automatic construction method for the Chinese patent corpus based on the TRIZ. 40 TRIZ inventive principles are recombined, large ambiguity exists between categories of the regrouped inventive principles, the ambiguity in the categories is relatively small, and the quality of the corpus is further improved. The patent text contains many low-frequency domain terms, complete terms are destroyed by utilizing general Chinese word segmentation, and complete term keywords can be acquired by the method, so that a good foundation is provided for patent semantic analysis.According to the method, semantic information provided by keywords is limited, dependency syntactic analysis is carried out on sentences, more sufficient semantic information is obtained, machine recognition is more accurate, understanding of sentences by non-field professionals is facilitated, and then a small amount of annotation work is better completed. According to the method, sentences and dependency features of patent texts are extracted on the basis of a representation learning method, deeper and more abstract patent semantic representation is captured, and the most discriminative features can be extracted, so that clustering of the texts is facilitated.

Description

technical field [0001] The present invention relates to the technical field of patent text classification, and more specifically, relates to a semi-automatic construction method of a Chinese patent corpus based on TRIZ. Background technique [0002] In recent years, the number of patents in my country has shown a blowout growth, and there is still a large gap in the gold content of Chinese patents compared with developed countries, and there is a problem of "disharmony between the number and quality of patents". Therefore, people urgently need an effective means to effectively organize and manage massive patent data, and then identify key technologies with truly innovative invention principles, providing important basis for enterprise transformation and upgrading and formulation of national industrial policies. [0003] However, the traditional automatic classification of patents based on the International Patent Classification (IPC) cannot meet people's needs for fine-grain...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F40/194G06F40/211G06F40/289G06F40/30G06K9/62
CPCG06F16/353G06F40/194G06F40/211G06F40/289G06F40/30G06F18/23213G06F18/214
Inventor 韦婷婷张建桃江涛
Owner SOUTH CHINA AGRI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products