Automatic decimation method of scientific and technical terminology

A technology of technical terms and terminology, applied in the field of automatic extraction of technical terms, can solve problems such as difficulty in making breakthroughs

Inactive Publication Date: 2010-02-24
北京中献电子技术开发有限公司
View PDF2 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method does not add some word-forming rules of Chinese itself. It is still a probability method based on frequency, etc. When the accuracy rate reaches a certain level, it will be difficult to make a breakthrough, and the bottleneck is obvious.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic decimation method of scientific and technical terminology
  • Automatic decimation method of scientific and technical terminology
  • Automatic decimation method of scientific and technical terminology

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0065] This specific embodiment describes the automatic extraction of commonly used noun phrases in Chinese patents, and the fields involved include: safety, geology, electric power, real estate, textiles, aviation, nuclear science, chemical industry, machinery, computers, 24 fields including construction, transportation, military, science, tourism, energy, agriculture, biology, biobank, communication, physics, metallurgy, medicine, and quality inspection.

[0066] like figure 1 As shown, in this embodiment, the extraction of technical terms includes the following steps:

[0067] Field sorting

[0068] A patent has an IPC. For a patent, the main IPC reflects the applicable field of the patent, and different patent literature databases are established for the patent with the IPC. Scientific and technological terms are generally related to the field, and the main purpose of establishing a patent literature database is to discover commonly used scientific and technological term...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for automatically recognizing and manual assistance decimating scientific and technical terminology from Chinese patent documentations by a computer. According to the method, Chinese terminology which are possible to compose words are automatically recognized and decimated based on basic information of part-of-speech taggings by means based on rules on an point of view of Chinese phrase word-building, and the authenticity of the terminology is judged and determined based on manual assistance. The main steps include building various patent documentation bases based on fields; decimating repeat strings by using a special patent documentation base as a training corpus, syncopating and part-of-speech tagging repeat strings by using basic terminology, and checking repeatedly boundariesof repeat strings according Chinese accidence rules until the strings become receptible provision terminology. It is possible to affirm through manual assistance for further verifying the provision terminology.

Description

technical field [0001] The invention relates to a method for automatically identifying and extracting scientific and technological terms by using a computer, in particular to a method for automatically identifying and manually assisting the extraction of scientific and technological terms in Chinese patent documents by using a computer. Background technique [0002] With the development of information technology, people have more and more scientific documents, but manual processing has become impossible, so the introduction of automation technology is an inevitable trend. However, technical terminology is a major obstacle to the automatic summarization, automatic indexing, automatic classification and even machine translation of this information. It is a very urgent and meaningful task to automatically identify and extract scientific and technological terms in the literature [0003] Chinese patent application 03148989.3 discloses a method for automatically extracting multi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王进张素兰贾学杰任丽王永生张迁王婷婷
Owner 北京中献电子技术开发有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products