Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Man-machine cooperation system and method for dynamically maintaining high-quality science and technology concept library

A cooperative system and high-quality technology, applied in the field of knowledge acquisition of knowledge engineering, can solve problems such as time-consuming, waste of manpower, and limited accuracy, and achieve the effect of reducing the need for manual intervention, reducing maintenance costs, and improving sensitivity

Pending Publication Date: 2022-05-10
集智学园(北京)科技有限公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, the classic TERMATE itself is still difficult to directly meet the more stringent requirements for terminology in the field of science and technology.
Specifically, there are several problems in the following aspects: First, the update and iteration of knowledge in the field of science and technology is relatively fast. In order to capture changes in the field in time and facilitate technological innovation, term extraction needs to be iterated in real time based on new corpus; second, the scale of the field and the project The corpus involved is often large in scale. Although TERMATE has excellent results, it will consume more calculations and take longer than many algorithms to obtain the results, which may cause problems such as long time consumption and high use of computing resources; other Third, the field of science and technology has higher requirements for the accuracy and completeness of terminology. It is impossible for the algorithm to be completely accurate for some leading words with less information, and the extraction of terminology alone cannot meet the requirements of a considerable part of scientific and technological work. add more information
[0006] Although there are already some new technologies trying to solve similar problems, they all have obvious deficiencies: the patent application of CN105260482A tries to use the crowdsourcing method to discover new words on the Internet, but the machine processing steps before the crowdsourcing did not take advantage of the machine's large-scale statistics , wasting precious manpower, it is obviously not advisable to maintain the terminology database in the field of science and technology
The patent application of CN112632969A is about a Chinese-specific term extraction method, which involves the basic incremental update, but the focus is on the incremental processing of the calculation of several features, and does not involve the incremental consideration of the entire ATE process. , and the latter is the time-consuming step in this method; in addition, the method itself is not universal across languages, and cannot be applied to scientific and technological fields where the main body is English, and rough methods such as disabling vocabulary also limit the accuracy of the technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Man-machine cooperation system and method for dynamically maintaining high-quality science and technology concept library
  • Man-machine cooperation system and method for dynamically maintaining high-quality science and technology concept library
  • Man-machine cooperation system and method for dynamically maintaining high-quality science and technology concept library

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0128] Except for the characteristics of the original TERMATE, the present invention is characterized in that incremental calculations are used to obtain continuously improved extraction results with relatively small calculation costs. Since the extraction algorithm is not synchronized with the crowdsourcing platform, the former runs at a set cycle, while the latter updates the results as users use it; therefore, the two are shown separately here. The former selects 1,000 documents as the seed library for extraction, and subsequently adds 100 documents for demonstration; the latter shows some typical details and explains its relationship with the aforementioned scheme.

[0129] The corpus processed by this method is a collection of chapters rather than a single document. With the expansion of data, the accuracy of this method will increase accordingly. It should be noted that only a small amount of corpus is selected here for the convenience of examples, and the results can re...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a man-machine cooperation system for dynamically maintaining a high-quality science and technology concept library, and designs a complete science and technology concept library maintenance framework based on incremental operation and man-machine cooperation on the basis of an automatic term extraction (ATE) algorithm TERMATE with excellent performance and aiming at the contradiction and pain point of fast science and technology concept updating and long time consumption of an excellent algorithm. And finally, an online closed-loop corpus utilization system is realized on the basis. In the system, a customized extraction algorithm is used for carrying out organic integrated ATE on inrush corpora so as to enrich term candidates and report logs; the crowdsourcing platform for collecting crowdsourcing confirms and perfects the candidates to obtain terms and concepts and perform self-adaptive settlement rewards; the acknowledged candidate will in turn gain ATE performance. The whole system is named as ConceptEST or the most concept, machine intelligence and group intelligence are seamlessly integrated, high-quality science and technology knowledge is enriched, and field term and concept management is facilitated, so that concerned field changes can be followed with low cost and high efficiency, and convenience is brought to field practitioners.

Description

technical field [0001] The invention belongs to the field of knowledge acquisition of knowledge engineering, and its main purpose is to extract terminology integratedly from continuously acquired corpus streams; in particular, it relates to a human-machine cooperation system for dynamically maintaining a high-quality scientific and technological concept library——ConceptEST. Background technique [0002] There are a large number of unstructured texts in traditional media and Internet media, and extracting useful domain concepts from these text data is an extremely important task in natural language processing. [0003] Automatic terminology extraction (Automatic Terminology Extraction, ATE) is the work of extracting words that describe concepts with rich connotations from a corpus of a certain scale and language, so as to form a core vocabulary (Vocabulary) that can describe the skeleton of the domain. ATE is a direction with high practical value in the field of Natural Langu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N5/02G06F40/216G06F40/247G06F16/35
CPCG06N5/022G06N5/025G06F40/216G06F40/247G06F16/353
Inventor 徐恩峤胡乔林嘉琦
Owner 集智学园(北京)科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products