Unlock instant, AI-driven research and patent intelligence for your innovation.

Named entity recognition model applied to science and technology documents in manufacturing industry

A technology for named entity recognition and technical documentation, applied in biological neural network models, knowledge expression, neural learning methods, etc., can solve problems such as unrecognized terms and concepts, and named entity recognition models cannot be transplanted to the manufacturing industry, so as to promote The effect of digital transformation

Pending Publication Date: 2022-05-17
中云开源数据技术(上海)有限公司
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1) Existing standard named entity recognition models cannot recognize domain-specific terms and concepts
[0005] 2) Due to the different characteristics of technical documents in different fields, the named entity recognition model developed for a specific field cannot be transplanted to the manufacturing field
[0006] 3) The existing technology center lacks research on automatic classification of manufacturing science and technology documents using the constructed named entity recognition model in the manufacturing field

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Named entity recognition model applied to science and technology documents in manufacturing industry
  • Named entity recognition model applied to science and technology documents in manufacturing industry
  • Named entity recognition model applied to science and technology documents in manufacturing industry

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In order to make the technical means, creative features, goals and effects achieved by the present invention easy to understand, the present invention will be further elaborated below in conjunction with illustrations and specific embodiments.

[0048] Such as figure 1 As shown, a named entity recognition model applied to manufacturing scientific and technical documents proposed by the present invention includes the following steps:

[0049] I. Data Collection

[0050] Select 10 English journals related to manufacturing science from the Web of Science between 2010 and 2021. Each journal selects 10,000 paper abstracts, and a total of 100,000 abstracts are selected to form the original corpus for training the named entity recognition model.

[0051] II. Data Preprocessing

[0052] 1) Remove punctuation and stop words in the data set, perform lemmatization, and form a dictionary corresponding to the corpus.

[0053] 2) Manually define 12 kinds of manufacturing text catego...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a named entity recognition model applied to science and technology documents in the manufacturing industry, and the network structure of the model comprises a word embedding layer SciBERT which is used for converting an input word into a vector with a fixed length; the BiLSTM layer is used for mining hidden features by utilizing context information of the text sequence and is used for encoding the text; the attention layer is used for reducing weights of irrelevant modifiers in the entities, defining boundaries of the entities and avoiding extraction omission of important entities; and the CRF layer serves as an output layer of the network structure and is used for preventing the entities in the text sequence from being mistakenly labeled. The named entity recognition model can extract information and generate knowledge from a text, and can analyze various documents equivalent to manufacturing science and technology, such as product design text data, engineering test text data, supplier data, maintenance record data, product use data and the like in the manufacturing field; a technical basis can be provided for an enterprise to realize interconnection of various data assets, and the method is a key for promoting digital transformation of the enterprise.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a named entity recognition model applied to manufacturing scientific and technical documents. Background technique [0002] With the exponential growth of manufacturing-related scientific literature and the number of digital resources available on the Internet, it is challenging to search and extract valuable information from manufacturing scientific and technical documents. Existing named entity recognition models have relevant research and applications in specific fields such as materials science, biomedicine, chemical science, network security, maintenance practice, and forensics science. However, for manufacturing scientific and technical documents, research uses named entity recognition models and extracts The valuable information is still in the embryonic stage of technical language processing research and application. [0003] The shortcomings of exist...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/295G06F40/242G06F40/247G06F40/126G06N3/04G06N3/08G06N5/02
CPCG06F40/295G06F40/242G06F40/247G06F40/126G06N3/08G06N5/02G06N3/044
Inventor 王明浩
Owner 中云开源数据技术(上海)有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More