Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Industrial big data search optimization method, system, equipment, medium and terminal

An optimization method and big data technology, applied in network data retrieval, other database retrieval, digital data information retrieval and other directions, can solve problems such as low retrieval efficiency and poor retrieval accuracy, and achieve improved search efficiency, improved word segmentation effect, and excellent classification. Effects of performance and robustness

Pending Publication Date: 2021-10-26
XIDIAN UNIV
View PDF1 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0024] Aiming at the problems existing in the prior art, the present invention provides an industrial big data search optimization method, system, equipment, medium, and terminal, and particularly relates to an industrial big data search optimization method, system, equipment, and medium based on an industrial word segmenter , terminal, aiming to solve the problems of low retrieval efficiency and poor retrieval accuracy existing in the current industrial data retrieval methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Industrial big data search optimization method, system, equipment, medium and terminal
  • Industrial big data search optimization method, system, equipment, medium and terminal
  • Industrial big data search optimization method, system, equipment, medium and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0080] Embodiment: industrial big data search optimization method based on industrial tokenizer

[0081] There are many types of commonly used Chinese tokenizers, but different tokenizers have different word segmentation effects on documents. Commonly used tokenizers include word tokenizer, ik-analyzer tokenizer, mmseg4j tokenizer, jieba tokenizer, stanford tokenizer, etc. Using different Chinese word segmentation devices, the document will be segmented according to the word segmentation rules. Different word segmentation rules will result in different word segmentation results. This invention proposes an industrial word segmentation device to meet the needs of industrial big data search.

[0082] The seven documents are Document 1: "Industrial Internet Analysis Platform", Document 2: "Industrial Internet Analysis Platform Architecture", Document 3: "Industrial Internet Analysis Platform Architecture", Document 4: "Industrial Internet Analysis Platform Technical Architecture", ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of industrial data processing, and discloses an industrial big data search optimization method, system, device, medium and terminal. The industrial big data search optimization method comprises the steps of firstly collecting industrial field specialized vocabularies to form a related set, and storing the content into a new document and placing the document in an ik word segmentation device configuration folder; and then configuring an industrial extension dictionary in the XML document to form an industrial word segmentation device, restarting an Elasticsearch search engine, and at the moment, completing the construction of the industrial word segmentation device. The invention provides a word segmentation device specially applied to industry, an industrial word segmentation device is constructed by analyzing the technical principle of the word segmentation device, and the difference between the word segmentation results of the industrial word segmentation device and mainstream general Chinese word segmentation devices jieba and Ansj in the industrial field is compared, and the result shows that based on the industrial word segmentation device, more excellent classification performance and robustness are achieved according to the word segmentation, and the word segmentation effect and the search efficiency are effectively improved by expanding industrial specialized vocabularies.

Description

technical field [0001] The invention belongs to the technical field of industrial data processing, and in particular relates to an industrial big data search optimization method, system, equipment, medium, and terminal. Background technique [0002] At present, in the industrial information service platform, data is the cornerstone of the entire platform, and retrieval and acquisition of data is the core part of the platform. Professional tokenizer design is the key technology to build data search. In the industrial field, due to the large amount of data and many data sources, the efficiency of data retrieval is not high, so it is necessary to study the word segmenter to improve the efficiency of data search. [0003] Analyzer can segment the words in the data text according to specific rules. There is an abstract Analyzer class in each tokenizer. Different Analyzer subclasses determine different word segmentation rules. Therefore, different tokenizers should be used for C...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/332G06F16/33G06F16/955G06F40/284
CPCG06F16/3329G06F16/3344G06F16/955G06F40/284
Inventor 殷磊孔宪光杨天澍王宇惊
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products