Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

BOM text word segmentation method and device, equipment and storage medium

A word segmentation method and text technology, which is applied in the fields of instruments, digital data processing, computing, etc., can solve the problem of inaccurate word segmentation in BOM files, and achieve the effect of fast word segmentation and solving inaccurate word segmentation.

Active Publication Date: 2022-01-28
ALLCHIPS LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The main purpose of the present invention is to solve the technical problem of inaccurate word segmentation of existing BOM files

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • BOM text word segmentation method and device, equipment and storage medium
  • BOM text word segmentation method and device, equipment and storage medium
  • BOM text word segmentation method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058]Embodiments of the present invention provide a word segmentation method, device, equipment and storage medium for BOM text.

[0059] The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and not necessarily Used to describe a specific sequence or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the term "comprising" or "having" and any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to those explicitly listed instead, may include other steps or elements not explicitly listed or inherent to the process, metho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of text word segmentation, and discloses a BOM text word segmentation method and device, equipment and a storage medium. The method comprises the steps of obtaining BOM text data to be subjected to word segmentation, and performing Chinese and English splitting processing on the BOM text data to obtain a segmented text set; reading a segmented text in the segmented text set; judging whether the segmented text is a Chinese text or not; if the segmented text is a Chinese text, performing word segmentation processing on the segmented text according to a preset jieba function to obtain a segmented word set, and determining the segmented word set as segmented word data; if the segmented text is not a Chinese text, screening and splitting the segmented text according to a preset English digit check screening algorithm to obtain segmented word data of English digits; and combining all the segmented word data into a segmented word data set, and determining the segmented word data set as a word segmentation result of the BOM text data.

Description

technical field [0001] The present invention relates to the field of text word segmentation, in particular to a method, device, equipment and storage medium for word segmentation of BOM text. Background technique [0002] The BOM file is a semi-structured text file, and the user will write in the BOM file the parameter information of the hardware to be purchased, including model, brand, precision, etc. [0003] Natural Language Processing (NLP, Natural Language Processing) is an important direction in the field of artificial intelligence. It mainly studies various theories and methods for effective communication between humans and computers using natural language. The underlying tasks of natural language processing can be roughly divided into lexical analysis, syntactic analysis and semantic analysis from easy to difficult. Word segmentation is the most basic task in lexical analysis (including part-of-speech tagging and named entity recognition), and it is also an essentia...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F40/284G06F40/242
CPCG06F40/289G06F40/284G06F40/242
Inventor 杜飞高宇鹏刘武刘松山王园园王安李六七
Owner ALLCHIPS LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products