Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

New word discovery method based on model

A new word discovery and model technology, applied in the field of model-based new word discovery, can solve problems such as difficulty, difficulty in discovering its regularity, and difficulty in automatically discovering rules by computer, so as to improve construction efficiency and reduce workload.

Pending Publication Date: 2021-10-22
FUJIAN YIRONG INFORMATION TECH +2
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the process of audit informatization, it is necessary to extract domain words. For example, the rule-based extraction method is to establish corresponding rules based on the composition structure of words and external context connections, and use pattern matching to extract domain words. This extraction method Most of the rules are made manually, and it is difficult to automatically discover the rules with a computer. Especially nowadays, Internet buzzwords are so strange that it is even more difficult to find their regularity, so it is very difficult. However, this method is not very effective in identifying individual domain words and low-frequency domain words, so there is an urgent need for a high-quality model-based new word discovery method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • New word discovery method based on model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the accompanying drawings in the embodiments of the present invention; obviously, the described embodiments are only part of the embodiments of the present invention, not all embodiments, based on The embodiments of the present invention and all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0037] The present invention provides a technical solution: please refer to figure 1 , a model-based new word discovery method, including the following steps:

[0038] (1) Obtain documents related to audit business.

[0039] (2) Perform format conversion on audit business-related documents, and use the converted audit business-related document data as input data for subsequent labeling tools.

[0040] (3) Complete the labeling of audit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a new word discovery method based on a model, which belongs to the technical field of auditing, and comprises the following steps: S1, acquiring an audit service related document; and S2, performing format conversion on the audit service related document, and taking the converted audit service related document data as input data of a subsequent labeling tool. According to the method, on the basis of audit data, continuous discovery of audit field words is realized by utilizing a new word discovery technology, the professional word bank in the audit field is preliminarily constructed in combination with a combed existing industry word bank, the word bank is subsequently audited by professionals, and the professional word bank in the audit field is finally formed, so that effective support is provided for subsequent audit data analysis, the text is subjected to preliminary 'new word' discovery, then the 'new word' discovered manually is checked, and the real professional vocabularies in the auditing field are extracted, so that the workload of purely manually extracting the professional vocabularies in the auditing field from the document can be reduced to a great extent, and the construction efficiency of an auditing professional lexicon is improved.

Description

technical field [0001] The invention relates to the technical field of auditing, in particular to a model-based new word discovery method. Background technique [0002] In recent years, with the continuous application and development of information technologies such as big data, artificial intelligence, cloud computing, Internet of Things, and mobile applications, people's lives and work have gradually changed, bringing opportunities and challenges to audit supervision. Internal audit work is facing With the profound reform of audit informatization. [0003] In the process of audit informatization, it is necessary to extract domain words. For example, the rule-based extraction method is to establish corresponding rules based on the composition structure of words and external context connections, and use pattern matching to extract domain words. This extraction method Most of the rules are made manually, and it is difficult to automatically discover the rules with a computer...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/284G06F16/903G06F40/151G06F16/35G06K9/62
CPCG06F40/284G06F16/90344G06F40/151G06F16/353G06F18/217
Inventor 卢伟龙王小龙王燕蓉鲍琳子
Owner FUJIAN YIRONG INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products