Unlock instant, AI-driven research and patent intelligence for your innovation.

Domain dictionary construction method and device

A construction method and technology of domain words, applied in the field of domain dictionary construction methods and devices, can solve problems affecting the accuracy and efficiency of domain dictionary construction, difficulties in manual domain word labeling, and manual labeling of domain words, so as to ensure the independence of sentences , Guaranteed reliability and cost-saving effect

Pending Publication Date: 2020-11-13
INDUSTRIAL AND COMMERCIAL BANK OF CHINA +1
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For the following situations, the above method is difficult to implement, which affects the accuracy and efficiency of building domain dictionaries:
[0005] (1) There is no suitable document-type corpus: corpus stored in a structured manner in the transactional database: lack of document-type corpus; (2) There is no long-length corpus: the composition is based on "handling matters" + "handling angle" As the primary key to uniquely determine a corpus, under the limited subset of "management angles", the corresponding answers are short, non-sentential, and similar; (3) It is difficult to manually label domain words: the content of the corpus involves Matters” covers many fields such as medical care, insurance, finance, and trade, and it is difficult to manually label words in the field

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Domain dictionary construction method and device
  • Domain dictionary construction method and device
  • Domain dictionary construction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] In order to enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of the application will be clearly and completely described below in conjunction with the drawings in the embodiments of the application. Obviously, the described The embodiments are only some of the embodiments of the present application, but not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0041]In order to solve the problem that the corpus is short and covers a wide range of business, which does not match the requirements of the current domain word construction method, consider changing the existing domain dictionary construction method, combined with the "new word discovery" technology, and propose a domain dictionary construction method and d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a domain dictionary construction method and device, and relates to the technical field of artificial intelligence. The method comprises the steps of obtainingan original transaction corpus; performing character processing on the original transaction corpus to obtain a to-be-segmented transaction corpus; performing n-gram word segmentation processing on thetransaction corpus to be subjected to word segmentation to obtain a plurality of word segments of the transaction corpus to be subjected to word segmentation; obtaining a statistical index value of each word segment, and taking the word segment of which the statistical index value is greater than a combination threshold as a filtered word segment; and performing segmentation processing on the filtered word segment, judging whether each first segmentation word obtained after segmentation processing is a complete vocabulary or not, and if not, taking the filtered word segment as a first domainword to construct a target transaction domain dictionary. According to the method, the domain words can be obtained on the basis of corpora which are short in length, stored in a structured mode and free of annotation, the process is efficient and accurate, and then the reliability of a transaction domain dictionary can be guaranteed.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, in particular to a method and device for constructing a domain dictionary. Background technique [0002] A domain dictionary refers to a combination of terms or expressions unique to a specific domain. Traditional domain dictionary construction methods are basically based on rules and statistics. The general method is to combine the grammatical rules with the characteristics of sentence patterns and parts of speech, and then cooperate with TF-IDF (term frequency–inverse document frequency) statistical values ​​to conduct screening, and then conduct manual re-inspection on the basis of screening words. The disadvantage of this method is that the characteristics of sentence patterns and parts of speech in different fields are different, and the reusability is not good. In addition, there are certain requirements for the length of the corpus. Generally speaking, it is more ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/242G06F40/289G06F40/216
CPCG06F40/242G06F40/289G06F40/216
Inventor 张文慧范晓东李羊唐伟佳
Owner INDUSTRIAL AND COMMERCIAL BANK OF CHINA