Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Construction method of domain dictionary

A construction method and dictionary technology, applied in the field of natural language processing, can solve problems such as lack of pertinence, time-consuming and labor-consuming, and inability to achieve better analysis results, and achieve high construction efficiency, strong accuracy, and strong practicability Effect

Inactive Publication Date: 2017-05-10
成都数联铭品科技有限公司
View PDF2 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The sentiment analysis method based on the sentiment dictionary is targeted analysis and excavation. The dictionaries adopted in different fields are also very different. At present, the existing field dictionaries lack the applicability to specific problems and are not very targeted.
When analyzing specific fields or specific topics, using existing large and broad domain dictionaries cannot achieve good analysis results. It is necessary to build targeted domain dictionaries, but manually constructing dictionaries is very time-consuming and labor-intensive; Can not meet the needs of massive text analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Construction method of domain dictionary
  • Construction method of domain dictionary
  • Construction method of domain dictionary

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The present invention will be further described in detail below in conjunction with test examples and specific embodiments. However, it should not be understood that the scope of the above subject matter of the present invention is limited to the following embodiments, and all technologies realized based on the content of the present invention belong to the scope of the present invention.

[0038] A domain dictionary construction method is provided. On the basis of automatically obtaining text keywords, the texts to be processed are clustered to form different subject text sets; some seed words are selected through manual inspection in the domain text sets to be constructed. On this basis, the relationship between the clustered subject text set and the selected domain seed words is analyzed, and only the subject text sets with close relationship are reserved for domain dictionary expansion. On this basis, the algorithm is combined to automatically expand the domain dict...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of natural language processing, and in particular relates to a construction method of a domain dictionary. The method comprises the following steps: on the basis of automatic acquisition of a text keyword, clustering to-be-processed texts to form different topic text sets; selecting a part of seed words in a domain text set of a to-be-constructed dictionary through manual examination; on the basis, analyzing the distance of the clustered topic text sets and the selected domain seed words in relationship, and only retaining the top text sets which are relatively close in relationship for expanding the domain dictionary; and in a related domain, performing automatic expansion of the domain dictionary by combining an algorithm in the related domain to obtain a corresponding dictionary. According to the method provided by the invention, the to-be-constructed domain dictionary can be automatically expanded through a few of part of seed words on the basis of automatic differentiation of the domains of the text topics; the construction efficiency of the dictionary is relatively high, the accuracy is high, and the pertinence of the domain is strong; the method has wide application prospect in text analysis and natural language processing field.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to a method for constructing a field dictionary. Background technique [0002] With the rapid development of the Internet, a large amount of public web data has been generated, which has also spurred various emerging industries based on big data technology, such as Internet medical care, Internet education, corporate or personal credit investigation, etc. The rise and prosperity of these Internet industries are inseparable from the analysis of a large amount of data information. Natural language processing plays an important role in big data analysis. In the face of massive network text resources, natural language processing analysis methods can be used to automatically and intelligently judge the emotional tendency contained in the text or the text publisher, whether it is in public opinion It has vital practical application significance in both analysis and business inv...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F16/374G06F40/247
Inventor 张晓霞刘世林
Owner 成都数联铭品科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products