Domain dictionary generation system for natural language processing

A natural language processing and generation system technology, applied in the direction of electronic digital data processing, special data processing applications, instruments, etc., can solve the problems of not being able to meet the analysis of massive texts, lack of pertinence, lack of applicability, etc., and achieve reliable dictionary automatic Generating tools, targeted, and highly accurate results

Inactive Publication Date: 2017-06-06
成都数联铭品科技有限公司
View PDF5 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the existing domain dictionaries lack the applicability to specific problems and are not well targeted.
When analyzing specific fields or specific topics, using existing large and broad domain dictionaries cannot achieve ideal analysis results. It is necessary to build targeted domain dictionaries. However, manual construction of dictionaries is very time-consuming...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Domain dictionary generation system for natural language processing
  • Domain dictionary generation system for natural language processing
  • Domain dictionary generation system for natural language processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention will be further described in detail below in conjunction with test examples and specific embodiments. However, it should not be understood that the scope of the above subject matter of the present invention is limited to the following embodiments, and all technologies realized based on the content of the present invention belong to the scope of the present invention.

[0037] A domain dictionary generating system for natural language processing is provided. On the basis of automatically distinguishing text subject domains, the system automatically constructs corresponding domain dictionaries according to seed words. Such as figure 1 As shown, it includes a text preprocessing system and a dictionary construction system. The text preprocessing system performs processing including word segmentation, removal of high-frequency words and removal of stop words for the text to be processed; The dictionary is automatically expanded to construct a correspon...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of natural language processing, in particular to a domain dictionary generation system for natural language processing. On the basis of automatic differentiation of text topic domains, the system conducts the automatic construction of dictionaries of corresponding domains based on seed words. A user inputs the to-be-processed texts and domain seed words into the system. On the basis of the automatic acquisition of the test key words, the to-be-processed texts are clustered, and relationship level between the clustered thematic text set and the selected domain seed words is analyzed and decided. The automatic expansion of the dictionaries is conducted with coordination of algorithms in the thematic text set with close relationship. According to the domain dictionary generation system for natural language processing, on the basis of the automatic differentiation of text topic domains, the automatic expansion of domain dictionaries is achieved through small quantity of seed words. The dictionary construction efficiency is high, the accuracy is high, and the domain pertinence is strong. The domain dictionary generation system for natural language processing provides a powerful tool for the text analysis and natural language processing.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to a domain dictionary generation system for natural language processing. Background technique [0002] The advent of the era of big data has created new opportunities for the world. The analysis and utilization of big data reflects the value of big data. Natural language processing occupies an important position in big data analysis. Facing massive network text resources, by using natural language The analysis method of processing automatically and intelligently extracts useful information, or judges a certain emotional tendency contained in the text or the text publisher, which has important practical significance in both public opinion analysis and business investigation. Using the analysis results, we can make correct predictions about the development and evolution of things or user preferences, and then take corresponding measures in advance to achieve greater positiv...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/35G06F16/374
Inventor 张晓霞刘世林
Owner 成都数联铭品科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products