Method and device for identifying text orientation

A tendency and text technology, applied in the field of text tendency determination methods and devices, can solve the problems of different expression methods, inaccurate text determination results, limited sentiment word extraction accuracy and completeness of sentiment dictionaries, etc. The effect of accuracy

Active Publication Date: 2015-04-29
RUN TECH CO LTD BEIJING
View PDF5 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The above semantic-based text orientation analysis method has the following defects: the extraction of emotional words is limited by the accuracy and completeness of the emotional dictionary
[0006] When the above-mentioned method of analyzing text orientation based on machine learning mode

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying text orientation
  • Method and device for identifying text orientation
  • Method and device for identifying text orientation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] see figure 1 , is a flow chart of a method for determining text orientation provided by Embodiment 1 of the present invention. The method in the embodiment of the present invention can be executed by a device for determining text orientation implemented in hardware and / or software, and the implementation device is typically configured in a server capable of providing orientation determination services.

[0026] The method includes: Step 110 - Step 130 .

[0027] Step 110 , based on the pre-established industry characteristic word dictionary, search for sentences containing at least one industry characteristic word in the text to be analyzed in units of sentences.

[0028] Industries can be any of the existing industries, such as automotive, sports, finance, and entertainment. Due to the different development trends of the industry in different time periods, the industry characteristic words reflecting the industry development trend change dynamically over time, and th...

Embodiment 2

[0044] On the basis of the above-mentioned embodiments, this embodiment provides an optimal solution for the operation of searching for sentences containing at least one industry characteristic word in the text to be analyzed based on the pre-established industry characteristic word dictionary in units of sentences. Specifically include:

[0045] Perform sentence segmentation and word segmentation processing on the text to be analyzed;

[0046] For each sub-sentence, match the participle contained in the sub-sentence in the pre-established industry characteristic word dictionary, and search for a sentence containing at least one industry characteristic word in the text to be analyzed.

[0047] Among them, the text to be analyzed is subjected to sentence segmentation processing, and each clause contained in the text to be analyzed can be obtained, and each clause is subjected to word segmentation processing, and the word segmentation contained in each clause can be obtained; T...

Embodiment 3

[0064] see figure 2 , is a flow chart of a method for determining text orientation provided by Embodiment 3 of the present invention. It specifically includes: Step 210 - Step 260 .

[0065] Step 210, based on the pre-established industry characteristic word dictionary, search for sentences containing at least one industry characteristic word in the text to be analyzed in units of sentences.

[0066] This step is also applicable to the sentence segmentation processing and word segmentation processing in the second embodiment above, and the operation of matching the word segmentation contained in each clause in the pre-established industry characteristic word dictionary, and will not be described again.

[0067] The establishment of the industry characteristic word dictionary in this step is also applicable to the establishment operation of the industry characteristic word dictionary in Embodiment 2, and will not be repeated here.

[0068] Step 220: According to the pre-trai...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a method and a device for identifying text orientation. The method comprises the following steps of: based on pre-built industry characteristic word dictionary, by taking the sentence as unit, searching and obtaining the sentence which includes at least one industry characteristic word from the text to be analyzed; determining the corresponding orientation of the sentence which includes at least one industry characteristic word according to a text categorization model pre-trained in advance; based on the determining strategy of the pre-set text orientation, determining the orientation of the test to be analyzed according to the corresponding orientation of the sentence which includes at least one industry characteristic word. According to the method and device, by the industry characteristic words in the industry characteristic word dictionary, the texts which describe the evaluation objects and/or elevate the feelings can be screened and obtained; because the interferences of the texts which describe the objects irrelevant to evaluation objects and/or elevate the feelings are eliminated, the accuracy of text orientation analysis of objects elevated in the text to be analyzed is improved.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of data analysis, and in particular to a method and device for determining text tendency. Background technique [0002] For text orientation analysis, there are currently two main methods, one is based on semantic analysis of text orientation, and the other is based on machine learning model analysis of text orientation. [0003] Among them, the semantic-based text tendency analysis method is generally based on pre-establishing a tendency semantic pattern library or emotional dictionary, and extracting adjectives or phrases that can reflect subjective colors in the text to be analyzed, that is, extracting emotional words, and then extracting The emotional words are judged one by one and given a tendency value, and finally all the above-mentioned tendency values ​​are added up to obtain the tendency of the text to be analyzed. [0004] The above semantic-based text orientation analy...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27
Inventor 鲁平
Owner RUN TECH CO LTD BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products