Unlock instant, AI-driven research and patent intelligence for your innovation.

A language processing method and device

A processing method and processing device technology, which is applied in the fields of electronic digital data processing, special data processing applications, unstructured text data retrieval, etc., can solve the problems of unreliable corpus, reduce the coverage of classifiers, and waste work results, etc., to achieve Improve user experience, increase utilization and accuracy, and reach performance

Active Publication Date: 2019-09-24
LENOVO (BEIJING) LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This will cause the following situations: First, there will be many evaluation methods that cannot be covered. Manually labeling the training corpus usually takes a lot of time and energy, and it is generally difficult to cover all possible situations
Therefore, each labeling result is very precious. If the corpus that does not reach the threshold is directly removed, not only will the labeling work be wasted, but also the coverage of the final classifier will be reduced, and the final classification effect cannot be guaranteed; , even if there are some evaluation methods in the training corpus, but the number of corpus corresponding to them is small
Due to the accidental and error-prone nature of manual annotation, the accuracy of these corpora is not reliable enough
If these corpus with low label reliability are put into the classifier, it may eventually affect the classification effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A language processing method and device
  • A language processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Note that in this specification and the drawings, substantially the same steps and elements are denoted by the same reference numerals, and repeated explanation of these steps and elements will be omitted.

[0020] Reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one of the described embodiments. Thus, appearances of the phrase "in one embodiment" or "in an embodiment" in the specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.

[0021] figure 1 A flow chart of a corpus processing method 100 according to an embodiment of the pres...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Provide a corpus processing method and device, including: obtaining a first corpus collection to be classified; determining a second corpus collection from the first corpus collection, and the evaluation objects of the second corpus in the second corpus collection are all The first evaluation object, and the evaluation contents of the second corpus about the first evaluation object are all marked as positive evaluations; a third corpus collection is determined from the first corpus collection, and the third corpus collection is The evaluation objects of the third corpus are all the first evaluation objects, and the evaluation contents of the third corpus about the first evaluation object are all marked as negative evaluations; determine the third corpus in the second corpus set Whether the second corpus is synonymous or synonymous with any piece of third corpus in the third corpus collection regarding the evaluation content of the first evaluation object; and processing the corpus collection. The corpus processing method provided by the present invention can improve the utilization rate, accuracy and coverage of classified corpus.

Description

technical field [0001] The present invention relates to a corpus processing method and device, more specifically, to a corpus processing method and device for sentiment classification. Background technique [0002] At present, when we are doing sentiment analysis of product reviews, we mainly use classification methods to build sentiment analysis models. Since most of the objects to be classified are user comments on e-commerce websites, these comments are generally their own shopping experience published by users, and they are all colloquial descriptions. There is no specific evaluation scope and evaluation rules, and may involve all aspects of the product. Even when describing the same aspect of a product, different users say it differently. This makes it difficult for us to construct classification training corpus. Because the training corpus only achieves a certain coverage, representativeness and accuracy, the trained classification model will have a better classific...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F17/27
Inventor 卓雷赵凯葛安生
Owner LENOVO (BEIJING) LTD