Method and device for processing text information
A text information and processing method technology, applied in the computer field, can solve problems such as finding and content vary widely
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0049] figure 1 Is the flow chart of the method of this embodiment, as figure 1 As shown, the method includes:
[0050] S1: Obtain text information, and filter the text information.
[0051] Obtaining text information includes obtaining text information from user-generated content, preferably, includes obtaining text information from news channels, Weibo channels and forum channels, and using the text content in these channels as text information. Among them, the news channels include Sina, Netease, Sohu, Tencent and "Today's Headlines"; the microblog channels include Sina Weibo; the forum channels include Tianya, Baidu Tieba, and Zhihu. For the news channel, the title text of the news is used as the text information; for the forum channel, the text content of the post is used as the text information. For the Weibo channel, the text content of the Weibo post is used as text information. New text information can be obtained very well through the text information obtained by...
Embodiment 2
[0096]In Embodiment 1, calculating the similarity between text information and text information is accomplished by calculating the similarity between text vectors I corresponding to the text information. However, since the text vector I is a vector containing text features and node features, its dimension is often as high as hundreds of thousands of dimensions, so in the calculation of similarity, it is necessary to calculate the multiplication and addition of hundreds of thousands of numbers, and the calculation load Very big. On the other hand, the storage of such a high-dimensional text vector I also takes up quite a lot of space.
[0097] In this embodiment, the following method is used to calculate the similarity between text information and text information:
[0098] Calculate the minimum count table (Count-Min Sketch) of the text vector I corresponding to the text information;
[0099] Calculate the similarity between the text vector I corresponding to the different t...
Embodiment 3
[0112] The embodiment of the present invention also proposes a text information processing device, which includes:
[0113] a text information filtering device, which acquires text information and filters the text information;
[0114] The text information classification device calculates the similarity of the filtered text information, and classifies the filtered text information into different events according to the similarity;
[0115] A flagging means calculates a metric for each event and flags the event when the metric exceeds a threshold.
[0116] In this embodiment, the text information filtering device is used to obtain and filter the text information, and the text information from the webpage can be obtained, and the text information is classified into different events according to the similarity calculated by the text information classification device, and the marking device calculates each event A metric that flags an event when the metric crosses a threshold. T...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


