Check patentability & draft patents in minutes with Patsnap Eureka AI!

Big data text mining processing system and method

A text mining and processing system technology, applied in the fields of electronic digital data processing, special data processing applications, natural language data processing, etc. Improve text readability, text content optimization, reasonable prediction effect

Active Publication Date: 2018-08-24
寇毅
View PDF13 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing text big data mining technology cannot focus on specific users and effectively extract and represent the kernels that specific users care about
[0005] The defects of the existing technology are manifested in the following aspects: First, it cannot adapt to the mining and analysis of diversified text forms for specific users. With the diversification of network platforms and services, the forms of text big data related to specific users are becoming more and more abundant, including key It also includes texts in discrete forms such as words and tags, as well as long texts in the form of entire articles such as papers, blogs, news reports, and website posts, as well as short texts consisting of a few sentences such as Weibo comments and messages in Moments. That is to say, the text big data related to a specific user is a collection of diverse forms of text data, and the existing text big data mining and analysis methods are difficult to realize unified and effective semantic feature mining for text data sets containing diverse forms. In particular, the existing text mining analysis methods are mainly suitable for long texts, and it is difficult to effectively mine short texts and keyword texts generated by users.
Secondly, the existing text mining analysis methods extract the distribution characteristics of representative words in the text, which often cannot match well and describe the characteristics of the user's attention and interest in the text. article, but its concerns and interests are not necessarily the representative content identified as the characteristics of the article, but may be some non-representative local details of the article, so text mining in isolation often deviates from the real interest of users. where the

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data text mining processing system and method
  • Big data text mining processing system and method
  • Big data text mining processing system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0038] figure 1 It is a schematic diagram of the overall structure of the big data text mining processing system of the present invention. The overall architecture of the system includes: a text big data acquisition module 101, a text preprocessing module 102, a text chain aggregation module 103, a weight evaluation module 104, a text chain feature vector extraction module 105, and a text feature analysis module 106.

[0039] Wherein, the text big data acquis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a big data text mining processing system and method, and can be applied to Internet service platforms, including search engines, social networks, instant communication, news information sites, electronic commerce, recreation and entertainment applications and the like. Firstly, mass text big data related to the behaviors, including browsing, communication, sharing, searching, downloading and the like, of users is obtained, and preprocessing, including data cleaning, word segmentation, stop word removal and the like, is executed for the text big data; and in addition, onthe basis of a user behavior mechanism, various forms of text data, including keywords, long texts, short texts and the like, are aggregated into a text link, feature extraction on the basis of the dynamic distribution of weight is executed by aiming at the text link, and mining analysis is realized according to extracted text link features.

Description

technical field [0001] The invention relates to the field of big data information processing and analysis, in particular to a big data text mining processing system and method thereof. Background technique [0002] In recent years, with the development and progress of network communication and computer technology, the storage, transmission and computing capabilities of information systems have grown by leaps and bounds, making the "big data era" a reality. Text big data is a very important part of big data information. Text big data information is data information in the form of text that exists in large quantities on various information platforms such as search engines, social networks, instant messaging, news sites, e-commerce, and leisure and entertainment applications. Valuable regularity information is obtained from text scattered information through text big data mining. [0003] The so-called text big data mining is to take text big data as the object and use approp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F40/289G06F40/30
Inventor 寇毅
Owner 寇毅
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More