A clustering method based on big data

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A clustering method and big data technology, applied in the field of clustering analysis, can solve the problems of reducing the speed of search efficiency, affecting the efficiency of retrieving user target information, etc., to achieve the effect of improving accuracy and effectiveness

Active Publication Date: 2021-08-10

成都东方盛行电子有限责任公司

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The effect of text clustering will greatly affect the efficiency of retrieving user target information. For example, compared with the method of sequentially organizing documents, the method of random clustering of documents will not improve the search efficiency but reduce the speed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] In order to have a clearer understanding of the technical features, purposes and effects of the present invention, the specific implementation manners of the present invention will now be described with reference to the accompanying drawings.

[0029] Such as figure 1 As shown, a clustering method based on big data includes the following steps:

[0030] S1. Segment news D to obtain news S;

[0031] S2. Determine whether the news S is the first news, if so, execute S5, if not, execute S3;

[0032] S3. Establish a VSM vector model for the news S, and calculate the similarity between the news S and all categories of the cluster center;

[0033] S4. find out the category C with the maximum similarity with the news S, if the similarity between the news S and the category C is greater than a preset threshold, then classify the news S into the category C, If it is less than the preset threshold, execute S5;

[0034] S5. Create a new category based on the news S;

[0035] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a clustering method based on big data, comprising the following steps: performing word segmentation on news D to obtain news S; judging whether news S is the first news, if so, establishing a new category based on news S, if not, Establish a VSM vector model for the news S, calculate the similarity between the news S and all categories of the cluster center; find out the category C with the largest similarity with the news S, if the similarity between the news S and the category C is greater than the preset threshold, then the Classify news S into category C, if it is less than the preset threshold, create a new category based on news S; calculate the average similarity M1 between news S and other news in category C, and calculate other news and clustering centers in category C The average similarity M2 of other news, if M1 is greater than M2, update the news S as the new clustering center, otherwise the clustering center remains unchanged; judge whether the current news has been processed, if so, calculate the popularity of the news through the preset algorithm, and extract the hot spots News, otherwise continue to the next article.

Description

technical field [0001] The invention relates to the technical field of cluster analysis, in particular to a clustering method based on big data. Background technique [0002] Due to the rapid development of the Internet on a global scale and the rapid development of information technology, the various data used by people are growing at an explosive rate. A large amount of data is stored in the database, which can be applied to government offices, business intelligence, scientific research and project development, etc., but it is not easy to use these data. Understanding the massive data in the database is no longer within the scope of human ability. If we do not rely on automatic analysis methods, the large amount of data stored in the data will become a "data grave" - a data archive that is difficult to access again. Because decision makers cannot manually excavate useful knowledge from massive data, the important decisions they make are not based on the data in the data...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F16/35G06F40/289G06K9/62

CPCG06F16/35G06F16/355

Inventor 马萧萧温大川吴春才冯良怀文斌杨树海姚晴麟

Owner 成都东方盛行电子有限责任公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A clustering method based on big data

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology