Supercharge Your Innovation With Domain-Expert AI Agents!

A Character Similarity Characterization Method Based on Heterogeneous Data

A technology of character similarity and similarity, applied in text database query, electronic digital data processing, digital data information retrieval, etc., can solve problems such as difficulties

Active Publication Date: 2021-04-30
NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
View PDF13 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

During this process, people express their interests and characteristics in different forms; however, in the face of a huge amount of information, people have to identify the content they are interested in and find like-minded friends, businesses and governments from the massive amount of information. It is becoming more and more difficult for institutions to find more valuable user groups and conduct further research or recommendations based on user information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Character Similarity Characterization Method Based on Heterogeneous Data
  • A Character Similarity Characterization Method Based on Heterogeneous Data
  • A Character Similarity Characterization Method Based on Heterogeneous Data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] The specific implementation method of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0055] The method for characterizing similarity of characters based on heterogeneous data on the microblog platform of the present invention firstly collects microblog texts of users, obtains the attention relationship between users and the basic information of each user, and personalizes according to the characteristics of different types of data Choose the processing method, and use the Doc2vec model for the microblog text. Considering the context characteristics, calculate the text similarity, and finally fuse the matrices obtained from different dimensions to describe the final similarity of users.

[0056] Such as figure 1 As shown, the specific implementation steps are as follows:

[0057] Step 1. Collect microblog data streams about a certain field or with high activity from the network, perform preprocessing and store th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a character similarity characterization method based on heterogeneous data, which belongs to the field of data mining. The present invention firstly collects microblog texts of users, obtains the attention relationship between users and the basic information of each user, and selects a processing method individually according to the characteristics of different types of data, and adopts the Doc2vec model for microblog texts, combines the context information to convert the text Express it as a vector, then measure the similarity according to the defined similarity function, and finally fuse the matrices obtained from different dimensions to describe the final similarity of users. The present invention introduces a variety of social network information, including social relationship data, user attribute data, and user text data, etc., and by comprehensively considering different types of information, a more comprehensive character similarity characterization method is obtained; at the same time, the present invention provides For a variety of data processing and calculation schemes, use complete data and weighted fusion methods to personalize and calculate the similarity of people with different preferences.

Description

technical field [0001] The invention belongs to the field of data mining and relates to a similarity calculation technology, in particular to a character similarity characterization method based on heterogeneous data. Background technique [0002] With the development of the Internet, people's lives are increasingly inseparable from the Internet, relying on the Internet to work, socialize and express their opinions more and more closely, resulting in the gradual blurring of the boundaries between online and offline. During this process, people express their interests and characteristics through different forms; however, in the face of a huge amount of information, people have to identify the content they are interested in and find friends with the same interests, businesses and governments from the massive amount of information. It is becoming more and more difficult for institutions to find user groups that are more valuable to them, and to conduct further research or recom...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/335G06F16/33G06F16/9535
Inventor 王卿刘春阳包秀国张旭王萌李雄吴俊杰蒋丽娜
Owner NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More