Clustering method, device and system

A clustering and cohesion technology, applied in the network field, can solve problems such as searching for resources, avoid wrong clustering, improve user experience, and have an objective and accurate processing method.

Active Publication Date: 2011-05-18
BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] That is to say, there are at least the following problems in the prior art: the prior art cannot search for resources with the same text content length according to the text content length of the audio file

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Clustering method, device and system
  • Clustering method, device and system
  • Clustering method, device and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0029] reference figure 1 Shown is the first embodiment of the method of the present invention, including the steps:

[0030] Step 101: Obtain part of the body content of the media file;

[0031] Step 102: Calculate clustering information of the media file according to part of the body content of the media file.

[0032] The embodiments of the present invention have the following advantages:

[0033] First, the embodiment of the present invention can calculate the clustering information of the media file according to the partial body content by obtaining part of the body content of the media file;

[0034] Secondly, because the embodiment of the present invention obtains the clustering information of the media file by calculating part of the body content of the media file, it does not rely on the description information of the media file, and avoids the error aggregation caused by the artificial modification of the description information. The processing method is objective and accura...

Embodiment 2

[0073] reference figure 2 Shown is the second embodiment of the method of the present invention. This embodiment takes an audio file as an example for description, including the steps:

[0074] Step 201: Obtain the content of the head and tail of the audio file;

[0075] For MP3 and WMA files, a large amount of meta (metadata, source data) information will be stored in the header of the file to identify various attributes of the file itself, ID3V1 (first generation tag, see for details) http: / / www.id3.org / ID3v1 The relevant introduction in) format MP3 (Moving PictureExperts Group Audio Layer III, audio compression technology and audio coding technology) file has 128 bytes of meta information at the end. Usually the head gets no more than 50k bytes of content; the tail gets 5k content.

[0076] Step 202: Analyze the content of the head and tail of the audio file;

[0077] For MP3 and WMA header and tail files, refer to the MP3 file specification. There are ID3v1 and ID3v2 (the secon...

example 1

[0100] Example 1: Suppose the link of an audio file in WMA format is:

[0101] http: / / oursim.whu.edu.cn / houtai / edit / UploadFile / 2006112073350103.wma For this audio file, the process of calculating its MD5 signature includes:

[0102] 1. Obtain the head and tail content of the WMA file in the link. The head and tail content of the file are usually expressed in the form of a list of music URL links;

[0103] Head: 2006112073350103_head 50k

[0104] Tail: 2006112073350103_tail 5k

[0105] 2. Analyze the obtained header file and tail file of the WMA file in the link:

[0106] a) First analyze the contents of the header file.

[0107] The first 16 bytes of the header are 0x30 0x26 0xB2 0x75 0x8E 0x66 0xCF0x11 0xA6 0xD9 0x00 0xAA 0x00 0x62 0xCE 0x6C, so it can be determined that the file in which the header file is located is a WMA format file.

[0108] The analysis program looks for the start identifier of the audio content 0x36 0x26 0xB2 0x75 0x8E 0x660xCF 0x11 0xA6 0xD9 0x00 0xAA 0x00 0x62 0xC...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Disclosed is a method for obtaining cluster information, comprising obtaining a part of media file text content according to which the media file cluster information is to be calculated. A device and system for obtaining cluster information is also disclosed. The use of the invention can search resource without description information in accordance with keyword.

Description

technical field [0001] The present invention relates to the field of network technology, in particular to a clustering method, device and system. Background technique [0002] The amount of resources stored in the Internet is huge, and it is constantly being updated and expanded. Especially with the expansion of network bandwidth, media files including audio and video files have developed rapidly because they can bring great enjoyment to people's physical and mental pleasure. However, how to adapt to the needs of users and provide users with accurate similar media file information has become more and more necessary with the expansion of media files. [0003] To search for resources that these users care about, it is necessary to find links to related resources. A kind of solution of music search engine is provided in the prior art, and main process is: [0004] The user enters a query word; [0005] After receiving the query word, the search engine performs a correspondi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 王志刚贾玉龙
Owner BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products