Unlock instant, AI-driven research and patent intelligence for your innovation.

Visual discrimination difficulty combined text string weight calculation method and device

A weight calculation and text technology, applied in the field of search engines, can solve problems such as the smoothness of text string recognition without consideration

Inactive Publication Date: 2014-04-23
1VERGE INTERNET TECH BEIJING
View PDF5 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, the above-mentioned methods of various adaptive modifications of TF·IDF do not take into account the smoothness of visual recognition of text strings by users who are observers of search results.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visual discrimination difficulty combined text string weight calculation method and device
  • Visual discrimination difficulty combined text string weight calculation method and device
  • Visual discrimination difficulty combined text string weight calculation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0114] Take a document title "Latest News of Earthquake in Ya'an, Sichuan" as an example:

[0115] "Latest News of Earthquake in Ya'an, Sichuan" is word-segmented to obtain a sequence of text strings containing five words such as "Latest News of Earthquake in Ya'an, Sichuan". Calculate its IDF, MD, and YB values ​​respectively, (for simplicity, all the previous weighting factors are taken as 1) to get:

[0116]

[0117] It can be seen that in this document, the most important words are "news, Ya'an, earthquake, latest, Sichuan". From the user's demand for news, the above word weight ranking is reasonable.

[0118] The present invention introduces the recognizability factor and the visual density factor when calculating the text string, that is, the weight of the word to the document, so that the word that is easier to be recognized and understood by the user as a whole gets a greater weight, so that the search results include The text is easier to be recognized and browse...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Disclosed is a visual discrimination difficulty combined text string weight calculation method and device. The visual discrimination difficulty combined text string weight calculation method comprises constructing a document collection and performing statistics on the frequency in the document collection and the frequency in a single document of a character string, text strings in which every character is located and the number of strokes of every character; performing word segmentation on a document in which the text string weight is to be calculated to obtain a text string sequence and calculating the visual density, the legibility degree and a TF-IDF (Term Frequency-Inverse Document Frequency) value of every text; weighting and summing the visual density, the legibility degrees and the TF-IDFvalues of the text string to obtain the weight on a document of the text string and further obtaining the normalized weight on the document of the text string. According to the visual discrimination difficulty combined text string weight calculation method and device, the words which have larger amount of information and can be identified by users easily are larger in weight, the video results which can be identified, read and understood by the users easily are shown more in the search result, and accordingly the users can find the interested results rapidly.

Description

technical field [0001] The present application relates to the field of search engines, and in particular relates to a method and device for calculating the weight of text strings in combination with the difficulty of visual discrimination. Background technique [0002] When a search engine builds an inverted index, it needs to calculate the weight of words in each document in that document. In the prior art, the weight of a word in a document is mostly calculated based on the word frequency in the document and the document frequency (ie, TF·IDF) in which the word appears in the document collection. The TF·IDF algorithm is a classic algorithm in the field of search engines. In the process of implementing it into the system, users generally make adaptive modifications to conform to the characteristics of data distribution in their field. [0003] However, the various adaptively modified TF·IDF methods mentioned above do not take into account the smoothness of visual recogniti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/7867G06F16/951
Inventor 刘伟姚键潘柏宇卢述奇
Owner 1VERGE INTERNET TECH BEIJING