Unlock instant, AI-driven research and patent intelligence for your innovation.

Repetitive file detecting and displaying function

A document and repetitive content technology, applied in unstructured text data retrieval, network data retrieval, other database retrieval, etc., can solve problems such as manual handling of identification

Inactive Publication Date: 2007-11-21
THOMSON REUTERS ENTERPRISE CENT GMBH
View PDF1 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Unfortunately, the duplicate stories are generally mixed according to their relevance to other different stories, leaving the user to manually grapple with the complex problem of identifying and / or filtering them

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Repetitive file detecting and displaying function
  • Repetitive file detecting and displaying function
  • Repetitive file detecting and displaying function

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] This specification describes one or more specific embodiments of the invention with reference to and in conjunction with the aforementioned drawings. These embodiments of the invention, which are provided as non-limiting examples, are illustrated and described in sufficient detail to enable any person skilled in the art to make or practice the invention. Therefore, where appropriate, certain information known to those skilled in the art will be omitted from this description in order to clarify the present invention.

[0020] exemplary definition

[0021] This specification includes many words that have meanings derived from their technical application or from the context of the specification. However, to further aid reading, the following exemplary definitions are provided.

[0022] "Document" means any addressable arrangement of machine-readable data, such as text data.

[0023] A "database" includes any logical arrangement of documents. In so...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Many companies provide online search facilities that enable users to conduct computerized searches for documents. Unfortunately, these searches frequently provide results that include duplicate documents-that is, documents that are completely or substantially identical to each other. This problem is particularly vexing when searching news stories, for example. Moreover, the duplicate documents are intermixed in the search results, leaving users to manually manage the complexities of identifying and / or filtering them. Accordingly, the present inventors devised systems, methods, and software that facilitate the identification and / or grouping of duplicate documents in search results. One exemplary system includes a signature generation module which generates document signatures based on length, temporal, and / or content components; a real-time duplicate detection module which uses the document signatures to identify ''exact'' or ''fuzzy'' duplicate documents; and a user-interface or presentation module which controls how duplicate documents are presented or suppressed in search results.

Description

[0001] related application [0002] This application claims priority to US Provisional Application 60 / 603762 (Attorney Abstract 6962.030PRV), filed August 23, 2004, and US Provisional Application 60 / 623975, filed November 1, 2004 (Attorney Abstract 4962.030PV2). Both applications are incorporated herein by reference. [0003] Copyright notice and permission [0004] Portions of this patent document contain copyrighted material. The copyright owner has no objection to any facsimile reproduction of the patent document or patent disclosure, although it appears in the Patent and Trademark Office patent file or records, all rights reserved. The following notice is used in this document: Copyright2004, West Services, Inc. technical field [0005] Various embodiments of the invention relate to information retrieval systems, such as those that provide news documents or other related content. Background technique [0006] Companies such as Thomson Legal & R...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F17/30864G06F17/3061G06F16/30G06F16/951
Inventor J·G·康拉德J·R·S·克劳森J·林
Owner THOMSON REUTERS ENTERPRISE CENT GMBH