Method of indexing and retrieval of electronically-stored documents

a document and electronic storage technology, applied in the field of document storage and retrieval systems, can solve problems such as query producing unsatisfactory retrieval, and achieve the effect of facilitating rapid searching and facilitating rapid searching

Inactive Publication Date: 2000-06-06
KAGENECK KARL ERBO G +1
View PDF9 Cites 82 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The present invention facilitates the rapid searching of a document data base for documents that are of interest to the user. By using the suggested SWAPS terms the user can modify his query so as to retrieve those documents, if they exist in the data base, which are of interest. Since the SWAPS terms that are presented are in many of the documents that the original query terms are in, adding them to the query is guaranteed to retrieve those documents and others containing the SWAPS terms. By using the SWAPS feature repeatedly the user can in effect roam around the data base without actually retrieving and reading documents. Only after the query has been modified to include all the interesting SWAPS terms, does the user need to actually retrieve the documents. The user can start with a poor query and modify it using SWAPS so that it becomes a good query. The user need not waste time formulating a good query that will not retrieve any relevant documents because there happen to be no such documents in the data base. The SWAPS terms that are suggested will always retrieve documents that contain them i.e. documents that are likely to be relevant.
The ranking of the documents also facilitates rapid searching because the user can be confident that the highest ranked documents will be the

Problems solved by technology

However, if the query consists of general words that are not terms of art, the query may produce unsatisfactory retrieval results by either producing few documents that are of interest to the user or producing many documents that are not interesting to the user or both.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method of indexing and retrieval of electronically-stored documents
  • Method of indexing and retrieval of electronically-stored documents
  • Method of indexing and retrieval of electronically-stored documents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

This invention will now be described as embodied in a computer system of the type shown in FIG. 1. This embodiment utilizes the following computer hardware and software:

(1) IBM compatible personal computer with at least 4 MB of RAM, a large capacity hard drive, a display screen, and a keyboard.

(2) MS-DOS compatible operating system and LIM 3.2 compatible expanded memory manager.

(3) A vocabulary file of terms (words and phrases)

(4) A series of programs that index the documents by constructing various files that hold information about which terms are in which documents, which documents contain which terms, the weights of the terms, and which terms are relatives of other terms by virtue of occurring in the same documents and how strongly are they are related.

(5) A user program that accepts a query, suggests modifications to the query, and ranks the documents based on the modified query using the weights and relative strengths of the terms of the query.

The Vocabulary file is structured ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A document indexing and retrieval system and method which assigns weights to the key words and assigns a relative value to pairs of key words (i.e. defines a relative relation on KxK) based on their frequency of occurrence and co-occurrence in the document data base. In response to a query both the weights and this relative relation are used to suggest additional and/or alternative key words which are very likely to find relevant documents. Documents are then ranked by number of hits adjusted for the weights of hit words and their relative values.

Description

BACKGROUND OF THE INVENTION1. Field of the InventionThis invention relates generally to document storage and retrieval systems and more particularly to a method of indexing documents so that they can be retrieved in response to a query in order of their relevance to the query. It also permits, general query to be easily modified based on the content of the documents so that the new query will retrieve documents that are relevant to the original query.2. Description of the Prior ArtDocument retrieval based on indexing of the documents in a document data base is well known. Typically the documents are indexed by creating an index file which records the documents that each word is in. Then when the user inputs a query, the documents that contain one or more words of the query can be quickly identified. However, if the query consists of general words that are not terms of art, the query may produce unsatisfactory retrieval results by either producing few documents that are of interest t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(United States)
IPC IPC(8): G06F17/30
CPCG06F17/3061G06F17/30616G06F17/30722Y10S707/99945Y10S707/99935G06F16/38G06F16/313G06F16/30
Inventor KAGENECK, KARL-ERBO G.YOUNG, TED
Owner KAGENECK KARL ERBO G
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products