Apparatus and Method for Conducting Searches with a Search Engine for Unstructured Data to Retrieve Records Enriched with Structured Data and Generate Reports Based Thereon

a search engine and structured data technology, applied in the field of methods for searching an index of structured and unstructured incoming data, can solve the problems of inefficiency of the process, inability to replace and inability to guarantee the usefulness of replacing old results with new results

Inactive Publication Date: 2008-05-01
INFORMATION BUILDERS INC
View PDF14 Cites 333 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0021]The present invention overcomes the aforementioned problems of prior art search engines in providing a method for zeroing in on the hits returned by a search that are most relevant to the user. The method of the invention winnows down the number of hits returned by a search request thereby enabling the user to find only the one or more relevant items from a potentially much larger list of search results obtained from an inquiry to a search engine.
[0028]When the user clicks on a value within a node, the search results, which can number 1000 or more snippets corresponding to 1000 or more respective records, are filtered by passing a further search request as a meta query, i.e., a query to search for the selected tag-value pairs in the metadata with which the records have been enriched, and only the URLs containing that value remain displayed. Hence, the user is able to narrow the search results based on data contained in the search results, without having to further refine the original search request. Using the metadata to construct a search results navigation tree allows the end user to immediately perceive the underlying structure of the search results and to leverage the knowledge of the data to refine the initial search.
[0036]Providing reports of the search results in dynamic tables allows a user to analyze the search results while not connected to the Internet or other network, and also allows a user to email the dynamic table to other users who can perform further analysis on the search results. In this way, the usefulness of the search results is extended beyond mere observation of the results to analysis of the results and retrieval of further relevant information.

Problems solved by technology

This limitation is pragmatic since the expectation is that if a user does not find the results within the top most relevant hits, it will be more efficient to refine the query than to page through all one million hits.
This process is inefficient, because:
Such snippets can also be misleading.
There is no guarantee that replacing the old results with new results will be more useful given that the user refines the search without much knowledge about the structure content of all 1000 previous results.
Even though the number of records having information of interest to a searcher might be very small, the number of hits could occupy many pages, most containing irrelevant information, making it very difficult for the searcher to find what was wanted.
If they are offline, they loose even the ability to sort by relevance or date, hence storing search results has little usefulness.
These limitations severely constrain the ability of users to efficiently analyze and manipulate search results to make faster and more informed decisions.
While this limitation may not be as obvious when searching completely unstructured data, such as word processing documents, it becomes quickly apparent when users search structured data sources.
A mere sequential listing of these records is not very useful.
Prior art search systems fail to make analysis, manipulation and storing of search results meaningful.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Apparatus and Method for Conducting Searches with a Search Engine for Unstructured Data to Retrieve Records Enriched with Structured Data and Generate Reports Based Thereon
  • Apparatus and Method for Conducting Searches with a Search Engine for Unstructured Data to Retrieve Records Enriched with Structured Data and Generate Reports Based Thereon
  • Apparatus and Method for Conducting Searches with a Search Engine for Unstructured Data to Retrieve Records Enriched with Structured Data and Generate Reports Based Thereon

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0088]An example of the preparation of data for indexing and searching in accordance with the invention follows in the context of a sporting goods business that wants to make its merchandise searchable.[0089]First the following information is gathered.[0090]1. An example of the XML file generated by the listener on the database from which the data will be selected.[0091]2. The name of the database, in this case, retaildb.[0092]3. The type of report to be called from the search result link, in this case, prddet.fex focexec, which resides in the retail application using the retaildb database.[0093]4. Any links which are desired to appear below the main results link for calling the reports. Here there will be two links. The first will read “Product Sheet”, a link which displays a PDF version of the report. For this, prddet2.fex focexec, which resides in the retail application using the retaildb database, is used. The second link will read “Summary Report”, which displays a parameterize...

example 2

[0145]Another example of the preparation of data for indexing and searching in accordance with the invention follows in the context of an enterprise that wants to make information on its employees searchable.

[0146]Metadata containing name-value pairs is added to each record as follows.

    that will be used to construct the navigation tree. “Name” stands for FIELD NAME and “Content” is the value retrieved from the field for each indexed record. Each Field / Value pair is encoded in the URL for the search results -->  Employee Data for HENRY CHISOLM   name=“description” / >              Google ® appliance for search. This text is retrieved from any number of fields from the database. --> 360 20041228 978-465-6080 MA NEWBURYPT 13:48:280.5 0.02 0 188 CHISOLM HENRY CE SALES 257PRB SALESSPECIALIST 43000.0000 1990-02-07 00:00:00.0 CENTRAL 1 

[0147]During the pre-indexing preparation of the records in the population to be searched, a retrieval URL is generated for each indexed record with the me...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Records in databases or unstructured files are enriched with metadata and are indexed for retrieval by a search engine. In response to a search request, a graphical user interface (GUI) control based on the metadata associated with the search hits is constructed and displayed with the search results in a standard view. Selection of a metadata value via the GUI control filters the previously matched records down to those matching the value selected via the GUI control. The metadata in the search results is arranged in a tabular view which is embedded in the display of search results and rendered invisible until selected by the user. Reports can be constructed from an identifier each returned record set for presenting, analyzing and modifying the data, and for generating further reports.

Description

BACKGROUND OF THE INVENTION[0001]The present invention provides a method for searching an index of structured and unstructured incoming data received from remote locations on a wide area network or global network, e.g. the Internet or an enterprise intranet. More specifically, the invention provides for capturing and enriching data records with metadata or appended data, and accessing the data through the use of a search engine designed for searching unstructured or free-form data.[0002]Such search engines are in common use. Examples presented herein have been specifically tested for use with the familiar Google® search appliance and Internet search engine. However, the teachings herein are adaptable for use with other search appliances and engines useful in searching records on the Internet and on private intranets, often configured by business enterprises to enable access to data from diverse locations, e.g., the open source Lucene search engine licensed by the Apache Software Fou...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G06F17/30G06F3/048
CPCG06F17/30864G06F17/30861G06F17/30554G06F16/951G06F16/248G06F16/95
Inventor COHEN, GERALD D.KOTOROV, RADOSLAV P.LAM, VINCENTLENAHAN, PETER
Owner INFORMATION BUILDERS INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products