Although the computing processes required to store and retrieve electronic documents are well known, the sheer volume of documents and data stored in some databases can still make it difficult as a matter of practice to properly index and find the desired content in a timely fashion.
This is particularly true when considering that many databases contain documents with similar or identical content, thereby increasing the difficultly and
processing required to distinguish between the various documents.
For example, it can be difficult to achieve a desired speed for even a reasonable amount of
queries per second when the index being searched corresponds to millions or billions of records.
One problem with the foregoing setup, which is provided by existing search services, is that the speed of the search is constrained by the speed of the slowest
server.
Another problem is the shear number of servers and clusters that are required in the first place.
In particular, although computing equipment is becoming more affordable, it can still be quite expensive to provide a farm of servers that is necessary to adequately meet the demands of the public when it comes to searching databases of any magnitude, such as
the Internet, particularly when considering the documents are indexed to an atomic level (each individual term).
One problem that can be encountered with a typical search is an erroneous spelling of a search term, which may result in a failed attempt at locating a desired document.
Although provisions have been made to remedy such errors, current search engines do not provide suggestions to a user regarding which terms will improve the efficiency or effectiveness of a search, or at least until after the search has already been performed, if at all, thereby expending valuable
processing resources and
time processing search terms that might be irrelevant or relatively insignificant to the search.
However, the spelling help provided does not relate to the success of a searched result.
Accordingly, the spelling help, if any, merely requires misspelled words to be ‘atomically’ indexed along with the other indexed terms, thereby increasing the
processing and computing requirements for the larger index, and without providing any measurable efficiencies.
Another problem with existing search engines is that they indiscriminately expend resources searching for terms of relatively different significance.
However, even when less common
search terms are entered as part of a search, they are searched for, just as are the more unique terms, thereby resulting in a significant volume of documents being identified that contain only the common terms.
This is particularly true when the
search terms may not help to focus or narrow the search.
For example, when a search for a document includes a common term found in many documents, the processing required to analyze each of the documents containing the common term can be onerous and is certainly undesirable, particularly when considering that the processing required by the
search engine to provide any meaningful results necessarily requires the application of many different priority rules to each of the identified documents.
The application of the various priority rules must also be normalized into some sort of
score, which can be computationally expensive.
Keeping track of the relevance of the various search terms during the search process also slows down the servers.
Searching for
electronic content can also be difficult even when the user knows exactly which document to search and is relatively familiar with the document or documents being searched.
However, unless the user is able to craft the search request with the exact words and order in which they are found in the designated passage, they are often inundated with irrelevant or erroneous results.
In other words, the user will be presented with many false leads.
Furthermore, even if the user is able to recite, as part of the search, all of the words in the desired passage and in the correct order, existing search processes still examine and process the
electronic content and references that are irrelevant for the desired search, simply because it contains the recited terms that are also found in various extraneous documents that the user is unconcerned with.
Existing search engines also fail to appreciate or effectively use synonyms as part of a search.
In particular, if a search term includes the word “made”, existing search engines fail to consider whether a synonymous word “create” would enable a better or more applicable result from the search.
Yet another problem with existing search engines is that they fail to adequately account for the customized behavior of different users and the contexts in which the search is performed.
Current search engines, however, to not adequately account for the contextual relevance of a search.
In this regard, many processing resources are again wasted.
The inability of existing search engines to adequately consider context is again based on the relatively myopic approach of atomic searching based on individual terms.
Yet another failing with existing search engines is the inability to adequately and meaningfully interpret
user input in real-time, while the search terms are being entered.
Instead, existing search engines wait until after the search terms has been completely entered before the search process even begins, thereby
wasting valuable time that could have been spent searching.
Search engines also fail to provide any adequate means for interpreting
user input and search terms that are only partially entered.
In particular, some of the foregoing inadequacies of existing search engines are easily overlooked when a user has the convenience of a full-sized keyboard and has ample time to provide completed input before processing of the input is to even begin.
However, in today's busy world, time is often not seen as a convenience and the input interfaces of many compact computing devices can be difficult to manipulate, particularly those of portable computing devices such as telephones and PDAs, which would benefit from search engines that would be able to effectively utilize abbreviations, gross misspellings and shorthand text as part of a search, and particularly in a real-time and helpful manner.