Information processing device, control method, program

The information processing device addresses the inefficiency of document search systems by classifying and aggregating documents, enhancing the efficiency of document retrieval by grouping and displaying similar documents effectively.

JP2026105107APending Publication Date: 2026-06-25CANON MARKETING JAPAN INC +1

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
CANON MARKETING JAPAN INC
Filing Date
2026-04-23
Publication Date
2026-06-25

Smart Images

  • Figure 2026105107000001_ABST
    Figure 2026105107000001_ABST
Patent Text Reader

Abstract

The present invention aims to provide a mechanism that allows for the efficient verification of information related to relevant documents. [Solution] The system comprises a classification means for classifying documents into document groups, and a display control means for controlling the aggregation and display of information relating to the classified documents for each document group, wherein the display control means displays information indicating whether the document group contains multiple documents.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] It relates to a technology for controlling to display information related to a document.

Background Art

[0002] As the number of electronic documents within a company increases, the importance of a document search system for efficiently searching for documents necessary for business operations has been increasing. Here, a document search system is a system that presents a set of documents related to search conditions input by a user to the user. In this document search system, as a list of search results related to a search query, a large number of documents with similar content (similar documents) may be displayed. This is likely to occur when past versions are saved without being deleted or derivative documents are created and saved when creating and updating materials. The amount of information that can be quickly grasped by the user as search results is limited, and a situation where only similar documents are ranked at the top of the search results is a factor increasing the time required for the user to find the document they are seeking.

[0003] To solve the above problems, it is conceivable to establish operational rules regarding the data storage method, such as deleting old version documents, but it is not always the case that past versions or source documents are unnecessary, and it is difficult to thoroughly implement such operations in daily business activities.

[0004] Patent Document 1 describes a method of outputting various types of content in a limited screen area by aggregating and displaying search results for each genre for a set of content having hierarchical genres (categories). By increasing the number of displays for genres more highly related to the search query, it becomes possible to display a large number of contents of genres important to the user while also displaying contents of other genres.

[0005] Non-Patent Document 1 mentions a function of aggregating and displaying similar documents together when displaying a list of search results. By increasing the number of documents with different contents that can be confirmed on one screen, the efficiency of the document search operation is improved. [Prior art documents] [Patent Documents]

[0006] [Patent Document 1] Japanese Patent Publication No. 2013-106610 [Non-patent literature]

[0007] [Non-Patent Document 1] Sumitomo Electric Information Systems Co., Ltd., "More Advanced Extended Search | All-in-One Search & Information Utilization Solution QuickSolution", [online], March 1, 2021 (product release date), [searched October 1, 2021], Internet,<URL: https: / / www.sei-info.co.jp / quicksolution / functions / extension.html> [Disclosure of the Invention] [Problems that the invention aims to solve]

[0008] The technology described in Patent Document 1 is applicable when searchable content is pre-assigned to genres (categories) that can be aggregated, such as on product search websites. Setting the appropriate genre when registering internal company documents is burdensome for users. Furthermore, since documents displayed under the same genre are often filled with similar documents, it is unsuitable as a solution to the problems in internal company document search systems.

[0009] Regarding the technology described in Non-Patent Document 1, it states that "high similarity to the file body" is used as a criterion for determining similar documents, but the method for evaluating the similarity of the body is not obvious. Depending on the means of determining similarity, the search execution time may increase significantly.

[0010] The present invention aims to provide a mechanism that allows for the efficient verification of information related to relevant documents. [Means for solving the problem]

[0011] The information processing device of the present invention comprises a classification means for classifying documents into document groups, and a display control means for controlling the aggregation and display of information relating to the classified documents for each document group, wherein the display control means displays information indicating whether the document group contains multiple documents. [Effects of the Invention]

[0012] According to the present invention, it becomes possible to efficiently check information related to relevant documents. [Brief explanation of the drawing]

[0013] [Figure 1] This figure shows an example of the system configuration of a document search system in an embodiment of the present invention. [Figure 2] This block diagram shows an example of the hardware configuration of a document search system in an embodiment of the present invention. [Figure 3] This figure shows an example of a document database in an embodiment of the present invention. [Figure 4] This is an example of a search screen before similar document aggregation in an embodiment of the present invention. [Figure 5] This is an example of a search screen after similar document aggregation in an embodiment of the present invention. [Figure 6] This is an example of a search screen after similar document expansion in an embodiment of the present invention. [Figure 7] This flowchart shows an example of the search result aggregation process in an embodiment of the present invention. [Figure 8] This flowchart shows an example of a similar document search process in an embodiment of the present invention. [Modes for carrying out the invention]

[0014] Embodiments of the present invention will be described in detail below with reference to the drawings.

[0015] FIG. 1 is a diagram showing an example of the system configuration of a document search system according to an embodiment of the present invention.

[0016] The document search system 100 includes a document registration device 110, a document DB 120, a document search device 130, and a keyword update device 140.

[0017] The document registration device 110 is a device for registering documents to be searched by a user, and includes a document reception unit 111, a keyword extraction unit 112, and a document registration processing unit 113.

[0018] The document reception unit 111 is a device for receiving documents to be registered. A user can send any document to the document reception unit 111 through a web browser or the like. Alternatively, a configuration may be adopted in which a crawler mechanically collects and transmits documents.

[0019] The keyword extraction unit 112 is a device for extracting keywords that are candidates for characteristic words in the document and their frequencies of occurrence from the document received by the document reception unit 111. Details of the characteristic words will be described later. The keyword extraction process in the keyword extraction unit 112 uses a known morphological analysis technique. Here, the morphemes to be extracted may be limited to specific word types such as proper nouns according to the application of the search system. Alternatively, strings that match a predetermined pattern may be extracted as keywords without using morphological analysis.

[0020] The document registration processing unit 113 is a device that associates the document received by the document reception unit 111 with the keywords extracted by the keyword extraction unit 112 and stores them in the document DB 120.

[0021] Figure 3 shows an example of the document database 120. The document database 120 includes a document ID 121 for uniquely identifying documents, a document name 122, a document body 123, a keyword frequency 124 for storing values ​​extracted by the keyword extraction unit 112, and an area for storing characteristic words 125. The method for creating characteristic words 125 will be described later. Although the above five items are used as an example configuration to explain this idea, additional items used as a search system, such as a URL indicating the location of the document, the document size, and the document creator, may be included.

[0022] Returning to Figure 1, the document search device 130 consists of a search processing unit 131, a search result aggregation processing unit 132, and a search result output processing unit 133.

[0023] The search processing unit 131 is a device that receives search requests from users and searches the document database 120 for documents corresponding to the request, and has the function of retrieving documents related to the search request in order of score. In order to achieve efficient search processing, the document registration processing unit 113 can create an inverted index, which is a known technique, and use it during the search.

[0024] The search result aggregation processing unit 132 is a device that determines the similarity between documents for each document in the search results obtained by the search processing unit 131, and groups documents that are determined to be similar. The search result aggregation process in the search result aggregation processing unit 132 will be explained in detail later using an example.

[0025] The search result output processing unit 133 is a device that returns the search results obtained by the search processing unit 131 and the group information of similar documents obtained by the search result aggregation processing unit 132 to the client that sent the search request. Users of the document search system 100 can check the search results through a web browser or the like.

[0026] The feature word update device 140 is a device that extracts characteristic keywords as feature words for each document stored in the document DB and updates the corresponding record. Feature word selection can be achieved using tf-idf, one of the indices that represent the feature quantity of a word. The feature word update device 140 obtains the frequency of occurrence of each word from the keyword:frequency item in the document DB 120 and extracts up to N keywords as feature words in descending order of tf-idf value. Here, the maximum number of feature words extracted, N, is an element related to the accuracy of similar document determination described later, and it is desirable for N to be a large value such as N=20 or more, but for the sake of simplicity, we will explain it as N=5 below. For example, in the document DB 120, the feature words for document 1 are "product X", "cross-search", "high-speed", "planning", and "similar", which are 5 items. The tf-idf values ​​of keywords change as the set of documents included in the document DB120 changes. However, updating the characteristic words of all documents every time a new document is registered would place a heavy load on the document search system 100 and could lead to an increase in search execution time. Therefore, the characteristic word update device 140 updates the characteristic words of documents at the following two timings.

[0027] (Update Method 1) When a new document is registered by the document registration processing unit 113, the characteristic words of the document are updated.

[0028] (Update Method 2) Check for updates to document DB120 according to a pre-set schedule, and update the characteristic words of all documents if updates are found. For newly registered documents, feature word extraction is performed immediately using update method 1. For documents from which feature words have already been extracted, update method 2 allows for periodic updates of feature words during off-peak hours, such as nighttime batch processing or holidays, while suppressing increases in search execution time.

[0029] Figure 2 is a block diagram showing an example of a hardware configuration applicable to each device and database constituting the document search system 100 in an embodiment of the present invention.

[0030] As shown in Figure 2, the information processing device is connected via a system bus 204 to a CPU (Central Processing Unit) 201, ROM (Read Only Memory) 202, RAM (Random Access Memory) 203, input controller 205, video controller 206, memory controller 207, and communication I / F controller 208.

[0031] CPU201 provides comprehensive control over all devices and controllers connected to the system bus 204.

[0032] ROM202 or external memory211 holds the BIOS (Basic Input / Output System) and OS (Operating System), which are control programs executed by the CPU201, as well as computer-readable and executable programs and various necessary data (including data tables) for realizing this information processing method.

[0033] RAM203 functions as the main memory, work area, etc., of the CPU201. The CPU201 loads the necessary programs, etc., from ROM202 or external memory211 into RAM203, and then executes the loaded programs to perform various operations.

[0034] The input controller 205 controls input from input devices such as a keyboard 209 or a pointing device such as a mouse (not shown). If the input device is a touch panel, the user can give various instructions by pressing (touching with a finger, etc.) icons, cursors, or buttons displayed on the touch panel.

[0035] Furthermore, the touch panel may be a multi-touch screen or other touch panel capable of detecting the positions of multiple fingers touching it.

[0036] The video controller 206 controls the display to an external output device such as the display 210. The display may include the display of a notebook computer integrated with the main unit. The external output device is not limited to a display; for example, it may be a projector. Furthermore, for the aforementioned touch-enabled device, an input device is also provided.

[0037] The video controller 206 can control the video memory (VRAM) used for display control. It can utilize a portion of the RAM 203 as the video memory area, or it can provide a separate, dedicated video memory.

[0038] The memory controller 207 controls access to the external memory 211. The external memory can include an external storage device (hard disk), a flexible disk (FD), or a CompactFlash® memory connected to a PCMCIA card slot via an adapter, which stores boot programs, various applications, font data, user files, editing files, and other data.

[0039] The communication interface controller 208 connects to and communicates with external devices via a network and performs communication control processing over the network. For example, it can communicate using TCP / IP, telephone lines such as ISDN, and 3G mobile phone lines.

[0040] Furthermore, the CPU 201 enables display on the display 210 by, for example, performing the process of expanding (rasterizing) outline fonts into the display information area in RAM 203. The CPU 201 also enables user input via a mouse cursor (not shown) on the display 210.

[0041] Next, the process of aggregating search results in an embodiment of the present invention will be explained using Figures 4 to 8.

[0042] Figure 4 shows an example of a search screen when similar documents are not aggregated. The search screen 400 consists of a search criteria input form 410, a search result summary 420, and a search result list 430. In the search result list 430, the document name 431, featured snippet 432, and characteristic words 433 are displayed as information for each document. Here, the featured snippet 432 is the string surrounding the string in which the keyword entered as a search criterion appears, and this information can be efficiently obtained by storing the keyword's occurrence position in the inverted index.

[0043] The search results list 430 is an example of the search results obtained by the search processing unit 131. Figure 4 shows the search results for documents containing the keyword "Product X". Similar documents such as "Product X - Product Planning Document - 20200106", "Product X - Product Planning Document - 20200105", and "Product X - Product Planning Document - 20200105-2" are listed at the top of the search results.

[0044] Figure 5 shows an example of a search screen that aggregates similar documents in an embodiment of the present invention. The search screen 500 aggregates and displays documents that have been determined to be similar documents through the search result aggregation process described later. For example, in the search screen 500, "Product X - Product Planning Document - 20200106", "Product X - Product Planning Document - 20200105", and "Product X - Product Planning Document - 20200105-2" from the previous example are aggregated, and only the document information for "Product X - Product Planning Document - 20200106" is displayed in the search results list. Here, the document presented as information for the set of similar documents is called the representative document. Above the representative document, an expand button 501 that allows the user to expand and display similar documents, and an aggregated document count 502 indicating the number of documents aggregated are displayed together. Compared to the search screen 400 that does not perform aggregation of similar documents, a variety of documents are displayed in the search results list, so the user can efficiently search for the desired document. In a set of similar documents, the representative document may be the one with the highest score (the one with the highest search ranking), or it may be the one with the most recent creation date and update date. The former allows the representative document to be a document that is highly relevant to the search criteria, while the latter allows the representative document to be the most up-to-date information. In cases where the latest document is important, such as design documents or regulations, using the one with the most recent update date as the representative document can reduce the time it takes to find the desired document. Since the appropriate method for selecting a representative document differs depending on the use case, the search screen 500 may include a user interface for switching the criteria for selecting a representative document.

[0045] Figure 6 shows an example of the search screen after pressing the expand button 501 on the search screen 500. In the search results list, similar documents associated with the expand button 501 are displayed in descending order according to the method for selecting the representative document, and the differences between each similar document and the information of the representative document are highlighted. For example, "Product X - Product Planning Document - 20200106" is displayed as similar documents to "Product X - Product Planning Document - 20200105" and "Product X - Product Planning Document - 20200105-2," and the snippets and characteristic words in each similar document are highlighted in bold or with different colors to show the differences from the representative document "Product X - Product Planning Document - 20200106." By presenting the differences from the representative document, it becomes easier for the user to find a more appropriate document even if the document they desire is located in a consolidated document collection. For the sake of simplicity, a basic example of difference display has been shown. However, features that make it easier to distinguish between document differences may be implemented, such as a function that retrieves the full text of the document instead of just snippets to view the differences.

[0046] Figure 7 is a flowchart of the search result aggregation process in the search result aggregation processing unit 132, showing the process of determining the set of representative documents to be displayed in the list of search results, and the set of similar documents for each representative document.

[0047] First, in step S701, the representative document set G is initialized to an empty state, and in steps S702 to S706, each document α included in the search results is assigned to either the representative document or a document similar to the representative document.

[0048] In step S703, the representative document set G is searched for a document β that is similar to document α. The process for searching for similar documents will be described later.

[0049] Subsequently, in step 704, the presence or absence of similar document β is determined. If similar document β exists, in step S705, document α is added to the set of similar documents to which document β belongs. Here, if a method other than the one with the highest score (the one with the highest search ranking) is adopted for selecting the representative document, document α and similar document β are compared using a predetermined method, and the appropriate document is adopted as the representative document. For example, if the method for selecting the representative document is the one with the most recent update date, the update dates of document α and document β are compared, and the document with the more recent date is treated as the representative document. If document α is deemed appropriate as the representative document, document β and the set of similar documents to which document β belongs are designated as a set of similar documents with document α as the representative document, and document α is registered in place of document β in the representative document set G.

[0050] If no similar document β exists in step S704, then in step S706, document α is added to the representative document set G.

[0051] After performing the above processing for each document included in the search results, the process terminates in step S707 by returning the representative document set G as the aggregated result of similar documents.

[0052] Figure 8 is a flowchart of the similar document search process in step S703. In steps S801 to S804, it is determined whether a document β similar to document α exists in the representative document set G. If it does, the document β is returned in step S805. If no similar documents exist in the representative document set G, the similar document search process terminates, assuming that no similar documents exist for document α.

[0053] Step S802 is a process that compares the characteristic words in document α and document β, and determines that they are potentially similar documents if M% or more of the characteristic words match. M is a predetermined threshold, and in the following explanation, M=80, but it can also be dynamically determined from information such as document size. In step S802, it becomes possible to exclude documents with significantly different characteristic words from the list of similar document candidates, allowing for more efficient determination of similar documents.

[0054] Step S803 is a process to determine whether there are a sufficient number of feature words in document α. In documents where there is not enough feature words, such as documents with a small amount of text, it may be inappropriate to determine similar documents based solely on feature words. If there are L or more feature words, in step S805, document β is detected as a similar document. In the following explanation, L=4, but it may be dynamically determined according to the maximum number N of feature words extracted by the feature word update device 140.

[0055] In step S803, if there are fewer than L characteristic words, in step S804, a process is performed to determine similar documents based on the similarity of document names. The similarity of document names is determined by known techniques such as the Levenshtein distance, and it is confirmed that it is less than or equal to a predetermined value R. If the edit distance of the document names is less than or equal to R, document β is detected as a similar document in step S805. Hereinafter, R=2 will be used for the explanation, but it may be calculated manually depending on the length of the document name. In addition, before calculating the edit distance, a method may be added to compare each document name after removing strings indicating the date or version number. Furthermore, in addition to document names, more detailed processing may be added to determine the similarity between documents, such as similarity of extensions, similarity of snippets, and similarity of file sizes.

[0056] The following steps will be explained in detail using an example that aggregates seven search results from search screen 400 ("Product X - Product Planning Document - 2020106" to "Sales Schedule for March 2021").

[0057] First, regarding the top-ranking search result, "Product X - Product Planning Document - 2020106", the representative document set G is empty, and there are no similar documents in the representative document set G. Therefore, in step S706, "Product X - Product Planning Document - 2020106" is added to the representative document set G.

[0058] For the next search result, "Product X - Product Planning Document - 2020105," we determine whether it is similar to "Product X - Product Planning Document - 2020106," which is included in the representative document set G. First, in step S802, we determine if more than M (=80)% of the characteristic words match. The characteristic words for "Product X - Product Planning Document - 2020105" from document DB120 are "Product X," "Cross-search," "High-speed," "Planning," and "Related," and four of these characteristic words match the characteristic words of "Product X - Product Planning Document - 2020106." Therefore, more than 80% of the characteristic words match, so we proceed to step S803.

[0059] In step S803, since there are L (=4) or more characteristic words in "Product X - Product Planning Document - 2020105", in step S805, the representative document "Product X - Product Planning Document - 2020106" is returned as a similar document to "Product X - Product Planning Document - 2020105". In this way, for documents with a sufficient number of characteristic words, it is possible to efficiently determine similar documents.

[0060] Returning to step S704, it is determined that a similar document exists, and in step S705, "Product X - Product Planning Document - 2020105" is added to the set of similar documents to "Product X - Product Planning Document - 2020106". If the representative document selection method were based on the most recent update date, the update dates of "Product X - Product Planning Document - 2020106" and "Product X - Product Planning Document - 2020105" would be compared and the newer one would be adopted as the representative document. However, in this embodiment, the representative document selection method is based on the higher score (higher search ranking), so the representative document remains "Product X - Product Planning Document - 2020106", which was the first one found. The same applies to subsequent steps.

[0061] The next search result, "Product X - Product Planning Document - 2020105-2," will be added to the set of similar documents to "Product X - Product Planning Document - 2020106" through the same process.

[0062] Regarding the "Search Screen Design Document," there are two key terms: "screen" and "product X." Only one key term matches the key terms of the representative document, "product X - product planning document - 2020106." In step S802, the key term match is less than M (=80%), so it is not adopted as a similar document. Since there are no other representative documents, there are no similar documents to this document.

[0063] In step S704, since no similar documents exist, in step S706, "Search Screen Design Document" is added to the representative document set G. At this point, the representative document set G contains two documents: "Product X - Product Planning Document - 2020106" and "Search Screen Design Document," and "Product X - Product Planning Document - 2020105" and "Product X - Product Planning Document - 2020105-2" are linked as similar documents to "Product X - Product Planning Document - 2020106."

[0064] Regarding "Search Screen Design Document_β," the characteristic words are "screen" and "product X," which are 100% identical to the characteristic words of the representative document, "Search Screen Design Document." However, in step S803, since there are not L (=4) or more characteristic words, the similarity between documents is further checked.

[0065] In step S804, the edit distance (Levenshtein distance) between the document names "Search Screen Design Document_β" and "Search Screen Design Document" is 2, and since R (=2) is less than or equal to the given value, "Search Screen Design Document" is detected as a similar document to "Search Screen Design Document_β".

[0066] Returning to step S704, it is determined that a similar document exists, and in step S705, "Search Screen Design Document_β" is added to the set of similar documents for "Search Screen Design Document".

[0067] For the "Administrator Screen Design Document," there are two characteristic words: "screen" and "product X." Similar to the "Search Screen Design Document β," the edit distance with the "Search Screen Design Document" is compared in step S804. The edit distance between the "Administrator Screen Design Document" and the "Search Screen Design Document" is 3, which is not less than or equal to R (=2). Therefore, it is determined that no similar documents exist for this document.

[0068] Returning to step S704, since no similar document exists, in step S706, "Administrator Screen Design Document" is added to the representative document set G.

[0069] Finally, regarding "Sales Schedule for March 2021," the characteristic words are "target," "sales," "plan," "sales," and "region." Since there are no representative documents that match the characteristic words by M(=80)% or more, there are no similar documents to this document. In step S706, "Sales Schedule for March 2021" is added to the representative document set G, and in S707, the representative document set G is returned, ending the aggregation process of the search results.

[0070] As described above, in the embodiment of the present invention, it is possible to efficiently aggregate similar documents in the search results using the feature words for each document requested by the feature word update device 140. This makes it possible not only to simply list and display the search results, but also to aggregate and display similar documents as shown in Figure 5, or to expand and display similar documents and compare them with a representative document as shown in Figure 6.

[0071] Furthermore, the display of search results is just one example, and the present invention is applicable to mechanisms for displaying documents in a list.

[0072] Furthermore, while we have shown examples of methods for selecting representative documents, such as selecting the one with the highest score (the one with the highest search ranking) or selecting the one with the most recent creation / update date, these are not the only methods we offer. You should adopt a method that suits your purpose and other factors.

[0073] It should be noted that the structure and content of the various data described above are not limited to those mentioned, and it goes without saying that they can be composed of various structures and contents depending on the use and purpose.

[0074] Furthermore, the program in this invention is a program that allows a computer to execute the processing methods of each flowchart. The program in this invention may also be a separate program for each processing method of each device in each flowchart.

[0075] As described above, it goes without saying that the object of the present invention can also be achieved by supplying a recording medium containing a program that realizes the functions of the embodiments described above to a system or device, and by having the computer (or CPU or MPU) of that system or device read and execute the program stored on the recording medium.

[0076] In this case, the program read from the recording medium itself realizes the novel function of the present invention, and the recording medium on which that program is recorded constitutes the present invention. For recording media used to supply programs, examples include flexible disks, hard disks, optical disks, magneto-optical disks, CD-ROMs, CD-Rs, DVD-ROMs, magnetic tapes, non-volatile memory cards, ROMs, EPROMs, silicon disks, and the like.

[0077] Furthermore, it goes without saying that the functions of the aforementioned embodiments are realized not only by the computer executing the program it has read, but also by the operating system (OS) running on the computer performing some or all of the actual processing based on the instructions of that program, thereby realizing the functions of the aforementioned embodiments.

[0078] Furthermore, it goes without saying that this also includes cases where, after a program read from a recording medium is written to the memory of a function expansion board inserted into a computer or a function expansion unit connected to a computer, the CPU or other components of the function expansion board or function expansion unit perform some or all of the actual processing based on the instructions of the program code, and the functions of the aforementioned embodiments are realized through that processing.

[0079] Furthermore, the present invention may be applied to a system consisting of multiple devices or to a device consisting of a single device. It goes without saying that the present invention can also be applied when the results are achieved by supplying a program to a system or device. In this case, by reading a recording medium containing a program for achieving the present invention into the system or device, the system or device can enjoy the effects of the present invention.

[0080] Furthermore, by downloading and reading the program for achieving the present invention from a server, database, etc. on a network using a communication program, the system or device can enjoy the effects of the present invention. It should be noted that configurations combining the above-described embodiments and their variations are all included in the present invention. [Explanation of Symbols]

[0081] 100 Document Search Systems 110 Document Registration Device 120 Document Database 130 Document retrieval device 140 Feature word update device

Claims

1. A classification method for classifying documents into document groups, A display control means that controls the display of information relating to the classified documents, aggregated and displayed for each document group, Equipped with, The display control means is characterized by displaying information indicating whether the document group contains multiple documents.

2. The information processing apparatus according to claim 1, characterized in that the display control means displays information relating to a document representing the document group and information indicating whether other documents are included in the document group.

3. The information processing apparatus according to claim 2, further comprising a means for identifying a document that represents the aforementioned group of documents according to predetermined criteria.

4. The information processing apparatus according to claim 3, wherein the identifying means identifies a document that represents the document group based on information relating to the date and time of the documents included in the document group.

5. The information processing apparatus according to claim 4, characterized in that the display control means displays information relating to a document in order based on information relating to the date and time of the document included in the document group.

6. The aforementioned documents are those found using the specified criteria. The information processing device according to claim 3, wherein the identifying means identifies a document that represents the document group based on information relating to the ranking of documents included in the document group searched according to the conditions.

7. The information processing apparatus according to claim 6, characterized in that the display control means displays information relating to a document in order based on information relating to the ranking of documents included in the document group that were searched according to the conditions.

8. The information processing apparatus according to claim 1 or 2, characterized in that the display control means displays information relating to the number of documents included in the document group.

9. The information processing apparatus according to claim 2, characterized in that the display control means displays a reception unit that receives instructions to expand and display information relating to documents included in the document group.

10. The information processing apparatus according to claim 1 or 2, characterized in that the display control means compares and displays information relating to a plurality of documents included in the document group.

11. The information processing apparatus according to claim 1 or 2, characterized in that the classification means classifies based on the similarity between information relating to documents.

12. The information processing apparatus according to claim 11, characterized in that the classification means classifies based on the degree of agreement between characteristic words obtained from a document.

13. The information processing apparatus according to claim 11, characterized in that the classification means classifies based on the relationships between the names of the documents.

14. The information processing device according to claim 1 or 2, characterized in that the document is a document retrieved under specified conditions.

15. The classification means includes a classification step of classifying documents into document groups, A display control step in which the display control means controls the information relating to the classified documents to be aggregated and displayed for each document group, Equipped with, The control method for an information processing device is characterized in that the display control step displays information indicating whether the document group contains multiple documents.

16. A program that can be executed in an information processing device, The aforementioned information processing device A classification method for classifying documents into document groups, Display control means that controls the display of information related to the classified documents, aggregated for each document group. To make it function as, The display control means is a program characterized by displaying information indicating whether the document group contains multiple documents.