Device for drawing document correlation diagram where documents are arranged in time series
A correlative graph and time sequence technology, applied to computer components, instruments, calculations, etc., can solve problems such as accumulation of deviations, unclear branch meanings, and inability to properly represent the temporal development of the field, and achieve the effect of improving misclassification
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 2
[0287]
[0288] In the Codimensional Reduction Method (Codimensional Reduction Method), as in Embodiment 1 (Balanced Cutting Method; BC Method), association rules are used to determine the cutting position of the dendrogram. In embodiment 1, the parameters that can be obtained according to the geometric shape of the dendrogram are used, and the combination height between elements is used as the cutting position, while in this embodiment 2, the index dimension that represents the difference between the document element vectors is used to determine Decide where to cut.
[0289] Since the basic description related to association rule analysis has been done in Embodiment 1, it will be omitted. First, the differences between Embodiment 2 and Embodiment 1 will be described for the parameters used in association rule analysis in Embodiment 2.
[0290]
[0291] When a certain node (node) c is given in the dendrogram, its combination level is represented by an integer i(c...
Embodiment 3
[0327]
[0328] In the Cell Division Method, after cutting the dendrogram at the cutting height α determined by a certain method and extracting the parent cluster, only the file elements belonging to each parent cluster are used in order to divide each parent cluster into sub-clusters , again making a dendrogram of that section. When creating the partial dendrogram, the dimension of the index term whose deviation value of the document element vector component in the parent cluster is smaller than the value determined by a predetermined method is removed and analyzed.
[0329]
[0330] Fig. 11 is a flowchart illustrating a cluster extraction procedure in Example 3 (cell division method; CD method). This flowchart is more image 3 The procedure of the third embodiment is shown in more detail. for with image 3 The same steps are in image 3 Add 300 to the step number, and take the last two digits and image 3 Same step number, sometimes omitted with image 3...
Embodiment 8
[0475]
[0476] Time Slice Analyzes is a method of performing cluster analysis within each time category after classifying a plurality of file elements to be analyzed based on time data. This is different from the sixth and seventh embodiments described above in that analysis is performed based on time data before clusters are extracted based on content data. After classification based on time data and analysis of clusters within each time classification are completed, lines are drawn between elements belonging to clusters before and after time, and the file correlation graph is completed.
[0477]
[0478] Figure 34 is better than figure 2 It is a diagram explaining the configuration and functions of the file-correlation graph creation device in Embodiment 8 (time-sectional analysis; TSA) in more detail. right with figure 2 The same parts are denoted by the same symbols, and explanations are omitted.
[0479] The file correlation diagram making device ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 