Method and device for creating full-text index data
A full-text indexing and data technology, applied in digital data processing, special data processing applications, instruments, etc., can solve problems such as index data efficiency decline, achieve the effect of reducing word segmentation time and effectively using computing resources
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0043] figure 1 The flow chart of the method for creating full-text index data provided by Embodiment 1 of the present invention. This embodiment is applicable to the situation of creating full-text index data. This method can be executed by a device for creating full-text index data. The device can be implemented by software / hardware It can be realized in a way and can be integrated in the database system.
[0044] see figure 1 , the creation method of the full-text index data, including:
[0045] S110, perform word segmentation on the document in parallel, and record word positions and word marks.
[0046] figure 2 It is a schematic diagram of the process in the method for creating full-text index data provided by Embodiment 1 of the present invention, and C refers to figure 1 and figure 2 , Exemplarily, multiple parallel processing threads can be used for word segmentation, and the specific number can be adjusted according to the actual hardware resources of the user...
Embodiment 2
[0055] image 3It is a schematic flowchart of the method for creating full-text index data provided by Embodiment 2 of the present invention. The embodiment of the present invention is based on the above-mentioned embodiments. Further, before performing word segmentation on documents in parallel, add: number the documents to be indexed, and according to The number and the content of the document generate a data item; and the data item is filled into a memory block of a preset size.
[0056] see image 3 , the creation method of the full-text index data, including:
[0057] S210. Number the document to be indexed, and generate a data item according to the number and document content.
[0058] Data items are encapsulated in the order of Doc ID (document number), and each data block contains ordered Doc ID and corresponding document content.
[0059] S220. Fill the data item into a memory block with a preset size.
[0060] Exemplarily, the main thread can pre-read the documen...
Embodiment 3
[0067] Figure 4 It is a schematic flowchart of the method for creating full-text index data provided by Embodiment 3 of the present invention. The embodiment of the present invention is based on the above-mentioned embodiments. Further, after classifying the same words, the following steps are added: according to the classification The word is encapsulated to generate an encapsulated data packet, the encapsulated data packet includes: a word mark and position information data corresponding to the word mark, and the position information data is stored in a differential manner.
[0068] see Figure 4 , the creation method of the full-text index data, including:
[0069] S310. Number the document to be indexed, and generate a data item according to the number and document content.
[0070] Figure 5 It is a schematic diagram of instance creation in the method for creating full-text index data provided by Embodiment 3 of the present invention, see Figure 4 and Figure 5 . ...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


