Unlock instant, AI-driven research and patent intelligence for your innovation.

A Method of Term Matching Based on Cedar Double Array Trie Algorithm

A double array and tree algorithm technology, applied in the field of computer communication, can solve the problems of slow query, slow term indexing, slow word search efficiency, etc., to achieve the effect of improving efficiency

Active Publication Date: 2019-07-23
TRANSN IOL TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The technical problem to be solved by the present invention is that the current term matching engine based on the database is relatively slow in word search efficiency, and the way to improve this problem is to build a fast index for the terms in the database, and the introduction of a double array dictionary tree can solve a large number of problems. The problem of slow index building and slow query of terms

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Method of Term Matching Based on Cedar Double Array Trie Algorithm
  • A Method of Term Matching Based on Cedar Double Array Trie Algorithm
  • A Method of Term Matching Based on Cedar Double Array Trie Algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The technical scheme of the present invention will be further described in detail below with reference to the drawings and specific embodiments.

[0052] To solve the above technical problems, such as figure 1 As shown, the present invention provides a method for term matching based on the cedar double-array dictionary tree algorithm, which is characterized by including the steps of building an index and using the index for term query matching;

[0053] among them,

[0054] 1. The indexing step is to traverse the database, obtain the term set, and call the cedar double array dictionary tree to insert the term to form the index of the term set;

[0055] In the cedar double-array dictionary tree, each array element includes a structure array array[n] (such as image 3 Shown), a circular queue queue[n] with the same volume as the structure array, and a binary tree array used to store the parent-child and sibling relationships composed of characters, namely the sibling array ninfo[n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a term matching method based on a cedar double-array trie algorithm. The method is characterized in comprising a step of establishing an index and a step of carrying out term search matching through the index. The step of establishing the index comprises the sub-steps of traversing a database to obtain a term set, and calling the cedar double-array trie to insert terms, thereby forming the index of the term set, wherein the cedar double-array trie comprise a structure array which taken a reference value and a check value as members and a circular queue of which volume is the same to that of the structure array. Through application of the cedar double-array algorithm to the index establishment of a term matching engine and term search according to the index, the efficiency of the term matching engine is greatly improved. Moreover, according to the algorithm, the deficiency that a classic double-array algorithm libdatrie is very low in speed to the disadvantage of rapid data reconstruction when the index is established for a great number of terms is avoided. A binary tree is taken as an auxiliary structure, so the whole double-array trie can be rapidly restored.

Description

Technical field [0001] The invention belongs to the field of computer communication, and particularly relates to a method for term matching based on a cedar double-array dictionary tree algorithm. Background technique [0002] At present, the translation industry continues to expand, and the growth rate of corpus and terminology is relatively fast, and the number is relatively large. A large number of terms are the cornerstone of translation, and effective information technology must be used to manage them. At present, the original text, translation, and other detailed information about the terms in the company are stored in the mongo database. It is very slow to directly query the database to obtain the original text or the translation, and the original text or the translation may be too long and inconvenient as an index field. An existing set of term matching engine is implemented, which uses a double array algorithm to establish a peripheral index for the term, and then uses ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/31G06F17/28
CPCG06F16/316G06F40/58
Inventor 冯泽康
Owner TRANSN IOL TECH CO LTD
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More