Image formula Chinese document retrieval method based on content
An image format and document retrieval technology, applied in the field of information processing, can solve problems such as inability to effectively handle character degradation image format documents, and achieve the effect of simple retrieval method, high speed and low cost
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
specific Embodiment approach 1
[0024] Specific implementation mode one: according to the instructions attached figure 1 with 2 Specifically illustrate this embodiment, a kind of content-based image format Chinese document retrieval method of this embodiment, it comprises the following steps:
[0025] Step 1: Obtain the Chinese document in image format to be retrieved, and perform character segmentation for each Chinese document in image format, and then obtain a single character image in each Chinese document in image format ;
[0026] Step 2: According to the acquired single character image , extracting the character image feature vector of the character image;
[0027] Step 3: Based on the principle of local sensitive hash transformation, construct a hash function h, and extract the character image The character image feature vector correspondingly transforms into a pseudocode , and according to the pseudocode Establish a character indexing database, the pseudocode consists of L 16-bit intege...
specific Embodiment approach 2
[0040] Specific embodiment two: this embodiment is a further description of specific embodiment one. In specific embodiment one, in step 3, the specific process of constructing the hash function h is: first define the set of fixed points of the regular polyhedron in the m-dimensional space ,in, , and define the rotation matrix A, and then establish the hash function , is a unit vector, the hash function The mapped result set is .
specific Embodiment approach 3
[0041] Specific embodiment three: this embodiment is a further description of specific embodiment one or two, in specific embodiment one or two, in step three, the pseudo code 16-bit integer The range of the number L is 1-50.
PUM

Abstract
Description
Claims
Application Information

- R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com