An image convolution method and system based on hilbert fractal curve

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By using an image convolution method based on Hilbert fractal curves, the problems of high computational complexity and insufficient interaction of local feature information in convolutional neural networks are solved, achieving efficient image feature extraction and classification and improving the performance of CNN models.

CN115909014BActive Publication Date: 2026-06-26TONGJI ARTIFICIAL INTELLIGENCE RES INST SUZHOU CO LTD

View PDF 4 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: TONGJI ARTIFICIAL INTELLIGENCE RES INST SUZHOU CO LTD
Filing Date: 2023-01-20
Publication Date: 2026-06-26

Application Information

Patent Timeline

20 Jan 2023

Application

26 Jun 2026

Publication

CN115909014B

IPC: G06V10/82; G06V10/764; G06V10/774

CPC: Y02D10/00

AI Tagging

Technology Topics

Algorithm Convolution

Technical Efficacy Phrases

Improved deformation stabilityExpand naturally and efficiently

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Force feedback mechanism, master control arm and surgical robot
CN224421138UImproved second-stage force feedback performanceImproved deformation stabilitySurgical Manipulation Surgical operation

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing convolutional neural networks suffer from high computational complexity and insufficient ability to interact with local feature information when processing images, especially when using more convolutional layers and larger convolutional kernels, which further increases computational complexity.

Method used

An image convolution method based on Hilbert fractal curves is adopted. Local feature aggregation is achieved through padding operations, block indexing and rearrangement, cross-correlation operations, and depthwise separable one-dimensional convolution. The image blocks are rearranged and convolutional operations are performed in combination with Hilbert fractal indexing.

Benefits of technology

Without increasing computational complexity, it improves the global learning capability of convolution operations and the stability of multi-scale representations, enhances the information interaction capability between adjacent image patches, and improves the task performance of CNN models.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN115909014B_ABST

Patent Text Reader

Abstract

The application relates to a Hilbert fractal curve-based image convolution method and system, comprising the following steps: 1) performing a padding operation on an input image according to a convolution output size mode, for example, the input size is equal to the output mode, and a padding operation needs to be performed on the input edge; 2) block index and rearrangement: performing image block reading and rearrangement operations on the padded input image according to the index sequence number of the Hilbert fractal; 3) cross-correlation operation: performing convolution cross-correlation operation on the rearranged image block with a step equal to the convolution kernel size; 4) local feature aggregation: performing convolution cross-correlation operation on the output of step 3 with a step of 1 by using a deep separable one-dimensional convolution; 5) restoring the two-dimensional structure of the input: performing reverse rearrangement operation on the output of step 4 according to the index of the Hilbert fractal, and outputting. Compared with the existing image convolution operation method, the application is simple, has strong universality and has certain theoretical advancement.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer vision image convolution technology, and in particular to an image convolution method and system based on Hilbert fractal curves for local feature enhancement. Background Technology

[0002] German mathematician David Hilbert discovered a curve that, when divided into four equal parts, can be obtained by repeatedly dividing the square into four smaller squares. Starting from the center of the lower left square, the curve moves downwards to the center of the upper left square, then to the center of the upper right square, and finally downwards to the center of the lower right square. This is one iteration. By continuing this process of dividing the squares downwards and downwards, a curve that can fill the entire square is eventually obtained. This is the Hilbert curve, and its general process is shown in the attached figure. Figure 1 As shown in the image.

[0003] A Hilbert fractal is a fascinating curve that traverses all points within a unit square, resulting in a curve that fills space. The Hilbert curve is continuous and non-differentiable. When transforming two-dimensional to one-dimensional data, the positions of feature points do not change with resolution. Simply put, the Hilbert fractal operation can be defined as a dimensionality reduction mapping of a multi-scale image. The invariance of this mapping is manifested in the invariance of the relative positions of the reduced feature vectors, i.e., symmetry. Therefore, a Hilbert fractal can be symbolically represented as a symmetry group.

[0004] A convolutional network, or convolutional neural network (CNN), is an elegantly designed structure for processing data with a grid structure arranged in Euclidean space. CNNs have achieved good results in many visual tasks due to their versatility and the ability to handle variable-scale inputs. Furthermore, cross-correlation is a special type of linear operation; if a network uses convolution (cross-correlation), it is generally considered a CNN.

[0005] Convolution operators primarily incorporate three key ideas to improve machine learning systems: sparse interaction, parameter sharing, and equivariant representation. It is precisely these sparse representation and parameter sharing characteristics that allow convolution to provide a method for handling inputs of variable size. However, the balance it strikes between parameters and sparse representation means that the kernel size is limited by the number of parameters and computational complexity. This kernel size limitation can cause convolution operations to focus excessively on local features, resulting in a lack of ability to interact with information from adjacent image patches.

[0006] To overcome this deficiency, current methods mainly include:

[0007] 1) Use more convolutional and fully connected layers to enhance the model's deep global perception capabilities;

[0008] 2) Use large convolution kernels for convolution operations;

[0009] While the two methods mentioned above overcome their shortcomings, such as the lack of a global receptive field, they also lead to an increase in computational complexity. Summary of the Invention

[0010] In view of this, the purpose of this application is to propose an image convolution method and system based on Hilbert fractal curves, which can specifically solve the existing problems.

[0011] To achieve the above objectives, this application proposes an image convolution method based on Hilbert fractal curves, comprising:

[0012] 1) Select the padding operation for the input image based on the convolution output size mode;

[0013] 2) Block indexing and rearrangement: The operation of reading and rearranging image blocks according to the index number of Hilbert fractals is performed on the filled input image;

[0014] 3) Cross-correlation operation: Perform convolution cross-correlation operation on the rearranged image patches with a stride equal to the kernel size;

[0015] 4) Local feature aggregation: Perform a stride-1 convolutional cross-correlation operation on the output of step 3 using a depthwise separable one-dimensional convolution;

[0016] 5) Restore the two-dimensional structure of the input: Reverse the output of step 4 according to the Hilbert fractal index and output it.

[0017] The filling operation further includes filling the input edges in a mode where the input size equals the output size.

[0018] Further, in step 2), the rearrangement of image blocks continues according to the Hilbert fractal index, specifically including the following steps:

[0019] X = {X1, X2, X3, X4, ..., X} l}

[0020] X′={X1,X2,X m+2 ,X m+1 ,…,X a}

[0021]

[0022] Where X is the original input image, X lLet X' be a block of the input image, and let l be the number of image blocks and a be the number of rows and columns of the input image.

[0023] Furthermore, in step 3), a regular convolution operation with a stride s equal to the kernel size k is performed on the rearranged X′:

[0024]

[0025] Where K is a two-dimensional convolution kernel, m and n are the number of times the convolution kernel can slide in rows and columns on the input two-dimensional plane, respectively, and S(i,j) is the output of the convolution kernel K in the current image block X(i,j) after cross-correlation operation.

[0026] Furthermore, in step 4), the output of step 3) is subjected to a depthwise separable one-dimensional convolution operation, specifically including the following steps:

[0027] 1) The default size of the channel-separable convolution kernel is 3, the default stride is 1, the default padding size is 1, and the default padding value is zero.

[0028] 2) Using this channel, the convolution kernel can be separated to perform a stride-1 convolutional cross-correlation operation on the output of step 3.

[0029] Furthermore, in step 5), the specific steps are as follows:

[0030] 1) Generate the index matrix of the Hilbert fractal;

[0031] 2) Use a matrix operation broadcasting mechanism to rearrange image patches based on previous index records;

[0032] 3) Output the image after the convolution operation on the two-dimensional structure.

[0033] To achieve the above objectives, this application also proposes a method for constructing an image classification network, comprising:

[0034] S1. Construct a Hilbert convolutional layer according to the image convolution method, and then connect multiple ordinary convolutional layers in sequence to form a convolutional neural network.

[0035] S2. Input a batch of images into the constructed convolutional neural network;

[0036] S3. Output the feature map after the convolutional network operation;

[0037] S4. Input the feature map from step S3 into the classification head, which includes a fully connected layer and an activation function layer, and outputs a classification score.

[0038] S5. Input supervision information, i.e., image labels;

[0039] S6. Using the supervision information, calculate the cross-entropy loss for the output of the classification head and the label One-Hot vector, respectively.

[0040] S7. The network model was trained using the back gradient propagation algorithm through 100 iterations.

[0041] S8. After the loss converges, the image classification network parameters are obtained, and the final network model is obtained.

[0042] To achieve the above objectives, this application also proposes an image convolution system based on Hilbert fractal curves, comprising:

[0043] The fill module performs a fill operation on the input image based on the convolution output size mode.

[0044] The block indexing and rearrangement module reads and rearranges image blocks according to the Hilbert fractal index numbers of the filled input image;

[0045] The cross-correlation module performs convolution cross-correlation operations on the rearranged image patches with a stride equal to the kernel size.

[0046] The local feature aggregation module uses depthwise separable one-dimensional convolution to perform convolutional cross-correlation operation with a stride of 1 on the output of the cross-correlation operation module;

[0047] The restoration module reverses the output of the local feature aggregation module according to the Hilbert fractal index and then outputs it.

[0048] In summary, the advantages of this application and the user experience it brings are as follows:

[0049] 1. Compared with the Flatten / serpentine fractal unfolding method in the default convolution operation, only the Hilbert fractal satisfies all the properties defined by the standard group, improving the deformation stability of the multi-scale representation (Gromov-Hausdorff distance metric);

[0050] Second, by utilizing the stability of Hilbert fractal multi-scale expansion and the consistency of its dimensional transformation, the expansion of image patches becomes more natural and efficient, maximizing the similarity of adjacent image patches without increasing computational complexity.

[0051] Third, by using channel-separable one-dimensional convolution operations to aggregate information for each adjacent two-dimensional image block, the global learning capability of the convolution operation is enhanced.

[0052] Fourth, this application has certain theoretical innovations and relatively universal applicability. It can be applied to common CNN models to improve task performance. Attached Figure Description

[0053] In the accompanying drawings, unless otherwise specified, the same reference numerals throughout the various drawings denote the same or similar parts or elements. These drawings are not necessarily drawn to scale. It should be understood that these drawings depict only some embodiments disclosed in this application and should not be construed as limiting the scope of this application.

[0054] Figure 1 This is a schematic diagram of Hilbert curves of different orders.

[0055] Figure 2 This is a flowchart of the method described in this application.

[0056] Figure 3 This diagram illustrates a comparison between the Hilbert unrolling convolution process of this application and the traditional convolution process.

[0057] Figure 4 This is a schematic diagram comparing the effects of Hilbert image block unfolding in this application with traditional zigzag unfolding.

[0058] Figure 5 This diagram illustrates the process of classifying and recognizing images containing cats using the convolution method described in this application.

[0059] Figure 6 A schematic diagram of an image convolution system based on Hilbert fractal curves according to an embodiment of this application is shown.

[0060] Figure 7 A schematic diagram of the structure of an electronic device provided in one embodiment of this application is shown.

[0061] Figure 8 A schematic diagram of a storage medium provided in one embodiment of this application is shown. Detailed Implementation

[0062] The present application will now be described in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the invention. Furthermore, it should be noted that, for ease of description, only the parts relevant to the invention are shown in the accompanying drawings.

[0063] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other. This application will now be described in detail with reference to the accompanying drawings and embodiments.

[0064] Example 1

[0065] An image convolution method based on Hilbert fractal curves for local feature enhancement, such as... Figure 2As shown, it includes the following steps:

[0066] 1) Select the padding operation for the input image based on the convolution output size mode. For example, if the input image size is equal to the output mode, the input edges need to be padded.

[0067] 2) Block indexing and rearrangement: The operation of reading and rearranging image blocks according to the index number of Hilbert fractals is performed on the filled input image;

[0068] 3) Conventional cross-correlation operation: Perform conventional convolution cross-correlation operation on the rearranged image patches with a stride equal to the kernel size;

[0069] 4) Local feature aggregation: Perform a stride-1 convolutional cross-correlation operation on the output of step 3 using a depthwise separable one-dimensional convolution;

[0070] 5) Restore the two-dimensional structure of the input: Reverse the output of step 4 according to the Hilbert fractal index and output it;

[0071] Step 1) specifically includes the following steps:

[0072] 1) Padding is a term related to convolutional neural networks because it refers to the number of pixels added to an image before the kernel parameters of a convolution are calculated. For example, if the padding value is set to zero, the pixel values added to the four sides of the input image will all be zero.

[0073] 2) The fill value can be divided into three modes, generally including zero fill as mentioned in the example above, as well as two modes: mirror fill and copy-extended fill;

[0074] 3) The padding operation is also based on the three modes of convolution. It is generally divided into two padding methods: valid padding, which fills the input image, and invalid padding, which does not fill the input image. The padding size is represented by p.

[0075] In step 2), the rearrangement of image blocks continues according to the Hilbert fractal index, specifically including the following steps:

[0076] X = {X1, X2, X3, X4, ..., X} l}

[0077] X′={X1,X2,X m+2 ,X m+1 ,…,X a}

[0078]

[0079] Where X is the original input image, Xl Let X′ be a block of the input image, and let l be the number of image blocks rearranged using Hilbert indexing. Let a be the number of rows and columns of the input image. For ease of description, we assume that the square of a equals l. In addition, the size of the image block is equal to the size of the convolution kernel.

[0080] In step 3), a regular convolution operation with a stride s equal to the kernel size k is performed on the rearranged X′:

[0081]

[0082] Here, K is a two-dimensional convolution kernel, and m and n are the number of times the convolution kernel can slide along rows and columns in the input two-dimensional plane, respectively. k is much smaller than m in practice. S(i,j) is the output of the convolution kernel K after cross-correlation operation on the current image block X(i,j).

[0083] In step 4), an unconventional depthwise separable one-dimensional convolution operation is performed on the output S of step 3), specifically including the following steps:

[0084] 1) The default size of the channel-separable convolution kernel is 3, the default stride is 1, the default padding size is 1, and the default padding value is zero.

[0085] 2) Using this channel, the convolution kernel can be separated to perform a convolutional cross-correlation operation with a stride of 1 on the output of step 3. The purpose is to aggregate the information between two adjacent image blocks.

[0086] In step 5), to facilitate subsequent processing and operations based on the two-dimensional input image, after the convolution kernel and feature aggregation operations, it is necessary to restore the one-dimensional image patch to its original two-dimensional structure. The specific steps are as follows:

[0087] 1) Generate the index matrix of the Hilbert fractal;

[0088] 2) Use a matrix operation broadcasting mechanism to rearrange image patches based on previous index records;

[0089] 3) Finally, output the image after the convolution operation on the two-dimensional structure.

[0090] like Figure 3 As shown, the Hilbert unwinding convolution process proposed in this application is compared with the traditional zigzag unwinding convolution process. This application has the following advantages:

[0091] 1. Compared with the Flatten / serpentine fractal unfolding method in the default convolution operation, only the Hilbert fractal satisfies all the properties defined by the standard group, improving the deformation stability of the multi-scale representation (Gromov-Hausdorff distance metric);

[0092] II. Figure 4 As shown, by utilizing the stability of Hilbert fractal multi-scale expansion and the consistency of its dimensional transformation, the expansion of image patches becomes more natural and efficient, maximizing the similarity of adjacent image patches without increasing computational complexity.

[0093] Third, by using channel-separable one-dimensional convolution operations to aggregate information for each adjacent two-dimensional image block, the global learning capability of the convolution operation is enhanced.

[0094] Fourth, this application has certain theoretical innovations and relatively universal applicability. It can be applied to common CNN models to improve task performance.

[0095] Example 2

[0096] This embodiment describes a scheme for constructing and training an image classification network using the image convolution method of Embodiment 1. The specific implementation method is as follows:

[0097] S1, according to Figure 5 The structure shown is in accordance with Figure 2 The method described above constructs a Hilbert convolutional layer and combines it with a traditional convolutional layer to reconstruct the traditional ResNet-18 image classification network;

[0098] S2. Input a batch of images into the constructed convolutional neural network;

[0099] S3. Output the feature map after the convolutional network operation;

[0100] S4. Input the feature map from step S3 into the designed classification head, which generally contains a fully connected layer and an activation function layer, and outputs a classification score.

[0101] S5. Input supervision information, i.e., image labels;

[0102] S6. Using the above supervision information, calculate the cross-entropy loss for the output of the classification head and the label One-Hot vector, respectively.

[0103] S7. The network model was trained using the back gradient propagation algorithm through 100 iterations.

[0104] S8. After the loss converges, the image classification network parameters are obtained, and the final network model is obtained.

[0105] S9. Test on the test set and compare the performance of the convolution operator in this application with that of the original convolution operator.

[0106] Table 1 shows a performance comparison between the algorithm in this application and the basic algorithm. This is used for performance comparison of ResNet18 on the Tiny-ImageNet-200 dataset.

[0107] Table 1

[0108] Model / Method Top 1 accuracy Top 5 accuracy ResNet18 58.25％ 80.37％ Ours+Zero Padding 58.34％ 80.70％ Ours+Reflect Padding 58.92％ 80.59％

[0109] The application provides an image convolution system based on Hilbert fractal curves, which is used to perform the image convolution method based on Hilbert fractal curves described in the above embodiments, such as... Figure 6 As shown, the system includes:

[0110] The filling module 601 selects the filling operation for the input image according to the convolution output size mode;

[0111] The block indexing and rearrangement module 602 reads and rearranges image blocks according to the index numbers of the Hilbert fractals in the filled input image;

[0112] The cross-correlation operation module 603 performs a convolution cross-correlation operation on the rearranged image blocks with a stride equal to the kernel size.

[0113] The local feature aggregation module 604 uses depth-separable one-dimensional convolution to perform convolutional cross-correlation operation with a stride of 1 on the output of the cross-correlation operation module;

[0114] The restoration module 605 reverses the output of the local feature aggregation module according to the Hilbert fractal index and then outputs it.

[0115] The image convolution system based on Hilbert fractal curves provided in the above embodiments of this application and the image convolution method based on Hilbert fractal curves provided in the embodiments of this application are based on the same inventive concept and have the same beneficial effects as the methods used, run or implemented by the applications stored therein.

[0116] This application also provides an electronic device corresponding to the Hilbert fractal curve-based image convolution method provided in the foregoing embodiments, for executing the Hilbert fractal curve-based image convolution method. This application does not limit the scope of the embodiments.

[0117] Please refer to Figure 7 This illustrates a schematic diagram of an electronic device provided by some embodiments of this application. For example... Figure 7As shown, the electronic device 20 includes: a processor 200, a memory 201, a bus 202, and a communication interface 203. The processor 200, the communication interface 203, and the memory 201 are connected via the bus 202. The memory 201 stores a computer program that can run on the processor 200. When the processor 200 runs the computer program, it executes the image convolution method based on Hilbert fractal curves provided in any of the foregoing embodiments of this application.

[0118] The memory 201 may include high-speed random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Communication between this system network element and at least one other network element is achieved through at least one communication interface 203 (which can be wired or wireless), such as the Internet, wide area network, local area network, or metropolitan area network.

[0119] Bus 202 can be an ISA bus, PCI bus, or EISA bus, etc. The bus can be divided into an address bus, a data bus, a control bus, etc. The memory 201 is used to store programs. After receiving an execution instruction, the processor 200 executes the program. The image convolution method based on Hilbert fractal curves disclosed in any of the foregoing embodiments of this application can be applied to the processor 200, or implemented by the processor 200.

[0120] The processor 200 may be an integrated circuit chip with signal processing capabilities. In implementation, each step of the above method can be completed by the integrated logic circuitry in the hardware of the processor 200 or by instructions in software form. The processor 200 may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; it may also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor may be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly embodied in the execution of a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. The storage medium is located in memory 201. The processor 200 reads the information in memory 201 and, in conjunction with its hardware, completes the steps of the above method.

[0121] The electronic device provided in this application embodiment and the image convolution method based on Hilbert fractal curves provided in this application embodiment are based on the same inventive concept and have the same beneficial effects as the methods they employ, operate, or implement.

[0122] This application also provides a computer-readable storage medium corresponding to the image convolution method based on Hilbert fractal curves provided in the foregoing embodiments. Please refer to... Figure 8 The computer-readable storage medium shown is an optical disc 30, on which a computer program (i.e., a program product) is stored. When the computer program is run by a processor, it executes the image convolution method based on Hilbert fractal curves provided in any of the foregoing embodiments.

[0123] It should be noted that examples of the computer-readable storage medium may also include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other optical and magnetic storage media, which will not be elaborated here.

[0124] The computer-readable storage medium provided in the above embodiments of this application and the image convolution method based on Hilbert fractal curves provided in the embodiments of this application are based on the same inventive concept and have the same beneficial effects as the methods used, run or implemented by the applications stored therein.

[0125] It should be noted that:

[0126] The algorithms and displays provided herein are not inherently related to any particular computer, virtual system, or other device. Various general-purpose systems can also be used in conjunction with the teachings herein. The required structure for constructing such systems is apparent from the above description. Furthermore, this application is not directed to any particular programming language. It should be understood that the content of this application described herein can be implemented using various programming languages, and the above description of specific languages is for the purpose of disclosing the best mode of implementation of this application.

[0127] Numerous specific details are set forth in the specification provided herein. However, it will be understood that embodiments of this application may be practiced without these specific details. In some instances, well-known methods, structures, and techniques have not been shown in detail so as not to obscure the understanding of this specification.

[0128] Similarly, it should be understood that, in order to simplify this application and aid in understanding one or more of the various inventive aspects, in the above description of exemplary embodiments of this application, various features of this application are sometimes grouped together into a single embodiment, figure, or description thereof. However, this method of disclosure should not be construed as reflecting an intention that the claimed application requires more features than are expressly recited in each claim. Rather, as reflected in the following claims, inventive aspects lie in fewer than all features of a single foregoing disclosed embodiment. Therefore, the claims following the detailed description are hereby expressly incorporated into that detailed description, wherein each claim itself is a separate embodiment of this application.

[0129] Those skilled in the art will understand that modules in the device of the embodiments can be adaptively changed and placed in one or more devices different from that embodiment. Modules, units, or components in the embodiments can be combined into a single module, unit, or component, and further, they can be divided into multiple sub-modules, sub-units, or sub-components. Except where at least some of such features and / or processes or units are mutually exclusive, any combination can be used to combine all features disclosed in this specification (including the accompanying claims, abstract, and drawings) and all processes or units of any method or device so disclosed. Unless expressly stated otherwise, each feature disclosed in this specification (including the accompanying claims, abstract, and drawings) may be replaced by an alternative feature that serves the same, equivalent, or similar purpose.

[0130] Furthermore, those skilled in the art will understand that although some embodiments described herein include certain features but not others included in other embodiments, combinations of features from different embodiments are intended to be within the scope of this application and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.

[0131] The various component embodiments of this application can be implemented in hardware, or as software modules running on one or more processors, or a combination thereof. Those skilled in the art will understand that microprocessors or digital signal processors (DSPs) can be used in practice to implement some or all of the functions of some or all of the components in the virtual machine creation system according to the embodiments of this application. This application can also be implemented as a device or system program (e.g., a computer program and computer program product) for performing part or all of the methods described herein. Such an implementation of this application can be stored on a computer-readable medium, or can be in the form of one or more signals. Such signals can be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.

[0132] It should be noted that the above embodiments are illustrative of this application and not restrictive, and that those skilled in the art can devise alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses should not be construed as limiting the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. This application can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several systems, several of these systems may be embodied by the same item of hardware. The use of the words first, second, and third, etc., does not indicate any order. These words can be interpreted as names.

[0133] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any person skilled in the art can easily conceive of various variations or substitutions within the technical scope disclosed in this application, and these should all be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. An image convolution method based on Hilbert fractal curves, characterized in that, include: 1) Select the padding operation for the input image based on the convolution output size mode; 2) Block Indexing and Rearrangement: The filled input image is read and rearranged according to the Hilbert fractal index numbers; in step 2), the rearrangement of image blocks continues according to the Hilbert fractal index, specifically including the following steps: in, The original input image, For the input image, a block, For an input image that is rearranged using Hilbert indexing; The number of image blocks is denoted by 'a', where 'a' is the number of rows and columns of the input image. 3) Cross-correlation operation: Perform convolution cross-correlation operation on the rearranged image patches with a stride equal to the kernel size; 4) Local Feature Aggregation: A depthwise separable one-dimensional convolution is used to perform a convolutional cross-correlation operation with a stride of 1 on the output of step 3. In step 4), the depthwise separable one-dimensional convolution operation on the output of step 3) specifically includes the following steps: the default size of the channel-separable convolution kernel is 3, the default stride is 1, the default padding size is 1, and the default padding value is zero; the channel-separable convolution kernel is used to perform a convolutional cross-correlation operation with a stride of 1 on the output of step 3; wherein, the channel-separable one-dimensional convolution operation is used to aggregate information for each adjacent two-dimensional image patch. 5) Restore the two-dimensional structure of the input: Reverse the rearrangement of the output of step 4 according to the Hilbert fractal index and output it; After the convolution kernel and feature aggregation operation, restore the one-dimensional image patch to the original two-dimensional structure. The specific steps are as follows: Generate the Hilbert fractal index matrix; Use the matrix operation broadcasting mechanism to rearrange the image patch according to the previous index record; Output the image after the two-dimensional convolution operation.

2. The image convolution method based on Hilbert fractal curves according to claim 1, characterized in that, The filling operation includes filling the input edges when the input size equals the output size.

3. A method for constructing an image classification network, characterized in that, include: S1. Construct a Hilbert convolutional layer according to the method of claim 1 or 2, and then connect multiple ordinary convolutional layers in sequence to form a convolutional neural network. S2. Input a batch of images into the constructed convolutional neural network; S3. Output the feature map after the convolutional network operation; S4. Input the feature map from step S3 into the classification head, which includes a fully connected layer and an activation function layer, and outputs a classification score. S5. Input supervision information, i.e., image labels; S6. Using the supervision information, calculate the cross-entropy loss for the output of the classification head and the label One-Hot vector, respectively; S7. The network model was trained using the back gradient propagation algorithm through 100 iterations. S8. After the loss converges, the image classification network parameters are obtained, and the final network model is obtained.

4. An image convolution system based on Hilbert fractal curves, using the method described in any one of claims 1-3, characterized in that, include: The fill module selects the fill operation for the input image based on the convolution output size mode; The block indexing and rearrangement module reads and rearranges image blocks according to the Hilbert fractal index numbers of the filled input image; The cross-correlation module performs convolution cross-correlation operations on the rearranged image blocks with a stride equal to the kernel size. The local feature aggregation module uses depthwise separable one-dimensional convolution to perform convolutional cross-correlation operation with a stride of 1 on the output of the cross-correlation operation module; The restoration module reverses the output of the local feature aggregation module according to the Hilbert fractal index and then outputs it.

5. An electronic device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, The processor executes the computer program to implement the method as described in any one of claims 1-3.

6. A computer-readable storage medium having a computer program stored thereon, characterized in that, The program is executed by a processor to implement the method as described in any one of claims 1-3.

Citation Information

Patent Citations

CN109886273A
CN113039541A
CN113128614A
CN113507355A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

CN109886273A

CN113039541A

CN113128614A

CN113507355A