Pedestrian fall detection method based on mixed precision quantization and storage medium
By improving the hybrid precision quantization method of the genetic algorithm and rationally allocating the weights and activation bit width of the pedestrian neural network, the problem of balancing the size, accuracy and speed of the pedestrian fall detection model in embedded devices is solved, thereby improving the detection speed and accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHENZHEN ICOMM SEMICON CO LTD
- Filing Date
- 2023-02-20
- Publication Date
- 2026-06-26
AI Technical Summary
Existing pedestrian fall detection models struggle to achieve an optimal balance between model size, accuracy, and processing speed, posing a particular challenge when applied to embedded devices.
A mixed-precision quantization method based on an improved genetic algorithm is adopted. By encoding, initializing, selecting, crossovering and game mutation of the pedestrian neural network, the optimal mixed-precision quantization model is generated, and the weights and bit widths of the activation values of each layer of the model are reasonably allocated.
It improves the model's processing speed, achieving an optimal balance between model size, accuracy, and processing speed, thus increasing the speed of pedestrian fall detection without causing excessive loss of accuracy.
Smart Images

Figure CN116071826B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer vision technology, and in particular to a pedestrian fall detection method and storage medium based on hybrid precision quantization. Background Technology
[0002] Pedestrian fall detection is an important component of security systems, used to detect whether pedestrians in a monitored area are in a fall, promptly identify fallen pedestrians, and take appropriate measures. Currently, neural network models are being applied to pedestrian fall detection.
[0003] Existing neural network models share a common characteristic: they are large and complex, suitable for server-side inference, but not for use in mobile embedded devices such as smartphones and cameras. However, industrial applications often require deploying these complex models in low-cost embedded devices. To address this issue, model quantization has emerged. It can compress models with a slight loss of accuracy, making it possible to apply these complex models to embedded terminals such as smartphones and robots. Model quantization primarily achieves model compression and runtime acceleration by using fixed-point representations of the model's weights and runtime activations.
[0004] Currently, mixed-precision quantization is commonly used to balance model recommendation accuracy and model size. Mixed-precision quantization quantizes the weights of different layers in a neural network to different precisions. For example, it converts floating-point calculations in some layers to low-precision point-to-point calculations, while other layers still use floating-point calculations, resulting in a smaller model. While mixed-precision quantization can effectively reduce model computational intensity, parameter size, and memory consumption, it often leads to a significant loss of accuracy.
[0005] Generally, the computing power and storage space of the model processing chip inside a smart camera are quite limited. If a floating-point (32-bit) neural network model is used directly, the model size will be too large, and the processing speed will be severely affected. If a low-bit-width (4-bit) mixed-precision quantization model is used, although the model size is reduced and the processing speed is improved, the model's computational accuracy will be severely affected.
[0006] In the process of realizing this invention, the inventors discovered at least the following problems in the prior art:
[0007] The existing hybrid quantization accuracy model used for pedestrian fall detection struggles to achieve an optimal balance between model size, accuracy, and processing speed. Summary of the Invention
[0008] The purpose of this invention is to provide a pedestrian fall detection method and storage medium based on hybrid precision quantization, to address the problem that existing pedestrian fall detection methods using hybrid quantization precision models struggle to achieve an optimal balance between model size, accuracy, and processing speed. The numerous technical effects of the preferred solutions among the many technical solutions provided by this invention are detailed below.
[0009] To achieve the above objectives, the present invention provides the following technical solution:
[0010] This invention provides a pedestrian fall detection method based on hybrid precision quantization, comprising:
[0011] S1. Train the acquired pedestrian image set to obtain a floating-point pedestrian neural network, and encode and quantize the pedestrian neural network into a basic neural network.
[0012] S2. Initialize the basic neural network to obtain K different individuals. Based on the fitness of the individuals, perform selection and crossover operations on the chromosomes of the K individuals to generate the first population.
[0013] S3. Perform a game-theoretic mutation operation on the individuals in the first population to generate the second population;
[0014] S4. Repeat steps S2-S3 until the number of iterations reaches the preset maximum number of iterations or the iteration termination condition is met, and obtain the pedestrian neural network model with optimal mixed precision quantization from the second group.
[0015] Preferably, step S3 includes:
[0016] S31. Using the pedestrian image set as a verification dataset, a certain batch of data is randomly selected from the verification dataset for inference, and the individual inference accuracy is calculated.
[0017] S32. Randomly select two gene loci on the chromosome of an individual and perform mutation operations on them respectively, and calculate the inference accuracy of the two mutated individuals based on the verification dataset;
[0018] S33. Calculate the reasoning difference value between the two mutated individuals based on the reasoning accuracy before and after the mutation.
[0019] S34. The gene locus with the best inference difference value gets the chance to mutate, while the other gene locus does not mutate.
[0020] Preferably, the iteration termination condition in step S4 is:
[0021] There exists an individual in the population whose bit resources are ≤ a*mp and whose inference precision meets the preset inference precision; where 0.5≤a≤1, and mp is the total number of parameters of the individual.
[0022] Preferably, the fitness function in step S2 is:
[0023] Fitness = λ * (Individual inference accuracy - Basic inference accuracy) / (Individual bit count / Basic bit count); where λ is a user-defined parameter greater than zero; the basic inference accuracy is the inference accuracy of the basic neural network, and the basic bit count is the bit count of the basic neural network.
[0024] Preferably, the initialization of the basic neural network in step S2 includes:
[0025] S21. Using the basic neural network as the initial individual, randomly select a gene locus for mutation to obtain a new individual;
[0026] S22. The initial individual is then randomly selected from an unselected gene locus for mutation to obtain a new individual;
[0027] S23. Repeat step S22 until K different individuals are obtained.
[0028] Preferably, step S1 further includes: using each convolution to be quantized in the pedestrian neural network as a gene point of the genetic algorithm, the bit width of the convolution as a gene, and the entire pedestrian neural network as an individual.
[0029] Preferably, the basic neural network is a minimum bit-width neural network or a maximum bit-width neural network; all gene points in the minimum bit-width neural network have the lowest bit width; and all gene points in the maximum bit-width neural network have the highest bit width.
[0030] Preferably, when the base network is the lowest bit-width neural network, the mutation operation is defined as a single-step mutation from low bits to high bits. If the convolution bit width corresponding to the current bit is already the bit width of the highest bit, then the current bit will no longer participate in the mutation operation. When the base network is the highest bit-width neural network, the mutation operation is defined as a single-step mutation from high bits to low bits. If the convolution bit width corresponding to the current bit is already the bit width of the lowest bit, then the current bit will no longer participate in the mutation operation.
[0031] Preferably, the method further includes: detecting pedestrian status using a pedestrian neural network model with optimal mixed precision quantization, and outputting the detection results.
[0032] A computer-readable storage medium storing a computer program that, when executed, implements a hybrid precision quantization method based on an improved genetic algorithm as described above.
[0033] Implementing one of the above-described technical solutions of the present invention has the following advantages or beneficial effects:
[0034] (1) This invention improves the efficiency of random search by introducing a genetic algorithm, thereby improving the processing speed of the model and thus improving the pedestrian fall detection speed;
[0035] (2) Taking into account both quantization sensitivity and resource utilization, the bit width of the weights and activation values of each layer of the model is reasonably allocated during the mixed precision quantization process to obtain the best neural network model.
[0036] (3) By introducing a genetic algorithm improved by game theory, the search-based method and the optimization-based method in mixed precision quantization are combined, so that the model after mixed precision quantization can achieve the best balance in model size, accuracy and processing speed, so that the detection speed of pedestrian fall detection is improved without causing too much loss of accuracy. Attached Figure Description
[0037] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort. In the drawings:
[0038] Figure 1 This is a flowchart of a hybrid precision quantization method based on an improved genetic algorithm according to an embodiment of the present invention;
[0039] Figure 2 This is a flowchart of step S3 of a hybrid precision quantization method based on an improved genetic algorithm according to an embodiment of the present invention;
[0040] Figure 3 This is a flowchart of the initialization steps of the basic neural network in an embodiment of the present invention;
[0041] Figure 4 This is the encoding of a neural network hybrid progress quantization model according to an embodiment of the present invention;
[0042] Figure 5 These are two chromosomes to be crossed in an embodiment of the present invention;
[0043] Figure 6 These are two crossed chromosomes according to an embodiment of the present invention;
[0044] Figure 7 The chromosome to be mutated in this embodiment of the invention;
[0045] Figure 8 The chromosome after mutation at site A in this embodiment of the invention;
[0046] Figure 9 The chromosome after the mutation at site B in this embodiment of the invention;
[0047] In the picture: Figure 5 The dashed box represents the part to be intersected; Figure 6 The dashed box indicates the crossed parts. Detailed Implementation
[0048] To make the objectives, technical solutions, and advantages of the present invention clearer, various exemplary embodiments described below will be referenced to the accompanying drawings, which form part of the exemplary embodiments, illustrating various exemplary embodiments that may be used to implement the present invention. Unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. It should be understood that they are merely examples of processes, methods, and apparatuses consistent with some aspects of the present invention disclosed as detailed in the appended claims, and other embodiments may be used, or structural and functional modifications may be made to the embodiments listed herein without departing from the scope and spirit of the present invention.
[0049] In the description of this invention, it should be understood that the terms "center," "longitudinal," "lateral," etc., indicate the orientation or positional relationship based on the accompanying drawings, and are only for the convenience of describing the invention and simplifying the description, and do not indicate or imply that the referred element must have a specific orientation, or be constructed and operated in a specific orientation. The terms "first," "second," etc., are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. The term "multiple" means two or more. The terms "connected" and "linked" should be interpreted broadly, for example, they can be fixed connections, detachable connections, integral connections, mechanical connections, electrical connections, communication connections, direct connections, indirect connections through an intermediate medium, and can be the internal connection of two elements or the interaction relationship between two elements. The term "and / or" includes any and all combinations of one or more of the related listed items. Those skilled in the art can understand the specific meaning of the above terms in this invention according to the specific circumstances.
[0050] To illustrate the technical solution described in this invention, specific embodiments are described below, showing only the parts related to the embodiments of this invention.
[0051] Example 1:
[0052] like Figure 1 As shown, this invention provides a mixed-precision quantization method based on an improved genetic algorithm, comprising:
[0053] S1. Train the acquired pedestrian image set to obtain a floating-point pedestrian neural network, encode the pedestrian neural network and quantize it into a basic neural network; wherein, the pedestrian video captured by the smart camera is converted into a pedestrian image set for each frame of the video.
[0054] S2. Initialize the basic neural network to obtain K different individuals. Based on the fitness of the individuals, perform selection and crossover operations on the chromosomes of the K individuals to generate the first population, where K>2;
[0055] S3. Perform game-theoretic mutation operations on the individuals in the first population to generate the second population;
[0056] S4. Repeat steps S2-S3 until the preset maximum number of iterations is reached or the iteration termination condition is met. Obtain the optimal mixed-precision quantized pedestrian neural network model from the second population. The individual with the highest fitness in the second population is the optimal mixed-precision quantized pedestrian neural network model. The maximum number of iterations can be flexibly set according to the actual situation.
[0057] This embodiment uses a genetic algorithm improved by game theory to perform mixed precision quantization, which improves the efficiency of random search and enables the model after mixed precision quantization to achieve the best balance between model size, accuracy and processing speed.
[0058] As an optional implementation, step S1 further includes: using each convolution to be quantized (such as 2D convolution, 3D convolution, depthwise separable convolution, etc.) in the pedestrian neural network as a gene location for the genetic algorithm, the bit width of the convolution as a gene, and the entire neural network as an individual. The correspondence between the genetic algorithm and mixed-precision quantization is shown in Table 1:
[0059] Genetic Algorithm Hybrid precision quantization chromosome Encoding of deep learning neural networks with mixed bandwidth Gene Bit width of each convolution in a neural network individual Mixed-width deep learning neural networks population A selected set of deep learning neural networks with mixed bandwidths fitness Fitness function (inference accuracy, number of bits) choose Selecting the next generation of neural network populations cross Interchange the corresponding convolutions of different neural networks Mutations The convolution bit width of the neural network changes
[0060] Table 1. Correspondence between genetic algorithms and mixed-precision quantization
[0061] The base neural network can be either a minimum bit width neural network or a maximum bit width neural network. In a minimum bit width neural network, all gene points have the lowest bit width; in a maximum bit width neural network, all gene points have the highest bit width. The minimum bit width neural network has the lowest accuracy, therefore, in the genetic algorithm process, accuracy needs to be increased until an optimal balance with hardware specifications is achieved. Hardware specifications relate to whether the model size and processing speed meet the requirements of the hardware device. Conversely, the maximum bit width neural network already has the highest accuracy, so in the genetic algorithm, accuracy needs to be appropriately reduced until an optimal balance with hardware specifications is achieved. When the base network is a minimum bit width neural network, the mutation operation is defined as a single-step mutation from low bits to high bits. If the convolution bit width corresponding to the current point is already the maximum bit width, then the current point does not participate in the mutation operation. When the base network is a maximum bit width neural network, the mutation operation is defined as a single-step mutation from high bits to low bits. If the convolution bit width corresponding to the current point is already the minimum bit width, then the current point does not participate in the mutation operation. Suppose that in a mixed-precision quantization process, x quantization widths are involved, and the basic neural network performs mixed quantization with 4-bit, 6-bit, and 8-bit quantization widths. Then x = 3, and each gene locus has 3 possible representations, such as... Figure 4 As shown; if the basic neural network is the lowest bit-width neural network, then the width of all gene points in the initial individual is 4 bits. During mutation, it can only mutate from 4 bits to 6 bits before it can mutate from 6 bits to 8 bits. If the basic neural network is the highest bit-width neural network, then the width of all gene points in the initial individual is 8 bits. During mutation, it can only mutate from 8 bits to 6 bits before it can mutate from 6 bits to 4 bits.
[0062] like Figure 3 As shown, the initialization of the basic neural network in step S2 includes:
[0063] S21. Using the basic neural network as the initial individual, randomly select a gene locus for mutation to obtain a new individual;
[0064] S22. The initial individual is then randomly selected from an unselected gene locus for mutation to obtain a new individual;
[0065] S23. Repeat step S212 until K different individuals are obtained.
[0066] The selection and crossover operation in step S2 specifically involves: forming an initial population of K different individuals; performing a selection operation on the initial population based on fitness to update it; the selection operation chooses regenerated individuals based on fitness, with individuals with high fitness having a higher probability of being selected, while individuals with low fitness may be eliminated, and only superior individuals have a greater chance of being preserved to the next generation. This embodiment uses a tournament selection method. Each time, several individuals are randomly selected, and the one with the best fitness is retained, iterating M times to generate a new population; the worst individual is always eliminated. Then, a crossover operation is performed on the initial population after the selection operation to generate the first population; the crossover operation is used to combine the genetic information of the parents to produce offspring; this embodiment uses a single-point crossover method, randomly selecting two individuals from the population, and then exchanging chromosomes through single-point crossover; for example... Figure 6-7 As shown, chromosomes 1 and 2 undergo crossover at point P to produce two offspring chromosomes. These two offspring chromosomes replace chromosomes 1 and 2 and enter the second population. Chromosomes that do not undergo crossover are directly copied into the second population. Crossover can improve the convergence speed, and an efficient crossover method can combine the superior genes of two parent individuals to construct better offspring individuals.
[0067] like Figure 2 As shown, step S3 includes:
[0068] S31. Using the pedestrian image set as the validation dataset, randomly select a certain batch of data from the validation dataset for inference and calculate the individual inference accuracy.
[0069] S32. Randomly select two gene loci on the chromosome of an individual and perform mutation operations on them respectively, and calculate the inference accuracy of the two mutated individuals based on the verification dataset;
[0070] S33. Calculate the reasoning difference value between the two mutated individuals based on the reasoning accuracy before and after the mutation.
[0071] S34. The gene locus with the best inference difference value gets the chance to mutate, while the other gene locus does not mutate.
[0072] like Figure 7-9As shown, randomly selected points A and B on a chromosome undergo game-theoretic mutation operations, resulting in chromosomes with mutated points A and B. The inference difference between the two mutated chromosomes is calculated. If the lowest bit-width neural network is used as the base neural network, a larger inference difference value leads to a greater improvement in accuracy, making the mutation more advantageous. Assuming the chromosome with mutated point A has a larger inference difference value, indicating a greater improvement in accuracy, point A on this chromosome gets the opportunity to mutate, while point B reverts to its original bit width and does not mutate. If the highest bit-width neural network is used as the base neural network, a smaller inference difference value leads to a smaller decrease in accuracy, making the mutation more advantageous. Similarly, assuming the chromosome with mutated point A has a larger inference difference value, indicating a greater decrease in accuracy, it is not an advantageous mutation. Therefore, point B gets the opportunity to mutate, while point A reverts to its original bit width and does not mutate.
[0073] The iteration termination condition for step S4 is: there exists an individual in the population whose bit resources are ≤ a*mp and whose inference accuracy meets the preset inference accuracy; where 0.5 ≤ a ≤ 1, and mp is the total number of parameters for the individual. When a = 1, all convolutions in the neural network are 8-bit wide; when a = 0.5, all convolutions in the neural network are 4-bit wide. To obtain a neural network with mixed widths, the width can be set according to the actual situation, such as a = 0.75.
[0074] The fitness function in step S2 is:
[0075] Fitness = λ * (Individual inference accuracy - Basic inference accuracy) / (Individual bit count / Basic bit count); where λ is a user-defined parameter greater than zero, which can adaptively adjust the fitness value so that the fitness of each individual has a large degree of discriminability, making it easy to distinguish during selection and crossover operations; basic inference accuracy is the inference accuracy of the basic neural network, and basic bit count is the bit count of the basic neural network.
[0076] The method also includes: detecting pedestrian states using a pedestrian neural network model with optimal mixed precision quantization and outputting the detection results.
[0077] This embodiment not only improves the efficiency of random search and the speed of pedestrian fall detection by introducing a genetic algorithm, but also comprehensively considers quantization sensitivity and resource utilization, and achieves reasonable allocation of the bit width of the weights and activation values of each layer of the model during mixed-precision quantization. Most importantly, by introducing a genetic algorithm improved by game theory, the search-based method and the optimization-based method in mixed-precision quantization are combined, so that the model after mixed-precision quantization can achieve the best balance in terms of model size, accuracy and processing speed. Thus, pedestrian fall detection can be improved in speed without causing excessive loss of accuracy.
[0078] The embodiment is merely a specific example and does not indicate that this is the only way to implement the present invention.
[0079] Example 2:
[0080] A computer-readable storage medium storing a computer program, which, when executed, implements a hybrid precision quantization method based on an improved genetic algorithm as described above.
[0081] Those skilled in the art will understand that all or part of the features / steps of the above-described method embodiments can be implemented by methods, data processing systems, or computer programs. These features may be implemented without hardware, entirely in software, or in a combination of hardware and software. The aforementioned computer program may be stored in one or more computer-readable storage media. When the computer program is executed (e.g., by a processor), it performs the steps of the above-described embodiment of a mixed-precision quantization method based on an improved genetic algorithm.
[0082] The aforementioned storage media capable of storing program code include: static disks, solid-state drives, random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), optical storage devices, magnetic storage devices, flash memory, magnetic disks or optical disks, and / or combinations of the above devices, that is, they can be implemented by any type of volatile or non-volatile storage devices or combinations thereof.
[0083] The above description is merely a preferred embodiment of the present invention. Those skilled in the art will understand that various changes or equivalent substitutions can be made to these features and embodiments without departing from the spirit and scope of the present invention. Furthermore, under the teachings of the present invention, these features and embodiments can be modified to adapt to specific situations and materials without departing from the spirit and scope of the present invention. Therefore, the present invention is not limited to the specific embodiments disclosed herein, and all embodiments falling within the scope of the claims of this application are within the protection scope of the present invention.
Claims
1. A pedestrian fall detection method based on hybrid precision quantization, characterized in that, include: S1. Train the acquired pedestrian image set to obtain a floating-point pedestrian neural network, and encode and quantize the pedestrian neural network into a basic neural network. S2. Initialize the basic neural network to obtain K different individuals. Based on the fitness of the individuals, perform selection and crossover operations on the chromosomes of the K individuals to generate the first population. S3. Perform a game-theoretic mutation operation on the individuals in the first population to generate the second population; S4. Repeat steps S2-S3 until the number of iterations reaches the preset maximum number of iterations or the iteration termination condition is met, and obtain the optimal mixed precision quantized pedestrian neural network model from the second group. Step S3 includes: S31. Using the pedestrian image set as a verification dataset, a certain batch of data is randomly selected from the verification dataset for inference, and the individual inference accuracy is calculated. S32. Randomly select two gene loci on the chromosome of an individual and perform mutation operations on them respectively, and calculate the inference accuracy of the two mutated individuals based on the verification dataset; S33. Calculate the reasoning difference value between the two mutated individuals based on the reasoning accuracy before and after the mutation. S34. The gene locus with the best inference difference value gets the chance to mutate, while the other gene locus does not mutate.
2. The pedestrian fall detection method based on hybrid precision quantization according to claim 1, characterized in that, The iteration termination condition mentioned in step S4 is: The number of individuals in the population with bit resources ≤ a mp and the inference accuracy of the individual meets the preset inference accuracy; where 0.5≤a≤1, and mp is the total number of parameters of the individual.
3. The pedestrian fall detection method based on hybrid precision quantization according to claim 1, characterized in that, The fitness function mentioned in step S2 is: Fitness = λ (Individual inference accuracy - basic inference accuracy) / (individual bit count / basic bit count); where λ is a user-defined parameter greater than zero; the basic inference accuracy is the inference accuracy of the basic neural network, and the basic bit count is the bit count of the basic neural network.
4. The pedestrian fall detection method based on hybrid precision quantization according to claim 1, characterized in that, Step S2, which involves initializing the basic neural network, includes: S21. Using the basic neural network as the initial individual, randomly select a gene locus for mutation to obtain a new individual; S22. The initial individual is then randomly selected from an unselected gene locus for mutation to obtain a new individual; S23. Repeat step S22 until K different individuals are obtained.
5. The pedestrian fall detection method based on hybrid precision quantization according to claim 1, characterized in that, Step S1 further includes: taking each convolution to be quantized in the pedestrian neural network as a gene position of the genetic algorithm, taking the bit width of the convolution as a gene, and taking the entire pedestrian neural network as an individual.
6. The pedestrian fall detection method based on hybrid precision quantization according to claim 1, characterized in that, The basic neural network is either a minimum bit-width neural network or a maximum bit-width neural network; all gene points in the minimum bit-width neural network have the lowest bit width; and all gene points in the maximum bit-width neural network have the highest bit width.
7. The pedestrian fall detection method based on hybrid precision quantization according to claim 6, characterized in that, When the base network is the lowest bit-width neural network, the mutation operation is defined as a single-step mutation from low bits to high bits. If the convolution bit width corresponding to the current bit is already the bit width of the highest bit, then the current bit will no longer participate in the mutation operation. When the base network is the highest bit-width neural network, the mutation operation is defined as a single-step mutation from high bits to low bits. If the convolution bit width corresponding to the current bit is already the bit width of the lowest bit, then the current bit will no longer participate in the mutation operation.
8. The pedestrian fall detection method based on hybrid precision quantization according to claim 1, characterized in that, The method further includes: detecting pedestrian status using a pedestrian neural network model with optimal mixed precision quantization, and outputting the detection results.
9. A computer-readable storage medium, characterized in that, The storage medium stores a computer program, which, when executed, implements the pedestrian fall detection method based on hybrid precision quantization as described in any one of claims 1-8.