Coding and decoding structure-based crowd counting and positioning method

A positioning method and crowd counting technology, applied in the field of computer vision, can solve problems such as the counting method is not as simple as the density map, performance degradation, and weak positioning performance, and achieve excellent positioning performance, improved robustness, and simple counting effects

Active Publication Date: 2022-03-01
SOUTHWEST JIAOTONG UNIV
View PDF8 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Part of the existing method is to directly fuse shallow features with high-level features, but shallow networks often contain a large amount of feature redundancy, and direct introduction may even lead to performance degradation
[0005] The label map cannot take into account counting and positioning tasks well: although the density map used by the current mainstream algorithm is convenient for counting, its positioning performance is weak, and overlapping phenomena will occur in slightly dense areas, and the peak points of human heads cannot be accurately highlighted; while FIDT Although the map counting performance is strong, the counting method is not as simple as the density map, and its counting accuracy is also closely related to the positioning accuracy, which has high requirements for the regression quality of the label map.
[0006] The difficulty of solving the above problems and defects is: in the codec network, if you want to make full use of the extracted features, feature fusion is inevitable. Considering the feature redundancy of shallow features, it is necessary to design an attention feature fusion module , and to capture multi-scale features, it is also necessary to add a multi-scale feature fusion module in the network; in terms of labels, in order to flexibly perform counting and positioning tasks, the label map needs to have both a simple counting method and a good Positioning performance, but the existing label maps do not have this feature, and the generation method of the label map needs to be redesigned

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Coding and decoding structure-based crowd counting and positioning method
  • Coding and decoding structure-based crowd counting and positioning method
  • Coding and decoding structure-based crowd counting and positioning method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] A label map generation method, said method comprising the following steps:

[0059] Step S1, making a data set; first collect image data of crowds in different environments in the actual scene, and then mark the data;

[0060] Step S2, generate a label map; generate a label map according to the marked data, the generation method of the label map is as follows:

[0061]

[0062]

[0063]

[0064] Among them, B is the coordinate set of the marked point, (x', y') is the pixel coordinate of the marked point in the label map, where x' indicates the abscissa of the marked point in the label map, and y' indicates the vertical coordinate of the marked point in the label map Coordinates; (x, y) represents the pixel coordinates of any point in the image, where x is the abscissa of any point in the image, y is the ordinate of any point in the image, and P(x, y) represents the coordinates ( x, y) to the distance from the nearest marked point, I (x, y) is the corresponding...

Embodiment 2

[0068] The purpose of this embodiment is crowd counting and positioning, aiming to give the number of people and positioning information in the image through an algorithm.

[0069] The counting part selects public datasets SHHA, SHHB and UCF_CC_50 as experimental materials. Among them, SHHA contains 300 training pictures and 182 test pictures; SHHB contains 400 training pictures and 316 test pictures; UCF_CC_50 contains 50 pictures.

[0070] First, use the label generation method proposed by the present invention to convert the labeled content of the above data set into a label map for training and testing.

[0071] Secondly, build the network model, the overall structure of the algorithm is as follows figure 1 As shown, the encoding part includes 7x7 convolution, maximum pooling layer, Res-1, Res-2, and Res-3. Except that the 7x7 step size is 1, the rest of the structure is the same as that of ResNet50. Taking the input picture 3×256×256 as an example, after 7×7 convolutio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a crowd counting and positioning method based on a coding and decoding structure, relates to the field of computer vision, and solves the problems that in the prior art, characteristics are not fully utilized, and a tag graph cannot well give consideration to counting and positioning tasks. A multi-scale feature fusion module is introduced into the deep layer of the network, a space-channel attention up-sampling module is introduced into a re-decoding part, and the multi-scale feature fusion module captures features of multiple scales by using cavity convolution with different expansion rates and performs feature fusion, so that the robustness of the network to deal with scale change is improved; the space-channel attention up-sampling module guides superficial layer features to perform high-efficiency fusion through high-level high-level semantics, so that interference of redundant features and picture backgrounds is reduced; secondly, a new label graph is provided, and the label graph has the advantage of simple counting of a density graph and also has the positioning performance of an FIDT graph.

Description

technical field [0001] The invention relates to the field of computer vision, in particular to a crowd counting and positioning method based on a codec structure. Background technique [0002] Crowd counting and positioning is to predict the quantity information and location information of the crowd through algorithms. This technology is widely used in urban management, intelligent security and other fields, especially in places where crowds gather. It is of great significance for preventing various accidents and strengthening regional management. At present, in the field of counting, the widely used method is to obtain the density map through convolutional neural network regression, and then integrate and sum the density map to obtain the number of people information. However, the density map will overlap in a slightly denser area (such as Figure 8 (b)), which is not conducive to positioning. In order to expand network application scenarios, one approach is to use FIDT gr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06V40/10G06V20/52G06V10/774G06V10/80G06V10/82G06K9/62G06T7/73G06N3/04
CPCG06T7/73G06T2207/30242G06N3/045G06F18/253G06F18/214
Inventor 黄进杨涛王晴杨旭李剑波方铮冯义从
Owner SOUTHWEST JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products