The invention discloses a lightweight fine-grained image recognition method for cross-layer feature interaction in a weak supervision scene, and the method comprises the steps: constructing a novel residual module through employing multi-layer aggregation grouping convolution to replace conventional convolution, and enabling the novel residual module to be directly embedded into a deep residual network frame, thereby achieving the lightweight of a basic network; then, performing modeling on the interaction between the features by calculating efficient low-rank approximate polynomial kernel pooling, compressing the feature description vector dimension, reducing the storage occupation and calculation cost of a classification full-connection layer, meanwhile, the pooling scheme enables the linear classifier to have the discrimination capability equivalent to that of a high-order polynomial kernel classifier, and the recognition precision is remarkably improved; and finally, using a cross-layer feature interaction network framework to combine the feature diversity, the feature learning and expression ability is enhanced, and the overfitting risk is reduced. The comprehensive performance of the lightweight fine-grained image recognition method based on cross-layer feature interaction in the weak supervision scene in the three aspects of recognition accuracy, calculation complexity and technical feasibility is at the current leading level.