The invention discloses a visual question-answering problem-solving method based on a complex network analysis method, including semantic concept network construction, non-random depth walk, image and text feature fusion, and a classifier. The semantic concept network construction aims to mine the co-occurrence of concepts Mode to enhance semantic expression, non-random deep walk realizes the mapping of complex network relations to low-dimensional features, on the basis of constructing image semantic concept network, apply deep walk algorithm to learn the potential relationship of nodes in semantic concept network, and combine complex The nodes in the network are mapped into a low-dimensional feature vector, and multinomial logistic regression fuses image and text features to solve visual question answering problems. The invention digs deep into the conceptual co-occurrence mode and the hierarchical structure of the cluster concept, effectively integrates the visual and semantic features of the image, and the natural language features, and provides a feasible way to solve the visual question answering problem.