The invention discloses a device for predicting embryo pregnancy results based on multimodality, which belongs to the field of medical artificial intelligence. Firstly, the image of the embryo developed to the blastocyst stage after in vitro fertilization and the corresponding pregnancy result are obtained, and the blastocyst and inner cells of the embryo are obtained. Three images of clumps and trophoblast cells, labeled with pregnancy results, annotated data, as raw data. Then use the Gaussian kernel function to smooth the image, remove part of the noise, and then normalize the image. Afterwards, data augmentation is performed on the image as input data. Image fusion of three images using a multimodal approach such that the input image contains features for the three evaluation aspects. Pass the fused image into ResNet‑50 for training, optimize the network according to the target label, and iterate until the training is complete. After having a model, before embryo transfer, take three images, which can be imported into the model to predict the pregnancy result, and the embryo with a high success rate can be selected according to the output result, which can improve the final pregnancy success rate.