The invention discloses a pre-training method and device, equipment and a storage medium, and belongs to the technical field of computers. The method comprises the following steps: obtaining an initial text
sentence after character masking
processing; obtaining a
target text sentence based on the initial text
sentence after the character masking
processing and the additional characters before the sentence; a
mask matrix corresponding to the
target text sentence is determined, the
mask matrix comprises a plurality of elements, each element is used for indicating the operation association degree of two characters corresponding to the element in the
target text sentence in the
feature extraction process to a to-be-trained
feature extraction model, and the element corresponding to the additional character in front of the sentence is not 0; and training a to-be-trained
feature extraction model based on the initial text sentence, the target text sentence and the
mask matrix. By adopting the method and the device, not only can the
feature vector corresponding to each character in the target text sentence be obtained, but also the
feature vector corresponding to the target text sentence can be obtained, other training does not need to be carried out additionally, and data operation resources and
operation time are reduced.