The invention discloses a Chinese text key
information extraction method based on a pre-trained
language model, which comprises the following specific steps of: (1) classifying key information to be extracted, easily concluding information categories forming rules, and extracting by using a regular matching method; and (2) extracting the named entities by using a
sequence labeling model. (3) constructing the
sequence labeling model by adopting a method of finely adjusting a pre-training
language model, wherein firstly, a large-scale unlabeled
text corpus is used for learning to obtain the pre-training
language model, and word boundary features are introduced in a pre-training stage; (4) replacing the
data content matched by using the rule with the corresponding rule template
label so as tocomplete fusion of
rule matching and the deep network; and (5) performing fine adjustment on the pre-trained language model according to the marked training data, and migrating the pre-trained language model to the sequence marking task of the
named entity. According to the method, text context semantic features can be effectively extracted, and each
information type can be effectively identifiedin a complex
information type scene.