The invention discloses an iterative construction method and device for a military
scenario text event extraction corpus. The method comprises the following steps of 1, preprocessing, and obtaining anoriginal
data set represented by a word sequence; 2, constructing a seed
data set, defining an event template, constructing an
event trigger word dictionary, forming the seed
data set through manualannotation, and dividing the seed data set into a seed
training set and a
test set; 3, training a model, training a
machine learning model by using the seed
training set, testing the model by using the
test set, and optimizing the
model parameters according to a test result to obtain a first learning model; 4, selecting an unlabeled training corpus, and inputting the unlabeled training corpus intothe first learning model to obtain a prediction
result set; 5, correcting the prediction
result set to form a new
annotation corpus; and 6, through the continuous iteration, generating the training sets in sequence to form the event extraction corpus. According to the iterative construction method for the military
scenario text event extraction corpus, the corpus construction efficiency is improved, the
manual annotation cost is reduced, and the relatively higher corpus
annotation accuracy is obtained.