The invention relates to a method for establishing a Vietnamese
dependency tree bank based on an improved Nivre
algorithm, and belongs to the technical field of
natural language processing. The method comprises the steps of firstly, establishing an initial training corpus, an expansion corpus and a test corpus; secondly training two dependency
parsing weak learners S1 and S2 based on the improved Nivre
algorithm by utilizing the established initial training corpus to serve as two fully redundant views; thirdly, performing dependency
parsing on the expansion corpus by utilizing the two trained weak learners S1 and S2 and building a Vietnamese
dependency tree bank model; and finally, performing dependency
parsing testing on the test corpus and finally establishing the Vietnamese
dependency tree bank. According to the method, the powerful support can be provided for upper applications of syntactic analysis,
machine translation,
information acquisition and the like of a Vietnamese language; the process of manually marking a
dependency relation of Vietnamese sentences can be effectively avoided, so that the time of manpower and
material resources is saved; and a large amount of unmarked Vietnamese
sentence level corpora can be effectively utilized for improving the accuracy of dependency parsing.