The invention relates to a multi-
granularity word segmentation method and
system based on
sequence labeling modeling, and provides a method and
system for acquiring a multi-
granularity label sequenceby means of a
machine learning method. The method comprises the steps that sentences in at least one single-
granularity labeling
data set are converted into word segmentation sequences complying withother n-1 word segmentation specifications respectively, n word segmentation sequences complying with the different specifications and corresponding to each
sentence are converted into a multi-granularity word segmentation hierarchical structure, a multi-granularity
label of each word in each
sentence is obtained according to a predetermined coding method and the multi-granularity word segmentation hierarchical structures, and therefore a multi-granularity
label sequence of each
sentence is obtained; on the basis of the
data set including the sentences and the corresponding multi-granularity label sequences, by training a
sequence labeling model, a multi-granularity
sequence labeling model is obtained. According to the multi-granularity word segmentation method and
system based on sequencelabeling modeling, the concept of multi-granularity word segmentation is put forward for the first time, and the multi-granularity word segmentation hierarchical structures can be quickly and automatically obtained.