The invention discloses a mixed corpus word segmentation method based on a Bi-LSTM-CNN. According to the method, training mixed corpus data is converted into corpus data at a character level;
statistical analysis is performed on characters of the corpus data to obtain a character set, and each character is numbered to obtain a character number set;
statistical analysis is performed on character labels to obtain a
label set, and the labels are numbered to obtain a
label number set; a corpus is divided according to
sentence length, obtained sentences are grouped according to the
sentence length,and a
data set is obtained; a
sentence group is selected from the
data set randomly without replacement, multiple sentences are extracted from the sentence group, the characters of each sentence forma piece of data w, and the corresponding
label set is y; the data w is converted into corresponding numbers and labels y, the corresponding numbers and labels y are input into a model Bi-LSTM-CNN, and parameters of a
deep learning model are trained; and to-be-predicted data is converted into data matched with the
deep learning model, and the data is input into the trained
deep learning model to obtain a word segmentation result.