The invention discloses an emotional Chinese text human voice synthesis method, which mainly comprises the steps of: (1) constructing an emotional corpus; (2) and performing emotional speech synthesisbased on waveform splicing. The emotional corpus establishment is mainly implemented by the steps of: (11) segmenting terms and acquiring parts of speech of the terms; (12) performing speech segmentation, and acquiring audio data corresponding to segmented terms based on speech data features and text corpora; (13) and performing emotion analysis, and acquiring emotional feature values of terms, clauses and whole sentences based on text term segmentation and audio features. The emotional speech synthesis based on waveform splicing is implemented by the steps of: (21) segmenting terms and performing emotion analysis on a text to be synthesized, and acquiring parts of speech of words, sentence patterns and emotional features in the text to be synthesized; (22) selecting the optimal corpus, and carrying out matching to obtain the optimal corpus set based on text eigenvalues; (23) and perfomring speech synthesis and waveform splicing, extracting a word audio sequence set from the corpus set, and synthesizing the audio to output a final speech. The emotional Chinese text human voice synthesis method is used for synthesizing and outputting a true human voice speech with emotional features.