The invention provides a voice annotation method for a Chinese speech emotion database combined with an electroglottography. The main annotation content of the voice annotation method comprises eight layers of information which are simultaneously annotated on each voice. The eight layers of information comprises that a first layer is a text conversion layer, speaking content of a speaker and corresponding paralanguage information are made clear; a second layer is a syllable layer, a regular spell and a tone of each syllable are annotated; a third layer is an initial / final consonant layer, initial / final consonants of the syllable layer are annotated separately, and meanwhile tone information is marked; a fourth layer is an unvoiced sound, voiced sound and silence layer, and unvoiced sounds, voiced sounds and silences of the voices are segmented combined with the electroglottography; a fifth layer is a paralanguage information layer, and paralanguage information included in each voice is annotated; a sixth layer is an emotion layer, and according to emotion status which is expressed by the speaker, each voice is annotated with information comprising seven kinds of emotions and expression degrees of each kind of the emotions; a seventh layer is a stress index layer, and intensity information of pronunciation of each voice is annotated; an eighth layer is a statement function layer, and a statement type of each statement is annotated.