The invention discloses a voice abstract forming method and a voice abstract forming system for a conference, and relates to the field of voice recognition. According to the invention, the speaking position, the identity information, the personnel data and other information of a speaker in a conference are analyzed, and then the weight coefficient of the speaker is determined. As a result, candidate key speaking segments corresponding to different speakers can be obtained by using different preset strategies according to the weight coefficients of speakers. Furthermore, according to the characteristics of the speaking content, for example, the large-probability interval of the important content of the utterance on a speaking time axis, key transitional words and key connection words in theimportant content of statements, and the like, a set of candidate key speaking segments is intercepted. The intercepted set of candidate key speaking segments is processed and then a set of audio/video segments for forming a voice abstract is obtained. As a result, more content is extracted from important statements, while less content is extracted from non-important statements. The content of afinally formed abstract is more reasonable. The more effective help is provided for users.