The invention relates to a character extracting method in a
digital video based on character segmentation and color cluster, which comprises the following steps: (1) character segmentation: utilizing the characteristic differences of a character area and a
character interval area to carry out
vertical projection to segment images in the character area, namely, segmenting each row of area image containing a plurality of characters into a plurality of subarea images only containing a
single character so as to reduce the post operating and treating difficulties and improve the identifying accuracy rate of OCR; and (2) character extraction: firstly, using the character color characteristic in the image to cluster colors, finding out an
image layer containing maximum character information as a target
image layer, and deleting the background area; and then, using the communicating characteristics of the characters to analyze a communicating area of the target
image layer, and removing non-character areas to obtain such three results as
single character images, an integral image of the character area and an integral image spliced by the
single character images respectively, wherein all the three results are input to an OCR
system to be identified, and the latter two results use the semantic
processing function of the OCR and can accurately determine the characters with similar forms according to the context to improve the identifying effect.