A method and apparatus for automatically generating tags for digital content are provided. The method is adapted to be run on a computer, which is an example of the type of apparatus which may generate the tags. The generated tags describe the digital content, and may be used as topics for the content to organize, retrieve, and process the content. The tag generation begins by accessing content from a content collection unit and a tags candidate tag database unit, which are then processed using techniques from computational linguistics in a multi-pass process that generates sets of tags, then refines and normalizes them. Finally, scores are generated and stored along with the tags.