Systems and methods for generation of hyperlinks and
anchor text from data such as reference text in
HTML and in non-
HTML documents are disclosed. The method generally includes locating a text reference in a
source document, searching using a
search engine for a target document relating to the text reference, computing
anchor text from the text reference, generating a
hyperlink to the target document, and associating the
hyperlink with the computed
anchor text. The locating and / or computing may be based on a respective
statistical model of text formatting and / or lexical cues. The text reference may be parsed into pieces such that the searching, computing, generating, and associating are performed for each piece of text. The
source document may be an
HTML or non-HTML document. The text reference may be a reference to, for example, a paper, article, company, institution, product,
search engine, image, object, and geographical location.