A
Chinese word segmentation method based on navigation
information retrieval is characterized in that a word
segmentation system is obtained through the steps that a dictionary is loaded, and text
code conversion is carried out; segmentation
processing is carried out, and a source character string is segmented into a plurality of slightly simpler short sentences; atomic word segmentation is carried out to obtain the smallest
morpheme units which cannot be segmented in the short sentences; word forming full-match is achieved with a word-by-word traversal matching method; the matching results are screened to generate a plurality of best results; human names, place names and proper nouns are processed; the dictionary is corrected, and mainly, unlisted new words are added, and properties of the existing words are improved; the
processing results of all the short sentences are finally combined to be output. The
Chinese word segmentation method has the advantages that content input by a user can be formed into words through the
Chinese word segmentation technology, the speed can be optimized, wrongly written characters can be corrected with the words as the basis, and a more suitable result can be provided. With the Chinese word segmentation technology,
semantics can be understood by an
information retrieval engine better, and the provided
result set can be fully adjusted.