The invention discloses a new word automatic searching
system and a new word automatic searching method based on a
query log. The new work automatic searching
system mainly comprises a
query log preprocessing module, a new word discovering module and a new word generating module, wherein the
query log preprocessing module is used for regularly acquiring a
query string, query frequency and the like from the query log in a timed interval according to set timed new word discovering time; the new word discovering module counts the frequency of the same n-
gram strings according to a word segmentation result of the
query string, computes the concurrence rate of the n-
gram strings, and mergers primary and secondary strings with the similar frequency in a candidate new word
assembly; and the new word generating module performs filtering and
pruning strategies for the candidate new word
assembly, and removes rubbish strings in the candidate new word
assembly, so that a final new word assembly is obtained. By the new word automatic searching
system and the new word automatic searching method, problems that a corpus is difficult to acquire by an existing statistic method, a rule-based method is difficult to extend and the like are solved, new words can be automatically discovered from the query log simply and easily by the aid of the concurrence rate of the word strings and the auxiliary filtering strategy, and the final new word assembly is introduced into a word segmentation
bank, so that incremental new word discovery is realized.