A n-gram statistical analysis is employed to acquire frequently appearing character strings of n characters or more, and individual character strings having n characters or more are replaced by character translation codes of 1 byte each. The correlation between the original character strings having n characters and the character translation codes is registered in a character translation code table. Assume that a character string of three characters, i.e., a character string of three bytes, "sta," is registered as 1-byte code "e5" and that a character string of four characters, i.e., a character string of four bytes, "tion," is registered as 1-byte code "f1." Then, the word "station," which consists of a character string of seven characters, i.e., seven bytes, is represented by the 2-byte code "e5 f1," so that this contributes to a compression of five bytes.