The invention discloses a method for solving a text similarity based on the Gini index. The method comprises the following steps: performing text word segmentation processing by use of the word segmentation technology, matching with a stop word list to perform a stop word elimination operation on a vocabulary, and obtaining a series of vocabulary positions and word characteristic weighted values according to the research statistics; collecting and reducing dimensions of the text vocabulary by use of a target weight function as shown in description, combining the vocabularies with high similarity according to the semantic similarity, collecting and reducing the dimensions of above characteristic words again, and solving the inter-textual similarity by use of the similarity between the vectors. Compared with the traditional text characteristic vocabulary extracting method, the method disclosed by the invention is higher in accuracy, better in application vale, and good in data processing effect; the defects of an information gain method are overcome, the result is more suitable for the experience value, the text characteristic vocabulary high-dimensional spare problem and the problem of the synonyms and polyseme are solved, the contribute degrees of different vocabularies to the text thought are computed, and the good theory basis is provided for the subsequent text similarity and text clustering.