The invention discloses a capital market public opinion monitoring method based on a distributed
web crawler and an NLP. The capital market public opinion monitoring method comprises a distributed crawler module based on a
cloud server and a financial text NLP analysis
system. Based on a distributed crawler of a
cloud server, multi-process and periodically updated
public information capture is realized, and the
system scale can be quickly adjusted according to use requirements; according to the method, a financial segmented word
bank and a positive and negative emotion word
bank are constructed, the cost of
manual annotation is reduced in the corpus construction process through mixed sample inspection,
fuzzy clustering and other algorithms, the positive and negative emotion tendency probability of a text is calculated based on a
supervised learning algorithm, emotion indexes are synthesized by adopting an auto-
encoder algorithm, and the emotion judgment accuracy is improved; accordingto the distributed architecture,
system crash caused by a
single node fault is avoided, the validity of the emotion index is improved through the reconstructed financial emotion word
bank and the textanalysis
algorithm, the market emotion is dynamically reflected, and real-time capital market data is provided for a user.