A subject-class-based cross-lingual biomedical research paper information recommendation method

A biomedical and information recommendation technology, which is applied in the field of subject-based cross-language biomedical academic paper information recommendation, can solve the problems of affecting the recommendation results, low model effect, deviation from the real needs of users, etc., to solve polysemy and polysemy, reducing the effect of dependence

Pending Publication Date: 2019-01-22
SUN YAT SEN UNIV
View PDF7 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These phenomena often occur in the processing of document information, which will directly affect the recommendation results and make them deviate from the real needs of users.
In dealing with cross-language retrieval problems, many scholars have also tried to use various methods to optimize the effect of machine translation, but they still cannot do without the dependence on translation dictionaries or bilingual control corpora. The professionalism and particularity of academic literature makes the translation model The difficulty of modeling increases, and the effect of the model is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A subject-class-based cross-lingual biomedical research paper information recommendation method
  • A subject-class-based cross-lingual biomedical research paper information recommendation method
  • A subject-class-based cross-lingual biomedical research paper information recommendation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0042] The present invention is a cross-language biomedical academic paper information recommendation method based on topic clustering. The method is mainly divided into two parts: offline text information processing and online document recommendation.

[0043] The first part of the offline text information processing work is mainly to extract the information of academic literature to obtain vector information that can be used to reflect the theme of the literature. This part of the work is mainly divided into the following four steps, such as figure 1 shown.

[0044] S1: First, perform data preprocessing on the text data.

[0045] S2: According to the word frequency information obtained by data preprocessing, apply the PLAS model to perform text clustering and obtain the subject grouping of each academic document.

[0046] S3: Calculate the word vector information of each topic group and obtain the vector information of each topic group.

[0047]S4: Use the translation rel...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention relates to the technical field of information retrieval and recommendation systems, and more particularly, to a subject-class-based cross-lingual biomedical research paper information recommendation method. The method mainly comprises the following steps of: carrying out data preprocessing on the text data, applying the PLAS model to text clustering, calculating the word vector information of each subject grouping, obtaining the most relevant cross-language subject number of each subject, reading the retrieval word group input by the user, judging the retrieval word groupof the user, obtaining the recommendation result of the Chinese article and the recommendation of the English literature and so on. The invention realizes the dimensionality reduction of the analysisof the text from the word frequency space to the spatial subject space. The method of data dimension reduction can effectively reduce the dependence of the model on translation methods, which is conducive to cross-linguistic literature feature analysis. At the same time, topic model can effectively mine the semantic information in documents, discover the potential association between documents, and effectively solve the problem of polysemy and monosyllabic multi-word.

Description

technical field [0001] The present invention relates to the technical fields of information retrieval and recommendation systems, and more specifically, to a topic-based method for recommending information on cross-language biomedical academic papers. Background technique [0002] In the text recommendation system, the most commonly used method is to use the Term Frequency-Invert Document Frequency (TF-IDF) method to convert the document into a vector representation of the term frequency dimension. Then, the similarity between documents is calculated by the distance of word vectors, so as to make content-based recommendations. As a statistical method, TF-IDF only considers the word frequency information of words appearing in documents, but cannot mine semantic information and statistical information within and between documents. This information is often the best feature that reflects the content of the document. For document processing in different languages, TF-IDF canno...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F16/35
CPCG06F40/289
Inventor 陆遥霍焯亮
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products