Method for automatically obtaining short text of knowledge domain from community question-and-answer website

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A knowledge field and community question-and-answer technology, which is applied in the field of automatically obtaining short texts in the knowledge field from community question-and-answer websites, can solve problems such as unfavorable use and learning of learners, inability to fully cover resources, and incomplete resources, so as to facilitate learning and use. Effect

Active Publication Date: 2016-07-13

XI AN JIAOTONG UNIV

View PDF6 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Therefore, the resources crawled by the above patents based on the URL may not be complete and cannot completely cover all resources in a certain field, which is not conducive to learners' use and learning

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0056] The present invention will be further described in detail below in conjunction with specific embodiments, which are for explanation rather than limitation of the present invention.

[0057] The present invention is a method for automatically acquiring short texts in the knowledge field from a community Q&A website, which realizes automatic collection and sorting of short texts in the knowledge field of the community Q&A website. It includes the following steps.

[0058] (1) Crawling the web pages of the knowledge domain in the community question and answer website: crawl the dynamic web pages of the community question and answer website and ensure the integrity of the data. Taking the Quora website as an example, web pages containing knowledge domain knowledge include topic pages, question pages, and author pages, which are crawled according to the depth-first traversal algorithm. First, crawl the topic page according to the Quora topic page address, obtain the hyperlinks t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a method for automatically obtaining short text of the knowledge domain from a community question-and-answer website.Question-and-answer web pages and author web pages of each subject of the domain corresponding to the knowledge domain can be crawled from the community question-and-answer website, a system with comprehensive data is obtained, and learning and using of a user are convenient.The method comprises the following steps that 1, a Web page of the knowledge domain in the community question-and-answer website is crawled; 2, short text, with concentrated web page data, of the knowledge domain is extracted; 3, a domain subject tree is constructed; 4, storing of the domain subject tree is conducted.By means of the method, the short text of the knowledge domain can be automatically extracted from semi-structured data of the community question-and-answer website, the question-and-answer web pages and the author web pages of each subject of the domain corresponding to the knowledge domain are crawled from the community question-and-answer website, a web page data set of the knowledge domain is constructed, the short text of the knowledge domain is automatically extracted from the web page data set, and parent child relationships are found, so that the domain subject tree is constructed, storing of the domain subject tree is achieved, and learning and using of the user are convenient.

Description

Technical field [0001] The invention relates to a method for acquiring website information, in particular to a method for automatically acquiring short texts in a knowledge field from a community question and answer website. Background technique [0002] Open knowledge sources represented by community Q&A websites have become an important source of knowledge for people. These knowledge sources have an open and collaborative knowledge sharing mechanism, which can effectively promote the dissemination and application of knowledge, but at the same time it also exacerbates the fragmentation of knowledge. The accumulated fragmented knowledge is scattered in different corners and exists in the form of short texts. repeat. Take the community question and answer website Quora as an example. Quora is a community question and answer website (English website) with short texts in a rapidly growing field of knowledge. The questions on the Quora website are mainly organized in the form of top...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30

CPCG06F16/951G06F16/955

Inventor 魏笔凡郑元浩刘均郑庆华吴蓓闫彩霞郭朝彤张玲玲

Owner XI AN JIAOTONG UNIV

Method for automatically obtaining short text of knowledge domain from community question-and-answer website

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology