Scientific and technical information acquisition and pushing method based on text classification and image deep mining

A technology of image depth and text classification, which is applied in the acquisition and push of patent information, papers, news, and scientific and technological intelligence acquisition and push based on text classification and image depth mining. , lack of flexibility in keywords, etc.

Active Publication Date: 2014-09-10
苏州弘图智能科技有限公司
View PDF6 Cites 39 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the existing methods often only focus on a certain point in the information push problem. Many methods determine the user's attention keywords by analyzing user behavior attributes, and lack flexibility in the determination of keywords, resulting in the inability to satisfy users' customizable concerns. The demand for information; after some methods capture the required information from the Internet, there is no further structured classification and organization of the captured information. This limits the speed at which users can query the information they need, and cannot meet the needs of users to efficiently obtain the information they need; most methods only capture and push text information, ignoring the information in the form of intuitive and visualized images, which cannot satisfy users. Requirements for fast and efficient access to effective information in information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Scientific and technical information acquisition and pushing method based on text classification and image deep mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The technical scheme of the present invention is described in detail below in conjunction with accompanying drawing:

[0061] as attached figure 1 Shown, the embodiment of the present invention carries out according to the following steps:

[0062] Step 1. Enterprises customize research direction information;

[0063] Step 2, the web crawler reads the research direction information customized by the enterprise in step 1;

[0064] Step 3, the web crawler uses the HTTP protocol to access the Internet and obtain webpage information based on the breadth-first search strategy based on the relevant information read in step 2;

[0065] Step 4, read the web page information text in step 3, and convert it into an ARFF format file text.arff;

[0066] Step 5. Determine whether the trained support vector machine classifier model SMO.model exists, and execute step 13 if it exists, and execute step 6 if it does not;

[0067] Step 6, read the training set and convert it into the f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a scientific and technical information acquisition and pushing method based on text classification and image deep mining. The method is characterized in that keywords in scientific and technical fields that users follow are acquired according users' customization, web crawlers are compiled by Python language, essays, news and patents, related to the keywords in the scientific and technical fields that users follow, are acquired from webpages through the HTTP protocol, the acquired webpage scientific and technical information contents are classified on a Weka platform by a support vector machine classification algorithm, image information in scientific and technical information content documents is extracted and stored by a dividing line algorithm, and finally, the acquired scientific and technical content data are pushed via WeChat public subscribers.

Description

technical field [0001] The invention belongs to the field of scientific and technological information acquisition and processing, and in particular relates to a method for acquiring and pushing scientific and technological information based on text classification and deep image mining, which can be applied to the acquisition and pushing of news, papers, and patent information. Background technique [0002] Nowadays, with the rapid development of the Internet, massive amounts of data are published and shared on the Internet every day. Massive information not only provides more information for Internet users, but also brings difficulties for Internet users to obtain effective information value. In the massive data of the Internet, the value of different information to users with different needs has a large gap. A large amount of information is of no value to users with different needs, and often only a small amount of information is what Internet users pay attention to. of. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/313G06F16/353G06F16/951G06F16/958G06V30/40
Inventor 朱全银严云洋李翔张永军陈孚尹永华孙佩佩黄丽民费飞周泓
Owner 苏州弘图智能科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products