Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

39 results about "Web data mining" patented technology

Enterprise web mining system and method

An enterprise-wide web data mining system, computer program product, and method of operation thereof, that uses Internet based data sources, and which operates in an automated and cost effective manner. The enterprise web mining system comprises: a database coupled to a plurality of data sources, the database operable to store data collected from the data sources; a data mining engine coupled to the web server and the database, the data mining engine operable to generate a plurality of data mining models using the collected data; a server coupled to a network, the server operable to: receive a request for a prediction or recommendation over the network, generate a prediction or recommendation using the data mining models, and transmit the generated prediction or recommendation.
Owner:ORACLE INT CORP

Intelligent sampling for neural network data mining models

A method, system, and computer program product provides automated determination of the size of the sample that is to be used in training a neural network data mining model that is large enough to properly train the neural network data mining model, yet is no larger than is necessary. A method of performing training of a neural network data mining model comprises the steps of: a) providing a training dataset for training an untrained neural network data mining model, the first training dataset comprising a plurality of rows of data, b) selecting a row of data from the training dataset for performing training processing on the neural network data mining model, c) computing an estimate of a gradient or cost function of the neural network data mining model, d) determining whether the gradient or cost function of the neural network data mining model has converged, based on the computed estimate of the gradient or cost function of the neural network data mining model, e) repeating steps b)-d), if the gradient or cost function of the neural network data mining model has not converged, and f) updating weights of the neural network data mining model, if the gradient or cost function of the neural network data mining model has converged.
Owner:ORACLE INT CORP

Hadoop-based network data mining and analysis platform and method thereof

The invention discloses a Hadoop-based network data mining and analysis platform and a method thereof. The platform comprises a data collection layer, a data storage layer, a business application layer and a user layer, wherein the data collection layer adopts a distributed directional collection system architecture, collects original network data by taking terminal sites in different networks as basic task units of network data collection, and transmits the data to the data storage layer in a gathering manner, wherein each basic task unit adopts independent collection rule and policy; the data storage layer is used for finishing gathering, storage and original processing of the original network data and providing different types of function call services; the data storage layer is realized by adopting a Hadoop framework; and the business application layer is used for calling and analyzing the network data processed by the data storage layer to realize separation of a public component and a personalized business application component, and transmitting a result after network data analysis to the user layer for performing real-time display.
Owner:南方电网互联网服务有限公司

Network video classification method based on historical access records

The invention relates to a network video classification method based on historical access records and belongs to the technical field of computer network data mining. The method comprises, firstly, automatically analyzing historical access record datasets of videos, extracting meaningful characteristics, generating standby data files for the historical access record datasets of the videos, converting historical access records into structurized documents applicable to training through the data files and then performing machine learning on the structurized documents through logistic regression to obtain prediction models; utilizing the prediction models, according to the integrity of the historical access record information of videos to be predicted, to select corresponding methods to perform classification prediction on the videos to be predicted. Compared with the prior art, the network video classification method based on the historical access records can reduce labor costs and simplify parameters involved in computation and is more accurate in prediction effects and lower in time consumption. Meanwhile, due to the fact of being capable of being clustered or not according to the integrity of the historical access record information of the videos to be predicted, the models has a wide application range.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Method for identifying and positioning unknown mobile station in special position area

InactiveCN101820580AOvercome limitationsSolve the problem of identification and positioningLocation information based serviceTelecommunicationsWeb data mining
The invention discloses a method for identifying and positioning an unknown mobile station in a special position area, which performs identification and mobile positioning on the mobile station in the special position area. The method is implemented by identifying the mobile station in the special position area through data mining and combining the identification result and a mobile positioning network. The method overcomes the limitation that the conventional mobile network positioning method serving as a one-dimensional active positioning method must determine a mobile station number IMSI and solves the problem that a network data mining mode positions an idle mobile station.
Owner:刘涛

Vertical search based network data excavation method

The present invention discloses a network data mining method based on a vertical search. Firstly, the method of the vertical search is adopted to search data from a network; the information gained is preprocessed, the structured data after purging through the data is saved in a data base; an analysis is performed to the data in the data base to find a rule, therefore to construct a model, and a matching is performed to the feature vector of a collection and the feature vector of a target sample, herefrom, the degree of association of the relevant collection information is gained; a prediction is performed to an unknown data, and an evaluation is delivered by being compared with a actual result, therefore to perform a revision to an original model parameter, and the authoritative information is supplied to a user. The present invention adopts a network data mining method of a vertical search to gain the relevant information, the relevant specialized information can be effectively gained, the repeated information and the spam information are little, and the enquiry of a user specialization can be met.
Owner:NANJING UNIV OF FINANCE & ECONOMICS

Community detection method based on dynamic synchronous model

The invention belongs to the field of network data mining field, and specifically relates to a community detection method based on a dynamic synchronous model. The method comprises the steps of firstly reading social network data, and performing network vectorization according to a social network graph to obtain a vectorized one-dimensional coordinate sequence; setting synchronization parameters and calculating a synchronization range; performing synchronization clustering, wherein each node is synchronized in the synchronization range according to the extensional synchronous model until a local synchronization status is available; dividing communities according to the coordinate position of each node; calculating the modularity of the division; adding the synchronization parameters constantly; executing a new round of synchronization clustering process until the synchronization range covers all the nodes. Nodes in the network are clustered through a kuramoto model, so that a link density can be accurately described, the difference of the network link density is effectively reflected, the automatic detection of a social network community structure is realized, and the community detection results are selected and optimized.
Owner:XIDIAN UNIV

Detection method for overlapping community based on multi-label propagation

The invention belongs to the technical field of network data mining and particularly relates to a detection method for an overlapping community based on multi-label propagation. The detection method comprises the following steps: A, constructing a social network diagram; B, analyzing a network rough core; C, initializing a label set; D, executing label propagation; E, decomposing a discontinuous community. According to the detection method disclosed by the invention, the link density between every two nodes is fully considered; the detection method has higher accuracy and effectiveness; in addition, manual input of data in the label propagation process is avoided.
Owner:XIDIAN UNIV

Post-loan risk early warning system based on semantic sentiment analysis

The invention discloses a post-loan risk early warning system based on semantic sentiment analysis. The post-loan risk early warning system is characterized by comprising a network data mining module, a semantic sentiment analysis module, a total analysis module and a user interaction module. The network data mining module is used for collecting relevant information of customer enterprises from the network, wherein the relevant information comprises one or more of news, reviews, Microblogs, reports and complaints relevant to the client enterprises. The semantic sentiment analysis module is used for receiving the relevant information, analyzing the sentiment components of the relevant information and generating sentiment polarity K and sentiment intensity M. The total analysis module is used for obtaining the sentiment polarity K and the sentiment intensity M, generating the value of the sentiment polarity K and the value of the sentiment intensity M according to the source of the relevant information, and then obtaining a reliable coefficient P and an overall reliable coefficient W through calculation in sequence according to a predetermined formula. The user interaction module is used for giving a warning when the overall reliable coefficient W is smaller than a warning value. The post-loan risk early warning system based on semantic sentiment analysis can give an early warning for great changes of the client enterprises in time, help a bank to manage the client enterprises better, and effectively reduce post-loan risks.
Owner:SUZHOU UNIV

Web data mining system on basis of Hadoop platform

The invention discloses a Web data mining system on the basis of a Hadoop platform and relates to a data mining system. The system comprises a user interaction layer, a service application layer, a Web data mining platform layer and a distributed storage calculation layer; the user interaction layer is used for interaction between a user and the system and comprises a user management module, a service module and a display module; the service application layer comprises a service response module and a workflow module; the Web data mining platform layer comprises a data loading module, a result storage module, a mode evaluation module, a parallel ETL (Extract Transform and Load) module and a parallel data mining algorithm module; and the distributed storage calculation layer uses Hadoop to implement file distributed storage and parallel calculation functions and comprises an HDFS (Hadoop Distributed File System) module, a MapReduce module and a distributed management module. According to the invention, the calculation and storage requirements of each module with the requirement on huge calculation capacity are expanded onto each node in an HADOOP cluster and related data mining work is carried out by utilizing the parallel calculation and storage capacity of the cluster.
Owner:句容智恒安全设备有限公司

Electric power big data visual neural network data mining technology-based electric power failure prediction method

The invention discloses an electric power big data visual neural network data mining technology-based electric power failure prediction method. The method comprises an electric power big database, a data mining preprocessing and visual processing module, a visual BP neural network data mining module and a result output module. According to the method, failure prediction is realized via a graphicalneural network data mining technology, so that the electric power big data using difficulty is reduced and the using efficiency is improved.
Owner:STATE GRID ZHEJIANG ELECTRIC POWER +2

Mass web data mining method based on Hadoop

The invention discloses a mass web data mining method based on Hadoop, and belongs to the field of computer data processing. A genetic algorithm is fused with the MapReduce of Hadoop, and mass Web data in a Hadoop-based distributed file storage system (HDFS) is mined to further verify the high efficiency of a platform, and a preferred access route of a user in a Web log is mined with a fused algorithm on the platform. As proved by an experiment result, the efficiency of Web data mining can be remarkably increased by processing of a large amount of Web data with a distributed algorithm in Hadoop.
Owner:INSPUR GROUP CO LTD

Microblog-user-quality-based information influence evaluation method

InactiveCN105608625AImprove impactBlock the influence of false WeiboData processing applicationsMicrobloggingData acquisition
The invention relates to the social network data mining field, particularly to a microblog-user-quality-based information influence evaluation method. The method comprises steps of data acquisition, data processing, user quality calculation, and dynamic message influence calculation. According to the microblog-user-quality-based information influence evaluation method provided by the invention, the traditional influence maximization problem can be solved by considering the user quality involved in microblog information spreading; and the good influence effect is realized. With the method, the microblog message influence can be evaluated effectively and the faked microblog influence caused by the robot fan can be shielded.
Owner:HARBIN ENG UNIV

Network data excavation method

The invention provides a network data excavation method, which is used for performing text classification and text clustering on acquired webpage information so as to extract topics. The network data excavation method specifically comprises the following steps of S1, catching the webpage information by a preset network probe according to an industrial body; S2, performing text extraction on the acquired network information; S3, performing text classification on extracted texts by a preset classifier to generate a plurality of text type systems; S4, clustering texts under each text type system to generate a plurality of text sub types, wherein each text sub type corresponds to each topic; S5, storing webpage links, and constructing an index according to the text type systems and the text sub types. The network data excavation method provided by the invention can combine repeated information.
Owner:BEIJING ZHITOUJIA INTPROP OPERATION CO LTD

Viterbi algorithm based web page sorting dynamic crawling method

The invention belongs to the technical field of network data mining and relates to a viterbi algorithm based web page sorting dynamic crawling method. The method includes: providing a seed URL (uniform resource locator), taking the seed URL as a parent link to crawl downwards to acquire outbound sublinks; calculating inbound link quantity of the sublinks on the basis of a link structure; acquiringsublink web page content, and calculating similarity of the web page content to a theme; calculating web page comprehensive assessment values, eliminating web pages low in assessment value, and taking the rest of web pages as a parent link to crawl downwards to obtain new links; repeating the process until no new web page joins in during crawling, and stopping crawling. The method has the advantages that under the condition of a given theme, a user can efficiently and accurately acquire important websites under the specific theme through viterbi algorithm based dynamic web crawling.
Owner:KUNMING UNIV OF SCI & TECH

Web community dividing method based on importance degrees and separation degrees of nodes

ActiveCN107862073AReliable divisionFully reflect the densityData processing applicationsSpecial data processing applicationsNODALWeb data mining
The invention discloses a Web community dividing method based on importance degrees and separation degrees of nodes, and belongs to the technical field of Web data mining. The method includes the following steps of firstly, representing a Web network in the form of a figure, representing Web pages through the nodes in the figure, and representing the links between the Web pages through sides between the nodes; secondly, calculating the degree of each node in the figure and the similarity among the nodes; thirdly, calculating the separation degree of each node through the importance degree of the node and the similarity among the nodes; fourthly, calculating the representation degree of each node through the importance degree and the separation degree of the node; fifthly, sequencing all the nodes in the network according to the importance degrees, and selecting a central node of the network community according to the representation degrees of the nodes; sixthly, determining a communitylabel of each network node based on the importance degree and the similarity of the nodes; seventhly, putting the Web pages represented by the nodes with the same community labels in the same community to complete community division.
Owner:山西朔铭科技有限公司

Name country identification method based on WEB and GBBoosting algorithms

The invention discloses a name country identification method based on WEB and GBBoosting algorithms, and belongs to the technical field of WEB data mining. The method comprises the steps of I. extracting names of scholars in universities through a WEB data extraction technology; II. constructing a GBBoosting algorithm: constructing weak classifiers, wherein each weak classifier outputs a weak classification hypothesis to an input sample, and a strong classifier is formed through weight fusion of all weak classifiers; and III. identifying countries of the names through the GBBoosting algorithm. The name country identification method based on WEB and GBBoosting algorithms disclosed by the invention effectively solves a problem on classifying names of two countries which are similar in the spelling way; and meanwhile, the method, compared to existing other classifying methods, is easier to implement, and can be better applied to engineering practices such as name country or city country semantic annotation.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Network node importance evaluation method based on community influence

The invention belongs to the technical field of network data mining, and particularly relates to a network node importance evaluation method based on community influence, which comprises the followingsteps: performing community division on a social network to obtain a community structure in the network; calculating the information propagation influence of each community, and calculating the influence degree of each node in the network on each connected community; and integrating the influence degree of the nodes on the connected communities and the influence of the nodes on the correspondingcommunities to evaluate the capability of the nodes to indirectly propagate information through the communities in the network. According to the method, the influence degree of the nodes on the connected communities and the information propagation capacity of the communities are comprehensively considered; the ability of nodes in the network to indirectly propagate information through communitiesis measured; important nodes are evaluated, the important nodes indirectly influencing network information propagation through the community can be accurately found, the importance of social network nodes indirectly influencing network information propagation can be reasonably and effectively evaluated, effective guidance and control of the social network are achieved, and the method has importantguiding significance for social network public opinion monitoring.
Owner:PLA STRATEGIC SUPPORT FORCE INFORMATION ENG UNIV PLA SSF IEU

Data mining system, data mining method and data retrieval system

Accurate concept information and relationships between concepts are extracted from a figure even in a case where sufficient character string recognition accuracy cannot be obtained by image processing, or a case where lexical ambiguity remains because there are a plurality of meanings with the same spelling. After concepts and relationships between concepts appearing in a document referring to a figure and a document related or similar to the document are prepared, candidates for concepts and relationships between concepts are limited to those likely to appear in the figure by checking against the prepared concepts and relationships between concepts. Thus, a false recognition rate is lowered.
Owner:HITACHI LTD

Community structure discovery method and system of e-mail network

The invention discloses a community structure discovery method and system for an e-mail network, belongs to the field of Web network data mining, and the method and the system are used for solving theproblem of community discovery in the e-mail network. The method comprises the following steps: carrying out E-mail network topology modeling based on an E-mail data set; randomly initializing the community label of each user in the e-mail network for multiple times, and generating a plurality of independent community discovery results of the e-mail network by utilizing a label propagation method; calculating the modularity of each independent community discovery result; calculating an integration weight of each independent community discovery result; and performing weighted integration on the plurality of independent community discovery results of the e-mail network to obtain an integrated community discovery result of the e-mail network. The method and the system have the advantages ofbeing simple in algorithm structure, easy to implement and high in execution efficiency, and the E-mail network community discovery result high in stability and reliability can be obtained by conducting consistency integration on the multiple independent community discovery results.
Owner:SHANXI UNIV

Hadoop-based mass Web data mining genetic method

The invention provides a Hadoop-based mass Web data mining genetic method and belongs to the field of data mining and analysis. According to the Hadoop-based mass Web data mining genetic method, a genetic algorithm is fused with a MapReduce and is used for performing Web data analysis in a Hadoop cluster environment. Experimental results show that a platform can obtain implicit information with apractical value and is high in execution efficiency, and can not only improve the mining efficiency but also overcome the disadvantage of a network environment.
Owner:山东爱城市网信息技术有限公司

Web data mining system based on XML

The invention discloses a Web data mining system based on XML. The system comprises a user interface module, a preprocessing module, a data mining module and a result accessing module. The problem of Web data mining is solved effectively, by XML, structural data from difference sources are effectively combined, searching in diversified difficultly compatible databases is possible, and the technical problem of Web data mining is solved effectively. In addition, owing to powerful expansibility and flexibility of XML, XML is allowed to reasonably describe various application software data, the gathered Web data records are convenient to describe, and therefore, favorable conditions are provided for software developers and Web terminal and station users.
Owner:合肥红珊瑚软件服务有限公司

A Dynamic Crawling Method Based on Viterbi Algorithm for Web Page Classification and Sorting

The invention relates to a dynamic crawler method for classifying and sorting web pages based on a Viterbi algorithm, and belongs to the technical field of network data mining. The present invention first gives the seed URL, crawls down the seed URL as the parent link, and obtains the chain link; calculates the number of incoming links of the sub-link based on the link structure; then obtains the content of the sub-link web page and calculates the similarity between the content of the web page and the theme performance; by calculating the comprehensive evaluation value of the webpage, the webpage with a low evaluation value is eliminated and the remaining webpage is used as the parent link to crawl down the new link. Repeat the above process until no new web pages are added during the crawling process, then stop the crawling. The method of the invention enables the user to efficiently and accurately obtain important websites under a specific theme under the condition of a given theme through a dynamic web crawler based on the Viterbi algorithm.
Owner:KUNMING UNIV OF SCI & TECH

A Network Video Classification Method Based on Historical Access Records

The invention relates to a network video classification method based on historical access records and belongs to the technical field of computer network data mining. The method comprises, firstly, automatically analyzing historical access record datasets of videos, extracting meaningful characteristics, generating standby data files for the historical access record datasets of the videos, converting historical access records into structurized documents applicable to training through the data files and then performing machine learning on the structurized documents through logistic regression to obtain prediction models; utilizing the prediction models, according to the integrity of the historical access record information of videos to be predicted, to select corresponding methods to perform classification prediction on the videos to be predicted. Compared with the prior art, the network video classification method based on the historical access records can reduce labor costs and simplify parameters involved in computation and is more accurate in prediction effects and lower in time consumption. Meanwhile, due to the fact of being capable of being clustered or not according to the integrity of the historical access record information of the videos to be predicted, the models has a wide application range.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

A Post-loan Risk Early Warning System Based on Semantic Sentiment Analysis

The invention discloses a post-loan risk early warning system based on semantic sentiment analysis. The post-loan risk early warning system is characterized by comprising a network data mining module, a semantic sentiment analysis module, a total analysis module and a user interaction module. The network data mining module is used for collecting relevant information of customer enterprises from the network, wherein the relevant information comprises one or more of news, reviews, Microblogs, reports and complaints relevant to the client enterprises. The semantic sentiment analysis module is used for receiving the relevant information, analyzing the sentiment components of the relevant information and generating sentiment polarity K and sentiment intensity M. The total analysis module is used for obtaining the sentiment polarity K and the sentiment intensity M, generating the value of the sentiment polarity K and the value of the sentiment intensity M according to the source of the relevant information, and then obtaining a reliable coefficient P and an overall reliable coefficient W through calculation in sequence according to a predetermined formula. The user interaction module is used for giving a warning when the overall reliable coefficient W is smaller than a warning value. The post-loan risk early warning system based on semantic sentiment analysis can give an early warning for great changes of the client enterprises in time, help a bank to manage the client enterprises better, and effectively reduce post-loan risks.
Owner:SUZHOU UNIV

A data mining method and data mining system

The invention discloses a data mining method and a data mining system. The method includes the following steps of (A) data separation, (B) data sieving, (C) data iterative processing, (D) data normalization and (E) result judgment. The method and the system can overcome defects in the prior art, and processing speed of data mining with a large data quantity is remarkably increased by optimizing the data processing procedure.
Owner:CHANGCHUN UNIV OF TECH

Recognition method of personal name and country based on web and gbboosting algorithm

The invention discloses a name country identification method based on WEB and GBBoosting algorithms, and belongs to the technical field of WEB data mining. The method comprises the steps of I. extracting names of scholars in universities through a WEB data extraction technology; II. constructing a GBBoosting algorithm: constructing weak classifiers, wherein each weak classifier outputs a weak classification hypothesis to an input sample, and a strong classifier is formed through weight fusion of all weak classifiers; and III. identifying countries of the names through the GBBoosting algorithm. The name country identification method based on WEB and GBBoosting algorithms disclosed by the invention effectively solves a problem on classifying names of two countries which are similar in the spelling way; and meanwhile, the method, compared to existing other classifying methods, is easier to implement, and can be better applied to engineering practices such as name country or city country semantic annotation.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

A Web Community Division Method Based on Node Importance and Separation

The invention discloses a Web community dividing method based on importance degrees and separation degrees of nodes, and belongs to the technical field of Web data mining. The method includes the following steps of firstly, representing a Web network in the form of a figure, representing Web pages through the nodes in the figure, and representing the links between the Web pages through sides between the nodes; secondly, calculating the degree of each node in the figure and the similarity among the nodes; thirdly, calculating the separation degree of each node through the importance degree of the node and the similarity among the nodes; fourthly, calculating the representation degree of each node through the importance degree and the separation degree of the node; fifthly, sequencing all the nodes in the network according to the importance degrees, and selecting a central node of the network community according to the representation degrees of the nodes; sixthly, determining a communitylabel of each network node based on the importance degree and the similarity of the nodes; seventhly, putting the Web pages represented by the nodes with the same community labels in the same community to complete community division.
Owner:山西朔铭科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products