Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

575 results about "Data dredging" patented technology

Data dredging (also data fishing, data snooping, data butchery, and p-hacking) is the misuse of data analysis to find patterns in data that can be presented as statistically significant when in fact there is no real underlying effect. This is done by performing many statistical tests on the data and only paying attention to those that come back with significant results, instead of stating a single hypothesis about an underlying effect before the analysis and then conducting a single test for it.

Method for automatic evaluation based on generalized fluent spoken language fluency

ActiveCN101740024ATroubleshoot automated assessment issuesFast scoringSpeech recognitionData dredgingSpoken language
The invention relates to a method for automatic evaluation based on generalized fluent spoken language fluency, which comprises the following steps of: acquiring speech data according to different ages and spoken language levels by using a speech input device; adopting an evaluating model based on characteristics of the generalized fluency and the machine learning training fluency; configuring a speech recognition system with corresponding parameters according to scripts of different subjects and genders of enunciators in the speech data; performing quantification on speech speed coherence, content understanding, advanced skills and reconstruction standard characteristics in the speech data to comprehensively extract the characteristics of the fluency from the speech data from the angle of expert assessment and evaluation; and adopting a decision tree method in regression fitting analysis and data mining to detect faults of abnormal fluency and grade and diagnose the fluency. The acquired score of the machine fluency can reach the level close to that of grading experts, and the relativity index exceeds that of 2 to 3 of general 5 experts; besides, the method has a high speed, and can be embedded into a spoken language automatic evaluation system to serve as an important module to evaluate fluency indexes in pronunciation quality.
Owner:IFLYTEK CO LTD

Intelligent ammeter fault real time prediction method based on decision-making tree

ActiveCN106054104AReflect real-time fault conditionsElectrical measurementsData dredgingSmart meter
Provided is an intelligent ammeter fault real time prediction method based on a decision-making tree, comprising the steps of: 1, pre-processing intelligent ammeter data of an electricity information acquisition system; 2, according to an intelligent ammeter fault determination model, screening the fault data of intelligent ammeters in the electricity information acquisition system and sending the fault data into an intelligent ammeter fault database; 3, dividing the historic data in the intelligent ammeter fault database into a training set and a test set, employing a decision-making tree algorithm to perform data excavation on the training set, and forming an intelligent ammeter fault decision-making tree and a preliminary classification rule; 4, through the data of the test set, performing accuracy assessment on the preliminary classification rule, determining the preliminary classification rule if the accuracy meets requirements, or else returning to the training set for training again; 5, generating an intelligent ammeter fault real time prediction model according to a finally determined classification rule; and 6, linking an intelligent ammeter real time fault database to the intelligent ammeter fault real time prediction model for real time prediction to obtain intelligent ammeter fault real time prediction results.
Owner:国网新疆电力有限公司营销服务中心 +1

Distributed knowledge data mining device and mining method used for complex network

The invention discloses a distributed knowledge data mining device and method used for a complex network. The distributed knowledge data mining device adopts a distributed computing platform which is composed of a control unit, a computing unit and a man-machine interaction unit, wherein the innovation key is to finish the calculated amount needed by a multifarious clustering algorithm in the data mining by different servers so as to improve the efficiency of the data mining. Aiming at different knowledge data, the degrees of relation and the weights of knowledge data also can be computed by applying different standards, so that a more credible result is obtained. A second-level clustering mode is adopted in the knowledge data mining process; the result of the first-level clustering is relatively rough, but the computing complexity is very low; and the computing complexity of the second-level clustering is relatively high, but the result is more precise. By combining the first-level clustering with the second-level clustering efficiently, the distributed knowledge data mining device improves the time complexity and clustering precision greatly in comparison with the traditional first-level clustering mode. According to the invention, as a visual and direct exhibition network structure and a dynamic evolutionary process are adopted, references are provided for the prediction in the fields of disciplinary development and hotspot research.
Owner:BEIJING UNIV OF POSTS & TELECOMM

Mobile data traffic package recommendation algorithm based on user historical data

The invention provides a mobile data traffic package recommendation algorithm based on user historical data according to data mining analysis technology. The mobile data traffic package recommendation algorithm comprises the following steps of: 1) a target user finding period comprising the processes of a, acquiring a processed generated data set which comprises a training set and a prediction set, b, executing a random forest classification algorithm for finding a latent data traffic package improving user as a target user, and c, ending; 2), a data traffic package recommendation period comprising the process of a, acquiring a processed generated prediction set, b, executing a K-means clustering algorithm for obtaining a slightly similar user cluster, c, obtaining the target user obtained in the process 1)-b, d, executing a TopN recommendation algorithm on the target user in a same cluster according to a similarity function of the user, and e, ending. The mobile data traffic package recommendation algorithm is used for finding the latent user with a latent data traffic improvement requirement according to data mining technology and executing a recommended plan on the user. Compared with a traditional method, the mobile data traffic package recommendation algorithm has advantages of higher accuracy, higher efficiency, simple realization, low cost, etc.
Owner:NANJING UNIV

Non-invasive load identification algorithm based on hybrid neural network and ensemble learning

The invention belongs to the data mining and machine learning field and relates to a non-invasive load identification algorithm based on a hybrid neural network and ensemble learning. According to the method, experimental data are processed, so that the format of the data conforms to the input formats of models; after the data are processed, a hybrid neural network model is established; the data are input into the model; the model is trained and tested, identification results are obtained; and voting is performed for the results of three different models based on the idea of ensemble learning, so that a final identification result is obtained. With the method adopted, the feature extraction effect and load identification effect of the hybrid neural network are better than the effects of a traditional neural network; an ensemble learning idea-based method is provided, a plurality of feature subsets are selected from a total feature set so as to train a plurality of base classifiers, and the base classifiers are combined, and therefore, variance can be decreased, and the identification effect of the final identification result can be improved, and the problem of adverse influence of the introduction of harmonic features on an identification effect can be solved.
Owner:NORTH CHINA ELECTRIC POWER UNIV (BAODING)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products