Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

34 results about "Boost regression tree" patented technology

Short-term and medium- and long-term electric power load prediction method based on machine learning model

The invention discloses a short-term and medium- and long-term electric power load prediction method based on machine learning model. Firstly, preprocessing is conducted on data, including smootheningabnormal data and filling missing data. Factors of affecting load changes will be analyzed, including historical data, time periodicity, and weather variable characteristics. Domestication will be conducted on all input variables for accelerating learning speed and raising prediction precision. The invention is advantageous in that linear regression is compared, and the performance of the vectorregression and gradient lifting regression in the short-term and medium- and long-term electric power load prediction is supported; with the prolongation of the prediction time, the performance of thegradient lifting regression model is better that that of the other two models; the AdaBoost algorithm which uses the gradient lifting tree as a basic classifier is brought forward, and load prediction is conducted, and the precision of electric power load prediction can be effectively raised.
Owner:FOSHAN SHUNDE SUN YAT SEN UNIV RES INST +2

Subway short-time passenger flow prediction method based on machine learning

ActiveCN107291668ARealize short-term passenger flow forecastingImprove forecast accuracyForecastingMachine learningReal-time dataData source
The invention discloses a subway short-time passenger flow prediction method based on machine learning. On the basis of subway card shooting data, all passengers are assumed to travel according to the shortest route, and the flow of all intervals and in all stations is counted in a unit time window; subway station passenger flow in the unit time window serves as nodes, subway interval passenger flow in the unit time window serves as the weight of the edge, and a subway passenger flow network is built; features whose influences are most important to a single target interval are selected out to be brought into a follow-up regression prediction model. The recursive feature elimination algorithm is used for completing feature selection, and important features of the target interval in a target time window are selected out. The regression prediction model is built through the gradient boosted regression tree method, and subway short-time passenger flow prediction is achieved. High prediction precision can be achieved through the method under the condition that a data source is simplex. The regression prediction model is built through historical data and combined with real-time data to predict the subway short-time passenger flow, and help is provided for design optimization of urban rail transit operation marshalling.
Owner:CENT SOUTH UNIV

Method and system for forecasting spatial distribution of heavy metals in soil

The embodiment of the invention provides a method and a system for forecasting the spatial distribution of heavy metals in soil, and relates to the technical field of environmental monitoring. The method comprises the following steps: measuring a content value of heavy metals in soil of a soil sample at a sampling point; constructing a training data set; constructing a GBRT (Gradient Boosting Regression Tree) model by adopting an auxiliary feature and the content of heavy metals in the soil as variables, and training the GBRT model by using the training data set to obtain a trained GBRT model; constructing a data set to be measured; inputting the data set to be measured into the trained GBRT model and outputting a content value of heavy metals in the soil corresponding to a point to be measured. As a GBRT algorithm iteratively integrates characteristics of a number of weak regression devices, and auxiliary features are rationally applied, the method has a good generalization ability and is conducive to popularization and application.
Owner:BEIJING RES CENT FOR AGRI STANDARDS & TESTING

Methods for building regression trees in a distributed computing environment

Systems and methods are disclosed for building and using decision trees, preferably in a scalable and distributed manner. Our system can be used to create and use classification trees, regression trees, or a combination of regression trees called a gradient boosted regression tree (GBRT). Our system leverages approximate histograms in new ways to process large datasets, or data streams, while limiting inter-process communication bandwidth requirements. Further, in some embodiments, a scalable network of computers or processors is utilized for fast computation of decision trees. Preferably, the network comprises a tree structure of processors, comprising a master node and a plurality of worker nodes or “workers,” again arranged to limit necessary communications.
Owner:BIGML

Ocean buoy life prediction method based on multi-class machine learning method

ActiveCN112288191AImprove accuracySolve the problem of unknowable real life spanForecastingArtificial lifeOptimal decisionData set
The invention discloses an ocean buoy life prediction method based on a multi-class machine learning method, and the method comprises the specific steps: S1, building different buoy life prediction models based on the machine learning method, carrying out the feature selection of each hardware feature of a buoy, obtaining the static attribute of the buoy, and enabling the survival time of the buoyto serve as a dynamic attribute, jointly forming a data set for training a buoy life prediction model, and further evaluating the prediction accuracy of the buoy life prediction model; wherein the buoy life prediction model comprises a regression decision tree, a gradient boosting regression tree, a random forest and a support vector regression machine; and S2, respectively inputting the to-be-predicted data set into the trained buoy life prediction model to obtain four prediction results, and obtaining a final prediction result according to the four prediction results. According to the method, prediction results of various models are comprehensively considered to make an optimal decision, and the accuracy of the prediction method is effectively improved.
Owner:NAT MARINE DATA & INFORMATION SERVICE

Satellite network flow prediction method based on space-time correlation

The invention discloses a satellite network flow prediction method based on space-time correlation. The method comprises the following steps: extracting satellite space-time correlation flow; reducingrelated flow dimensions of singular matrix decomposition, and extracting features; and establishing a satellite network traffic prediction model based on the gradient boosting regression tree. According to the method, singular matrix decomposition is carried out on the collected space-time flow to obtain the space-time related flow after dimension reduction, the space-time related flow serves asprediction input of a gradient boosting regression tree, then training and testing are carried out, and finally an accurate prediction value is output. According to the method, a new model is constructed by the gradient boosting regression tree in the gradient descending direction, the algorithm convergence method is optimized by improving the learning rate, in addition, the model is continuouslyupdated by minimizing the expected value of the loss function, so that the model tends to be stable, and finally, a future value is predicted by using test data for verification. Decision support is provided for planning of satellite network flow, and the method has a good application prospect.
Owner:DALIAN UNIV

Cigarette ventilation rate prediction method based on gradient boosting regression tree

The invention provides a cigarette ventilation rate prediction method based on a gradient boosting regression tree, and the method comprises the steps: carrying out the data preprocessing to form an original data set Dataset, wherein the data comprises the characteristic data: cigarette paper air permeability, tipping paper air permeability, filter stick suction resistance, cigarette length, cigarette circumference, cigarette hardness, cigarette quality and cigarette suction resistance; dividing the original data set Dataset into a training set Transfer and a test set Test set; performing feature selection by adopting the maximum information coefficient; performing parameter optimization on the cigarette ventilation rate prediction model based on the gradient boosting regression tree by adopting a Bayesian optimization method; according to a parameter optimization result, performing model verification by utilizing data in the test set Test set, and realizing cigarette ventilation rateprediction by utilizing the verified model. The model established by the invention has the advantage of high precision, and can accurately predict the ventilation rate of the cigarette.
Owner:HUBEI CHINA TOBACCO IND

Optical microscope automatic focusing method based on machine learning

The invention provides an optical microscope automatic focusing method based on machine learning, and belongs to the technical field of medical image processing. The method comprises the steps that firstly, images collected by an optical microscope and grouped are represented by designed original features and combined features, and the sequence difference value of a picture and the most clear picture in the group serves as the label of the picture; then the importance of the original features and the combined features is calculated by adopting a random forest composed of regression trees, andthe features with high importance are screened out through multiple times of iteration with the cooperation of set threshold value; data is divided into a training set and a test set by using a leave-one-out method and the screened features to train the gradient boosted regression trees, and finally automatic focusing is carried out on a strong regression device obtained through iterative training.
Owner:湖南品信生物工程有限公司

Power battery fault diagnosis method and system based on data driving

The invention provides a power battery fault diagnosis method and early warning system based on data driving. The method comprises the following steps: 1, collecting the performance parameters of a power battery under various working conditions and various states of the power battery, including the capacity, voltage, internal resistance and power of the power battery; 2, cleaning the acquired data; 3, calculating the state of charge SOC and the state of health SOH of the power battery according to the cleaned data; 4, formulating a fault level according to actual driving experience and automobile safety; 5, making the data obtained in steps 2, 3 and 4 into a data set; 6, putting the training set into a gradient boosting regression tree model, and carrying out iterative training on the training set; and 7, putting the test set into the model, evaluating the accuracy of the model, and adjusting model parameters according to the accuracy. The power battery fault can be accurately predicted, early warning is carried out on the fault, and the safety of the electric vehicle is greatly improved.
Owner:NANJING FORESTRY UNIV

Formation pore pressure prediction method based on machine learning

The invention relates to the technical field of logging engineering, aims to provide a formation pore pressure prediction method based on machine learning, and solves the problems that an existing prediction method is lower in prediction result accuracy and not ideal in effect. According to the technical scheme, the formation pore pressure prediction method based on machine learning comprises the following prediction steps of a, processing and preparing data, namely collecting the related logging data and the related rock physical property parameters; b, determining a sensitive curve, namely preparing a reference sequence and a comparison sequence of a grey relational degree method, and determining a sensitive logging curve; c, training and testing a model, namely dividing an original data set into a training set and a testing set, and inputting the training set into a gradient boosting regression tree model to obtain an optimal model; and d, predicting the formation pore pressure, namely taking the sensitive logging curve as an input feature vector of the optimal model to predict the reservoir formation pressure. The method has the advantages of better prediction precision, wide prediction range, high reliability and the like.
Owner:SOUTHWEST PETROLEUM UNIV

PM2.5 concentration satellite remote sensing estimation method in polluted weather

A PM2.5 concentration satellite remote sensing estimation method in polluted weather comprises establishing a data set by using satellite aerosol optical thickness data and corresponding ground PM2.5concentration data; completing sample learning and data testing based on a gradient boost regression tree learning method; verifying the accuracy of a test result, and adjusting the parameters of thegradient boost regression tree to meet an accuracy requirement. By the final regression tree calculation model, the method can be effectively used for estimating the PM2.5 concentration in polluted weather, has high result accuracy and a fast estimation speed, can supplement the deficiency of a traditional method for estimating the PM2.5 concentration in polluted weather, and provides accurate data support for the prevention and control of air pollution.
Owner:INST OF REMOTE SENSING & DIGITAL EARTH CHINESE ACADEMY OF SCI +2

An urban bus emission rate estimation method based on a gradient lifting regression tree

The invention discloses an urban bus emission rate estimation method based on a gradient lifting regression tree. Firstly, according to the measured bus emission data, a Lagrangian interpolation method is used for standardized processing to obtain the emission data per second. Secondly, the VSP (Vehicle Specific Power) is used to characterize the current operating conditions of the bus, and the influence of the previous driving state on the emission is considered to establish a quantitative model of the emission rate. Finally, the gradient lifting regression tree is used to train data and adjus the parameters, the bus emission rate estimation model is obtained. The invention considers the common influence of the current time operation condition and the previous driving state on the currenttime emission rate, The non-parametric method of gradient lifting regression tree model is used to improve the estimation accuracy of bus emission rate, which has practical significance for controlling traffic exhaust emissions and optimizing road environment, and overcomes the complex nonlinear relationship between bus emission rate and various influencing factors.
Owner:SOUTHEAST UNIV

Binary classification oriented factor screening method based on boosted regression trees

ActiveCN107608938AAddress subjectivitySolve the problem of multicollinearityComplex mathematical operationsFactor screeningRegression tree model
The invention discloses a binary classification oriented factor screening method based on boosted regression trees. The method comprises the following steps that: (1) searching data, and establishinga target variable-predictive factor dataset; (2) on the basis of the target variable and all factors, utilizing the boosted regression trees to carry out modeling, and calculating and sorting factor importance; (3) carrying out correlation analysis on all factors, analyzing a Pearson correlation matrix, and carrying out screening; (4) on the basis of the target variable and the retained factor, utilizing the boosted regression trees to establish a new model, calculating a predictive deviation, calculating and sorting the factor importance, and removing the factor with the lowest importance until the amount of the retained factors is less than or equal to 2; and (5) comparing the predictive deviation of each boosted regression tree model in the (4), and taking all factors adopted by the boosted regression tree model with the smallest predictive deviation as an optimal factor combination. By use of the method, a quantitative factor selection system is established, results are reliable, and an application field is wide.
Owner:ANHUI NORMAL UNIV

Query time prediction method for time sequence database

The invention discloses a time sequence database-oriented query time prediction method, relates to the technical field of computers, and aims to solve the problem of low query time prediction speed in the prior art. 2, the time series data are written into a CnosDB, the CnosDB uses a CnoSQL query statement to conduct query retrieval on the time series data, and query time is recorded; 3, encoding the query statement into vectorized data; 4, extracting data distribution characteristics of the vectorized data; 5, performing dimension reduction on the data distribution characteristics by using PCA; step 6, using the vectorized data and the dimension-reduced data distribution characteristics as input, using query time as output, and training a gradient lifting regression tree model; and 7, performing query time prediction by using the trained gradient lifting regression tree model. In the aspect of prediction time, the model can give a prediction result within dozens of milliseconds in the experiment, and the response speed is very considerable.
Owner:北京诺司时空科技有限公司 +1

TBM tunneling optimization method based on rock slag physical characteristics

The invention discloses a TBM tunneling optimization method based on rock slag physical characteristics. The method comprises the steps that firstly, image acquisition and sensor equipment of a system is installed, and TBM field tunneling parameter data and parameter data of geometric characteristics and physical characteristics of rock slag are acquired to serve as a sample set of a model; secondly, a gradient lifting regression tree model optimized by a particle swarm algorithm is established for parameter learning and training feedback, and a TBM tunneling parameter suggestion interval is controlled; thirdly, the TBM net tunneling rate is output, an optimal prediction model is obtained, the working performance of the optimal prediction model is evaluated according to a test set in samples, and optimal tunneling control parameters are provided; and finally, after the optimal tunneling control parameters are compared with related specification requirements, feedback is conducted to a TBM console in time, and TBM tunneling parameters are adjusted. Optimization provided by the invention can be applied to TBM construction, rock slag information is predicted in advance, the tunneling parameters are dynamically adjusted, intelligent prediction of the TBM rock breaking efficiency is achieved, and the method has important significance in safe and efficient construction of tunnels.
Owner:CHINA RAILWAY 18TH BUREAU GRP CO LTD +2

Aero-engine remaining service life prediction model based on hybrid machine learning

The invention belongs to the technical field of aero-engine fault prediction and health management, discloses an aero-engine residual service life prediction model based on hybrid machine learning, and particularly relates to a hybrid machine learning model-based SGBRT for predicting the residual service life of an aero-engine in time. The model combines a self-organizing mapping network and a gradient boosting regression tree algorithm, and can predict the residual service life of the aero-engine through the following steps: firstly, enabling the model to use the self-organizing mapping network to cluster an original sample set into a cluster; and then constructing a gradient boosting regression tree for each cluster so as to predict the residual service life of the aero-engine. According to the method, the remaining service life of the aero-engine can be better predicted, and the intrinsic characteristics of the degradation data of the aero-engine are also disclosed.
Owner:DALIAN UNIV OF TECH

Method for predicting online car-hailing order quantity based on multi-source data fusion

The invention discloses a method for predicting online car-hailing order quantity based on multi-source data fusion. A hierarchical prediction model is proposed based on proportional matrix weighted average to predict the OD order quantity. A proportional matrix weighted average mode is proposed to predict a proportional matrix of a future time slice, and the weight of the proportional matrix is determined according to a similarity measurement function of time, weather and other characteristics, so that an algorithm can effectively fuse the multi-source data. Finally, the total urban order quantity is allocated according to the corresponding value in the obtained proportion matrix to obtain the order quantity of each OD. According to the invention, the total urban order quantity of a future time slice is predicted by using a gradient lifting regression tree algorithm. A gradient lifting regression tree algorithm is adopted to predict city total order quantity of the future time slice,and then proportional matrix of the future time slice is predicted in combination with a proportional matrix weighted average mode. Finally, multi-source data is effectively fused through a PMWA algorithm to obtain the order quantity of each OD. The problem of multi-line prediction is effectively solved, and prediction precision is high.
Owner:SICHUAN UNIV

GBRT-based method for forecasting reaction property of solid fuel during chemical chain process

The invention relates to method for forecasting reaction property of a solid fuel during the chemical chain process. The method comprises the steps of (1) collecting data by solid fuel chemical chainexperiment study; (2) organizing the data to obtain a training sample and a test sample; (3) training the training sample by a gradient boosted regression tree model; and (4) forecasting reaction property of the solid fuel during the chemical chain process. A result is forecasted by traversing data combinations, and a corresponding chemical chain working condition parameter is obtained according to demands of different chemical chain technologies; compared with the prior art, the method has the advantages that the reaction properties of various solid fuels during the chemical chain process areforecasted by the gradient boosted regression tree model, the experiment number is substantially reduced, and labor and material are greatly saved; and moreover, the fuel conversion rate during the chemical chain process is favorably, intuitively and quantitatively forecasted, and the method has an important significance to optimization of the chemical chain process.
Owner:SOUTHEAST UNIV

Multiphase flow virtual metering method based on gradient boosting regression tree model

The invention discloses a multiphase flow virtual metering method based on a gradient boosting regression tree model. The method comprises the following steps: firstly, acquiring sample data of a plurality of multiphase flows including pressure, temperature, differential pressure, liquid quantity, oil quantity and water quantity; using the sample data for training a GBDT gradient boosting regression tree model to obtain a virtual metering model, using the sample data for evaluating the virtual metering model, and finally the pressure, inputting the real-time temperature and the real-time differential pressure measured in real time into the virtual metering model for virtual metering. The method has the advantages of being high in model training speed, small in parameter adjustment, high inprecision, small in error, small in equipment size, low in manufacturing cost, non-radiative and the like, and has high data reference value for production dynamic monitoring, flow guarantee and oilreservoir management of an oil field.
Owner:HAIMO PANDORA DATA TECH(SHENZHEN) CO LTD

Cutter life dynamic prediction method

The invention relates to the field of cutter service life prediction, and discloses a cutter service life dynamic prediction method, which comprises the following steps: S1, determining features influencing the cutter service life, collecting related information data, obtaining historical data, and carrying out standardization processing on the historical data; s2, performing correlation analysis on the historical data, and deleting features of correlation within a critical range; s3, performing principal component analysis on the features, and performing dimension reduction and simplification on historical data to obtain modeling data; s4, using a gradient lifting regression tree to train modeling data, and establishing a cutter life prediction model; s5, collecting real-time data according to the characteristics of the modeling data, carrying out standardization processing on the real-time data, inputting the data into the tool life prediction model, and outputting to obtain the tool life; according to the dynamic prediction method for the service life of the cutter, the information data influencing the service life of the cutter is optimized, a perfect cutter service life prediction model is established, and the accuracy of cutter service life prediction is further improved.
Owner:CHENGDU AIRCRAFT INDUSTRY GROUP

Multi-source man-made thermal space-time quantization method based on machine learning

The man-made heat has obvious influence on urban climate and air quality, but an accurate and efficient estimation method for the multi-source man-made heat is lacked at present. The invention improves the process of man-made thermal modeling, and provides a multi-source man-made thermal time-space quantization method based on machine learning. The method comprises the following steps: step 1) calculating a county-level annual average artificial heat flux (AHF) on the basis of energy consumption and social economic data; 2) carrying out time dimension downscaling processing on artificial heat from different sources by using the substitution data to obtain county-level monthly average AHF; 3) calculating a monthly county-level average value of the man-made thermal correlation multi-source data as an explanatory variable, and forming a training sample with the corresponding AHF; 4) training the model based on two machine learning algorithms of a gradient lifting regression tree and Cubist, performing error analysis, and selecting an optimal algorithm for modeling for different heat sources; and 5) inputting specific raster data into the optimal model to calculate the multi-source artificial heat flux in a specific area at a specific time.
Owner:AEROSPACE INFORMATION RES INST CAS +1

Early warning system and method for temperature of engine cylinder of mining truck

The invention discloses an early warning system and method for the temperature of an engine cylinder of a mining truck. The method comprises the following steps that S1, a data acquisition terminal is erected on the truck, and the data acquisition terminal is used for real-time data acquisition and transmission; S2, the data acquisition terminal is connected with a background server through a network to upload real-time truck operation data; S3, the background server screens related independent variables based on Spearman correlation coefficients through big data analysis service, and constructs a cylinder temperature prediction model by adopting a gradient lifting regression tree algorithm; and S4, the background server estimates a cylinder temperature predicted value in real time through a monitoring module, compares the actual cylinder temperature with the predicted value through an alarm module, and sends early warning information to inform a user of necessary inspection of the truck when the actual temperature and the predicted value have a large deviation and are continuous.
Owner:HUANENG YIMIN COAL POWER CO LTD +2

Robust closed conduit structure stress anomaly identification method and system

PendingCN114519225ATimely assessment of safety and reliabilityReduce false positivesGeometric CADMeasurement devicesStress measurementEngineering
The invention belongs to the technical field of hydraulic engineering safety monitoring, and particularly relates to a robust closed conduit structure stress anomaly identification method and system.The closed conduit monitoring section historical data is collected as sample data, and the closed conduit monitoring section data is composed of environment measurement point data and steel bar stress measurement point data changing along with the environment quantity; a machine learning modeling strategy of a two-stage gradient lifting regression tree is adopted to construct each reinforcing steel bar stress measuring point prediction model, and historical monitoring data is utilized to perform training optimization; collecting monitoring section data of the closed conduit at regular time, sending the section data into the trained and optimized steel bar stress prediction model for prediction, and comparing a predicted numerical value with actually measured steel bar stress data in the monitoring data to judge whether a measuring point is abnormal or not. According to the method, when the closed conduit structure stress anomaly identification model is constructed, environment measurement point information is utilized, steel bar stress measurement point information at other positions is indirectly utilized, the problem that the constructed model possibly has endogenesis is avoided, closed conduit structure stress anomaly can be accurately identified, and the safety and reliability of the closed conduit structure can be evaluated in time.
Owner:中国南水北调集团中线有限公司

Method for Predicting Reaction Performance in Solid Fuel Chemical Looping Process Based on GBRT

The invention relates to a method for predicting the reaction performance in the chemical chain process of solid fuel, comprising (1) collecting data through experimental research on chemical chain of solid fuel; (2) arranging data to obtain training samples and test samples; (3) using gradient The training sample is trained by boosting the regression tree model; (4) The reaction performance in the chemical chain process of solid fuel is predicted. By traversing the data combination, predicting the results, and obtaining the corresponding chemical chain working condition parameters according to the requirements of different chemical chain technologies. Compared with the prior art, the present invention predicts the reaction performance of various solid fuels in the chemical chain process through the gradient boosting regression tree model, greatly reduces the number of experiments, saves a lot of manpower and material resources, and is conducive to intuitive and quantitative prediction of chemical chains. The fuel conversion rate in the chain process, etc., has certain guiding significance for optimizing the chemical chain process.
Owner:SOUTHEAST UNIV

A Network Planning Method for Low Power Wide Area Network Based on Data Mining

The invention discloses a low power consumption wide area network network planning method based on data mining, obtains the actual measurement data of the low power consumption wide area network network, starts from the coverage target, comprehensively considers the influencing factors of the weak coverage of the network, and uses the lifting regression tree algorithm to establish The signal quality prediction model is used to extract the coverage distribution spatial pattern of the network; then, the selection of the base station location is treated as a weighted problem based on the coverage distribution spatial pattern, and the weighted K-centroids clustering algorithm is used to obtain the optimal solution for the current pattern. Base station deployment; finally, the final base station topology is determined according to the overall objective function. The invention can well improve the coverage quality of the low power consumption wide area network, and has certain reference value for its network planning.
Owner:NANJING UNIV OF POSTS & TELECOMM

Penicillin fermentation process soft measurement modeling method for optimizing a gradient lifting regression tree based on a fruit fly algorithm

The invention provides a penicillin fermentation process soft measurement modeling method for optimizing a gradient lifting regression tree based on a fruit fly algorithm, and belongs to the field ofindustrial fermentation production process soft measurement modeling and application. According to the method, key parameters of a gradient lifting regression tree are optimized through a fruit fly optimization algorithm; the soft measurement modeling of the penicillin fermentation process is realized; the penicillin product concentration needing to be measured offline is subjected to online softmeasurement through the auxiliary variable which can be measured in real time in the penicillin fermentation process, the model output value is calibrated in combination with the offset compensation technology, and a method is provided for online real-time measurement of the product concentration in the penicillin fermentation process. The soft measurement modeling method can improve the prediction precision of the product concentration in the penicillin fermentation process, and can be effectively used for guiding penicillin production.
Owner:JIANGNAN UNIV

A Satellite Remote Sensing Estimation Method of PM2.5 Concentration in Polluted Weather

A method for satellite remote sensing estimation of PM2.5 concentration under polluted weather, which uses satellite aerosol optical depth data and corresponding ground PM2.5 concentration data to establish a data set, and completes sample learning and data testing based on gradient boosting regression tree learning method. The accuracy of the test results is verified, and the parameters of the gradient boosting regression tree are adjusted to meet the accuracy requirements. The final calculation model of the regression tree can be effectively used for PM2.5 concentration estimation in polluted weather, and the results are more accurate. The estimation speed is faster, which can supplement the insufficiency of traditional methods for PM2.5 concentration estimation in polluted weather, and provide more accurate data support for air pollution prevention and control.
Owner:INST OF REMOTE SENSING & DIGITAL EARTH CHINESE ACADEMY OF SCI +2

A Factor Screening Method for Binary Classification Based on Enhanced Regression Tree Algorithm

ActiveCN107608938BAddress subjectivitySolve the problem of multicollinearityComplex mathematical operationsAlgorithmRegression tree model
The invention discloses a binary classification oriented factor screening method based on boosted regression trees. The method comprises the following steps that: (1) searching data, and establishinga target variable-predictive factor dataset; (2) on the basis of the target variable and all factors, utilizing the boosted regression trees to carry out modeling, and calculating and sorting factor importance; (3) carrying out correlation analysis on all factors, analyzing a Pearson correlation matrix, and carrying out screening; (4) on the basis of the target variable and the retained factor, utilizing the boosted regression trees to establish a new model, calculating a predictive deviation, calculating and sorting the factor importance, and removing the factor with the lowest importance until the amount of the retained factors is less than or equal to 2; and (5) comparing the predictive deviation of each boosted regression tree model in the (4), and taking all factors adopted by the boosted regression tree model with the smallest predictive deviation as an optimal factor combination. By use of the method, a quantitative factor selection system is established, results are reliable, and an application field is wide.
Owner:ANHUI NORMAL UNIV

A method for predicting online car-hailing orders based on multi-source data fusion

The invention discloses a method for predicting online car-hailing order volume based on multi-source data fusion. A hierarchical forecasting model based on weighted average of proportion matrix is ​​proposed to forecast OD order quantity. A method based on the weighted average of the proportion matrix is ​​proposed to predict the proportion matrix of the future time slice, and its weight is determined according to the similarity measurement function of time, weather, and other characteristics. Therefore, the algorithm can effectively integrate these multi-source data. fusion. Finally, according to the corresponding value in the obtained ratio matrix, the total order volume of the city is distributed to obtain the order volume of each OD. The present invention uses the gradient boosting regression tree algorithm to predict the total order volume of the city in the future time slice, and then predicts the scale matrix of the future time slice in combination with the weighted average of the proportion matrix, and finally uses the PMWA algorithm to effectively carry out these multi-source data. Fusion, get the order quantity of each OD, effectively solve the "multi-line forecasting" problem, with high forecasting accuracy.
Owner:SICHUAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products