Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

42results about How to "Reduce data noise" patented technology

Method and apparatus for analysis and decomposition of classifier data anomalies

A human assisted method of debugging training data used to train a machine learning classifier is provided. The method includes obtaining a classifier training data set. The training data set is then debugged using an integrated debugging tool configured to implement a debugging loop to obtain a debugged data set. The debugging tool can be configured to perform an estimation and simplification step to reduce data noise in the training data set prior to further analysis. The debugging tool also runs a panel of prediction-centric diagnostic metrics on the training data set, and provides the user prediction based listings of the results of the panel of prediction-centric diagnostic metrics.
Owner:MICROSOFT TECH LICENSING LLC

Semiconductor device

Data lines (D0, D1) are shared by a first storage portion (MA) and a second storage portion (MB), and furthermore, a first transistor (MC0) coupled to a first comparison data portion (CD0) and a second transistor (MCA) coupled to the storage node of a first storage portion are connected in series to form a first comparing circuit (11), and a third transistor (MC1) coupled to a second comparison data line (CD1) and a fourth transistor (MCB) coupled to the storage node of the second storage portion are connected in series to form a second comparing circuit (12). Consequently, it is possible to enhance a symmetry in the layout of a diffusion layer and a wiring layer and to achieve the easiness of a layout in which a memory cell is line symmetrical with respect to a center line passing through a center thereof. Thus, a manufacturing process condition can easily be optimized and a variation in a manufacturing process can be reduced so that the microfabrication of the memory cell can be achieved.
Owner:HITACHI LTD

Noise Reduction Apparatus, Systems, and Methods

This document describes a general system for noise reduction, as well as a specific system for Magnetic Resonance Imaging (MRI) and Nuclear Quadrupole Resonance (NQR). The general system, which is called Calculated Readout by Spectral Parallelism (CRISP), involves reconstruction and recombination of frequency-limited broadband data using separate narrowband data channels to create images or signal profiles. A multi-channel CRISP system can perform this separation using (1) frequency tuned hardware, (2) a frequency filter-bank (or equivalent), or (3) a combination of implementations (1) and (2). This system significantly reduces what we call cross-frequency noise, thereby increasing signal-to-noise-ratio (SNR). A multi-channel CRISP system applicable to MRI and NQR are described.
Owner:ARJAE SPECTRAL ENTERPRISES

Deep learning-based picture sentiment polarity analysis method

ActiveCN106886580ALarge data sizeReduce labor costsWeb data indexingSemantic analysisCommon wordAffective forecasting
The invention discloses a deep learning-based picture sentiment polarity analysis method, and relates to the technical field of image content understanding and big data analysis. In a conventional picture sentiment analysis method, the final prediction precision is non-ideal due to simple models and features. At present, a deep learning method is used to perform training in a large-scale training set, but the noises of the training set are excessively high, so that the final performance is limited. In the deep learning-based picture sentiment polarity analysis method, a mode of obtaining data directly from a network is adopted, and slave data scale is large. Only sentiment polarity information of common words needed to be obtained during data preparation is possibly needed to be manually annotated. Later, the whole image obtaining and cleaning work can be automatically finished, so that the required labor cost is very low. In the data obtaining stage, two data cleaning processes are introduced, so that a large portion of noises due to inconsistence of pictures and tags can be eliminated. According to the method, priori knowledge is used for filtering the training set, so that the noises of the training set are reduced; and an improved network structure is used assistantly, so that the picture sentiment prediction accuracy is improved.
Owner:BEIJING UNIV OF TECH

Method and system for revising user word bank

The invention provides a method and system for revising a user word bank. The method comprises checking whether current input contents are completely or partially same or similar with input contents on input codes and different on characters; and / or checking whether the current input contents are completely or partially same or similar with the input contents on the characters and different on the input codes; revising data in the user word bank based on the current input contents and error correction contents if conditions are met; and enabling the error correction contents to be a part of the input contents corresponding to the current input contents. The method and system can intelligently record user input information, avoids learn misinput words as much as possible and reduces data noise in the user word bank. The method and system does not need more limitation on user editing actions, greatly expand application range and depth of word bank revising, and can better remove the data noise which cannot be found in the prior art.
Owner:BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO LTD

DLO Hi-C (Digestion-Ligation-Only Hi-C) chromosome conformation capture method

The invention discloses a DLO Hi-C (Digestion-Ligation-Only Hi-C) chromosome conformation capture method. The technology (the method) can overcome a series of shortcomings in a conventional chromosome conformation capture technology (Hi-C) which is loud in noise, high in cost, complex in experimental process, low in success rate, high in data analysis difficulty and the like, and the technology can conduct experiments just by simple ligation and digestion. The method provided by the invention has the following innovation points: 1) by conducting double cross-linking on target cells by virtue of EGS (ethylene glycol bis(succinimidyl succinate)) and a formaldehyde, the occurrence of decrosslinking in a later experimental process is prevented; 2) the application of biotin labeling is avoided in an experimental process, so that cost is reduced to a great extent; 3) a short time is consumed, and a library can be constructed within two and a half days just by simple ligation and digestion steps; 4) data noise is reduced, target gene interaction fragments are selectively recovered according to the sizes of the fragments by virtue of a page gel, and basically all data, obtained from sequencing is valid data; and 5) a library quality assessment standard is proposed for the first time, so that the quality of the library can be judged before conducting high throughput sequencing.
Owner:HUAZHONG AGRI UNIV

Time series data filling and restoring method based on machine learning

ActiveCN110457867ARaise the upper limit of forecast performanceReduce data noiseMachine learningSpecial data processing applicationsAlgorithmData filling
The invention relates to the technical field of computer time series data analysis and prediction, in particular to a time series data filling and restoring method based on machine learning. The method includes: filling the missing value by using a domain-based median and mean value filling method; estimating a true value of an expected sampling moment through a linear rule; detecting wave crestsand wave troughs of the time sequence, and smoothing abnormal values; taking hundreds of thousands of collected real data as samples, designing and generating time sequence characteristics, taking real results as labels, and training a machine learning model based on an XGBoost (Extreme Gradient Boost) for predicting a large number of unknown data. According to the method, the problems of multiplemissing values, large volatility, error accumulation and the like of specific time sequence data are solved, and the accuracy of data filling and restoring is effectively improved; moreover, the complexity of a machine learning model is well controlled, the filling and restoration of hundreds of millions of data records can be completed within an hour level, and the method has a high practical value.
Owner:杭州知衣科技有限公司

Unmanned aerial vehicle speed estimation method

The embodiments of the present invention disclose an unmanned aerial vehicle speed estimation method, and relate to unmanned aerial vehicle navigation and information measurement. A purpose of the present invention is to solve the problems of inaccuracy and high delaying in the measurement of the unmanned aerial vehicle speed through optical flow principle. The unmanned aerial vehicle speed estimation method comprises: calculating the median of the similarity of sparse optical flow feature points, and filtering out the feature point pair with the standard cross-correlation value of less than the median in the feature point set to obtain first sparse optical flow field data; calculating the FB error median of each feature point pair set in the first sparse optical flow field, filtering outthe feature point pair with the FB error of greater than the median to obtain second sparse optical flow field data, and calculating the optical flow value capable of characterizing the actual motionof the unmanned aerial vehicle; and estimating the motion speed by combining the image optical flow value, the camera angular motion and the camera height information. The unmanned aerial vehicle speed estimation method of the present invention is mainly used for estimating the speed of the unmanned aerial vehicle.
Owner:HIWING TECH ACAD OF CASIC

Spoken language text processing method for removing stop words and predicting sentence boundaries

ActiveCN111339750AOptimum Processing GranularityReduced collaborative predictive powerMathematical modelsSemantic analysisConditional random fieldNatural language understanding
The invention discloses a spoken language text processing method for removing stop words and predicting sentence boundaries. The spoken language text processing method comprises the following steps: firstly, collecting spoken language recognition text corpora; then marking stop words in the text corpus; marking words on the two sides of sentence boundaries in the text corpus; training a sequence labeling model by adopting a machine learning method; and finally, processing the oral text by adopting the model. A sequence labeling mode is adopted to identify and remove stop words in a text sequence, a machine learning scheme combining text vector embedding, forward and reverse bidirectional coding and a conditional random field is adopted, deep semantic features of spoken language texts are efficiently extracted, and the tag sequence prediction accuracy is improved; one model is adopted to simultaneously complete stop language removal and sentence boundary prediction; after processing, the voice recognition text is more prominent in key point, reasonable punctuation separation is achieved, human reading is facilitated, and the natural language understanding module can select the optimal processing granularity conveniently.
Owner:网经科技(苏州)有限公司

Kinematics estimation and deviation calibration method for crawler tractor

The invention discloses a kinematics estimation and deviation calibration method for a crawler tractor, which belongs to the technical field of agricultural machinery automatic driving. The method provided by the scheme comprises the steps that a kinematic model of the crawler tractor is constructed; in actual situation, due to the factors such as ground fluctuation variation and GNSS dual antenna installation deviation, heading angle deviation and worse path tracking effect are caused; to simplify the model, the heading angle deviation can be approximated as a certain value; an eastward displacement coordinate component, a northward displacement coordinate component, the tractor speed, a northward displacement coordinate component, the tractor speed and the tractor heading angle are used as system observations; and the constructed crawler tractor kalman filter model is a nonlinear model. According to the invention, the heading angle deviation caused by the ground fluctuation variation, the GNSS antenna installation deviation and the like can quickly and accurately estimated; the heading angle is compensated; and the adaptability of the system to the ground is improved.
Owner:NANJING TIANCHENLIDA ELECTRONICS TECH CO LTD

Semiconductor device

Data lines (D0, D1) are shared by a first storage portion (MA) and a second storage portion (MB), and furthermore, a first transistor (MC0) coupled to a first comparison data portion (CD0) and a second transistor (MCA) coupled to the storage node of a first storage portion are connected in series to form a first comparing circuit (11), and a third transistor (MC1) coupled to a second comparison data line (CD1) and a fourth transistor (MCB) coupled to the storage node of the second storage portion are connected in series to form a second comparing circuit (12). Consequently, it is possible to enhance a symmetry in the layout of a diffusion layer and a wiring layer and to achieve the easiness of a layout in which a memory cell is line symmetrical with respect to a center line passing through a center thereof. Thus, a manufacturing process condition can easily be optimized and a variation in a manufacturing process can be reduced so that the microfabrication of the memory cell can be achieved.
Owner:HITACHI LTD

Data compression and decompression method on basis of orthogonal wavelet packet transform and rotating door algorithm

The invention discloses a two-stage data compression and decompression method on the basis of orthogonal wavelet packet transformation and a rotating door algorithm. Data compression comprises the following steps of: (1) carrying out orthogonal wavelet packet transformation on original data to obtain a wavelet packet coefficient; (2) carrying out threshold processing on the wavelet packet coefficient obtained in the step (1); and (3) carrying out secondary compression on the wavelet packet coefficient subjected to threshold processing by adopting the rotating door algorithm. Compressed data is stored into a historical database or a disk. Decompression on the compressed data comprises the following steps of: (4) carrying out linear interpolation on the compressed data and recovering to obtain primary compressed data; and (5) carrying out wavelet packet reconstitution on the primary compressed data to obtain the original data. The invention solves the problem of difficulty in compressing a nonstationary analog signal in a large-scale real-time database and provides the data compression and decompression method which is simple to implement, has a high data compression ratio and has an obvious compressing effect on the nonstationary analog signal.
Owner:GUODIAN NANJING AUTOMATION

Crowdsourcing-based auxiliary driving map real-time matching and updating method

A crowdsourcing map data acquisition, data processing and updating method comprises the steps of construction of a crowdsourcing data reliability evaluation model, evaluation and verification of the model, fusion and adoption of data meeting the reliability requirement, and updating of the fusion result into the map, so that the map data updating speed is higher, and the cost is lower. According to real-time matching of the high-precision map for assisting driving, track and event data reported by an automobile are matched to the high-precision map in real time, a method for real-time map matching of the data reported by the automobile is provided, and distributed map matching in a commercial environment is achieved; the credibility evaluation of crowdsourcing data enables the result of system evaluation to be closer to an objective actual environment, and meanwhile, cross validation is carried out by utilizing deep learning and a mathematical statistical model, so that the reliability of credibility evaluation is ensured; and the crowdsourcing map is fused and updated in real time to update the data of which the reliability meets the requirement into the map, and finally a high-precision map meeting the auxiliary driving production requirement is obtained.
Owner:王程

Oil-gas microorganism gene exploration method

The present invention discloses an oil-gas microorganism gene exploration method. The method comprises the following steps: respectively collecting samples of surface shallow layers above a known oilwell, a gas well and a dry well in an exploration area, performing high-throughput sequencing after extracting DNA, establishing a microbial community composition pattern diagram in the exploration area according to a sequencing result, respectively screening out characteristic microorganisms in surface soil above the oil / gas well in the exploration area according to the pattern diagram, and designing primers according to attribute characteristics and carrying out a fluorescent quantitative PCR detection on the samples in the whole exploration area to detect the number of the characteristic microorganisms. A contour line of the characteristic microorganisms obtained by the oil-gas microorganism gene exploration method has a high coincidence rate with the known well, has a good coincidencerate with a trap line, and can carefully describe an oil area.
Owner:新方舟能源科技(天津)有限公司

Real-time streaming data preprocessing method for tobacco industry production field

The invention discloses a real-time streaming data preprocessing method for a tobacco industry production field. The method comprises the following steps: automatically matching process data collectedin real time to a visualization model associated with a data source when the process data of the production field is collected in real time; during matching, according to the definition attributes inthe visualization model, splitting the diversified process data streams acquired in real time into real-time data streams corresponding to each attribute of the visualization model; in the data stream splitting process, using a defined data preprocessing rule to perform data preprocessing on the real-time process data stream, and storing the preprocessed process data in the visualization model. According to the real-time streaming data preprocessing method, the real-time data quality of the production process is improved, and a stable data basis is provided for improving big data modeling andartificial intelligence technology application of the tobacco industry.
Owner:CHINA TOBACCO ZHEJIANG IND

Novel intelligent skylight system for environment prediction

InactiveCN109113274AReliable predictionReliable Forecast Weather IndexRoof coveringForecastingControl systemData acquisition
The invention provides a novel intelligent skylight system for environment prediction. The novel intelligent skylight system comprises a window frame, a window sash installed on the window frame, a driving device for driving the window sash to rotatably open and close, open and close in a translating mode or open and close in a folding and unfolding mode, a control system for controlling startingand stopping of the driving device and an environment prediction system, wherein the environment prediction system monitors real-time weather, calculates a real-time weather index and predicts the weather index; and when weather is changed or is about to change, different skylight control commands are sent according to the weather changes, after the skylight control commands are received, the control system controls the driving device connected with the control system to drive the window sash of a skylight to rotate, translate, fold and unfold. The novel intelligent skylight system has the beneficial effects that through data collection and data processing of the surrounding environment of the skylight, the environment for a period of time in the future is reasonably predicted, and the skylight is opened and closed in advance according to the predicted weather index, so that manpower waste is reduced, and loss of life and property is reduced.
Owner:广州小楠科技有限公司

Safety information supervision device applied to high-speed rail signal system

The invention relates to a safety information supervision device applied to a high-speed rail signal system. The safety information supervision device comprises a communication processor, an RBC interface machine, a CTC interface machine, a supervision analysis server, an analysis terminal and a storage device. The communication processor and the RBC interface machine are in communication connection with the supervision analysis service through a detection network. The CTC interface machine is in communication connection with a CTC system of a station and the supervision and analysis server, and is used for acquiring information of a train in the CTC system and transmitting an analysis result of the supervision and analysis server to the CTC system through the CTC interface machine. According to the device, multi-source data fusion analysis is adopted, filtering and validity analysis are carried out on basic data, and false alarms are reduced.
Owner:CASCO SIGNAL

Method for automatically correcting pitch angle of vehicle-mounted laser radar

PendingCN112558044AReduce data noiseSolve the problem of a large drop in detection rateWave based measurement systemsPoint cloudIn vehicle
The invention provides a method for automatically correcting a pitch angle of a vehicle-mounted laser radar. The method comprises the steps: screening and obtaining points of the laser radar on the ground according to a denoising strategy; calculating the current pitch angle of the laser radar for points on the ground according to a ground fitting strategy; and correcting the current pitch angle by adopting an updating strategy. According to the method, straight line fitting is carried out on the point cloud point ground data after data denoising processing according to the ground fitting strategy, the current pitch angle is estimated and obtained, the operand is small, and few operation resources are occupied.
Owner:英博超算(南京)科技有限公司

Method for generating traffic sign recognition model

According to the traffic sign recognition method provided by the invention, the detection effect on the road traffic sign is remarkably improved, and the requirement on real-time performance is met. A weighted bidirectional feature pyramid network with more balanced accuracy and efficiency is adopted to replace a path aggregation network, and channel features of road traffic signs are better fused. Secondly, common convolution is replaced by the cavity convolution, and the cavity convolution is combined with the space pooling pyramid, so that the receptive field is further expanded. Meanwhile, the detection scale is increased to four types, and the small target detection effect is improved; and random cutting is added in a data enhancement technology, so that the model learns more detail features. Finally, digital image operation technique is used to increase the number of instances for low precision categories.
Owner:SHANGHAI INST OF TECH

dlo Hi-C chromosome conformation capture method

The invention discloses a DLO Hi-C (Digestion-Ligation-Only Hi-C) chromosome conformation capture method. The technology (the method) can overcome a series of shortcomings in a conventional chromosome conformation capture technology (Hi-C) which is loud in noise, high in cost, complex in experimental process, low in success rate, high in data analysis difficulty and the like, and the technology can conduct experiments just by simple ligation and digestion. The method provided by the invention has the following innovation points: 1) by conducting double cross-linking on target cells by virtue of EGS (ethylene glycol bis(succinimidyl succinate)) and a formaldehyde, the occurrence of decrosslinking in a later experimental process is prevented; 2) the application of biotin labeling is avoided in an experimental process, so that cost is reduced to a great extent; 3) a short time is consumed, and a library can be constructed within two and a half days just by simple ligation and digestion steps; 4) data noise is reduced, target gene interaction fragments are selectively recovered according to the sizes of the fragments by virtue of a page gel, and basically all data, obtained from sequencing is valid data; and 5) a library quality assessment standard is proposed for the first time, so that the quality of the library can be judged before conducting high throughput sequencing.
Owner:HUAZHONG AGRI UNIV

Method, system and device for wireless positioning of in-band pseudo satellite

Disclosed are an in-band pseudolite wireless positioning method, system and device, the system comprising: a base station, a pseudolite and a terminal; the base station is used to transmit identifier information to the pseudolite after correcting the transmission clock of the pseudolite, and transmit a pseudolite array and positioning correction information to a terminal; the pseudolite is used to generate a random positioning signal sequence according to the identifier information, and utilizes the positioning link having the same frequency band as the base station wireless system to transmit a positioning signal according to the transmission clock and the random positioning signal sequence; and the terminal is used to generate a random positioning signal sequence of the pseudolite according to the pseudolite array and the positioning correction information, and match the received positioning signal according to the random positioning signal sequence to obtain the arrival time of the positioning signal, and obtain through calculation the position coordinates of the terminal according to the position coordinates of the pseudolite and the arrival time.
Owner:ZTE CORP

Data Compression and Decompression Method Based on Orthogonal Wavelet Packet Transform and Revolving Door Algorithm

The invention discloses a two-stage data compression and decompression method on the basis of orthogonal wavelet packet transformation and a rotating door algorithm. Data compression comprises the following steps of: (1) carrying out orthogonal wavelet packet transformation on original data to obtain a wavelet packet coefficient; (2) carrying out threshold processing on the wavelet packet coefficient obtained in the step (1); and (3) carrying out secondary compression on the wavelet packet coefficient subjected to threshold processing by adopting the rotating door algorithm. Compressed data is stored into a historical database or a disk. Decompression on the compressed data comprises the following steps of: (4) carrying out linear interpolation on the compressed data and recovering to obtain primary compressed data; and (5) carrying out wavelet packet reconstitution on the primary compressed data to obtain the original data. The invention solves the problem of difficulty in compressing a nonstationary analog signal in a large-scale real-time database and provides the data compression and decompression method which is simple to implement, has a high data compression ratio and has an obvious compressing effect on the nonstationary analog signal.
Owner:GUODIAN NANJING AUTOMATION

A Deep Learning-Based Image Sentiment Polarity Analysis Method

The invention discloses a deep learning-based picture sentiment polarity analysis method, and relates to the technical field of image content understanding and big data analysis. In a conventional picture sentiment analysis method, the final prediction precision is non-ideal due to simple models and features. At present, a deep learning method is used to perform training in a large-scale training set, but the noises of the training set are excessively high, so that the final performance is limited. In the deep learning-based picture sentiment polarity analysis method, a mode of obtaining data directly from a network is adopted, and slave data scale is large. Only sentiment polarity information of common words needed to be obtained during data preparation is possibly needed to be manually annotated. Later, the whole image obtaining and cleaning work can be automatically finished, so that the required labor cost is very low. In the data obtaining stage, two data cleaning processes are introduced, so that a large portion of noises due to inconsistence of pictures and tags can be eliminated. According to the method, priori knowledge is used for filtering the training set, so that the noises of the training set are reduced; and an improved network structure is used assistantly, so that the picture sentiment prediction accuracy is improved.
Owner:BEIJING UNIV OF TECH

Reflective digital holographic microscopic imaging system and method based on pulsed laser

The invention discloses a pulse laser-based reflective digital holographic microscopy imaging system and method. The reflective digital holographic microscopy imaging system comprises an optical imaging sub system and a synchronous control sub system for controlling the operation of the optical imaging sub system, wherein the optical imaging sub system comprises a pulse laser, a laser attenuator, a steering apparatus, a light beam transfer apparatus and a holographic imaging apparatus which are arranged in sequence; the synchronous control sub system comprises an industrial control host and a synchronous controller; the reflective digital holographic microscopy imaging method comprises the following steps of 1, establishing the holographic microscopy imaging system; 2, performing synchronous control on the pulse laser and a digital camera; 3, obtaining hologram data; and 4, performing three-dimensional appearance holographic image display on the surface of a test sample. The system and the method are creative in design; by performing synchronous control on the pulse laser and the digital camera, the high-frequency micro-vibration test sample hologram is obtained; light path interference is completed through four beam splitters, so that a condition that reflective stray light enters the hologram caused by self parts in the light path can be avoided; and the hologram is high in quality.
Owner:XIAN UNIV OF SCI & TECH

A hyperspectral remote sensing image classification method based on six-layer convolutional neural network and joint spectral-spatial information

The invention discloses a hyperspectral remote sensing image classification method based on a combination of six-layer convolutional neural network and spectral-spatial information, which selects hyperspectral remote sensing image data of a certain number of bands, and performs spatial analysis on the selected two-dimensional image data of each band. Mean filtering, and then convert the format of the multi-band data corresponding to each pixel, and convert the one-dimensional vector into a square matrix, that is, each pixel corresponds to a square matrix data. Then design a six-layer classifier based on deep learning template, including input layer, first convolutional layer, maximum pooling layer, second convolutional layer, fully connected layer, output layer; extract the square matrix corresponding to several pixels The data is used as the training set, input the classifier and train the classifier; extract the square matrix data corresponding to several pixels as the test set, input it into the trained classifier, observe the classification results output by the trainer, and compare with the real The classification information is compared to verify the performance of the classifier. The classification accuracy rate of the present invention is higher than the existing 5-CNN method.
Owner:CENT SOUTH UNIV

Self-balancing mounting device and method for large ground penetrating radar

The invention discloses a self-balancing mounting device and method for a large ground penetrating radar. The self-balancing mounting device comprises a support (1), a plurality of mounting steel wire ropes (2) and a tensioning device (3). Two mounting steel wire ropes (2) with end heads at two ends are arranged on the support (1); and the mounting steel wire rope (2) is connected with the tension device (3). Two mounting steel wire ropes (2) with ends at the two ends are arranged on a support (1), a tensioning device (3) is used for transversely tensioning the mounting steel wire ropes (2), and the radar is lifted up to be connected with the support (1) along with tensioning of the mounting steel wire ropes (2); and a rubber block (5) for shock insulation is arranged on the mounting steel wire rope (2), so that the effects of protecting the radar and reducing interference to data are achieved, the mounting of the ground penetrating radar can be completed by a single person, the labor intensity is reduced, the mounting of four hanging lugs can be completed by transverse tensioning and one-time tensioning, and time and labor are saved.
Owner:广州诚安路桥检测有限公司

Financial event extraction method based on combination of pre-training language and deep learning model

InactiveCN113934909ASave time cost and labor costReduce data noiseFinanceWeb data indexingEngineeringData mining
The invention provides a financial event extraction method based on combination of a pre-training language and a deep learning model. The financial event extraction method based on combination of a pre-training language and a deep learning model comprises the following operation steps: S1, data acquisition and preprocessing: crawling public financial event text corpora by using a web crawler, and performing text preprocessing on the original financial event text corpora. According to the financial event extraction method based on combination of a pre-training language and a deep learning model provided by the invention, financial field event types and templates are defined by using a mode of combining machine learning with field knowledge, so that the time cost and the labor cost of manually defining events are greatly reduced; large-scale automatic labeling of financial field event corpus data is realized by using a remote supervised learning mode; data noise is effectively reduced by using a heuristic pruning method; and the blank of lack of large-scale corpus data in the current financial event extraction field is filled.
Owner:中电积至(海南)信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products