Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

59 results about "Clustered data" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Clustering data is the process of grouping items so that items in a group (cluster) are similar and items in different groups are dissimilar. After data has been clustered, the results can be analyzed to see if any useful patterns emerge. For example, clustered sales data could reveal which items are often...

Enhancing item retrieval using fitment match

PendingEP4769284A1CommerceClustered dataData pack

Some aspects relate to technologies for performing item retrieval on a listing platform using clusters of interchangeable parts formed using fitment match. In accordance with some aspects, item data is accessed for part item listings on a listing platform, where the item data for each part item listing includes fitment data for each of a number of different fitments. For each part item listing, a fitment hash is generated using fitment data for each fitment of the part item listing. The part item listings are clustered based on overlap of fitment hashes for the part item listings. Cluster data is stored for the part item listing clusters. The cluster data for each part item listing cluster associates a cluster identifier and an item listing identifier for each part item listing in the part item listing cluster. The cluster data can be leveraged to perform item retrieval for the listing platform.

Enhancing item retrieval using fitment match

Enhancing item retrieval using fitment match

Enhancing item retrieval using fitment match

Owner:EBAY INC

A clinical term knowledge graph construction method

PendingCN122314433AClustered dataMedicine

This invention relates to the field of data processing technology and specifically discloses a method for constructing a clinical terminology knowledge graph. The method includes: acquiring multi-source clinical data and medical domain dictionary data; determining the clinical entity set and data block dictionary set for each clinical data block of each medical clinical data from each data source; determining multiple business domain clustering data for each business domain label and the clustered medical entity set for each business domain clustering data; calculating the statistical entity relationship set and graph entity relationship set between every two medical entities in the clustered medical entity set of each business domain label; and constructing a clinical knowledge graph. This method enables deep collaboration between multi-source clinical data and medical dictionaries, improving the professionalism and reliability of medical entity associations, balancing the independence of business domain knowledge with the interoperability of global knowledge, and providing high-quality core support for intelligent medical assistance, medical research, and clinical knowledge transmission.

A clinical term knowledge graph construction method

Owner:JINAN BEISEN ELECTRONIC INFORMATION CO LTD

Data clustering processing methods, devices, computer equipment, and storage media

ActiveCN114358110BClustered dataEngineering

This application provides a method, apparatus, computer device, and storage medium for clustering data. The method includes: acquiring sample feature data of M samples, wherein the sample feature data includes N sample features, and each sample feature corresponds to a sample feature type; determining multiple outlier sample groups based on the sample features of Q samples under P sample feature types, wherein each sample feature under the i-th sample feature type in the i-th outlier sample group satisfies the outlier condition; wherein P is an integer greater than 1 and less than or equal to N, i is an integer greater than 1 and less than or equal to P, and Q is an integer greater than 1 and less than or equal to M; calculating the cluster centers of each outlier sample group based on the sample features included in each outlier sample group to obtain P first cluster centers; and performing clustering processing on the classification features of R classification samples based on the P first cluster centers to obtain clustering results, which can improve the accuracy of data clustering.

Data clustering processing methods, devices, computer equipment, and storage media

Owner:TENCENT TECHNOLOGY (SHENZHEN) CO LTD

Text data processing method, apparatus, device, storage medium, and program product

PendingCN122285907Aimprove interpretabilityEfficient integrationClustered dataData pack

This disclosure provides a text data processing method, apparatus, device, storage medium, and program product, relating to the field of big data processing. The method includes: acquiring incremental text data and historical clustering data; the incremental text data includes at least one text data item; performing incremental clustering processing on the text data items based on historical clusters in the historical clustering data, matching historical clusters or creating new clusters for each text data item, and generating corresponding cluster topics for clusters that meet preset conditions, thereby obtaining incremental clustering results; and updating historical clustering data according to the incremental clustering results. This method can effectively integrate newly added text data, dynamically adjust the clustering results as data is updated, avoid the high overhead of full re-clustering, and enhance the interpretability of clustering results by automatically generating cluster topics. It achieves efficient dynamic clustering of incremental text data, improving the automation and intelligence level of data management.

Text data processing method, apparatus, device, storage medium, and program product

Owner:BEIJING XIAOMI MOBILE SOFTWARE CO LTD +1

A ladder networking data management and backup method and system under a hybrid cloud architecture

ActiveCN121397092BPlaintextClustered data

The application relates to the technical field of elevator Internet of Things data processing, and discloses an elevator Internet of Things data management and backup method and system under a hybrid cloud architecture, which comprises the following steps: collecting elevator terminal equipment original data streams on the edge side and processing the original data streams to generate pretreatment data; identifying data sensitivity according to a security rule library, encrypting high and medium sensitive data, and keeping low sensitive data in plaintext; dividing the data into three backup priority sets according to business criticality, and respectively distributing the data to an edge side local backup unit, a private cloud and a public cloud backup cluster after clustering based on access frequency and storage occupation; monitoring storage load, link state and backup progress, and scheduling resources when an exception occurs; and regularly recovering and practicing the backup data. Through data hierarchical processing, differential storage, dynamic monitoring and scheduling and regular practice, the application supports intelligent management of large-scale elevator cluster data.

A ladder networking data management and backup method and system under a hybrid cloud architecture

A ladder networking data management and backup method and system under a hybrid cloud architecture

Owner:SHANDONG TIWANG INFORMATION TECHNOLOGY CO LTD

Mechanically intelligent incubator-tight blue light irradiation data management system

ActiveCN121177664BClustered dataData pack

The present application belongs to the technical field of medical apparatus and instruments, and discloses a warm box closed blue light irradiation data management system based on mechanical intelligence, which comprises: a state cluster data module, which collects multi-source heterogeneous data, corrects the clock deviation between the warm box and the preset biological sensor through a space-time alignment protocol, dynamically adjusts the multi-source heterogeneous data sampling rate based on the infant physiological data in the multi-source heterogeneous data after time alignment, and outputs a state cluster data packet with a time stamp; a hierarchical extraction module, which constructs a three-level feature processing pipeline based on the state cluster data packet, including a basic layer that calculates basic statistics through a sliding window; a logic layer that generates a temperature-light effect index and a sealing safety coefficient according to the basic statistics and in combination with the multi-source heterogeneous data; and a decision layer that establishes a treatment efficiency prediction matrix based on the temperature-light effect index and the sealing safety coefficient, and predicts the associated trend of blue light irradiation and bilirubin metabolism; the safety and curative effect stability of infant treatment are improved.

Mechanically intelligent incubator-tight blue light irradiation data management system

Owner:AFFILIATED CHILDRENS HOSPITAL OF CAPITAL INST OF PEDIATRICS

Patent clustering method and system based on pre-trained heterogeneous graph neural network

ActiveCN121256050BClustered dataMessage delivery

This invention discloses a patent clustering method and system based on a pre-trained heterogeneous graph neural network, relating to the field of data analysis technology. In the pre-training stage, a patent heterogeneous graph network is constructed based on patent data from the target domain; based on the current heterogeneous graph network, message passing is performed through meta-paths to obtain patent meta-path representations; Transformer attention scores are calculated based on these representations; and the patent network is pre-trained based on the aggregated meta-path representations. In the downstream fine-tuning stage, patent meta-path representations are obtained based on patents from another domain, consistent with the pre-training stage; downstream meta-path representations are aggregated based on the pre-trained attention scores, and cue vectors are used to enhance the sensitivity of downstream clustering tasks; the patents are clustered based on the aggregated representations; and the patent clustering data is sent to the target terminal device in sorted order. This invention combines pre-training and cue learning to enhance low-resource adaptability, enabling accurate patent clustering and improving the efficiency of user domain technology retrieval and analysis.

Patent clustering method and system based on pre-trained heterogeneous graph neural network

Owner:GUANGDONG UNIV OF TECH

Electronic device, recording medium, and method for clustering data related to battery charging pattern thereof

PendingEP4769184A1Relational databasesSpecial data processing applicationsClustered dataCluster algorithm

Provided is an electronic device for clustering data related to a battery charging pattern. The electronic device may be configured to: acquire a plurality of pieces of battery charging data for a plurality of vehicles; generate a plurality of features related to a battery charging pattern on the basis of the plurality of pieces of battery charging data; classify the plurality of features into a preset number of data sets by sampling the plurality of features; determine the number of clusters corresponding to each of the preset number of data sets on the basis of a clustering algorithm and an index for evaluating a clustering result; set a reference number of clusters corresponding to the plurality of features on the basis of the optimal number of clusters of each of the preset number of data sets; and cluster the plurality of features into clusters of the reference number of clusters.

Electronic device, recording medium, and method for clustering data related to battery charging pattern thereof

Owner:LG ENERGY SOLUTION LTD

Mpp cluster data processing method and system

PendingCN122346485AClustered dataQuery plan

The application relates to an MPP cluster data processing method and system, wherein a local and remote MPP cluster master node and segment node are both deployed with a cloudberry_fdw plug-in of an Apache Cloudberry open source project. The local master node parses a request to generate a parallel query plan and sends a parallel cursor instruction; the local segment node and the remote segment node are directly connected to pull / write data in parallel, and the remote cluster responds cooperatively. The scheme completely breaks through the master node bottleneck, improves the data transmission efficiency by more than 3 times, guarantees that the data consistency meets the ACID characteristics, automatically adapts cluster expansion, adapts to new scenarios such as high-dimensional vectors and data lakes, and greatly reduces operation and maintenance costs.

Mpp cluster data processing method and system

Owner:HASHDATA LTD

Modular lead-carbon battery monitoring system

PendingCN122410349AClustered dataData stream

The application discloses a kind of modular lead-carbon battery monitoring systems, it is related to battery management technical field, the application cooperates with data-control flow by hierarchical distributed architecture, the problem of system rigidification, expansion maintenance difficulty is solved, wherein, battery sampling module is directly bound lead-carbon battery monomer, realizes voltage, temperature acquisition and equalization execution, battery main control module summarizes this cluster data and completes cluster level SOC / SOH estimation, forms first level edge computing node, system general control module as local hub, carries out cross-cluster analysis and advanced diagnosis, each module is connected by standardized communication bus and network, so that battery main control module can linearly increase with battery cluster quantity, so that the physical topology of system and the monomer-cluster logic topology of battery system correspond strictly, data flow converges from bottom to top, control flow is accurately issued from top to bottom, realizes the flexible expansion of monitoring scale and the rapid online maintenance of fault module, overcome the defect that centralized system reconstruction is difficult, maintenance cost is high.

Modular lead-carbon battery monitoring system

Modular lead-carbon battery monitoring system

Modular lead-carbon battery monitoring system

Owner:YANTAI JINCHAO YUKE STORAGE BATTERY CO LTD

A cluster data adaptive sampling method and apparatus

PendingCN122309289AClustered dataData acquisition

This application provides a cluster data adaptive sampling method and apparatus, belonging to the field of server data acquisition. The method includes: collecting real-time pressure values corresponding to multiple load indicators of the cluster; determining the pressure inflection point corresponding to each load indicator based on historical real-time pressure values; non-linearly normalizing the real-time pressure values according to the pressure inflection point of each load indicator to obtain standard pressure values corresponding to each load indicator; fusing the standard pressure values of each load indicator to obtain a cluster load value; predicting the future load of the cluster based on the cluster load value at a first possible time; and adjusting the sampling frequency of the cluster data based on the future load, wherein the adjusted sampling frequency change time is earlier than the actual load change time of the cluster. The cluster data adaptive sampling method and apparatus provided in this application can pre-adjust the sampling frequency before load changes.

A cluster data adaptive sampling method and apparatus

Owner:ANQING (TIANJIN) COMPUTER CO LTD +1

Elevation detection and lofting marking method and system based on RTK and binocular vision

PendingCN122345379AClustered dataVisual perception

This invention relates to the field of visual measurement technology, and more particularly to a method and system for elevation detection and stakeout marking based on RTK and binocular vision. The method involves dividing the construction area into grids to form the travel path of a stakeout trolley, with grid vertices designated as stakeout target points. A binocular camera acquires real-time images of the construction environment, and the trolley's travel path is adjusted based on obstacle point cloud cluster data. Once the trolley enters a distance threshold within the stakeout point, the RTK positioning error is calculated. World coordinate system point cloud data of the target stakeout area is acquired, and the stakeout action is performed. Images of the stakeout points are acquired using the binocular camera after stakeout, and the stakeout effect is evaluated. The earthwork volume is calculated after all four vertices of the grid unit have been staked. Through real-time data from the binocular camera, RTK module, and IMU unit, reliable obstacle avoidance during travel, millimeter-level perception of the actual ground elevation at the stakeout point, and quality verification after stakeout are achieved, along with real-time calculation of the earthwork volume during the stakeout process.

Elevation detection and lofting marking method and system based on RTK and binocular vision

Owner:GUANGDONG UNIV OF TECH +1

A method and system for joint inspection control of a drone and a robot dog

PendingCN122116503AChecking time patrolsInternal combustion piston enginesClustered dataSimulation

The application provides a kind of unmanned aerial vehicle and robot dog's joint inspection control method and system, first acquisition inspection reference model and unmanned aerial vehicle inspection data, and using comparison strategy obtains initial task data, based on initial task obtains task cluster data, so that macroscopic abnormality found by unmanned aerial vehicle can be perceived in real time and converted into the task required by robot dog subsequently, then through task cluster data, robot dog waits data is obtained, while constructing solving scheduling objective function, task instruction data is obtained, so that robot dog and unmanned aerial vehicle can execute task instruction data, when multiple points are concurrent, the optimal task allocation of multi-machine cooperation is realized, the utilization rate of inspection resource is improved, then through robot dog and unmanned aerial vehicle, update data is collected, and enhanced data is obtained, finally, the inspection reference can be updated by enhanced update data, to ensure that the real-time updated inspection reference model is provided, to provide more reliable and comprehensive basis for operation and maintenance personnel, to save subsequent operation and maintenance resources.

A method and system for joint inspection control of a drone and a robot dog

Owner:URUMQI TIANYAO WEIYE INFORMATION TECH SERVICE CO LTD

An advertisement putting rhythm control method based on crowd behavior prediction

PendingCN122288794AClustered dataSemantic clustering

A method for controlling the pace of advertising delivery based on group behavior prediction, relating to the field of advertising control, is proposed. It involves generating topic cluster data through semantic clustering of cross-platform public opinion data, and generating cross-platform topic alignment data; identifying bridging proxy data that acts as a propagation carrier between platforms based on the cross-platform topic alignment data, and calculating bridging strength data; configuring the bridging strength data into a preset propagation dynamics model to generate diffusion prediction results data; combining delivery log data and cross-platform topic alignment data to generate delivery coupling degree data; and generating delivery pace control plan data based on the delivery coupling degree data when the negative public opinion spillover risk data is greater than or equal to a first preset threshold and the predicted arrival time data falls within a preset monitoring window period. This improves the stability and accuracy of advertising delivery control.

An advertisement putting rhythm control method based on crowd behavior prediction

Owner:SHANGHAI INTERNATIONAL STUDIES UNIVERSITY

A driving condition construction method based on a generative adversarial network

ActiveCN121597984BClustered dataAlgorithm

The application relates to the field of vehicle driving condition construction, and particularly discloses a driving condition construction method based on a generative adversarial network, which comprises the following steps: collecting vehicle driving data on expressways and urban roads and preprocessing; clustering by means of the elbow rule and K-means algorithm; extracting features from each clustering data subset and analyzing speed probability distribution by means of kernel density estimation to determine corresponding distribution models and parameters; constructing a generative adversarial network; training the generative adversarial network; and processing curve transition problems by means of a linear interpolation method and integrating into complete driving condition curves. The application improves sample generation efficiency, simultaneously gets rid of excessive dependence on labeled data, and generates condition data which is more in line with actual driving scenes, effectively overcoming the defects of a Markov model.

A driving condition construction method based on a generative adversarial network

Owner:HEFEI UNIV OF TECH

New energy cluster weak network adaptive data acquisition method, device, equipment and medium

PendingCN122456745AClustered dataDual mode

The application provides a new energy cluster weak network adaptive data acquisition method, device, equipment and medium, comprising: real-time monitoring network quality parameters through a dual-mode gateway deployed at a new energy station side, and dividing network states according to preset threshold values; dynamically adjusting data acquisition frequency based on the current network state; enabling corresponding data storage modes according to the network state, adopting a dual mode of real-time transmission and local backup in network fluctuation or weak network, switching to full local cache when the network is disconnected, and executing a preset local device control logic; after the network is restored, according to the received data time stamp fed back by the centralized control center, the cached data is transmitted to the centralized control center according to the priority for data consistency verification, and after the verification is passed, the dual-mode gateway is notified to clean up the confirmed cached data, so as to solve the problems of reliable acquisition of new energy cluster data in a complex network environment, safe operation of equipment, and ensuring data complete synchronization after the network is restored.

New energy cluster weak network adaptive data acquisition method, device, equipment and medium

Owner:HUANENG RENEWABLES CORP LTD HEBEI BRANCH

Cross-cluster data modification method, device, apparatus and storage medium

PendingCN122240725ADatabase updatingDatabase distribution/replicationClustered dataData set

This application discloses a cross-cluster data modification method, apparatus, device, and storage medium, relating to the field of data management technology. The method includes: responding to a user-submitted data modification request and determining the data distribution corresponding to the data to be modified in the data modification request; if the data distribution involves an archived data cluster, obtaining the syntax mapping relationship between the archived data cluster and the current data cluster; mapping the data modification request based on the syntax mapping relationship to obtain a data modification task that matches the syntax of the archived data cluster; and modifying the data in the archived data cluster according to the data modification task. Compared to existing methods involving multiple cross-cluster synchronizations, this application can directly modify data in the archived data cluster according to a data modification task with an adapted syntax, reducing resource consumption caused by frequent cross-cluster synchronization and thus improving data modification efficiency.

Cross-cluster data modification method, device, apparatus and storage medium

Cross-cluster data modification method, device, apparatus and storage medium

Cross-cluster data modification method, device, apparatus and storage medium

Owner:CHINA MERCHANTS BANK

Vertical domain natural language large model data processing method and device, equipment and medium

PendingCN122114153ADigital data information retrievalBiological modelsClustered dataEngineering

The present application relates to the technical field of data processing, and provides a vertical domain natural language large model data processing method, device, equipment and medium, the method comprises: according to the task type under the target scene, the user question and answer data are clustered, and the clustering data corresponding to each task type is obtained;Determine the calling amount and scoring condition of the clustering data corresponding to each task type;According to the calling amount and historical scoring condition, the classification condition of the clustering data is determined;According to the classification condition, the data processing strategy is determined, and the clustering data is processed according to the data processing strategy.The present application realizes the efficient linkage of data and model, and makes the data specialized model training through data clustering, which significantly improves the model iteration efficiency.

Vertical domain natural language large model data processing method and device, equipment and medium

Owner:CHINA MOBILE JIUTIAN ARTIFICIAL INTELLIGENCE TECHNOLOGY (BEIJING) CO LTD +1

A privacy-sensitive federated learning method for differential forgetting based on trajectory clusters

PendingCN122132803ASemantic analysisBiological modelsClustered dataAlgorithm

A differential forgetting method for federated learning based on trajectory cluster privacy sensitivity includes the following steps: After cleaning and standardizing trajectory data, similar trajectories are clustered into trajectory clusters using a clustering algorithm; each trajectory cluster is quantitatively evaluated for privacy sensitivity from multiple dimensions, including semantic features, spatiotemporal features, and individual identifier correlation; model training is performed within a federated learning framework, and the contribution of each trajectory cluster to the model parameters is calculated; based on the privacy sensitivity evaluation results and contribution calculation results of the trajectory clusters, differential forgetting trigger conditions and forgetting intensity are determined; the client eliminates or reduces the influence of corresponding trajectory clusters in the local model according to the differential forgetting strategy and collaboratively updates the global model. This invention achieves accurate and efficient forgetting of high privacy-sensitive trajectory cluster data; minimizes the impact on the overall model performance, and improves the privacy protection level and model practicality of federated learning in trajectory data application scenarios.

A privacy-sensitive federated learning method for differential forgetting based on trajectory clusters

Owner:HUAIYIN INSTITUTE OF TECHNOLOGY

Clustering method and system for vehicle line of crowdsourced map, and storage medium

ActiveUS12669348B2Clustered dataCurrent sample

Described herein is a method including S1: determining to-be-clustered data of different vehicle lines; S2: determining a mean Z value list and a centroid coordinate list of the vehicle line; S3: randomly obtaining a sample from the to-be-clustered data list of the vehicle line, and determining an E neighborhood sample list of a current sample; S4: computing a difference between a mean Z value of the current sample and a mean Z value of an E neighborhood sample, and putting samples that conform to distance threshold determination into a first list; S5: using the current sample as a nucleus seed if a length of the first list is greater than a minimum clustering sample number threshold; S6: repeating steps S3 to S5 until all nucleus seeds that meet requirements and data associated with the nucleus seeds are found; S7: clustering all the nucleus seeds; and S8: merging all the clusters.

Clustering method and system for vehicle line of crowdsourced map, and storage medium

Owner:CHONGQING CHANGAN AUTOMOBILE CO LTD

Multi-baseline insar phase unwrapping method, system, device, and medium

ActiveCN120446894BClustered dataData set

The application discloses a multi-baseline INSAR phase unwrapping method, system, device and medium, wherein the method comprises the following steps: constructing a to-be-clustered data set: acquiring intercept information corresponding to each pixel according to interferograms corresponding to different vertical baselines, then combining position information of each pixel as multi-dimensional clustering features of the pixel, and taking the pixel as a to-be-clustered target to obtain the to-be-clustered data set; wavelet clustering processing: performing clustering processing on the to-be-clustered data set based on wavelet clustering; cluster result correction: correcting cluster labels of pixels in each noise cluster; cluster-by-cluster phase unwrapping: based on the corrected cluster distribution, calculating a blur vector of each cluster by using a closed solution formula or a sparse-TSPA method, and calculating absolute phases of each pixel in a corresponding cluster based on the blur vector of each cluster. When large-size interferograms are processed, the application has more advantages than existing multi-baseline phase unwrapping methods in terms of efficiency, accuracy and adaptability of a baseline ratio.

Multi-baseline insar phase unwrapping method, system, device, and medium

Owner:CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY

Log regulation method, medium, product and equipment of database cluster

PendingCN122451036AClustered dataEngineering

The application provides a log regulation method, medium, product and equipment of a database cluster. The method comprises: sampling leaf nodes of a pairing heap in a backup database of the database cluster, the pairing heap being used to receive prewrite logs sent by a master database in the database cluster to the backup database, the prewrite logs comprising business logs and special logs, the special logs being used to advance a playback version number of the backup database; in the case that all the prewrite logs received by the leaf nodes are special logs and the duration reaches a preset time threshold, controlling the backup database to send idle control information to the master database; and the master database, in response to the idle control information, entering a business idle state and stopping generation and transmission of the special logs. This method can effectively prevent continuous generation of meaningless special logs in the master database idle stage, greatly improve the resource utilization rate of the database cluster, and reduce the operation and maintenance cost of the cluster on the premise of guaranteeing the consistency of the cluster data and the basic function of log playback.

Log regulation method, medium, product and equipment of database cluster

Owner:CETC JINCANG (BEIJING) TECH CO LTD

Data processing method and device, electronic equipment and storage medium

ActiveCN115733614BUser identity/authority verificationGeographical information databasesClustered dataUser Privilege

The application relates to the technical field of high-precision map data processing, and provides a data processing method and device, electronic equipment and a storage medium, the method comprising the following steps: receiving an access request of high-precision map data sent by a client; verifying the user authority based on an authentication token carried by the access request; if the user authority verification is passed, clustering the high-precision map data in a database based on the access request, and sending the clustered data obtained through clustering to the client; wherein the client displays the received clustered data. Through clustering processing of the high-precision map data in the database based on the access request, the data associated with the access request is determined, the data related to the access request is quickly displayed, and the display efficiency of the high-precision map data is improved.

Data processing method and device, electronic equipment and storage medium

Owner:ZHIDAO NETWORK TECH (BEIJING) CO LTD

Cross-cluster traffic optimization scheduling method and system for data processing workflow

ActiveCN116339941Breduce overheadProgram initiation/switchingEnergy efficient computingClustered dataParallel computing

This invention discloses a cross-cluster traffic optimization scheduling method and system for data processing workflows. It provides a strategy for optimizing job scheduling of cross-cluster data processing workflows by analyzing bottleneck points and migrating jobs across clusters at those bottlenecks. The main steps include: constructing a branching-merging directed acyclic graph of the data processing workflow; performing a depth-first post-order traversal of the graph; analyzing the bottleneck points and bottleneck traffic of each job during the traversal, and determining the execution cluster for that job at the bottleneck point. This invention can reduce cross-cluster data traffic in data pipelines and container workflows.

Cross-cluster traffic optimization scheduling method and system for data processing workflow

Cross-cluster traffic optimization scheduling method and system for data processing workflow

Cross-cluster traffic optimization scheduling method and system for data processing workflow

Owner:COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI

Enhancing product search with adaptive matching

PendingCN122312243AClustered dataData pack

Some aspects involve techniques for performing product retrieval on a publishing platform using clusters of interchangeable parts formed through adapter matching. According to some aspects, product data is accessed for each part product entry on the publishing platform, wherein the product data for each part product entry includes adapter data for each of a plurality of different adapters. For each part product entry, an adapter hash value is generated using the adapter data for each adapter of that part product entry. The part product entries are clustered based on the overlap between the adapter hash values for the part product entries. Cluster data is stored for each cluster of part product entries. The cluster data for each cluster of part product entries associates a cluster identifier with a product entry identifier for each part product entry in that cluster. The cluster data can be used to perform product retrieval on the publishing platform.

Enhancing product search with adaptive matching

Enhancing product search with adaptive matching

Enhancing product search with adaptive matching

Owner:EBAY INC

A data item pairing method, device and storage medium

ActiveCN115495549BNatural language data processingSpecial data processing applicationsClustered dataDatasheet

The application relates to a data item matching method, device and storage medium, wherein the data item matching method comprises the following steps: obtaining data items to be matched from a data table to be matched; calculating text similarity between the data items to be matched; clustering the data items to be matched according to the calculated text similarity; and matching the data items to be matched according to the clustering result. Through the application, the text similarity between the data items to be matched is mined, the text similarity between the data items to be matched is utilized, the data items to be matched are clustered first, for each type of the clustered data items, only a limited number of data items need to be matched, and the matching work of all the data items can be completed, so that the data item matching work efficiency is greatly improved, and the problem of low data item matching efficiency in the prior art is solved.

A data item pairing method, device and storage medium

A data item pairing method, device and storage medium

A data item pairing method, device and storage medium

Owner:ZHEJIANG DAHUA TECH CO LTD

Popular searches

Cluster based World Wide Web Information retrieval Data source Knowledge graph Data mining Data science Physics Data processing Data chunk