Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

40 results about "Quasi identifier" patented technology

Quasi-identifiers are pieces of information that are not of themselves unique identifiers, but are sufficiently well correlated with an entity that they can be combined with other quasi-identifiers to create a unique identifier.

Automated Determination of Quasi-Identifiers Using Program Analysis

A system and method for automated determination of quasi-identifiers for sensitive data fields in a dataset are provided. In one aspect, the system and method identifies quasi-identifier fields in the dataset based upon a static analysis of program statements in a computer program having access to—sensitive data fields in the dataset. In another aspect, the system and method identifies quasi-identifier fields based upon a dynamic analysis of program statements in a computer program having access to—sensitive data fields in the dataset. Once such quasi-identifiers have been identified, the data stored in such fields may be anonymized using techniques such as k-anonymity. As a result, the data in the anonymized quasi-identifiers fields cannot be used to infer a value stored in a sensitive data field in the dataset.
Owner:TELCORDIA TECHNOLOGIES INC

Computer systems, methods and computer program products for data anonymization for aggregate query answering

Computer program products are provided for anonymizing a database that includes tuples. A respective tuple includes at least one quasi-identifier and sensitive attributes associated with the quasi-identifier. These computer program products include computer readable program code that is configured to (k,e)-anonymize the tuples over a number k of different values in a range e of values, while preserving coupling at least two of the sensitive attributes to one another in the sets of attributes that are anonymized to provide a (k,e)-anonymized database. Related computer systems and methods are also provided.
Owner:AT&T INTPROP I L P

Systems and associated computer program products that disguise partitioned data structures using transformations having targeted distributions

A data structure that includes at least one partition containing non-confidential quasi-identifier microdata and at least one other partition containing confidential microdata is formed. The partitioned confidential microdata is disguised by transforming the confidential microdata to conform to a target distribution. The disguised confidential microdata and the quasi-identifier microdata are combined to generate a disguised data structure. The disguised data structure is used to carry out statistical analysis and to respond to a statistical query is directed to the use of confidential microdata. In this manner, the privacy of the confidential microdata is preserved.
Owner:AT&T INTPROP I L P

System and method to reduce a risk of re-identification of text de-identification tools

ActiveUS20170177907A1Improve de-identificationNot at risk of errorDigital data protectionProbabilistic networksMedical recordData set
A computer-implemented system and method to reduce re-identification risk of a data set. The method includes the steps of retrieving, via a database-facing communication channel, a data set from a database communicatively coupled to the processor, the data set selected to include patient medical records that meet a predetermined criteria; identifying, by a processor coupled to a memory, direct identifiers in the data set; identifying, by the processor, quasi-identifiers in the data set; calculating, by the processor, a first probability of re-identification from the direct identifiers; calculating, by the processor, a second probability of re-identification from the quasi-direct identifiers; perturbing, by the processor, the data set if one of the first probability or second probability exceeds a respective predetermined threshold, to produce a perturbed data set; and providing, via a user-facing communication channel, the perturbed data set to the requestor.
Owner:PRIVACY ANALYTICS

Database anonymization for use in testing database-centric applications

At least one quasi-identifier attribute of a plurality of ranked attributes is selected for use in anonymizing a database. Each of the ranked attributes is ranked according to that attribute's effect on a database-centric application (DCA) being tested. In an embodiment, the selected quasi-identifier attribute(s) has the least effect on the DCA. The database is anonymized based on the selected quasi-identifier attribute(s) to provide a partially-anonymized database, which may then be provided to a testing entity for use in testing the DCA. In an embodiment, during execution of the DCA, instances of database queries are captured and analyzed to identify a plurality of attributes from the database and, for each such attribute identified, the effect of the attribute on the DCA is quantified. In this manner, databases can be selectively anonymized in order to balance the requirements of data privacy against the utility of the data for testing purposes.
Owner:ACCENTURE GLOBAL SERVICES LTD

Secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes

The invention discloses a secondary k-anonymity privacy protection algorithm for differentiating quasi-identifier attributes, pertaining to the technical field of privacy protection.The algorithm comprises following steps: forming hierarchical grids with single attribute through an Incognito function to determine whether generalization satisfies k-anonymity or not, deleting nodes not satisfying k-anonymity, iterating nodes satisfying k-anonymity to form a candidate node set and determining again whether candidate nodes satisfy k-anonymity, deleting nodes not satisfying k-anonymity, and repeating the above steps till all categorical attributes are iterated and outputting root nodes satisfying k-anonymity.Data tables T are generalized through the root nodes. The MDAV algorithm is utilized for secondary generalization of generalized T'. The number of tuples in equivalence class inputted is divided into the range of k-2k-1. When partition is finished, information loss is provided for obtaining a data table with the little loss amount through comparisons.
Owner:XUZHOU MEDICAL UNIV

K-cryptonym improving method

The invention discloses a K-cryptonym improving method, relating to a data mining field. The K-cryptonym improving method comprising the following steps of selecting a quasi-identifier according to an original dataset; determining a generalizing mode and establishing an initial generalizing lattice corresponding to the generalizing mode; judging whether the initial generalizing lattice is empty or not; if not, selecting a global optimum node from all nodes of the initial generalizing lattice according to the optimum node selection mode and obtaining a first generalizing lattice; carrying out the cryptonym processing on the data to be issued according to the global optimum node and obtaining the quantity of cryptonym cluster; judging whether the quantity of the cryptonym cluster is less than the prearranged quantity or not; if so, carrying out the optimum node selection mode calculation on the first generalizing lattice and obtaining the optimum node; if not, carrying out the secondary K-cryptonym calculation to the first generalizing lattice and obtaining the optimum node as the cryptonym cluster is an isolated cluster; generalizing the data to be issued according to the generalizing mode corresponding to the optimum node; and issuing the generalized data. By adopting the K-cryptonym improving method, the execution time is shortened, and the information accuracy is improved.
Owner:TIANJIN UNIV

Automated determination of quasi-identifiers using program analysis

A system and method for automated determination of quasi-identifiers for sensitive data fields in a dataset are provided. In one aspect, the system and method identifies quasi-identifier fields in the dataset based upon a static analysis of program statements in a computer program having access to—sensitive data fields in the dataset. In another aspect, the system and method identifies quasi-identifier fields based upon a dynamic analysis of program statements in a computer program having access to—sensitive data fields in the dataset. Once such quasi-identifiers have been identified, the data stored in such fields may be anonymized using techniques such as k-anonymity. As a result, the data in the anonymized quasi-identifiers fields cannot be used to infer a value stored in a sensitive data field in the dataset.
Owner:TELCORDIA TECHNOLOGIES INC

Privacy protection method in multi-sensitive-attribute data release

The invention discloses a privacy protection method in multi-sensitive-attribute data release, and solves the problem of poor quality of quasi-identifier data in multi-sensitive-attribute data release. The basic thinking of the invention is as follows that: firstly, clustering is executed on data sets, the data sets of which quasi-identifiers are similar are aggregated into one aggregate, and a plurality of data aggregates are generated; secondly, a multi-dimension bucket structure is constructed on the basis of sensitive attributes, and data records are mapped into the multi-dimension bucket structure according to values of the sensitive attributes; and then on the basis of multi-dimension buckets, grouping is carried out, i.e., main sensitive attributes are selected, dimension capacity of the main sensitive attributes is calculated, L (L is greater than or equal to 2) main sensitive attributes with the maximum dimension capacity are selected, one data record is respectively selected from the L main sensitive attributes, whether the data records meet the multi-sensitive-attribute L-diversity is judged, and if not, each bucket is sequentially traversed according to the capacity from big to small until the data records meet the multi-sensitive-attribute L-diversity. The process is repeated until the data in the buckets do not meet the multi-sensitive-attribute L-diversity. Finally, all groups are subjected to anonymization processing.
Owner:HUAZHONG UNIV OF SCI & TECH

Computer systems, methods and computer program products for data anonymization for aggregate query answering

Computer program products are provided for anonymizing a database that includes tuples. A respective tuple includes at least one quasi-identifier and sensitive attributes associated with the quasi-identifier. These computer program products include computer readable program code that is configured to (k,e)-anonymize the tuples over a number k of different values in a range e of values, while preserving coupling at least two of the sensitive attributes to one another in the sets of attributes that are anonymized to provide a (k,e)-anonymized database. Related computer systems and methods are also provided.
Owner:AT&T INTPROP I LP

A privacy protection data publishing method based on conditional probability distribution

The invention belongs to the technical field of information security and privacy protection, and is a privacy protection data publishing method based on conditional probability distribution. Accordingto the conditional probability distribution, an attacker's prior knowledge is modeled so that the attacker has different prior knowledge in different transactions. Then using the constructed model and quasi-identifier attribute value, the sensitive attribute value of each record is predicted, and the original value is replaced with the predicted value, and then published. There is no direct correlation between the predicted values of the published sensitive attributes and the original values, which effectively protects the privacy of user data. The predicted distribution of sensitive attribute values is similar to the real distribution, which effectively controls the distribution error and ensures the availability of the published dataset better than that of the generalized and stochasticresponse method. The invention can provide privacy protection mechanism for data release in various social fields such as medical treatment, finance, credit generation, transportation and the like, and provides support for application of data in scientific research and social service while protecting user data privacy.
Owner:FUDAN UNIV

Anonymization apparatus, anonymization method, and computer program

Provided is an anonymization apparatus for optimally and fully performing anonymization, in anonymizing history information, in a state where a specific property existing in a plurality of records with an identical identifier is maximally maintained.This anonymization apparatus includes record extracting means for extracting, with respect to history information including a plurality of records each of which includes unique identification information associated with at least a quasi-identifier and sensitive information, on the basis of smallness of ambiguity of a property existing between the plurality of records that are able to satisfy desired anonymity and share a specific unique identifier, a record with other unique identifier different from the specific unique identifier from the history information and anonymizing means for giving commonality to and thereby abstracting the quasi-identifier each included in the plurality of records, so that an individual attribute in the plurality of records extracted by the record extracting means satisfies the desired anonymity.
Owner:NEC CORP

Method/system for the online identification and blocking of privacy vulnerabilities in data streams

A method and system for automatically identifying and protecting privacy vulnerabilities in data streams includes indexing data values for each attribute of the data stream received by local virtual machines based on a schema of each data stream, classifying the data attributes of the plurality of data streams into known data types, integrating the local virtual machine indexes into a global index data structure for the data streams including single attribute data values, identifying privacy vulnerabilities in the data as attributes that are direct identifiers based on the attribute data values stored in the global index and combinations of attributes that are quasi-identifiers based on the low frequency of certain combinations of attribute data value pairs by computing the frequency based on the single attribute data values stored in the global index and providing privacy protection to the data streams by applying data transformations on the discovered direct identifiers and the quasi-identifiers.
Owner:GREEN MARKET SQUARE LTD

Information processing device, information processing method and recording medium

InactiveUS20170161519A1Reduce ambiguityAmbiguity of relationship among attributes of linked data,Relational databasesDigital data protectionInformation processingAmbiguity
Provided is an information processing device that can decrease ambiguity of relationship among attributes of linked data, to which relational diversification is performed, and can assess a common characteristic of a linked data group belonging to a cohort. The information processing device includes: relational diversification means that diversifies a relationship to make it difficult to identify a sensitive attribute value of the linked data from another sensitive attribute value; and anonymous cohort generating means which generates cohort information by extracting an attribute value or a characteristic and a property being common in a linked data group belonging to a cohort as a set of linked data assigned with a combination of same quasi-identifiers or a same group identifier and having similarity to one another, wherein the relational diversification means outputs the linked data group, of which a relationship is diversified, by adding the cohort information to the linked data group.
Owner:NEC CORP

System and method to reduce a risk of re-identification of text de-identification tools

ActiveUS10395059B2Improve de-identificationNot at risk of errorMathematical modelsDigital data protectionMedical recordData set
A computer-implemented system and method to reduce re-identification risk of a data set. The method includes the steps of retrieving, via a database-facing communication channel, a data set from a database communicatively coupled to the processor, the data set selected to include patient medical records that meet a predetermined criteria; identifying, by a processor coupled to a memory, direct identifiers in the data set; identifying, by the processor, quasi-identifiers in the data set; calculating, by the processor, a first probability of re-identification from the direct identifiers; calculating, by the processor, a second probability of re-identification from the quasi-direct identifiers; perturbing, by the processor, the data set if one of the first probability or second probability exceeds a respective predetermined threshold, to produce a perturbed data set; and providing, via a user-facing communication channel, the perturbed data set to the requestor.
Owner:PRIVACY ANALYTICS

Method and system for anonymising data stocks

Provided is a method for anonymising data stocks, including the steps of determining a combination of generalization stages for quasi-identifiers pf of a data stock at a central node; transmitting the combination of generalization stages to a plurality of sub-nodes; and a parallel performing of an anonymisation of the data stock on the basis of the combination of generalization stages by the sub-nodes.
Owner:SIEMENS AG

key information protection method and system based on an OpenID

ActiveCN109829333AEffectively control the spreadControl spreadDigital data protectionInformation repositoryPassword
The invention provides a key information protection method and system based on an OpenID., relating to identity authentication technology, Password technology, DATA DESENSITIZATION TECHNIQUES, the user key information is vertically segmented into identifiers; Quasi-identifier, According to the method, the real-name identity information and the sensitive information are classified into four types,the function key is adopted for protection and decentralized storage, the track information is encrypted, anti-association cutting, encryption, noise adding and other desensitization processing can becarried out on different types of key data, the privacy of a user can be effectively protected, and the risk of leakage of a user information base is greatly reduced.
Owner:INST OF INFORMATION ENG CAS

Improved solving method for quasi-identifier in k-anonymization

The invention relates to an improved solving method for a quasi-identifier in k-anonymization, and belongs to the technical field of privacy protection in information security. The method comprises the following steps of converting a data table set into a bipartite graph of a hypergraph, calculating all paths between two points in a bipartite junction set by virtue of a method for solving the paths between the two points of the graph, and outputting all the paths. According to the method, the efficiency problem, caused by a Paths method, of a QUASI _IDENTIFIER method in a related data table solving process is effectively solved, and the time complexity O(V<4>) of the Paths method is lowered to O(V<3>) of the method provided by the invention.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Information processing device, method for verifying anonymity and method

The present invention provides an information processing device that enables a reduction in the processing cost of verifying anonymity during anonymization when multi-dimensional data is the subject of anonymization. The information processing device is provided with: a unit which generates information indicating the correspondence between a record contained in a data set and a class specifying a unique combination of quasi-identifier attribute values; a unit which verifies the anonymity of each record on the basis of the class thereof indicated in the information; and a unit which, on the basis of the results of verifying the anonymity, updates the information in a manner such that whether or not the record satisfies the anonymity can be identified and outputs the record-class correspondence information.
Owner:NEC CORP

Sensitive attribute data processing method and system

The invention provides a sensitive attribute data processing method and system. The method comprises the steps of obtaining a user data set; obtaining a plurality of sensitive attribute sub-data setsbased on the quasi-identifier attributes and the sensitive attributes; dividing the plurality of sensitive attribute sub-data sets into a plurality of sensitive attribute data record groups; determining a first sensitive attribute data record group conforming to the composite multi-sensitive attribute L-diversity, and determining a second sensitive attribute data record group not conforming to thecomposite multi-sensitive attribute L-diversity; adding the data in the second sensitive attribute data record group to the first sensitive attribute data record group under the condition of not destroying the L-diversity of the composite multi-sensitive attribute; and anonymizing all the first sensitive attribute data record groups to obtain multiple groups of anonymous groups, performing randomsorting, and publishing a random sorting result. The corresponding relations between the quasi-identifier attribute and the sensitive attribute are disorganized, so that the private information of the user is prevented from being speculated according to the user data, and the usability and the security of the personal information are ensured.
Owner:GUANGDONG UNIV OF TECH

Micro-aggregation anonymization method based on sorting

The invention provides a micro-aggregation anonymization method based on sorting. The method comprises the following steps that: (1) a sorting operation: on the basis of a Q1 quasi-identifier, dividing a dataset into a plurality of categories to enable k-division to be based on the Q1 quasi-identifier by the dataset; (2) a division operation based on sorting: independently systemically forming equivalence classes from the first extreme record and the last extreme record of the dataset initialization of the sorting operation, and keeping the record number of the equivalence classes in k; and (3) an aggregate operation: taking the center points of two extreme records as the centroid point of each equivalence class, and replacing all sensitive attribute values with the mean value of the equivalence classes to form an anonymous equivalence class. By use of the method, firstly, according to a mean valve sorting technology, a k-division process is effectively improved to guarantee the information loss ratio of the k-division process to be minimum, the execution efficiency of an algorithm is improved, in addition, a multidimensional dataset can be processed after a sorting concept is introduced, and then, privacy protection can be improved.
Owner:HOHAI UNIV

Skyline-based data generalization method

The invention discloses a Skyline-based data generalization method. The method comprises the steps of processing a data table according to a data release privacy protection standard 10-anonymity to obtain a re-identified risk quantity R of a policy, recording the risk quantity R as a threshold T, and determining a policy space {S,(R,U)} according to a value domain of a quasi-identifier attribute and the threshold T, wherein an R value of the policy comprised in the policy space {S,(R,U)} is not greater than the threshold T; filtering the policy space {S,(R,U)} by adopting epsilon-approximate Skyline to obtain candidate policy spaces {G,(R,U)}; and performing Skyline calculation on the candidate policy space {G,(R,U)} to obtain a recommended policy space {F,(R,U)}, wherein the recommended policy space {F,(R,U)} is a private policy space recommended for the data table. According to the method, the accuracy of privacy protection policy recommendation is improved through an enumeration full policy space; the coverage range of an RU space is wide; multilevel demands of a user are met; the threshold T is set and the privacy protection policies not meeting the requirements are filtered, so that the policy space generation time is shortened; and the filtering is performed by adopting the epsilon-approximate Skyline, so that the scale of the candidate policy spaces is further reduced.
Owner:HUAZHONG UNIV OF SCI & TECH

Data processing method and device, electronic equipment and readable storage medium

The invention provides a data processing method and device, electronic equipment and a readable storage medium, and relates to the technical field of data security. The method comprises the steps of obtaining a user data set corresponding to each user in a plurality of users; determining target data corresponding to each quasi-identifier attribute and sensitive data corresponding to the sensitiveattribute in a user data set corresponding to each user; based on the target data and the sensitive data, determining an association degree between each quasi-identifier attribute and the sensitive attribute; determining a generalization sequence for performing K-anonymity processing on the plurality of quasi-identifier attributes according to the association degree; if the quasi-identifier attribute with the large association degree can be generalized firstly, the quasi-identifier attribute related to the sensitive attribute can be generalized, an attacker cannot easily locate a user, and theproblem that privacy information of the user is easily leaked is avoided.
Owner:BEIJING TOPSEC NETWORK SECURITY TECH +2

Anonymous method for missing data and storage medium

The invention provides a missing data anonymity method and a storage medium. The missing data anonymity method comprises the steps of setting a clustering parameter k; setting l-diversity model parameter l; clustering all the data records in the data set according to the similarity judgment values of the data records in the data set, the l-diversity model parameters and the clustering parameters,dividing the data set into a plurality of clustering cluster groups, and obtaining a data set with a plurality of clustering clusters; and performing generalization processing on each clustering cluster group in the data set, performing generalization on each clustering cluster to obtain an equivalence class, and in the obtained equivalence class, the values of the data records in the same equivalence class on the quasi-identifier attribute are the same, thereby finishing anonymous processing. According to the method, the availability of the data after anonymization processing is guaranteed tothe maximum extent by processing the incomplete data set, information loss caused by a traditional anonymization method is reduced. Meanwhile, sensitive attributes related to a user are protected byl-diversity, and the safety of anonymization processing of the data set is improved.
Owner:ZHENGZHOU UNIV

Information processing device, method for verifying anonymity and medium

The present invention provides an information processing device that enables a reduction in the processing cost of verifying anonymity during anonymization when multi-dimensional data is the subject of anonymization. The information processing device is provided with: a unit which generates information indicating the correspondence between a record contained in a data set and a class specifying a unique combination of quasi-identifier attribute values; a unit which verifies the anonymity of each record on the basis of the class thereof indicated in the information; and a unit which, on the basis of the results of verifying the anonymity, updates the information in a manner such that whether or not the record satisfies the anonymity can be identified and outputs the record-class correspondence information.
Owner:NEC CORP

Database sensitive association attribute desensitization method based on invariant random response technology

The invention relates to a database sensitive association attribute desensitization method based on an invariant random response technology. Compared with the prior art, the defect that privacy risksrelated to data attributes are not fully considered is overcome. The method comprises the following steps: analyzing original data; and desensitizing sensitive associated attributes. On the basis of considering the dependency relationship between a quasi-identifier attribute and a sensitive attribute of the data stored in the database, the sensitive data in the database is desensitized to protectthe privacy of a user, and the data utility is enhanced.
Owner:ANHUI UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products