A computer-implemented system and method to reduce re-identification risk of a data set. The method includes the steps of retrieving, via a database-facing communication channel, a data set from a database communicatively coupled to the processor, the data set selected to include patient medical records that meet a predetermined criteria; identifying, by a processor coupled to a memory, direct identifiers in the data set; identifying, by the processor, quasi-identifiers in the data set; calculating, by the processor, a first probability of re-identification from the direct identifiers; calculating, by the processor, a second probability of re-identification from the quasi-direct identifiers; perturbing, by the processor, the data set if one of the first probability or second probability exceeds a respective predetermined threshold, to produce a perturbed data set; and providing, via a user-facing communication channel, the perturbed data set to the requestor.