The information management system disclosed enables caregivers to make better decisions, faster, using aggregated genetic and phenotypic data. The system enables the integration, validation and analysis of genetic, phenotypic and clinical data from multiple subjects who may be at distributed facilities. A standardized data model stores a range of patient data in standardized data classes that encompass patient profile information, patient symptomatic information, patient treatment information, and patient diagnostic information including genetic information. Data from other systems is converted into the format of the standardized data classes using a data parser, or cartridge, specifically tailored to the source system. Relationships exist between standardized data classes that are based on expert rules and statistical models. The relationships are used both to validate new data, and to predict phenotypic outcomes based on available data. The prediction may relate to a clinical outcome in response to a proposed intervention by a caregiver. The statistical models may be inhaled into the system from electronic publications that define statistical models and methods for training those models, according to a standardized template. Methods are described for selecting, creating and training the statistical models to operate on genetic, phenotypic and clinical data, in particular for underdetermined data sets that are typical of genetic information. The disclosure also describes how security of the data is maintained by means of a robust security architecture, and robust user authentication such as biometric authentication, combined with application-level and data-level access privileges.