A
data integration system (100, 10-14) comprises a plurality of data sources (10-14) and a
mapping system (120, 121, 122, 125, 126, 127, 128) for providing mapping between the data sources (10-14) and a global ontology. The global ontology comprises a plurality of elements including at least a plurality of concepts, at least some of which include one or more attributes. The
data integration system further comprises a
user interface (110). The
user interface (110) is operable in use to provide an integrated, global view of the data contained in the data sources (10-14) and to permit a user to interact with the data sources (10-14) using the global ontology. The
mapping system (120) includes a
schema mapping portion (122) and a semantic identifier portion (127), wherein the
schema mapping portion (127) includes a plurality of single
data source element mappings each of which specifies how one or more elements from a single
data source map to one or more elements of the global ontology, and the semantic identifier portion (127) comprises a plurality of semantic identifiers each of which is operable to specify in terms of the global ontology how to identify and merge duplicate rough instances of concepts of the global ontology derived from queries to the possibly heterogeneous data sources, which duplicate rough instances represent the same actual instances.