An Instance Based Schema Matching Between Opaque Database Schemas
2014 (English)In: 2014 4TH INTERNATIONAL CONFERENCE ON ENGINEERING TECHNOLOGY AND TECHNOPRENEURSHIP (ICE2T), IEEE Computer Society, 2014, 177-182 p.Conference paper (Refereed)
Schema matching is always needed amoung schemas of relational datasets in database integration applications. Heterogeneous database integration involves a significant role of schema matching Most of the previous solution to schema matching problem based on the identification of similarity between the columns names or by recognizing common domains in the data stored in the schemas. These approaches are not applicable on those datasets with unaligned schemas where the name of the columns in the schemas and the data in the columns are opaque. In this paper we proposed an instance based approach to find the matching between the schemas of heterogeneous datasets that share a common primary keys but it is unknown which columns are primary keys. The proposed approach consists of two main phases Row Similarity and Attribute Similarity. In the row similarity phase proposed approach determines all the pairs of rows among datasets that are representing same real world entity based on the same primary keys values. In attribute similarity phase, by comparing the data values within those similar pairs of rows our approach able to find the corresponding attributes. Different experiments are performed to validate proposed approach by using real world datasets. The results demonstrated the viability of the proposed approach.
Place, publisher, year, edition, pages
IEEE Computer Society, 2014. 177-182 p.
Attribute identification, Data integration, Schema matching, Heterogeneous databases, Relational datasets
Computer Science Information Systems
IdentifiersURN: urn:nbn:se:bth-10857ISI: 000360302900036ISBN: 978-1-4799-4621-1OAI: oai:DiVA.org:bth-10857DiVA: diva2:862002
4th International Conference on Engineering Technology and Technopreneuship (ICE2T), Kuala Lumpur, MALAYSIA