Abstract
Currently, a lot of recent electronic health records are based on XML documents. In order to integrate these heterogeneous XML medical documents efficiently, studies on finding structure and semantic similarity between XML Schemas have been exploited. The main problem is how to harvest the most appropriate relatedness to combine two schemas as a global XML Schema for reusing and referring purposes. In this paper, we propose the novel resemblance measure that concurrently considers both structural and semantic information of two specific healthcare XML Schemas. Specifically, we introduce new metrics to compute the datatype and cardinality constraint similarities, which improve the quality of the semantic assessment. On the basis of the similarity between each element pair, we put forward an algorithm to calculate the similarity between XML Schema trees. Experimental results lead to the conclusion that our methodology provides better similarity values than the others with regard to the accuracy of semantic and structure similarities.










Similar content being viewed by others
References
Brown I, Adams A (2007) The ethical challenges of ubiquitous healthcare. Int Rev Inf Ethics 8(12):53–60
Wikipedia, Electronic healthcare record, http://en.wikipedia.org/wiki/Electronic_health_record
Jervis M (2002) XML DTDs vs XML Schema. http://www.sitepoint.com/xml-dtds-xml-schema/
Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: Proceedings of the 32nd annual meeting on association for computational linguistics, pp 133–138
Do H-H, Rahm E (2002) COMA—a system for flexible combination of schema matching approaches. In: Proceedings of the very large data bases conference (VLDB), pp 610–621
Yang DD, David MW (2005) Powers, measuring semantic similarity in the taxonomy of WordNet. The 28th Australasian computer science conference (ACSC2005), pp 315–322
Lee ML, Yang LH, Hsu W, Yang X (2002) XCLust: clustering XML schemas for effective integration. ACM Press, New York, pp 292–299
Princeton University, WordNet_ A lexical database for English, http://wordnet.princeton.edu/wordnet
Tekli J, Chbeir R, Yetongnon K (2007) A hydrid approach for XML similarity. In: SOFSEM ‘07 proceedings of the 33rd conference on current trends in theory and practice of computer science. Springer, Berlin, pp783–795
Tekli J, Chbeir R, Yetongnon K (2009) An overview on XML similarity: background, current trends and future directions. Comput Sci Rev 3:151–173
Algergawy A, Nayak R, Saake G (2010) Element similarity measures in XML schema matching. Inf Sci 180:4975–4998
Rada R, Mili H, Bicknell E, Blettner M (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30
Fernandez A, Polleres A, Ossowski S (2007) Towards fine-grained service matchmaking by using concept similarity. In: Workshop on service matchmaking and resource retrieval in the semantic web, pp 31–45
Li Y, Bandar Z, McLean D (2003) An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans Knowl Data Eng 15(4):871–882
Pyshkin E, Kuznetsov A (2010) Approaches for web search user interfaces: how to improve the search quality for various types of information. J Conver 1(1):1–8
Klyuev V, Yokoyama A (2010) Web query expansion: a strategy utilising Japanese WordNet. J Conver 1(1):23–28
Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of international conference on machine learning, pp 296–304
Resnik P (1999) Semantic similarity in a taxonomy an information-based measure and its applications to problems of ambiguity in natural language. J Artif Intell Res 11:95–130
Ye Y, Li X, Wu B, Li Y (2011) A comparative study of feature weighting methods for document co-clustering. IJITCC 1(2):206–220
Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. In: Fellbaum C (ed) WordNet: an electronic lexical database. MIT Press, Cambridge, pp 265–283
Klyuev V, Oleshchuk V (2011) Semantic retrieval: an approach to representing, searching and summarizing text documents. IJITCC 1(2):221–234
D Vint Productions (2003) XML schema—data types quick reference. http://www.xml.dvint.com
Mebiquitous XML Schema. http://ns.medbiq.org/
Health Level Seven International. http://www.hl7.org/
Do H–H (2005) Schema matching and mapping-based data integration, PhD thesis, University of Leipzig, Interdisciplinary Center for Bioinformatics and Department of Computer Science
Acknowledgments
This work was supported by a grant from the Kyung Hee University in 2010 (KHU-20101372).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Thuy, P.T.T., Lee, YK. & Lee, S. Semantic and structural similarities between XML Schemas for integration of ubiquitous healthcare data. Pers Ubiquit Comput 17, 1331–1339 (2013). https://doi.org/10.1007/s00779-012-0567-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-012-0567-5