WORLD JOURNAL OF INNOVATION AND MODERN TECHNOLOGY (WJIMT )

E-ISSN 2504-4766
P-ISSN 2682-5910
VOL. 8 NO. 4 2024
DOI: 10.56201/wjimt.v8.no4.2024.pg137.151


A Constraint Identification Method for Predicate Node Identification in Clustered Xml Documents

B.A. Bodinga, A. Roko, A.B. Muhammad, I. Saidu


Abstract


A large number of documents are now represented and stored using an XML document structure on the web. These documents may emanate from the same source (Homogeneous) or different sources (Heterogeneous). This make it challenging as how these documents can be managed and retrieved. The existing systems returns irrelevant predicates. The predicate node identification method employed on the search systems use only a simple constraint. To improve the effectiveness of XML retrieval, an effective constraints Identification Algorithm (E_CIA) is developed to identify relevant predicates. The E_CIA uses Constraints Operator Generator (COG) to identify constraints to be imposed to generate most relevant predicate node to improve the effectiveness of the retrieval process. Experiments have been conducted to evaluate the performance of the proposed E_CIA. The experimental results have shown that the proposed E_CIA outperforms StruX and StruXPlus in terms of precision.


keywords:

XML retrieval, Constraint Identification, Predicate node, Constraint Operator


References:


[1] Bao, Z., Lu, J., Ling, T. W., & Chen, B. (2010). Towards an Effective XML Keyword
Search. IEE Transactions On Knowledge And Data Engineering, Vol. 22(8), pp.1077–1092.
[2] Bodinga, A. B., Roko, A., Muhammad, A.B., and Saidu, I. (2024). An Effective XML
Documents Clustering Method Using Word Embeddings for Heterogeneous Collections.
International Journal of Computer Science and Mathematical Theory. DOI:
10.56201/ijcsmt.v10.no2.2024.pg120.140
[3] Fuhr, N., Lalmas, M., & Kazai, G. (2002). INEX: Initiative for the Evaluation of XML
retrieval. In University of Dortmund. article.
[4] Gan, K.H. and Phang, K.K. (2017). A Sematic-Syntax Model for XML Query Construction.
International Journal of Web Information Systems. Vol. 13(2). doi: 10.1108/IJWIS-
06-2016-0034.
[5] Hagen, M. Potthast, M., Stein, B. & Brautigam, C. (2012). The power of Naïve
query
segmentation. In the Proceedings of the SIGIR’10 Conference, Geneva, Switzerland.
Pp. 1-2.
[6] Hummel, F., da Silva, A.S., Moro, M.M., & Laender, A.H.F. (2011). Automatically
Generating
Structured Queries for XML Keyword Search. In
S.Geva et.al., (Eds.):
INEX 2010, lncs 6932,194-205. Springer-Verlag Berling Heldeiberg.
[7] Liu, Z., and Chen, Y. (2007). Identifying meaningful return information for XML keyword
search. In Proceedings of the 2007 ACM SIGMOD international conference on
Management of data. SIGMOD ’07. Pp. 320-329. New York, New York, USA: ACM Press.
http://doi.org/10.1145/1247480.1247518.
[8] Roko, A., Doraisamy, S., Jantan, A, H. and Azman, A. (2015). Effective Keyword Query
Structuring using NER for XML Retrieval. International Journal of Web
Information Systems, vol. 11 (1), pp. 33-53.
[9] Roko, A., Doraisamy, S., and Nakone, B. (2018). Effective Predicate Identification
Algorithm for XML Retrieval. In proceedings of the Fourth International Conference on
Information Retrieval and Knowledge Management (CAMP),
Kota
Kinabalu,Malaysia, 2018, pp. 1-5, doi:
10.1109/INFRKM.2018.8464696.
[10] Woodley, A., and Geva, S. (2006). Nlpx at inex 2006. In N. Fuhr, M. Lalmas, & A.
Trotman (Eds.), Inex, 4518, 302-311. Springer-Verlag Berling Heldeiberg.
[11] Petkova D., Croft W.B., and Diao Y. (2009). Refining Keyword Queries for XML
Retrieval by Combining Content and Structure. In: Boughanem M., Berrut
C.,
Mothe J., Soule-Dupuy C. (eds) Advances in Information Retrieval.
ECIR 2009,
Springer, Berlin, Heidelberg.


DOWNLOAD PDF

Back


Google Scholar logo
Crossref logo
ResearchGate logo
Open Access logo
Google logo