ADVERTISEMENT

Home|Journals|Articles by Year|Audio Abstracts
 

Original Article

JJCIT. 2020; 6(2): 165-181


Tag Recommendation for Short Arabic Text by Using Latent Semantic Analysis of Wikipedia

IYAD ALAGHA, Yousef Abu Samra.




Abstract

Text tagging has gained a growing attention as a way of associating metadata that supports information retrieval and classification. To resolve the difficulties of manual tagging, tag recommendation has emerged as a solution to assist users in tagging by presenting a list of relevant tags. However, the majority of existing approaches for tag recommendation have focused on domain-specific tagging and tackled long-form text. Open-domain tagging can be challenging due to the lack of comprehensive knowledge and the intensive computations involved. Furthermore, tagging of short text can be problematic due to the difficulty of extracting statistical features. In terms of the language, most efforts have focused on tagging text written in English. The tagging of Arabic text has been challenged by the difficulty of processing the Arabic language and the lack of knowledge sources in Arabic.
This work proposes an approach for tag recommendation for short Arabic text. It exploits the Arabic Wikipedia as a background knowledge, and uses it to generate tags in response to input short text. Latent semantic analysis is exploited to analyze Wikipedia content and find articles relevant to the input text. Then, tags are selected from the titles and categories of these articles, and are ranked according to relevance.
The approach was evaluated based on experts' ratings of relevance of 993 tags. Results showed that the approach achieved 84.39% mean average precision and 96.53% mean reciprocal rank. A thorough discussion of results is given to highlight the limitations and the strengths of the approach.

Key words: tag recommendation; Arabic; Short text; Latent Semantic Analysis; Wikipedia; Apache Spark






Full-text options


Share this Article


Online Article Submission
• ejmanager.com




ejPort - eJManager.com
Author Tools
About BiblioMed
License Information
Terms & Conditions
Privacy Policy
Contact Us

The articles in Bibliomed are open access articles licensed under Creative Commons Attribution 4.0 International License (CC BY), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.