The current study is an effort to analyze the lexico-semantic relationships of nouns in Saraiki newspaper. Although spoken as a first language by almost 20 million people in Pakistan, Saraiki has a limited Corpus in the form of ijunoonSaraiki dictionary (2017). A functional corpus for Saraiki would be helpful in developing a WordNet through which the digital applications of Saraiki can be run. Of the four open class categories included in the WordNet the present study explores the semantic relations found among the nouns of Saraiki. A corpus of 2 million words was created from the Saraiki newspaper Jhoke after POS tagging a list of 1500 nouns was generated and the nouns were semantically categorized for identifying the lexical relationships among them using machine readable dictionaries. Each lexical and semantic relation was quantitatively analyzed using Antconc 3.5.7. The 3A model of corpus linguistics was used for annotation and analysis of Saraiki nouns (Wallis & Nelson, 2001). The results revealed ten lexico-semantic relationships frequently found among the nouns in the newspaper Jhoke, which are essential for the development of a lexical data base such as WordNet. The lexico-semantic relationship analysis showed that the singular / plural relationship was the most frequent among the Saraiki nouns while, synonymy was the least frequent relationship. Identifying the noun relationships of Saraiki are a step towards the development of a Saraiki WordNet that would support NLP and other digital applications.
Key words: Lexico-semantics, Nouns, Corpus linguistics, Saraiki, Newspaper.
|