Theme-Logo
  • Login
  • Home
  • Course
  • Publication
  • Theses
  • Reports
  • Published books
  • Workshops / Conferences
  • Supervised PhD
  • Supervised MSc
  • Supervised projects
  • Education
  • Language skills
  • Positions
  • Memberships and awards
  • Committees
  • Experience
  • Scientific activites
  • In links
  • Outgoinglinks
  • News
  • Gallery
publication name “Design and Implementation of Arabic Corpus”, Scientific Bulletin, Fac. Of Eng. , Ain-Shams Univ. , Vol. 33, No. 2, 1998.
Authors A. El-Sammak
year 1998
keywords
journal
volume Not Available
issue Not Available
pages Not Available
publisher Not Available
Local/International Local
Paper Link Not Available
Full paper download
Supplementary materials Not Available
Abstract

Arabic Corpus is the infrastructure of the Arabic language computerized applications. The present work is motivated by the lack of standard corpus for the current modern Arabic language. The proposed design methodology of the corpus is implemented and tested on a sample of 1.25 million word out of a 20 million word corpus. Corpus data are morphologically analyzed to decompose the words into their basic constituents supplemented with their linguistic information. The 20 million word corpus is made up of texts from different sources. Books, newspapers, magazines, technical reports, research theses, and leaflets covering a wide range of subject areas are included to represent a broad spectrum of the currently used Arabic language. The output of the Corpus is described by a set of statistical and collocation information as well as retrieving the concordance lines for one or more words.

Benha University © 2023 Designed and developed by portal team - Benha University