Publications

Text Classification of News Articles Using Machine Learning on Low-resourced Language: Tigrigna

Published in 3rd International Conference on Artificial Intelligence and Big Data (ICAIBD), 2020

Text categorization or Textual document is a method that becomes more significant in tagging a textual document to their most relevant label. However, not all languages have parallel textual growth; without free and absences of a dataset, text categorization becomes interesting for Tigrigna language, i.e., low-resourced language. Our aim to identify the given document to its categories based on its linguistic features. To achieve our goal, we have constructed a new dataset from different Tigrigna news sources. The dataset has six main categories: Agriculture, Sports, Health, Education, Religion, and Politics. Each collected is article preprocessed from Latin characters, punctuations, and stop words. We deployed a collection of different classical machine learning classifiers to investigate its effectiveness in our datasets. Namely, 7 popular classifiers were used, Logistic Regression, Nearest Centroid, Decision Tree (DT).

Effects of Light Stemming on Feature Extraction and Selection for Arabic Documents Classification

Published in Book - Part of the Studies in Computational Intelligence book series (SCI, volume 874), 2020

This chapter aims to study the effects of the light stemming technique on feature extraction where Bag of Words (BoW) and Term frequency- Inverse Documents (TF-IDF) are employed for Arabic document classification. Moreover, feature selection methods such as Chi-square (Chi2), Information gain (IG), and singular value decomposition (SVD) are used to select the most relevant features. K-nearest Neighbor (kNN), Logistic Regression (LR), and Support Vector Machine (SVM) classifiers are used to build the classification model.

Recent Advances in NLP: The Case of Arabic Language

Published in Book - Part of the Studies in Computational Intelligence book series (SCI, volume 874), 2019

In light of the rapid rise of new trends and applications in various natural language processing tasks, this book presents high-quality research in the field. Each chapter addresses a common challenge in a theoretical or applied aspect of intelligent natural language processing related to Arabic language. Many challenges encountered during the development of the solutions can be resolved by incorporating language technology and artificial intelligence.

Multi-channel Embedding Convolutional Neural Network Model for Arabic Sentiment Classification

Published in ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2019

In this paper, a multi-channel embedding convolutional neural network (MCE-CNN) is proposed to improve Arabic sentiment classification by learning sentiment features from different text domains, word and character n-grams levels. MCE-CNN encodes a combination of different pre-trained word embeddings into the embedding block at each embedding channel and trains these channels in parallel.

A Study of the Effects of Stemming Strategies on Arabic Document Classification

Published in IEEE Access, 2019

This paper aims to study the impact of stemming techniques, namely Information Science Research Institute (ISRI), Tashaphyne, and ARLStem on Arabic DC. The classification algorithms, namely Naïve Bayesian (NB), support vector machine (SVM), and K-nearest neighbors (KNN), are used in this paper. In addition, the chi-square feature selection is used to select the most relevant features.

Arabic Sentiment Classification Using Convolutional Neural Network and Differential Evolution Algorithm

Published in Computational Intelligence and Neuroscience, 2019

In this paper, we address this problem by combining differential evolution (DE) algorithm and CNN, where DE algorithm is used to automatically search the optimal configuration including CNN architecture and network parameters. In order to achieve the goal, five CNN parameters are searched by the DE algorithm which include convolution filter sizes that control the CNN architecture, number of filters per convolution filter size (NFCS), number of neurons in fully connected (FC) layer, initialization mode, and dropout rate.

Deep Knowledge Representation based on Compositional Semantics for Chinese Geography

Published in The 9th International Conference on Agents and Artificial Intelligence ICAART, 2017

In this paper, we propose a novel directed acyclic graph (DAG) deep knowledge representation built upon the theorem of combinational semantics. Knowledge is decomposed into nodes and edges which are then inserted into the ontology knowledge base. Experimental results demonstrate the superiority of the proposed method on question answering, especially when the syntax of question is complex, and its representation is fuzzy.

Word Embeddings and Convolutional Neural Network for Arabic Sentiment Classification

Published in The 26th international conference on computational linguistics, 2016

In this paper, a scheme of Arabic sentiment classification, which evaluates and detects the sentiment polarity from Arabic reviews and Arabic social media, is studied. We investigated in several architectures to build a quality neural word embeddings using a 3.4 billion words corpus from a collected 10 billion words web-crawled corpus. Moreover, a convolutional neural network trained on top of pre-trained Arabic word embeddings is used for sentiment classification to evaluate the quality of these word embeddings.

Simulation Comparison and Analysis of DSR and DYMO Protocols in MANETs

Published in 2016 International Conference on Industrial Informatics and Computer Systems (CIICS), 2016

Ad-hoc network has opened a new dimension in wireless networks. It allows wireless nodes to communicate with each other in the absence of centralized support. It does not follow any fixed infrastructure because of the mobility of nodes and multi-path propagations. Link instability and node mobility make routing a core issue in MANETs. A suitable and effective routing mechanism helps to extend the successful deployment of MANETs.