Browsing by Author "Viriri, Serestina."

Now showing 1 - 20 of 22

Automatic dental caries detection in bitewing radiographs.
(2022) Majanga, Vincent Idah.; Viriri, Serestina.
Dental Caries is one of the most prevalent chronic disease around the globe. Distinguishing carious lesions has been a challenging task. Conventional computer aided diagnosis and detection methods in the past have heavily relied on visual inspection of teeth. These are only effective on large and clearly visible caries on affected teeth. Conventional methods have been limited in performance due to the complex visual characteristics of dental caries images, which consists of hidden or inaccessible lesions. Early detection of dental caries is an important determinant for treatment and benefits much from the introduction of new tools such as dental radiography. A method for the segmentation of teeth in bitewing X-rays is presented in this thesis, as well as a method for the detection of dental caries on X-ray images using a supervised model. The diagnostic method proposed uses an assessment protocol that is evaluated according to a set of identifiers obtained from a learning model. The proposed technique automatically detects hidden and inaccessible dental caries lesions in bitewing radio graphs. The approach employed data augmentation to increase the number of images in the data set in order to have a total of 11,114 dental images. Image pre-processing on the data set was through the use of Gaussian blur filters. Image segmentation was handled through thresholding, erosion and dilation morphology, while image boundary detection was achieved through active contours method. Furthermore, the deep learning based network through the sequential model in Keras extracts features from the images through blob detection. Finally, a convexity threshold value of 0.9 is introduced to aid in the classification of caries as either present or not present. The relative efficacy of the supervised model in diagnosing dental caries when compared to current systems is indicated by the results detailed in this thesis. The proposed model achieved a 97% correct diagnostic which proved quite competitive with existing models.
Automatic lung segmentation using graph cut optimization.
(2015) Oluyide, Oluwakorede Monica.; Viriri, Serestina.; Tapamo, Jules-Raymond.
Medical Imaging revolutionized the practice of diagnostic medicine by providing a means of visualizing the internal organs and structure of the body. Computer technologies have played an increasing role in the acquisition and handling, storage and transmission of these images. Due to further advances in computer technology, research efforts have turned towards adopting computers as assistants in detecting and diagnosing diseases, resulting in the incorporation of Computer-aided Detection (CAD) systems in medical practice. Computed Tomography (CT) images have been shown to improve accuracy of diagnosis in pulmonary imaging. Segmentation is an important preprocessing necessary for high performance of the CAD. Lung segmentation is used to isolate the lungs for further analysis and has the advantage of reducing the search space and computation time involved in disease detection. This dissertation presents an automatic lung segmentation method using Graph Cut optimization. Graph Cut produces globally optimal solutions by modeling the image data and spatial relationship among the pixels. Several objects in the thoracic CT image have similar pixel values to the lungs, and the global solutions of Graph Cut produce segmentation results where the lungs, and all other objects similar in intensity value to the lungs, are included. A distance prior encoding the euclidean distance of pixels from the set of pixels belonging to the object of interest is proposed to constrain the solution space of the Graph Cut algorithm. A segmentation method using the distance-constrained Graph Cut energy is also proposed to isolate the lungs in the image. The results indicate the suitability of the distance prior as a constraint for Graph Cut and shows the effectiveness of the proposed segmentation method in accurately segmenting the lungs from a CT image.
Bi-modal biometrics authentication on iris and signature.
(2010) Viriri, Serestina.; Tapamo, Jules-Raymond.
Multi-modal biometrics is one of the most promising avenues to address the performance problems in biometrics-based personal authentication systems. While uni-modal biometric systems have bolstered personal authentication better than traditional security methods, the main challenges remain the restricted degrees of freedom, non-universality and spoof attacks of the traits. In this research work, we investigate the performance improvement in bi-modal biometrics authentication systems based on the physiological trait, the iris, and behavioral trait, the signature. We investigate a model to detect the largest non-occluded rectangular part of the iris as a region of interest (ROI) from which iris features are extracted by a cumulative-sums-based grey change analysis algorithm and Gabor Filters. In order to address the space complexity of biometric systems, we proposed two majority vote-based algorithms which compute prototype iris features codes as the reliable specimen templates. Experiments obtained a success rate of 99.6%. A text-based directional signature verification algorithm is investigated. The algorithm verifies signatures, even when they are composed of symbols and special unconstrained cursive characters which are superimposed and embellished. The experimental results show that the proposed approach has an improved true positive rate of 94.95%. A user-specific weighting technique, the user-score-based, which is based on the different degrees of importance for the iris and signature traits of an individual, is proposed. Then, an intelligent dual ν-support vector machine (2ν-SVM) based fusion algorithm is used to integrate the weighted match scores of the iris and signature modalities at the matching score level. The bi-modal biometrics system obtained a false rejection rate (FRR) of 0.008, and a false acceptance rate (FAR) of 0.001.
Component-based ethnicity identification from facial images.
(2017) Booysens, Aimée.; Viriri, Serestina.
Abstract available in PDF file.
Deep learning for brain tumor segmentation and survival prediction.
(2024) Magadza, Tirivangani Batanai Hendrix Takura.; Viriri, Serestina.
A brain tumor is an abnormal growth of cells in the brain that multiplies uncontrolled. The death of people due to brain tumors has increased over the past few decades. Early diagnosis of brain tumors is essential in improving treatment possibilities and increasing the survival rate of patients. The life expectancy of patients with glioblastoma multiforme (GBM), the most malignant glioma, using the current standard of care is, on average, 14 months after diagnosis despite aggressive surgery, radiation, and chemotherapies. Despite considerable efforts in brain tumor segmentation research, patient diagnosis remains poor. Accurate segmentation of pathological regions may significantly impact treatment decisions, planning, and outcome monitoring. However, the large spatial and structural variability among brain tumors makes automatic segmentation a challenging problem, leaving brain tumor segmentation an open challenge that warrants further research endeavors. While several methods automatically segment brain tumors, deep learning methods are becoming widespread in medical imaging due to their resounding performance. However, the boost in performance comes at the cost of high computational complexity. Therefore, to improve the adoption rate of computer-assisted diagnosis in clinical setups, especially in developing countries, there is a need for more computational and memoryefficient models. In this research, using a few computational resources, we explore various techniques to develop deep learning models accurately for segmenting the different glioma sub-regions, namely the enhancing tumor, the tumor core, and the whole tumor. We quantitatively evaluate the performance of our proposed models against the state-of-the-art methods using magnetic resolution imaging (MRI) datasets provided by the Brain Tumor Segmentation (BraTS) Challenge. Lastly, we use segmentation labels produced by the segmentation task and MRI multimodal data to extract appropriate imaging/radiomic features to train a deep learning model for overall patient survival prediction.
Deep learning framework for speech emotion classification.
(2024) Akinpelu, Samson Adebisi.; Viriri, Serestina.
A robust deep learning-based approach for the recognition and classification of speech emotion is proposed in this research work. Emotion recognition and classification occupy a conspicuous position in human-computer interaction (HCI) and by extension, determine the reasons and justification for human action. Emotion plays a critical role in decision-making as well. Distinguishing among various emotions (angry, sad, happy, neutral, disgust, fear, and surprise) that exist from speech signals has however been a long-term challenge. There have been some limitations associated with existing deep learning techniques as a result of the complexity of features from human speech (sequential data) which consists of insufficient label datasets, Noise and Environmental Factors, Cross-cultural and Linguistic Differences, Speakers’ Variability and Temporal Dynamics. There is also a heavy reliance on huge parameter tunning, especially for millions of parameters before the model can learn the expected emotional features necessary for classification emotion, which often results in computational complexity, over-fitting, and poor generalization. This thesis presents an innovative deep learning framework-based approach for the recognition and classification of speech emotions. The deep learning techniques currently in use for speech-emotion classification are exhaustively and analytically reviewed in this thesis. This research models various approaches and architectures based on deep learning to build a framework that is dependable and efficient for classifying emotions from speech signals. This research proposes a deep transfer learning model that addresses the shortcomings of inadequate training datasets for the classification of speech emotions. The research also models advanced deep transfer learning in conjunction with a feature selection algorithm to obtain more accurate results regarding the classification of speech emotion. Speech emotion classification is further enhanced by combining the regularized feature selection (RFS) techniques and attention-based networks for the classification of speech emotion with a significant improvement in the emotion recognition results. The problem of misclassification of emotion is alleviated through the selection of salient features that are relevant to emotion classification from speech signals. By combining regularized feature selection with attention-based mechanisms, the model can better understand emotional complexities and outperform conventional ML model emotion detection algorithms. The proposed approach is very resilient to background noise and cultural differences, which makes it suitable for real-world applications. Having investigated the reasons behind the enormous computing resources required for many deep learning based methods, the research proposed a lightweight deep learning approach that can be deployed on low-memory devices for speech emotion classification. A redesigned VGGNet with an overall model size of 7.94MB is utilized, combined with the best-performing classifier (Random Forest). Extensive experiments and comparisons with other deep learning models (DenseNet, MobileNet, InceptionNet, and ResNet) over three publicly available speech emotion datasets show that the proposed lightweight model improves the performance of emotion classification with minimal parameter size. The research further devises a new method that minimizes computational complexity using a vision transformer (ViT) network for speech emotion classification. The ViT model’s capabilities allow the mel-spectrogram input to be fed into the model, allowing for the capturing of spatial dependencies and high-level features from speech signals that are suitable indicators of emotional states. Finally, the research proposes a novel transformer model that is based on shift-window for efficient classification of speech emotion on bi-lingual datasets. Because this method promotes feature reuse, it needs fewer parameters and works well with smaller datasets. The proposed model was evaluated using over 3000 speech emotion samples from the publicly available TESS, EMODB, EMOVO, and bilingual TESS-EMOVO datasets. The results showed 98.0%, 98.7%, and 97.0% accuracy, F1-Score, and precision, respectively, across the 7 classes of emotion.
Detection and characterisation of vessels in retinal images.
(2015) Mapayi, Temitope.; Viriri, Serestina.; Tapamo, Jules-Raymond.
As retinopathies such as diabetic retinopathy (DR) and retinopathy of prematurity (ROP) continue to be the major causes of blindness globally, regular retinal examinations of patients can assist in the early detection of the retinopathies. The manual detection of retinal vessels is a very tedious and time consuming task as it requires about two hours to manually detect vessels in each retinal image. Automatic vessel segmentation has been helpful in achieving speed, improved diagnosis and progress monitoring of these diseases but has been challenging due to complexities such as the varying width of the retinal vessels from very large to very small, low contrast of thin vessels with respect to background and noise due to nonhomogeneous illumination in the retinal images. Although several supervised and unsupervised segmentation methods have been proposed in the literature, the segmentation of thinner vessels, connectivity loss of the vessels and time complexity remain the major challenges. In order to address these problems, this research work investigated di erent unsupervised segmentation approaches to be used in the robust detection of large and thin retinal vessels in a timely e cient manner. Firstly, this thesis conducted a study on the use of di erent global thresholding techniques combined with di erent pre-processing and post-processing techniques. Two histogram-based global thresholding techniques namely, Otsu and Isodata were able to detect large retinal vessels but fail to segment the thin vessels because these thin vessels have very low contrast and are di cult to distinguish from the background tissues using the histogram of the retinal images. Two new multi-scale approaches of computing global threshold based on inverse di erence moment and sum-entropy combined with phase congruence are investigated to improve the detection of vessels. One of the findings of this study is that the multi-scale approaches of computing global threshold combined with phase congruence based techniques improved on the detection of large vessels and some of the thin vessels. They, however, failed to maintain the width of the detected vessels. The reduction in the width of the detected large and thin vessels results in low sensitivity rates while relatively good accuracy rates were maintained. Another study on the use of fuzzy c-means and GLCM sum entropy combined on phase congruence for vessel segmentation showed that fuzzy c-means combined with phase congruence achieved a higher average accuracy rates of 0.9431 and 0.9346 but a longer running time of 27.1 seconds when compared with the multi-scale based sum entropy thresholding combined with phase congruence with the average accuracy rates of 0.9416 and 0.9318 with a running time of 10.3 seconds. The longer running time of the fuzzy c-means over the sum entropy thresholding is, however, attributed to the iterative nature of fuzzy c-means. When compared with the literature, both methods achieved considerable faster running time. This thesis investigated two novel local adaptive thresholding techniques for the segmentation of large and thin retinal vessels. The two novel local adaptive thresholding techniques applied two di erent Haralick texture features namely, local homogeneity and energy. Although these two texture features have been applied for supervised image segmentation in the literature, their novelty in this thesis lies in that they are applied using an unsupervised image segmentation approach. Each of these local adaptive thresholding techniques locally applies a multi-scale approach on each of the texture information considering the pixel of interest in relationship with its spacial neighbourhood to compute the local adaptive threshold. The localised multi-scale approach of computing the thresholds handled the challenge of the vessels' width variation. Experiments showed significant improvements in the average accuracy and average sensitivity rates of these techniques when compared with the previously discussed global thresholding methods and state of the art. The two novel local adaptive thresholding techniques achieved a higher reduction of false vessels around the border of the optic disc when compared with some of the previous techniques in the literature. These techniques also achieved a highly improved computational time of 1.9 to 3.9 seconds to segment the vessels in each retinal image when compared with the state of the art. Hence, these two novel local adaptive thresholding techniques are proposed for the segmentation of the vessels in the retinal images. This thesis further investigated the combination of di erence image and kmeans clustering technique for the segmentation of large and thin vessels in retinal images. The pre-processing phase computed a di erence image and k-means clustering technique was used for the vessel detection. While investigating this vessel segmentation method, this thesis established the need for a difference image that preserves the vessel details of the retinal image. Investigating the di erent low pass filters, median filter yielded the best di erence image required by k-means clustering for the segmentation of the retinal vessels. Experiments showed that the median filter based di erence images combined with k-means clustering technique achieved higher average accuracy and average sensitivity rates when compared with the previously discussed global thresholding methods and the state of the art. The median filter based di erence images combined with k-means clustering technique (that is, DIMDF) also achieved a higher reduction of false vessels around the border of the optic disc when compared with some previous techniques in the literature. These methods also achieved a highly improved computational time of 3.4 to 4 seconds when compared with the literature. Hence, the median filter based di erence images combined with k-means clustering technique are proposed for the segmentation of the vessels in retinal images. The characterisation of the detected vessels using tortuosity measure was also investigated in this research. Although several vessel tortuosity methods have been discussed in the literature, there is still need for an improved method that e ciently detects vessel tortuosity. The experimental study conducted in this research showed that the detection of the stationary points helps in detecting the change of direction and twists in the vessels. The combination of the vessel twist frequency obtained using the stationary points and distance metric for the computation of normalised and nonnormalised tortuosity index (TI) measure was investigated. Experimental results showed that the non-normalised TI measure had a stronger correlation with the expert's ground truth when compared with the distance metric and normalised TI measures. Hence, a non-normalised TI measure that combines the vessel twist frequency based on the stationary points and distance metric is proposed for the measurement of vessel tortuosity.
Exploration of ear biometrics with deep learning.
(2024) Booysens, Aimee Anne.; Viriri, Serestina.
Biometrics is the recognition of a human using biometric characteristics for identification, which may be physiological or behavioural. Numerous models have been proposed to distinguish biometric traits used in multiple applications, such as forensic investigations and security systems. With the COVID-19 pandemic, facial recognition systems failed due to users wearing masks; however, human ear recognition proved more suitable as it is visible. This thesis explores efficient deep learning-based models for accurate ear biometrics recognition. The ears were extracted and identified from 2D profiles and facial images, focusing on both left and right ears. With the numerous datasets used, with particular mention of BEAR, EarVN1.0, IIT, ITWE and AWE databases. Many machine learning techniques were explored, such as Naïve Bayes, Decision Tree, K-Nearest Neighbor, and innovative deep learning techniques: Transformer Network Architecture, Lightweight Deep Learning with Model Compression and EfficientNet. The experimental results showed that the Transformer Network achieved a high accuracy of 92.60% and 92.56% with epochs of 50 and 90, respectively. The proposed ReducedFireNet Model reduces the input size and increases computation time, but it detects more robust ear features. The EfficientNet variant B8 achieved a classification accuracy of 98.45%. The results achieved are more significant than those of other works, with the highest achieved being 98.00%. The overall results showed that deep learning models can improve ear biometrics recognition when both ears are computed.
Facial expression recognition and intensity estimation.
(2022) Ekundayo, Olufisayo Sunday.; Viriri, Serestina.
Facial Expression is one of the profound non-verbal channels through which human emotion state is inferred from the deformation or movement of face components when facial muscles are activated. Facial Expression Recognition (FER) is one of the relevant research fields in Computer Vision (CV) and Human-Computer Interraction (HCI). Its application is not limited to: robotics, game, medical, education, security and marketing. FER consists of a wealth of information. Categorising the information into primary emotion states only limit its performance. This thesis considers investigating an approach that simultaneously predicts the emotional state of facial expression images and the corresponding degree of intensity. The task also extends to resolving FER ambiguous nature and annotation inconsistencies with a label distribution learning method that considers correlation among data. We first proposed a multi-label approach for FER and its intensity estimation using advanced machine learning techniques. According to our findings, this approach has not been considered for emotion and intensity estimation in the field before. The approach used problem transformation to present FER as a multilabel task, such that every facial expression image has unique emotion information alongside the corresponding degree of intensity at which the emotion is displayed. A Convolutional Neural Network (CNN) with a sigmoid function at the final layer is the classifier for the model. The model termed ML-CNN (Multilabel Convolutional Neural Network) successfully achieve concurrent prediction of emotion and intensity estimation. ML-CNN prediction is challenged with overfitting and intraclass and interclass variations. We employ Visual Geometric Graphics-16 (VGG-16) pretrained network to resolve the overfitting challenge and the aggregation of island loss and binary cross-entropy loss to minimise the effect of intraclass and interclass variations. The enhanced ML-CNN model shows promising results and outstanding performance than other standard multilabel algorithms. Finally, we approach data annotation inconsistency and ambiguity in FER data using isomap manifold learning with Graph Convolutional Networks (GCN). The GCN uses the distance along the isomap manifold as the edge weight, which appropriately models the similarity between adjacent nodes for emotion predictions. The proposed method produces a promising result in comparison with the state-of-the-art methods.
Gender classification using facial components.
(2018) Bayana, Mayibongwe Handy.; Viriri, Serestina.; Angulu, Raphael.
Gender classification is very important in facial analysis as it can be used as input into a number of systems such as face recognition. Humans are able to classify gender with great accuracy however passing this ability to machines is a complex task because of many variables such as lighting to mention a few. For the purpose of this research we have approached gender classification as a binary problem, involving the two classes male and female. Two datasets are used in this research which are the FG-NET dataset and Pilots Parliament datasets. Two appearance based feature extractors are used which are the LBP and LDP with the Active Shape model being included by fusing. The classifiers used here are the Support Vector Machine with Radial Basis Function kernel and an Artificial Neural Network with backpropagation. On the FG-NET an average detection of 90.6% against that of 87.5% to that of the PPB. Gender is then detected from the facial components the nose, eyes among others. The forehead recorded the highest accuracy with 92%, followed by the nose with 90%, cheeks with 89.2% and the eyes with 87% and the mouth recorded the lowest accuracy of 75%. As a result feature fusion is then carried out to improve classification accuracies especially that of the mouth and eyes with lowest accuracies. The eyes with an accuracy of 87% is fused with the forehead with 92% and the resulting accuracy is an increase to 93%. The mouth, with the lowest accuracy of 75% is fused with the nose which has an accuracy of 90% and the resulting accuracy is 87%. These results carried out by fusing through addition showed improved results. Fusion is then carried out between Appearance based and shape based features. On the FG-NET dataset using the LBP and LDP an accuracy of 85.33% and 89.53% with the PPB recording 83.13%, 89.3% for LBP and LDP respectively. As expected and shown by previous researchers the LDP clearly obtains higher classification accuracies as it than LBP as it uses gradient rather than pixel intensity. We then fuse the vectors of the LDP, LBP with that of the ASM and carry out dimensionality reduction, then fusion by addition. On the PPB dataset fusion of LDP and ASM records 81.56%, and 94.53% with the FG-NET recording 89.53% respectively.
Handwritten signature verification using locally optimized distance-based classification.
(2012) Moolla, Yaseen.; Viriri, Serestina.; Nelwamondo, Fulufhelo Vincent.; Tapamo, Jules-Raymond.
Although handwritten signature verification has been extensively researched, it has not achieved optimum accuracy rate. Therefore, efficient and accurate signature verification techniques are required since signatures are still widely used as a means of personal verification. This research work presents efficient distance-based classification techniques as an alternative to supervised learning classification techniques (SLTs). Two different feature extraction techniques were used, namely the Enhanced Modified Direction Feature (EMDF) and the Local Directional Pattern feature (LDP). These were used to analyze the effect of using several different distance-based classification techniques. Among the classification techniques used, are the cosine similarity measure, Mahalanobis, Canberra, Manhattan, Euclidean, weighted Euclidean and fractional distances. Additionally, the novel weighted fractional distances, as well as locally optimized resampling of feature vector sizes were tested. The best accuracy was achieved through applying a combination of the weighted fractional distances and locally optimized resampling classification techniques to the Local Directional Pattern feature extraction. This combination of multiple distance-based classification techniques achieved accuracy rate of 89.2% when using the EMDF feature extraction technique, and 90.8% when using the LDP feature extraction technique. These results are comparable to those in literature, where the same feature extraction techniques were classified with SLTs. The best of the distance-based classification techniques were found to produce greater accuracy than the SLTs.
Hybrid component-based face recognition.
(2018) Gumede, Andile Martin.; Viriri, Serestina.; Gwetu, Mandlenkosi.
Facial recognition (FR) is the trusted biometric method for authentication. Compared to other biometrics such as signature; which can be compromised, facial recognition is non-intrusive and it can be apprehended at a distance in a concealed manner. It has a significant role in conveying the identity of a person in social interaction and its performance largely depends on a variety of factors such as illumination, facial pose, expression, age span, hair, facial wear, and motion. In the light of these considerations this dissertation proposes a hybrid component-based approach that seeks to utilise any successfully detected components. This research proposes a facial recognition technique to recognize faces at component level. It employs the texture descriptors Grey-Level Co-occurrence (GLCM), Gabor Filters, Speeded-Up Robust Features (SURF) and Scale Invariant Feature Transforms (SIFT), and the shape descriptor Zernike Moments. The advantage of using the texture attributes is their simplicity. However, they cannot completely characterise the whole face recognition, hence the Zernike Moments descriptor was used to compute the shape properties of the selected facial components. These descriptors are effective facial components feature representations and are robust to illumination and pose changes. Experiments were performed on four different state of the art facial databases, the FERET, FEI, SCface and CMU and Error-Correcting Output Code (ECOC) was used for classification. The results show that component-based facial recognition is more effective than whole face and the proposed methods achieve 98.75% of recognition accuracy rate. This approach performs well compared to other componentbased facial recognition approaches.
Leaf recognition for accurate plant classification.
(2017) Kala, Jules Raymond.; Viriri, Serestina.; Moodley, Deshendran.
Plants are the most important living organisms on our planet because they are sources of energy and protect our planet against global warming. Botanists were the first scientist to design techniques for plant species recognition using leaves. Although many techniques for plant recognition using leaf images have been proposed in the literature, the precision and the quality of feature descriptors for shape, texture, and color remain the major challenges. This thesis investigates the precision of geometric shape features extraction and improved the determination of the Minimum Bounding Rectangle (MBR). The comparison of the proposed improved MBR determination method to Chaudhuri's method is performed using Mean Absolute Error (MAE) generated by each method on each edge point of the MBR. On the top left point of the determined MBR, Chaudhuri's method has the MAE value of 26.37 and the proposed method has the MAE value of 8.14. This thesis also investigates the use of the Convexity Measure of Polygons for the characterization of the degree of convexity of a given leaf shape. Promising results are obtained when using the Convexity Measure of Polygons combined with other geometric features to characterize leave images, and a classification rate of 92% was obtained with a Multilayer Perceptron Neural Network classifier. After observing the limitations of the Convexity Measure of Polygons, a new shape feature called Convexity Moments of Polygons is presented in this thesis. This new feature has the invariant properties of the Convexity Measure of Polygons, but is more precise because it uses more than one value to characterize the degree of convexity of a given shape. Promising results are obtained when using the Convexity Moments of Polygons combined with other geometric features to characterize the leaf images and a classification rate of 95% was obtained with the Multilayer Perceptron Neural Network classifier. Leaf boundaries carry valuable information that can be used to distinguish between plant species. In this thesis, a new boundary-based shape characterization method called Sinuosity Coefficients is proposed. This method has been used in many fields of science like Geography to describe rivers meandering. The Sinuosity Coefficients is scale and translation invariant. Promising results are obtained when using Sinuosity Coefficients combined with other geometric features to characterize the leaf images, a classification rate of 80% was obtained with the Multilayer Perceptron Neural Network classifier. Finally, this thesis implements a model for plant classification using leaf images, where an input leaf image is described using the Convexity Moments, the Sinuosity Coefficients and the geometric features to generate a feature vector for the recognition of plant species using a Radial Basis Neural Network. With the model designed and implemented the overall classification rate of 97% was obtained.
Liver segmentation using 3D CT scans.
(2018) Hiraman, Anura.; Viriri, Serestina.; Gwetu, Mandlenkosi.
Abstract available in PDF file.
Modelling of artificial intelligence based demand side management techniques for mitigating energy poverty in smart grids.
(2018) Monyei, Chukwuka Gideon.; Viriri, Serestina.
This research work proposes an artificial intelligence (AI) based model for smart grid initiatives (for South Africa and by extension sub-Saharan Africa, (SSA)) and further incorporates energy justice principles. Spanning the social, technical, economic, environmental, policy and overall impact of smart and just electricity grids, this research begins by investigating declining electricity consumption and demand side management (DSM) potential across South Africa. In addition, technical frameworks such as the combined energy management system (CEMS), co-ordinated centralized energy management system (ConCEMS) and biased load manager home energy management system (BLM-HEMS) are modelled. These systems provide for the integration of all aspects of the electricity grid and their optimization in achieving cost reduction for both the utility and consumers as well as improvement in the consumers quality of life (QoL) and reduction of emissions. Policy and economy-wise, this research work further proposes and models an integrated electrification and expansion model (IEEM) for South Africa, and also addresses the issue of rural marginalization due to poor electricity access for off-grid communities. This is done by proposing a hybrid generation scheme (HGS) which is shown to satisfy sufficiently the requirements of the energy justice framework while significantly reducing the energy burden of households and reducing carbon emissions by over 70%.
Multi-level parallelization for accurate and fast medical image retrieval image retrieval.
(2016) Chikamai, Keith Sasala.; Viriri, Serestina.; Tapamo, Jules-Raymond.
Breast cancer is the most prevalent form of cancer diagnosed in women. Mammograms offer the best option in detecting the disease early, which allows early treatment and by implication, a favorable prognosis. Content-based Medical Image Retrieval (CBMIR) technique is increasingly gaining research attention as a Computer Aided Diagnosis (CAD)) approach for breast cancer diagnosis. Such systems work by availing mammogram images that are pathologically similar to a given query example, which are used to support the diagnostic decision by referential basis. In most cases, the query is of the form “return k images similar to the specified query image”. Similarity in the Content-based Image Retrieval (CBIR) context is based on the content of images, rather than text or keywords. The essence of CBIR systems is to enable indexing of pictorial content in databases and eliminating the drawbacks of manual annotation. CBMIR is a relatively young technology that is yet to gain widespread use. One major challenge for CBMIR systems is bridging the “semantic gap” in the description of image content. Semantic gap describes the discord in the notion of similarity between the descriptions of humans and CBMIR systems. Low accuracy concerns inhibit the full adoption of CBMIR systems into regular practice, with research focusing on improving the accuracy of CBMIR systems. Nonetheless, the area is still an open problem. As a contribution towards improving the accuracy of CBMIR for mammogram images, this work proposes a novel feature modeling technique for CBMIR systems based on classifier scores and standard statistical calculations on the same. A set of gradient-based filters are first used to highlight possible calcification objects; an Entropy-based thresholding technique is then used to segment the calcifications from the background. Experimental results show that the proposed model achieves a 100% detection rate, which shows the effectiveness of combining the likelihood maps from various filters in detecting calcification objects. Feature extraction considers established textural and geometric features, which are calculated from the detected calcification objects; these are then used to generate secondary features using the Support Vector Machine and Quadratic Discriminant Analysis classifier. The model is validated through a range of benchmarks, and is shown to perform competitively in comparison to similar works. Specifically, it scores 95%, 82%, 78%, and 98% on the accuracy, positive predictive value, sensitivity and specificity benchmarks respectively. Parallel computing is applied to the task of feature extraction to show its viability in reducing the cost of extraction features. This research considers two technologies for implementation: distributed computing using the message passing interface (MPI) and multicore computing using OpenMP threads. Both technologies involve the division of tasks to facilitate sharing of the computational burden in order to reduce the overall time cost. Communication cost is one penalty implied with parallel systems and a significant design target where efficiency of parallel models is concerned. This research focuses on mitigating the communication overhead for increasing the efficacy of parallel computation; it proposes an adaptive task assignment model dependent on network bandwidth for the parallel extraction of features. Experimental results report speedup values of between 4:7x and 10:4x, and efficiency values of between 0:11 and 0:62. There is a positive increase in both the speedup and efficiency values with an increase in the database size. The proposed adaptive assignment of tasks positively impacts on the speedup and efficiency performance of the parallel model. All experiments are based on the mammographic image analysis society (MIAS) database, which is a publicly available database that has been widely used in related works. The results achieved for both the mammogram pathology-based retrieval model as well as its computational efficiency met the objectives set for the research. In the domain of breast cancer applications, the models proposed in this work should positively contribute to the improvement of retrieval results of computer aided diagnosis/detection systems, where applicable. The improved accuracy will lead to higher acceptability of such systems by radiologists, which will enhance the quality of diagnosis both by reducing the decision-making time as well as improving the accuracy of the entire diagnostic process.
Pancreatic cancer survival prediction using Deep learning techniques.
(2023) Bakasa, Wilson.; Viriri, Serestina.
Abstract available in PDF.
A patch-based convolutional neural network for localized MRI brain segmentation.
(2020) Vambe, Trevor Constantine.; Viriri, Serestina.; Gwetu, Mandlenkosi Victor.
Accurate segmentation of the brain is an important prerequisite for effective diagnosis, treatment planning, and patient monitoring. The use of manual Magnetic Resonance Imaging (MRI) segmentation in treating brain medical conditions is slowly being phased out in favour of fully-automated and semi-automated segmentation algorithms, which are more efficient and objective. Manual segmentation has, however, remained the gold standard for supervised training in image segmentation. The advent of deep learning ushered in a new era in image segmentation, object detection, and image classification. The convolutional neural network has contributed the most to the success of deep learning models. Also, the availability of increased training data when using Patch Based Segmentation (PBS) has facilitated improved neural network performance. On the other hand, even though deep learning models have achieved successful results, they still suffer from over-segmentation and under-segmentation due to several reasons, including visually unclear object boundaries. Even though there have been significant improvements, there is still room for better results as all proposed algorithms still fall short of 100% accuracy rate. In the present study, experiments were carried out to improve the performance of neural network models used in previous studies. The revised algorithm was then used for segmenting the brain into three regions of interest: White Matter (WM), Grey Matter (GM), and Cerebrospinal Fluid (CSF). Particular emphasis was placed on localized component-based segmentation because both disease diagnosis and treatment planning require localized information, and there is a need to improve the local segmentation results, especially for small components. In the evaluation of the segmentation results, several metrics indicated the effectiveness of the localized approach. The localized segmentation resulted in the accuracy, recall, precision, null-error, false-positive rate, true-positive and F1- score increasing by 1.08%, 2.52%, 5.43%, 16.79%, -8.94%, 8.94%, 3.39% respectively. Also, when the algorithm was compared against state of the art algorithms, the proposed algorithm had an average predictive accuracy of 94.56% while the next best algorithm had an accuracy of 90.83%.
Retinal blood vessel segmentation using random forest Gabor feature selection and automatic thresholding.
(2019) Gwetu, Mandlenkosi Victor.; Tapamo, Jules-Raymond.; Viriri, Serestina.
Successful computer aided diagnosis of ocular diseases is normally dependent on the accurate detection of components such as blood vessels, optic disk, fovea and microaneurysms. The properties of these components can be indicative of the presence and/or severity of pathology. Since most prevalent forms of ocular diseases emanate from vascular disorders, it is expected that accurate detection of blood vessels is essential for ocular diagnosis. In this research work, we investigate several opportunities for improvement of retinal blood vessel segmentation with the hope that they will ultimately lead to improvement in the diagnosis of vascular related ocular diseases. We complement existing work in this domain by introducing new Gabor lter features and selecting the most e ective of these using Random Forests feature selection. The actual segmentation of blood vessels is then done using an improved automatic thresholding scheme based on the preferred Gabor feature. We propose Random Forest (RF) feature ranking algorithms that demonstrate reliable feature set partitions over several University of California, Irvine (UCI) datasets. To circumvent instances of unreliable rankings, we also propose feature rank and RF strength correlation as an alternative indicator. Of the four proposed Gabor features, the maximum magnitude response is con rmed as the most e ective, as is the general trend in previous literature. The proposed Selective Valley Emphasis thresholding technique achieves identical segmentation results to the legacy approach while improving on computational e ciency. Sensitivity and speci city outcomes of up to 76.8% and 97.9% as well as 78.8% and 97.8% are achieved on the DRIVE and STARE datasets, respectively.
Road obstacle detection Using YOLO algorithm based on attention mechanism.
(2024) Lekola , Bafokeng.; Viriri, Serestina.
Road obstacle detection is an important task in autonomous vehicles (AVs) and advanced driver assistance systems (ADAS) as they require real-time operation and high accuracy for safe operation. The mobile nature of the task means that it is carried out in a low-resourced environment where there is a need for an algorithm that achieves both high accuracy and high inference speed while meeting the requirement for lightweight. In this dissertation, an exploration of the effectiveness of the Attention-enhanced YOLO algorithm for the task of road obstacle detection is carried out. Several state-of-the-art attention modules that employ both channel and spatial attention are explored and fused with the YOLOv8 and YOLOv9 algorithms. These enhance feature maps of the network by suppressing non-distinctive features allowing the network to learn from highly distinctive features. The Attention-modified networks are trained and validated on the Kitti and BDD100k datasets which are publicly available. Comparisons are made between the models and the baseline. An improvement from the baseline is seen with the GAM attention achieving an accuracy rate of 93.3% on the Kitti dataset and 71.1% on the BDD100k dataset. The Attention modules generally achieved incremental improvements over the baseline.