Repository logo
 

Doctoral Degrees (Computer Science)

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 32
  • Item
    Search and selection methods for hyper-heuristics.
    (2018) Akandwanaho, Stephen Mugisha.; Petruccione, Francesco.; Sinayskiy, Ilya.
    Hyper-heuristics (HHs) search from a search space of heuristics for an optimal heuristic that can be mapped onto a problem to generate good solutions. One of the heuristic selection methods used by hyper-heuristics is a choice function (CF) which assigns scores to heuristics according to their performance. An investigation is conducted on a choice function for single-point selective hyper-heuristics. The drawbacks of the existing choice function are: discarding heuristics due to poor performance on a problem when they could exhibit good performance on another problem and premature convergence in the heuristic search space. In order to address the drawbacks of a choice function, a new selection method called an efficient choice function (ECF) is introduced, based on a three-pronged improvement approach. Firstly, a new element is introduced, which collects previously poor-performing heuris- tics. The best heuristic of the poorly performing heuristics is obtained and compared with the best heuristic from the general pool of heuristics. Secondly, the pairwise comparison of the best heuristics from both the poorly performing heuristics and the general pool of heuristics is applied at every point in the iteration to maintain competition through- out the search process and generate high-quality outcomes. Thirdly, another element is introduced that randomly divides heuristics into different groups, ranks the collective performance of each group and makes performance comparisons between disparate groups of heuristics. The proposed heuristic selection method is tested on several well-known combinatorial optimization problems which include, the vehicle routing problem, the bin packing problem, the permutation ow shop problem, the personnel scheduling problem and the patient transportation problem. Results show a better performance of an efficient choice function than the existing methods. The second contribution of this thesis is to enhance searching for optima in dynamic search spaces. An investigation of dynamic selection hyper-heuristics is performed and a new non-stationary covariance function is introduced. The Gaussian process regression is used as a predictive measure for moving optima in dynamic search environments and the proposed method is applied to the dynamic traveling salesman problem, which yields better performance than the existing approach. The third contribution is a spy search method (SSM) for a memetic algorithm (MA) in dynamic environments. Given that the proposed efficient choice function for hyper- heuristics is based on a memetic to perform the search for an optimal heuristic, improving a MA search enhances the capacity of the efficient choice function to nd good solutions. The proposed SSM shows a better performance than the nine existing methods on a set of different dynamic problems.
  • Item
    Retinal blood vessel segmentation using random forest Gabor feature selection and automatic thresholding.
    (2019) Gwetu, Mandlenkosi Victor.; Tapamo, Jules-Raymond.; Viriri, Serestina.
    Successful computer aided diagnosis of ocular diseases is normally dependent on the accurate detection of components such as blood vessels, optic disk, fovea and microaneurysms. The properties of these components can be indicative of the presence and/or severity of pathology. Since most prevalent forms of ocular diseases emanate from vascular disorders, it is expected that accurate detection of blood vessels is essential for ocular diagnosis. In this research work, we investigate several opportunities for improvement of retinal blood vessel segmentation with the hope that they will ultimately lead to improvement in the diagnosis of vascular related ocular diseases. We complement existing work in this domain by introducing new Gabor lter features and selecting the most e ective of these using Random Forests feature selection. The actual segmentation of blood vessels is then done using an improved automatic thresholding scheme based on the preferred Gabor feature. We propose Random Forest (RF) feature ranking algorithms that demonstrate reliable feature set partitions over several University of California, Irvine (UCI) datasets. To circumvent instances of unreliable rankings, we also propose feature rank and RF strength correlation as an alternative indicator. Of the four proposed Gabor features, the maximum magnitude response is con rmed as the most e ective, as is the general trend in previous literature. The proposed Selective Valley Emphasis thresholding technique achieves identical segmentation results to the legacy approach while improving on computational e ciency. Sensitivity and speci city outcomes of up to 76.8% and 97.9% as well as 78.8% and 97.8% are achieved on the DRIVE and STARE datasets, respectively.
  • Item
    A machine learning approach to facial-based ethnicity classification.
    (2017) Momin, Hajra Mehbub.; Tapamo, Jules-Raymond.
    The determination of ethnicity of an individual can be very useful in a face recognition and person identification system in general. The face displays a complex range of information about identity, age, sex, race as well as emotional and intentional state. It is commonly assumed that the biological unit of human classification is the ethnic group, with hereditary physical features making up the group classification, based on the qualities such as the skin colour, the build, the head shape, the hair, the face shape, and the blood type. In this thesis, the aim is to investigate methods and techniques to perform ethnicity classification of face images. Automatic face-based ethnicity classification has various applications in human computer interaction, surveillance, video and image retrieval, database indexing, and can give helpful insight for face recognition and identification. Since biometric systems have to deal with very large databases, it can be a good idea to partition the face database according to the ethnicity of a person. In addition, this has the potential to significantly improve the search speed, efficiency and accuracy of biometric systems. Automatic face and landmark detection on images is very important for face recognition, face identification and for ethnicity classification. This study presents an approach for detecting face and facial features such as the eyes, the nose and the mouth in gray-scale images. In addition, the study makes use of thresholding and connected component labelling algorithms in order to detect a face and extract features that characterize this face. This study investigates three different feature methods for the ethnicity classification of face images. A new ethnicity classification based on skin colour is proposed. Skin colour is one of the most important features in the human face. The skin colour differs from individual to individual belonging to different ethnic groups and from people across different regions. For instance, theskin colour of people belonging to White, Asian and Black groups is different from one another and extended from white to yellow to dark brown. Based on this different colour spaces are used to create a feature vector representing a given face image. A second feature model based on textures is proposed. Gabor filters are used to extract texture features. Thirdly, a combination of colour and texture features are used to further improve the ethnicity classification accuracy. Four different classifiers, namely K-Means clustering, Naive Bayesian (NB), Multilayer Perceptron (MLP) and Support Vector Machine (SVM), were used to test the effectiveness of the automatic characterization of ethnicity by using the proposed features models. The ethnic groups considered were Asian, Indian, White and Black. Extensive experiments demonstrate that our models achieve very good results, confirming the consistently overwhelming performance of Asian classification. The proposed models also achieve very good classification results for different ethnic groups when compared with existing models.
  • Item
    Mathematical and numerical analysis of the discrete fragmentation coagulation equation with growth, decay and sedimentation.
    (2018) Joel, Luke Oluwaseye; Banasiak, Jacek.; Shindin, Sergey Konstantinovich.
    Fragmentation-coagulation equations arise naturally in many branches of engineering and science, the applications stretching from astrophysics, blood clotting, colloidal chemistry and polymer science to molecular beam epitaxy. In realistic application, the fragmentation and coagulation are often coupled with growth, decay and/or sedimentation processes. The resulting models are used to describe the evolution of a population in which individuals can grow, coalesce, split or divide, and die. For example, in the phytoplankton dynamics, in addition to forming or breaking of clusters, individuals within them are born or die and so the latter processes must be adequately represented in the models. In the continuous case, the birth or death processes are incorporated into the model by adding an appropriate first order transport term, analogously to the age and size structured McKendrick models. In the discrete case, these vital processes are modelled by adding weighted differences operators. In this study, we focus on the discrete fragmentation-coagulation models with growth, decay or/and sedimentation. The problem is treated as an infinite-dimensional differential equation, which consists of a linear part (fragmentation, growth, decay and sedimentation term) and a nonlinear part (coagulation term), posed in a suitable Banach space, X. We use the theory of semigroups of linear operators, perturbation of positive semigroups and semilinear operators for the mathematical analysis of these models. The linear part of the models is shown to generate a semigroup which is analytic, compact and irreducible and thus has the asynchronous exponential growth property. These results are used to demonstrate the existence of global classical solutions to the semilinear fragmentation-coagulation equations with growth, decay and sedimentation for a class of unbounded coagulation kernels. Theoretical conclusions are supported by numerical simulations.
  • Item
    Modelling of artificial intelligence based demand side management techniques for mitigating energy poverty in smart grids.
    (2018) Monyei, Chukwuka Gideon.; Viriri, Serestina.
    This research work proposes an artificial intelligence (AI) based model for smart grid initiatives (for South Africa and by extension sub-Saharan Africa, (SSA)) and further incorporates energy justice principles. Spanning the social, technical, economic, environmental, policy and overall impact of smart and just electricity grids, this research begins by investigating declining electricity consumption and demand side management (DSM) potential across South Africa. In addition, technical frameworks such as the combined energy management system (CEMS), co-ordinated centralized energy management system (ConCEMS) and biased load manager home energy management system (BLM-HEMS) are modelled. These systems provide for the integration of all aspects of the electricity grid and their optimization in achieving cost reduction for both the utility and consumers as well as improvement in the consumers quality of life (QoL) and reduction of emissions. Policy and economy-wise, this research work further proposes and models an integrated electrification and expansion model (IEEM) for South Africa, and also addresses the issue of rural marginalization due to poor electricity access for off-grid communities. This is done by proposing a hybrid generation scheme (HGS) which is shown to satisfy sufficiently the requirements of the energy justice framework while significantly reducing the energy burden of households and reducing carbon emissions by over 70%.
  • Item
    Structure based partial solution search for the examination timetabling problem.
    (2021) Rajah, Christopher Bradley.; Pillay, Nelishia.
    The aim of this work is to present a new approach, namely, Structure Based Partial Solution Search (SBPSS) to solve the Examination Timetabling Problem. The success of the Developmental Approach in this problem domain suggested that the strategy of searching the spaces of partial timetables whilst constructing them is promising and worth pursuing. This work adopts a similar strategy. Multiple timetables are incrementally constructed at the same time. The quality of the partial timetables is improved upon by searching their partial solution spaces at every iteration during construction. Another key finding from the literature survey revealed that although timetables may exhibit the same behaviour in terms of their objective values, their structures or exam schedules may be different. The challenge with this finding is to decide on which regions to pursue because some regions may not be worth investigating due to the difficulty in searching them. These problematic areas may have solutions that are not amenable to change which makes it difficult to improve them. Another reason is that the neighbourhoods of solutions in these areas may be less connected than others which may restrict the ability of the search to move to a better solution in that neighbourhood. By moving to these problematic areas of the search space the search may stagnate and waste expensive computational resources. One way to overcome this challenge is to use both structure and behaviour in the search and not only behaviour alone to guide the search. A search that is guided by structure is able to find new regions by considering the structural components of the candidate solutions which indicate which part of the search space the same candidates occupy. Another benefit to making use of a structure-based search is that it has no objective value bias because it is not guided by only the objective value. This statement is consistent with the literature survey where it is suggested that in order to achieve good performance the search should not be guided by only the objective value. The proposed method has been tested on three popular benchmark sets for examination timetabling, namely, the Carter benchmark set; the benchmark set from the International Timetabling competition in 2007 and the Yeditepe benchmark set. The SBPSS found the best solutions for two of the Carter problem instances. The SBPSS found the best solutions for four of the competition problem instances. Lastly, the SBPSS improved on the best results for all the Yeditepe problem instances.
  • Item
    Tuberculosis diagnosis from pulmonary chest x-ray using deep learning.
    (2022) Oloko-Oba, Mustapha Olayemi.; Viriri, Serestina.
    Tuberculosis (TB) remains a life-threatening disease, and it is one of the leading causes of mortality in developing countries. This is due to poverty and inadequate medical resources. While treatment for TB is possible, it requires an accurate diagnosis first. Several screening tools are available, and the most reliable is Chest X-Ray (CXR), but the radiological expertise for accurately interpreting the CXR images is often lacking. Over the years, CXR has been manually examined; this process results in delayed diagnosis, is time-consuming, expensive, and is prone to misdiagnosis, which could further spread the disease among individuals. Consequently, an algorithm could increase diagnosis efficiency, improve performance, reduce the cost of manual screening and ultimately result in early/timely diagnosis. Several algorithms have been implemented to diagnose TB automatically. However, these algorithms are characterized by low accuracy and sensitivity leading to misdiagnosis. In recent years, Convolutional Neural Networks (CNN), a class of Deep Learning, has demonstrated tremendous success in object detection and image classification task. Hence, this thesis proposed an efficient Computer-Aided Diagnosis (CAD) system with high accuracy and sensitivity for TB detection and classification. The proposed model is based firstly on novel end-to-end CNN architecture, then a pre-trained Deep CNN model that is fine-tuned and employed as a features extractor from CXR. Finally, Ensemble Learning was explored to develop an Ensemble model for TB classification. The Ensemble model achieved a new stateof- the-art diagnosis accuracy of 97.44% with a 99.18% sensitivity, 96.21% specificity and 0.96% AUC. These results are comparable with state-of-the-art techniques and outperform existing TB classification models.
  • Item
    Automatic dental caries detection in bitewing radiographs.
    (2022) Majanga, Vincent Idah.; Viriri, Serestina.
    Dental Caries is one of the most prevalent chronic disease around the globe. Distinguishing carious lesions has been a challenging task. Conventional computer aided diagnosis and detection methods in the past have heavily relied on visual inspection of teeth. These are only effective on large and clearly visible caries on affected teeth. Conventional methods have been limited in performance due to the complex visual characteristics of dental caries images, which consists of hidden or inaccessible lesions. Early detection of dental caries is an important determinant for treatment and benefits much from the introduction of new tools such as dental radiography. A method for the segmentation of teeth in bitewing X-rays is presented in this thesis, as well as a method for the detection of dental caries on X-ray images using a supervised model. The diagnostic method proposed uses an assessment protocol that is evaluated according to a set of identifiers obtained from a learning model. The proposed technique automatically detects hidden and inaccessible dental caries lesions in bitewing radio graphs. The approach employed data augmentation to increase the number of images in the data set in order to have a total of 11,114 dental images. Image pre-processing on the data set was through the use of Gaussian blur filters. Image segmentation was handled through thresholding, erosion and dilation morphology, while image boundary detection was achieved through active contours method. Furthermore, the deep learning based network through the sequential model in Keras extracts features from the images through blob detection. Finally, a convexity threshold value of 0.9 is introduced to aid in the classification of caries as either present or not present. The relative efficacy of the supervised model in diagnosing dental caries when compared to current systems is indicated by the results detailed in this thesis. The proposed model achieved a 97% correct diagnostic which proved quite competitive with existing models.
  • Item
    Facial expression recognition and intensity estimation.
    (2022) Ekundayo, Olufisayo Sunday.; Viriri, Serestina.
    Facial Expression is one of the profound non-verbal channels through which human emotion state is inferred from the deformation or movement of face components when facial muscles are activated. Facial Expression Recognition (FER) is one of the relevant research fields in Computer Vision (CV) and Human-Computer Interraction (HCI). Its application is not limited to: robotics, game, medical, education, security and marketing. FER consists of a wealth of information. Categorising the information into primary emotion states only limit its performance. This thesis considers investigating an approach that simultaneously predicts the emotional state of facial expression images and the corresponding degree of intensity. The task also extends to resolving FER ambiguous nature and annotation inconsistencies with a label distribution learning method that considers correlation among data. We first proposed a multi-label approach for FER and its intensity estimation using advanced machine learning techniques. According to our findings, this approach has not been considered for emotion and intensity estimation in the field before. The approach used problem transformation to present FER as a multilabel task, such that every facial expression image has unique emotion information alongside the corresponding degree of intensity at which the emotion is displayed. A Convolutional Neural Network (CNN) with a sigmoid function at the final layer is the classifier for the model. The model termed ML-CNN (Multilabel Convolutional Neural Network) successfully achieve concurrent prediction of emotion and intensity estimation. ML-CNN prediction is challenged with overfitting and intraclass and interclass variations. We employ Visual Geometric Graphics-16 (VGG-16) pretrained network to resolve the overfitting challenge and the aggregation of island loss and binary cross-entropy loss to minimise the effect of intraclass and interclass variations. The enhanced ML-CNN model shows promising results and outstanding performance than other standard multilabel algorithms. Finally, we approach data annotation inconsistency and ambiguity in FER data using isomap manifold learning with Graph Convolutional Networks (GCN). The GCN uses the distance along the isomap manifold as the edge weight, which appropriately models the similarity between adjacent nodes for emotion predictions. The proposed method produces a promising result in comparison with the state-of-the-art methods.
  • Item
    The adoption of Web 2.0 tools in teaching and learning by in-service secondary school teachers: the Mauritian context.
    (2018) Pyneandee, Marday.; Govender, Desmond Wesley.; Oogarah-Pratap, Brinda.
    With the current rapid increase in use of Web 2.0 tools by students, it is becoming necessary for teachers to understand what is happening in this social networking phenomenon, so that they can better understand the new spaces that students inhabit and the implications for students’ learning and investigate the wealth of available Web 2.0 tools, and work to incorporate some into their pedagogical and learning practices. Teachers are using the Internet and social networking tools in their personal lives. However, there is little empirical evidence on teachers’ viewpoints and usage of social media and other online technologies to support their classroom practice. This study stemmed from the urgent need to address this gap by exploring teachers’ perceptions, and experience of the integration of online technologies, social media, in their personal lives and for professional practice to find the best predictors of the possibility of teachers’ using Web 2.0 tools in their professional practice. Underpinning the study is a conceptual framework consisting of core ideas found in the unified theory of acceptance and use of technology (UTAUT) and technology pedagogy and content knowledge (TPACK) models. The conceptual framework, together with a review of relevant literature, enabled the formulation of a theoretical model for understanding teachers’ intention to exploit the potential of Web 2.0 tools. The model was then further developed using a mixed-method, two-phase methodology. In the first phase, a survey instrument was designed and distributed to in-service teachers following a Postgraduate Certificate in Education course at the institution where the researcher works. Using the data collected from the survey, exploratory factor analysis, correlational analysis and multiple regression analysis were used to refine the theoretical model. Other statistical methods were also used to gain further insights into teachers’ perceptions of use of Web 2.0 tools in their practices. In the second phase of the study, survey respondents were purposefully selected, based on quantitative results, to participate in interviews. The qualitative data yielded from the interviews was used to support and enrich understanding of the quantitative findings. The constructs teacher knowledge and technology pedagogy knowledge from the TPACK model and the constructs effort expectancy, facilitating conditions and performance expectancy are the best predictors of teachers’ intentions to use Web 2.0 tools in their professional practice. There was an interesting finding on the relationship between UTAUT and TPACK constructs. The constructs performance expectancy and effort expectancy had a significant relationship with all the TPACK constructs – technology knowledge, technology pedagogy knowledge, pedagogical content knowledge (PCK), technology and content knowledge and TPACK – except for content knowledge and pedagogical knowledge. The association between the TPACK construct PCK with the UTAUT constructs performance expectancy and effort expectancy was an unexpected finding because PCK is only about PCK and has no technology component. The theoretical contribution of this study is the model, which is teachers’ intention of future use of Web 2.0 tools in their professional practice. The predictive model, together with other findings, enhances understanding of the nature of teachers’ intention to utilise Web 2.0 tools in their professional practice. Findings from this study have implications for school infrastructure, professional development of teachers and an ICT learning environment to support the adoption of Web 2.0 tools in teaching practices and are presented as guiding principles at the end of the study.
  • Item
    Automated design of genetic programming of classification algorithms.
    (2018) Nyathi, Thambo.; Pillay, Nelishia.
    Over the past decades, there has been an increase in the use of evolutionary algorithms (EAs) for data mining and knowledge discovery in a wide range of application domains. Data classification, a real-world application problem is one of the areas EAs have been widely applied. Data classification has been extensively researched resulting in the development of a number of EA based classification algorithms. Genetic programming (GP) in particular has been shown to be one of the most effective EAs at inducing classifiers. It is widely accepted that the effectiveness of a parameterised algorithm like GP depends on its configuration. Currently, the design of GP classification algorithms is predominantly performed manually. Manual design follows an iterative trial and error approach which has been shown to be a menial, non-trivial time-consuming task that has a number of vulnerabilities. The research presented in this thesis is part of a large-scale initiative by the machine learning community to automate the design of machine learning techniques. The study investigates the hypothesis that automating the design of GP classification algorithms for data classification can still lead to the induction of effective classifiers. This research proposes using two evolutionary algorithms,namely,ageneticalgorithm(GA)andgrammaticalevolution(GE)toautomatethe design of GP classification algorithms. The proof-by-demonstration research methodology is used in the study to achieve the set out objectives. To that end two systems namely, a genetic algorithm system and a grammatical evolution system were implemented for automating the design of GP classification algorithms. The classification performance of the automated designed GP classifiers, i.e., GA designed GP classifiers and GE designed GP classifiers were compared to manually designed GP classifiers on real-world binary class and multiclass classification problems. The evaluation was performed on multiple domain problems obtained from the UCI machine learning repository and on two specific domains, cybersecurity and financial forecasting. The automated designed classifiers were found to outperform the manually designed GP classifiers on all the problems considered in this study. GP classifiers evolved by GE were found to be suitable for classifying binary classification problems while those evolved by a GA were found to be suitable for multiclass classification problems. Furthermore, the automated design time was found to be less than manual design time. Fitness landscape analysis of the design spaces searched by a GA and GE were carried out on all the class of problems considered in this study. Grammatical evolution found the search to be smoother on binary classification problems while the GA found multiclass problems to be less rugged than binary class problems.
  • Item
    The enhanced best performance algorithm for global optimization with applications.
    (2016) Chetty, Mervin.; Adewumi, Aderemi Oluyinka.
    Abstract available in PDF file.
  • Item
    Leaf recognition for accurate plant classification.
    (2017) Kala, Jules Raymond.; Viriri, Serestina.; Moodley, Deshendran.
    Plants are the most important living organisms on our planet because they are sources of energy and protect our planet against global warming. Botanists were the first scientist to design techniques for plant species recognition using leaves. Although many techniques for plant recognition using leaf images have been proposed in the literature, the precision and the quality of feature descriptors for shape, texture, and color remain the major challenges. This thesis investigates the precision of geometric shape features extraction and improved the determination of the Minimum Bounding Rectangle (MBR). The comparison of the proposed improved MBR determination method to Chaudhuri's method is performed using Mean Absolute Error (MAE) generated by each method on each edge point of the MBR. On the top left point of the determined MBR, Chaudhuri's method has the MAE value of 26.37 and the proposed method has the MAE value of 8.14. This thesis also investigates the use of the Convexity Measure of Polygons for the characterization of the degree of convexity of a given leaf shape. Promising results are obtained when using the Convexity Measure of Polygons combined with other geometric features to characterize leave images, and a classification rate of 92% was obtained with a Multilayer Perceptron Neural Network classifier. After observing the limitations of the Convexity Measure of Polygons, a new shape feature called Convexity Moments of Polygons is presented in this thesis. This new feature has the invariant properties of the Convexity Measure of Polygons, but is more precise because it uses more than one value to characterize the degree of convexity of a given shape. Promising results are obtained when using the Convexity Moments of Polygons combined with other geometric features to characterize the leaf images and a classification rate of 95% was obtained with the Multilayer Perceptron Neural Network classifier. Leaf boundaries carry valuable information that can be used to distinguish between plant species. In this thesis, a new boundary-based shape characterization method called Sinuosity Coefficients is proposed. This method has been used in many fields of science like Geography to describe rivers meandering. The Sinuosity Coefficients is scale and translation invariant. Promising results are obtained when using Sinuosity Coefficients combined with other geometric features to characterize the leaf images, a classification rate of 80% was obtained with the Multilayer Perceptron Neural Network classifier. Finally, this thesis implements a model for plant classification using leaf images, where an input leaf image is described using the Convexity Moments, the Sinuosity Coefficients and the geometric features to generate a feature vector for the recognition of plant species using a Radial Basis Neural Network. With the model designed and implemented the overall classification rate of 97% was obtained.
  • Item
    A semantic sensor web framework for proactive environmental monitoring and control.
    (2017) Adeleke, Jude Adekunle.; Moodley, Deshendran.; Rens, Gavin Brian.; Adewumi, Aderemi Oluyinka.
    Observing and monitoring of the natural and built environments is crucial for main- taining and preserving human life. Environmental monitoring applications typically incorporate some sensor technology to continually observe specific features of inter- est in the physical environment and transmitting data emanating from these sensors to a computing system for analysis. Semantic Sensor Web technology supports se- mantic enrichment of sensor data and provides expressive analytic techniques for data fusion, situation detection and situation analysis. Despite the promising successes of the Semantic Sensor Web technology, current Semantic Sensor Web frameworks are typically focused at developing applications for detecting and reacting to situations detected from current or past observations. While these reactive applications provide a quick response to detected situations to minimize adverse effects, they are limited when it comes to anticipating future adverse situations and determining proactive control actions to prevent or mitigate these situations. Most current Semantic Sensor Web frameworks lack two essential mechanisms required to achieve proactive control, namely, mechanisms for antici- pating the future and coherent mechanisms for consistent decision processing and planning. Designing and developing proactive monitoring and control Semantic Sensor Web applications is challenging. It requires incorporating and integrating different tech- niques for supporting situation detection, situation prediction, decision making and planning in a coherent framework. This research proposes a coherent Semantic Sen- sor Web framework for proactive monitoring and control. It incorporates ontology to facilitate situation detection from streaming sensor observations, statistical ma- chine learning for situation prediction and Markov Decision Processes for decision making and planning. The efficacy and use of the framework is evaluated through the development of two different prototype applications. The first application is for proactive monitoring and control of indoor air quality to avoid poor air quality situations. The second is for proactive monitoring and control of electricity usage in blocks of residential houses to prevent strain on the national grid. These appli- cations show the effectiveness of the proposed framework for developing Semantic Sensor Web applications that proactively avert unwanted environmental situations before they occur.
  • Item
    Hierarchical age estimation using enhanced facial features.
    (2018) Angulu, Raphael.; Tapamo, Jules-Raymond.; Adewumi, Aderemi Oluyinka.
    Ageing is a stochastic, inevitable and uncontrollable process that constantly affect shape, texture and general appearance of the human face. Humans can easily determine ones’ gender, identity and ethnicity with highest accuracy as compared to age. This makes development of automatic age estimation techniques that surpass human performance an attractive yet challenging task. Automatic age estimation requires extraction of robust and reliable age discriminative features. Local binary patterns (LBP) sensitivity to noise makes it insufficiently reliable in capturing age discriminative features. Although local ternary patterns (LTP) is insensitive to noise, it uses a single static threshold for all images regardless of varied image conditions. Local directional patterns (LDP) uses k directional responses to encode image gradient and disregards not only central pixel in the local neighborhood but also 8 􀀀 k directional responses. Every pixel in an image carry subtle information. Discarding 8 􀀀 k directional responses lead to lose of discriminative texture features. This study proposes two variations of LDP operator for texture extraction. Significantorientation response LDP (SOR-LDP) encodes image gradient by grouping eight directional responses into four pairs. Each pair represents orientation of an edge with respect to central reference pixel. Values in each pair are compared and the bit corresponding to the maximum value in the pair is set to 1 while the other is set to 0. The resultant binary code is converted to decimal and assigned to the central pixel as its’ SOR-LDP code. Texture features are contained in the histogram of SOR-LDP encoded image. Local ternary directional patterns (LTDP) first gets the difference between neighboring pixels and central pixel in 3 3 image region. These differential values are convolved with Kirsch edge detectors to obtain directional responses. These responses are normalized and used as probability of an edge occurring towards a respective direction. An adaptive threshold is applied to derive LTDP code. The LTDP code is split into its positive and negative LTDP codes. Histograms of negative and positive LTDP encoded images are concatenated to obtain texture feature. Regardless of there being evidence of spatial frequency processing in primary visual cortex, biologically inspired features (BIF) that model visual cortex uses only scale and orientation selectivity in feature extraction. Furthermore, these BIF are extracted using holistic (global) pooling across scale and orientations leading to lose of substantive information. This study proposes multi-frequency BIF (MF-BIF) where frequency selectivity is introduced in BIF modelling. Local statistical BIF (LS-BIF) uses local pooling within scale, orientation and frequency in n n region for BIF extraction. Using Leave-one-person-out (LOPO) validation protocol, this study investigated performance of proposed feature extractors in age estimation in a hierarchical way by performing age-group classification using Multi-layer Perceptron (MLP) followed by within age-group exact age regression using support vector regression (SVR). Mean absolute error (MAE) and cumulative score (CS) were used to evaluate performance of proposed face descriptors. Experimental results on FG-NET ageing dataset show that SOR-LDP, LTDP, MF-BIF and LS-BIF outperform state-of-the-art feature descriptors in age estimation. Experimental results show that performing gender discrimination before age-group and age estimation further improves age estimation accuracies. Shape, appearance, wrinkle and texture features are simultaneously extracted by visual system in primates for the brain to process and understand an image or a scene. However, age estimation systems in the literature use a single feature for age estimation. A single feature is not sufficient enough to capture subtle age discriminative traits due to stochastic and personalized nature of ageing. This study propose fusion of different facial features to enhance their discriminative power. Experimental results show that fusing shape, texture, wrinkle and appearance result into robust age discriminative features that achieve lower MAE compared to single feature performance.
  • Item
    Multi-level parallelization for accurate and fast medical image retrieval image retrieval.
    (2016) Chikamai, Keith Sasala.; Viriri, Serestina.; Tapamo, Jules-Raymond.
    Breast cancer is the most prevalent form of cancer diagnosed in women. Mammograms offer the best option in detecting the disease early, which allows early treatment and by implication, a favorable prognosis. Content-based Medical Image Retrieval (CBMIR) technique is increasingly gaining research attention as a Computer Aided Diagnosis (CAD)) approach for breast cancer diagnosis. Such systems work by availing mammogram images that are pathologically similar to a given query example, which are used to support the diagnostic decision by referential basis. In most cases, the query is of the form “return k images similar to the specified query image”. Similarity in the Content-based Image Retrieval (CBIR) context is based on the content of images, rather than text or keywords. The essence of CBIR systems is to enable indexing of pictorial content in databases and eliminating the drawbacks of manual annotation. CBMIR is a relatively young technology that is yet to gain widespread use. One major challenge for CBMIR systems is bridging the “semantic gap” in the description of image content. Semantic gap describes the discord in the notion of similarity between the descriptions of humans and CBMIR systems. Low accuracy concerns inhibit the full adoption of CBMIR systems into regular practice, with research focusing on improving the accuracy of CBMIR systems. Nonetheless, the area is still an open problem. As a contribution towards improving the accuracy of CBMIR for mammogram images, this work proposes a novel feature modeling technique for CBMIR systems based on classifier scores and standard statistical calculations on the same. A set of gradient-based filters are first used to highlight possible calcification objects; an Entropy-based thresholding technique is then used to segment the calcifications from the background. Experimental results show that the proposed model achieves a 100% detection rate, which shows the effectiveness of combining the likelihood maps from various filters in detecting calcification objects. Feature extraction considers established textural and geometric features, which are calculated from the detected calcification objects; these are then used to generate secondary features using the Support Vector Machine and Quadratic Discriminant Analysis classifier. The model is validated through a range of benchmarks, and is shown to perform competitively in comparison to similar works. Specifically, it scores 95%, 82%, 78%, and 98% on the accuracy, positive predictive value, sensitivity and specificity benchmarks respectively. Parallel computing is applied to the task of feature extraction to show its viability in reducing the cost of extraction features. This research considers two technologies for implementation: distributed computing using the message passing interface (MPI) and multicore computing using OpenMP threads. Both technologies involve the division of tasks to facilitate sharing of the computational burden in order to reduce the overall time cost. Communication cost is one penalty implied with parallel systems and a significant design target where efficiency of parallel models is concerned. This research focuses on mitigating the communication overhead for increasing the efficacy of parallel computation; it proposes an adaptive task assignment model dependent on network bandwidth for the parallel extraction of features. Experimental results report speedup values of between 4:7x and 10:4x, and efficiency values of between 0:11 and 0:62. There is a positive increase in both the speedup and efficiency values with an increase in the database size. The proposed adaptive assignment of tasks positively impacts on the speedup and efficiency performance of the parallel model. All experiments are based on the mammographic image analysis society (MIAS) database, which is a publicly available database that has been widely used in related works. The results achieved for both the mammogram pathology-based retrieval model as well as its computational efficiency met the objectives set for the research. In the domain of breast cancer applications, the models proposed in this work should positively contribute to the improvement of retrieval results of computer aided diagnosis/detection systems, where applicable. The improved accuracy will lead to higher acceptability of such systems by radiologists, which will enhance the quality of diagnosis both by reducing the decision-making time as well as improving the accuracy of the entire diagnostic process.
  • Item
    An ontology-driven approach for structuring scientific knowledge for predicting treatment adherence behaviour: a case study of tuberculosis in Sub-Saharan African communities.
    (2016) Ogundele, Olukunle Ayodeji.; Moodley, Deshendran.; Pillay, Anban Woolaganathan.; Seebregts, Christopher.
    Poor adherence to prescribed treatment is a complex phenomenon and has been identified as a major contributor to patients developing drug resistance and failing treatment in sub-Saharan African countries. Treatment adherence behaviour is influenced by diverse personal, cultural and socio-economic factors that may vary drastically between communities in different regions. Computer based predictive models can be used to identify individuals and communities at risk of non-adherence and aid in supporting resource allocation and intervention planning in disease control programs. However, constructing effective predictive models is challenging, and requires detailed expert knowledge to identify factors and determine their influence on treatment adherence in specific communities. While many clinical studies and abstract conceptual models exist in the literature, there is no known concrete, unambiguous and comprehensive computer based conceptual model that categorises factors that influence treatment adherence behaviour. The aim of this research was to develop an ontology-driven approach for structuring knowledge of factors that influence treatment adherence behaviour and for constructing adherence risk prediction models for specific communities. Tuberculosis treatment adherence in sub-Saharan Africa was used as a case study to explore and validate the approach. The approach provides guidance for knowledge acquisition, for building a comprehensive conceptual model, its formalisation into an OWL ontology, and generation of probabilistic risk prediction models. The ontology was evaluated for its comprehensiveness and correctness, and its effectiveness for constructing Bayesian decision networks for predicting adherence risk. The approach introduces a novel knowledge acquisition step that guides the capturing of influencing factors from peer-reviewed clinical studies and the scientific literature. Furthermore, the ontology takes an evidence based approach by explicitly relating each factor to published clinical studies, an important consideration for health practitioners. The approach was shown to be effective in constructing a flexible and extendable ontology and automatically generating the structure of a Bayesian decision network, a crucial step towards automated, computer based prediction of adherence risk for individuals in specific communities.
  • Item
    A knowledge-based system for automated discovery of ecological interactions in flower-visiting data.
    (2017) Coetzer, Willem Gabriël.; Moodley, Deshendran.; Gerber, Aurona Jacoba.
    Studies on the community ecology of flower-visiting insects, which can be inferred to pollinate flowers, are important in agriculture and nature conservation. Many scientific observations of flower-visiting insects are associated with digitized records of insect specimens preserved in natural history collections. Specimen annotations include heterogeneous and incomplete, in situ field documentation of ecologically significant relationships between individual organisms (i.e. insects and plants), which are nevertheless potentially valuable. A wealth of unrepresented biodiversity and ecological knowledge can be unlocked from such detailed data by augmenting the data with expert knowledge encoded in knowledge models. An analysis of the knowledge representation requirements of flower-visiting community ecologists is presented, as well as an implementation and evaluation of a prototype knowledge-based system for automated semantic enrichment, semantic mediation and interpretation of flower-visiting data. A novel component of the system is a semantic architecture which incorporates knowledge models validated by experts. The system combines ontologies and a Bayesian network to enrich, integrate and interpret flower- visiting data, specifically to discover ecological interactions in the data. The system’s effectiveness, to acquire and represent expert knowledge and simulate the inferencing ability of expert flower-visiting ecologists, is evaluated and discussed. The knowledge-based system will allow a novice ecologist to use standardised semantics to construct interaction networks automatically and objectively. This could be useful, inter alia, when comparing interaction networks for different periods of time at the same place or different places at the same time. While the system architecture encompasses three levels of biological organization, data provenance can be traced back to occurrences of individual organisms preserved as evidence in natural history collections. The potential impact of the semantic architecture could be significant in the field of biodiversity and ecosystem informatics because ecological interactions are important in applied ecological studies, e.g. in freshwater biomonitoring or animal migration.
  • Item
    Intelligent instance selection techniques for support vector machine speed optimization with application to e-fraud detection.
    (2017) Akinyelu, Ayobami Andronicus.; Adewumi, Aderemi Oluyinka.
    Decision-making is a very important aspect of many businesses. There are grievous penalties involved in wrong decisions, including financial loss, damage of company reputation and reduction in company productivity. Hence, it is of dire importance that managers make the right decisions. Machine Learning (ML) simplifies the process of decision making: it helps to discover useful patterns from historical data, which can be used for meaningful decision-making. The ability to make strategic and meaningful decisions is dependent on the reliability of data. Currently, many organizations are overwhelmed with vast amounts of data, and unfortunately, ML algorithms cannot effectively handle large datasets. This thesis therefore proposes seven filter-based and five wrapper-based intelligent instance selection techniques for optimizing the speed and predictive accuracy of ML algorithms, with a particular focus on Support Vector Machine (SVM). Also, this thesis proposes a novel fitness function for instance selection. The primary difference between the filter-based and wrapper-based technique is in their method of selection. The filter-based techniques utilizes the proposed fitness function for selection, while the wrapper-based technique utilizes SVM algorithm for selection. The proposed techniques are obtained by fusing SVM algorithm with the following Nature Inspired algorithms: flower pollination algorithm, social spider algorithm, firefly algorithm, cuckoo search algorithm and bat algorithm. Also, two of the filter-based techniques are boundary detection algorithms, inspired by edge detection in image processing and edge selection in ant colony optimization. Two different sets of experiments were performed in order to evaluate the performance of the proposed techniques (wrapper-based and filter-based). All experiments were performed on four datasets containing three popular e-fraud types: credit card fraud, email spam and phishing email. In addition, experiments were performed on 20 datasets provided by the well-known UCI data repository. The results show that the proposed filter-based techniques excellently improved SVM training speed in 100% (24 out of 24) of the datasets used for evaluation, without significantly affecting SVM classification quality. Moreover, experimental results also show that the wrapper-based techniques consistently improved SVM predictive accuracy in 78% (18 out of 23) of the datasets used for evaluation and simultaneously improved SVM training speed in all cases. Furthermore, two different statistical tests were conducted to further validate the credibility of the results: Freidman’s test and Holm’s post-hoc test. The statistical test results reveal that the proposed filter-based and wrapper-based techniques are significantly faster, compared to standard SVM and some existing instance selection techniques, in all cases. Moreover, statistical test results also reveal that Cuckoo Search Instance Selection Algorithm outperform all the proposed techniques, in terms of speed. Overall, the proposed techniques have proven to be fast and accurate ML-based e-fraud detection techniques, with improved training speed, predictive accuracy and storage reduction. In real life application, such as video surveillance and intrusion detection systems, that require a classifier to be trained very quickly for speedy classification of new target concepts, the filter-based techniques provide the best solutions; while the wrapper-based techniques are better suited for applications, such as email filters, that are very sensitive to slight changes in predictive accuracy.
  • Item
    Practical reasoning for defeasable description logics.
    (2016) Moodley, Kodylan.; Meyer, Thomas Andreas.
    Description Logics (DLs) are a family of logic-based languages for formalising ontologies. They have useful computational properties allowing the development of automated reasoning engines to infer implicit knowledge from ontologies. However, classical DLs do not tolerate exceptions to speci ed knowledge. This led to the prominent research area of nonmonotonic or defeasible reasoning for DLs, where most techniques were adapted from seminal works for propositional and rst-order logic. Despite the topic's attention in the literature, there remains no consensus on what \sensible" defeasible reasoning means for DLs. Furthermore, there are solid foundations for several approaches and yet no serious implementations and practical tools. In this thesis we address the aforementioned issues in a broad sense. We identify the preferential approach, by Kraus, Lehmann and Magidor (KLM) in propositional logic, as a suitable abstract framework for de ning and studying the precepts of sensible defeasible reasoning. We give a generalisation of KLM's precepts, and their arguments motivating them, to the DL case. We also provide several preferential algorithms for defeasible entailment in DLs; evaluate these algorithms, and the main alternatives in the literature, against the agreed upon precepts; extensively test the performance of these algorithms; and ultimately consolidate our implementation in a software tool called Defeasible-Inference Platform (DIP). We found some useful entailment regimes within the preferential context that satisfy all the KLM properties, and some that have scalable performance in real world ontologies even without extensive optimisation.