Multi-level parallelization for accurate and fast medical image retrieval image retrieval.
Date
2016
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Breast cancer is the most prevalent form of cancer diagnosed in women. Mammograms offer
the best option in detecting the disease early, which allows early treatment and by implication, a
favorable prognosis. Content-based Medical Image Retrieval (CBMIR) technique is increasingly
gaining research attention as a Computer Aided Diagnosis (CAD)) approach for breast cancer
diagnosis. Such systems work by availing mammogram images that are pathologically similar
to a given query example, which are used to support the diagnostic decision by referential basis.
In most cases, the query is of the form “return k images similar to the specified query image”.
Similarity in the Content-based Image Retrieval (CBIR) context is based on the content of images,
rather than text or keywords. The essence of CBIR systems is to enable indexing of pictorial
content in databases and eliminating the drawbacks of manual annotation. CBMIR is a relatively
young technology that is yet to gain widespread use. One major challenge for CBMIR systems
is bridging the “semantic gap” in the description of image content. Semantic gap describes the
discord in the notion of similarity between the descriptions of humans and CBMIR systems. Low
accuracy concerns inhibit the full adoption of CBMIR systems into regular practice, with research
focusing on improving the accuracy of CBMIR systems. Nonetheless, the area is still an open
problem.
As a contribution towards improving the accuracy of CBMIR for mammogram images, this
work proposes a novel feature modeling technique for CBMIR systems based on classifier scores
and standard statistical calculations on the same. A set of gradient-based filters are first used
to highlight possible calcification objects; an Entropy-based thresholding technique is then used
to segment the calcifications from the background. Experimental results show that the proposed
model achieves a 100% detection rate, which shows the effectiveness of combining the likelihood
maps from various filters in detecting calcification objects.
Feature extraction considers established textural and geometric features, which are calculated
from the detected calcification objects; these are then used to generate secondary features using the
Support Vector Machine and Quadratic Discriminant Analysis classifier. The model is validated
through a range of benchmarks, and is shown to perform competitively in comparison to similar
works. Specifically, it scores 95%, 82%, 78%, and 98% on the accuracy, positive predictive value,
sensitivity and specificity benchmarks respectively.
Parallel computing is applied to the task of feature extraction to show its viability in reducing
the cost of extraction features. This research considers two technologies for implementation:
distributed computing using the message passing interface (MPI) and multicore computing using
OpenMP threads. Both technologies involve the division of tasks to facilitate sharing of the computational
burden in order to reduce the overall time cost. Communication cost is one penalty
implied with parallel systems and a significant design target where efficiency of parallel models
is concerned. This research focuses on mitigating the communication overhead for increasing the
efficacy of parallel computation; it proposes an adaptive task assignment model dependent on network
bandwidth for the parallel extraction of features. Experimental results report speedup values
of between 4:7x and 10:4x, and efficiency values of between 0:11 and 0:62. There is a positive increase
in both the speedup and efficiency values with an increase in the database size. The proposed
adaptive assignment of tasks positively impacts on the speedup and efficiency performance of the
parallel model. All experiments are based on the mammographic image analysis society (MIAS)
database, which is a publicly available database that has been widely used in related works.
The results achieved for both the mammogram pathology-based retrieval model as well as its
computational efficiency met the objectives set for the research. In the domain of breast cancer
applications, the models proposed in this work should positively contribute to the improvement of
retrieval results of computer aided diagnosis/detection systems, where applicable. The improved
accuracy will lead to higher acceptability of such systems by radiologists, which will enhance the
quality of diagnosis both by reducing the decision-making time as well as improving the accuracy
of the entire diagnostic process.
Description
Doctor of Philosophy in Mathematics, Statistics and Computer Science. University of KwaZulu-Natal, Durban 2016.
Keywords
Theses - Computer Science.