A parallel implementation for determining genomic distances under deletion and insertion


As the need for comparing genomes of different species has grown dramatically with the fast progress of the Human Genome Project, the evolution at the level of whole genomes has attracted more and more attention from both biologists and computer scientists. They are especially interested in the scenarios in which the genome evolves through insertions, deletions, and movements of genes along its chromosomes. Marron et al proposed a polynomial-time approximation algorithm to compute (near) minimum edit distances under inversions, deletions, and unrestricted insertions. Our work is based on their algorithm, which carries out lots of comparisons and sorting to calculate the edit distance. These comparisons and sorting are extremely time-consuming, and they result in decrease of computational efficiency. We believe the time of the algorithm can be improved through parallelization. We parallelize their algorithm via OpenMP using Intel C++ compiler for Linux 7.1, and compare three levels of parallelism: coarse grain, fine grain and combination of both. The experiments are conducted for a varying number of threads and different lengths of the gene sequences. The experimental results show that either coarse grain parallelism or fine grain parallelism alone does not improve the performance of the algorithm very much. However, the use of combination of both fine grain and coarse grain parallelism improves the performance of the algorithm drastically.

Document Type

Conference Proceeding



Publication Date


Journal Title

Lecture Notes in Computer Science