J4 ›› 2016, Vol. 1 ›› Issue (6): 4-7.

Previous Articles     Next Articles

Improvement of Genome Assembly Algorithm Based on MapReduce


  1. (College of Mathematics and Computer, Dali University, Dali, Yunnan 671003, China)
  • Received:2016-01-25 Online:2016-06-15 Published:2016-06-15


The rapid increase of biological information data requires the import of new technology. At present, the genome assembly
algorithm is neither precised nor parallelize. A new algorithm based on MapReduce is proposed after analysis of the current assembly
algorithm. The error data is removed through statistics way, and the duplicate data is eliminated by increasing the length of the k-mer
in the process of assembly. Finally, the parallel assembly algorithm is realized in MapReduce platform. The experimental results show
that the accuracy and speed of this algorithm are improved.

Key words: genome assembly, high throughput sequencing, de Bruijn, MapReduce

CLC Number: