ISBN-13: 9783659680502 / Angielski / Miękka / 2015 / 56 str.
The Longest Common Subsequence(LCS) identification of biological sequences has significant applications in bioinformatics. Due to the emerging growth in bioinformatics applications, new biological sequences with longer length have been used for processing, making it great challenge for sequential LCS algorithms. Few parallel LCS algorithms have been proposed but their efficiency and effectiveness are not satisfactory with increasing complexity and size of biological data. To overcome limitations of existing LCS algorithms and considering MapReduce programming model as promising technology for cost effective high performance parallel computing, MapReduce based parallel algorithm for LCS has been developed. This approach adopts the concepts of successor tables, identical character pairs, successor tree and traversal of successor tree to find Longest Common Subsequence. The hadoop framework is used for the realization of MapReduce model.
The Longest Common Subsequence(LCS) identification of biological sequences has significant applications in bioinformatics. Due to the emerging growth in bioinformatics applications, new biological sequences with longer length have been used for processing, making it great challenge for sequential LCS algorithms. Few parallel LCS algorithms have been proposed but their efficiency and effectiveness are not satisfactory with increasing complexity and size of biological data. To overcome limitations of existing LCS algorithms and considering MapReduce programming model as promising technology for cost effective high performance parallel computing, MapReduce based parallel algorithm for LCS has been developed. This approach adopts the concepts of successor tables, identical character pairs, successor tree and traversal of successor tree to find Longest Common Subsequence. The hadoop framework is used for the realization of MapReduce model.