ISBN-13: 9783330005068 / Angielski / Miękka / 2017 / 168 str.
Data Mining of protein sequence databases poses challenges because many protein sequences are non-relational, whereas most Data Mining algorithms assume the input data to be relational database. Further, raw protein sequence database does not provide meaningful information until it is segregated into meaningful category. In this book, 1700 VEGF (Vascular Endothelial Growth Factor) Protein sequence dataset have been used and Data Mining algorithms are used for prediction. In Biocomputing, Data Mining (DM) techniques are widely used for prediction of protein structure. Interpreting voluminous Biological data is complex and the need for Data Mining concepts is significant. Molecular data such as DNA/Protein sequence, level of genetic expression, biochemical pathways, biomarkers and protein structures constitute a major part of biological data. The book discusses how standard Data Mining techniques such as extraction of protein data, segregation by clustering, association and visualization on a real time protein sequence dataset are performed. The existing integrated tool BioParisodhana is compared with BioBCDM where the novel tool outperforms BioParisodhana.