Modern biological databases comprise not only data, but also sophisticated query facilities and bioinformatics data analysis tools. This book provides an exploration through the world of Bioinformatics Database Systems.
The book summarizes the popular and innovative bioinformatics repositories currently available, including popular primary genetic and protein sequence databases, phylogenetic databases, structure and pathway databases, microarray databases and boutique databases. It also explores the data quality and information integration issues currently involved with managing bioinformatics databases, including data quality issues that have been observed, and efforts in the data cleaning field.
Biological data integration issues are also covered in-depth, and the book demonstrates how data integration can create new repositories to address the needs of the biological communities. It also presents typical data integration architectures employed in current bioinformatics databases.
The latter part of the book covers biological data mining and biological data processing approaches using cloud-based technologies. General data mining approaches are discussed, as well as specific data mining methodologies that have been successfully deployed in biological data mining applications. Two biological data mining case studies are also included to illustrate how data, query, and analysis methods are integrated into user-friendly systems.
Aimed at researchers and developers of bioinformatics database systems, the book is also useful as a supplementary textbook for a one-semester upper-level undergraduate course, or an introductory graduate bioinformatics course.
About the Authors
Kevin Byron is a PhD candidate in the Department of Computer Science at the New Jersey Institute of Technology.
Katherine G. Herbert is Associate Professor of Computer Science at Montclair State University.
Jason T.L. Wang is Professor of Bioinformatics and Computer Science at the New Jersey Institute of Technology.