With a growing trend towards grid-based data repositories and data analysis services, scientific data analysis often involves accessing multiple data sources, and analyzing the data using a variety of analysis programs. A strictly related critical challenge is the fact that data sources often hold the same type of data in a number of different formats; moreover, the formats expected and generated by various data analysis services are often distinct. In bioinformatics the data are often stored in flat files, therefore accessing them to retrieve a subset of records determined by constraints, is slower with respect to other approaches such as relational DBMS. We have developed a data grid system, built on top of specific biological data sources in flat file format, which carries out the ingestion into a relational DBMS for data integration reducing the data redundancy present in the biological flat files. In this work, we describe the prototype for the ingestion in a relational DBMS of the Swiss-2D PAGE flat file.
A Grid-based Bioinformatics Wrapper for Biological Databases
CAFARO, Massimo;ALOISIO, Giovanni
2008-01-01
Abstract
With a growing trend towards grid-based data repositories and data analysis services, scientific data analysis often involves accessing multiple data sources, and analyzing the data using a variety of analysis programs. A strictly related critical challenge is the fact that data sources often hold the same type of data in a number of different formats; moreover, the formats expected and generated by various data analysis services are often distinct. In bioinformatics the data are often stored in flat files, therefore accessing them to retrieve a subset of records determined by constraints, is slower with respect to other approaches such as relational DBMS. We have developed a data grid system, built on top of specific biological data sources in flat file format, which carries out the ingestion into a relational DBMS for data integration reducing the data redundancy present in the biological flat files. In this work, we describe the prototype for the ingestion in a relational DBMS of the Swiss-2D PAGE flat file.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.