The Journal of Biological Physics and Chemistry

2004

Volume 4, Number 2, p. 64–73


Exploration by visualization of numerical and textual genomic data

N. Férey, P.E. Gros, J. Hérisson and R. Gherbi

LIMSI-CNRS, Université Paris-Sud XI, BP 133, 91403 Orsay CEDEX, France

Biologists are leading current research on genome characterization (sequencing, alignment, transcription). However, genomic information shows some characteristics that make it very difficult to exploit. These data are heterogeneous, huge in quantity, geographically distributed, recorded within public or private databanks, and constitute an important factual data source (GenBank, SwissProt and Decrypthon), but genome knowledge is not limited to DNA or annotated protein sequences. Indeed, there is a significant quantity of information relating to these genes recorded in an unstructured format within many publications (Medline). This paper presents GenomeExplorer, a new modelling and software solution to explore textual and numerical genomic data based on an adapted federator description language. GenomeExplorer offers biologists a user-friendly visualization of data within a virtual reality environment, using a well-adapted graphical paradigm. This solution allows biologists to explore huge sets of genomic data, and it could be applied to other fields. This kind of graph-based exploration has the advantage of displaying global topological character¬istics, which are not easily visible using traditional exploration tools. Finally, some results produced by GenomeExplorer software from various sets of biological data are presented.

Keywords: huge graph representation, virtual reality framework, visual data clustering, XML-based language to represent biological data

back to contents