03743naa a2200313 a 450000100080000000500110000800800410001910000160006024501000007626000090017630000160018549000370020150001800023852025180041870000200293670000170295670000210297370000200299470000220301470000200303670000170305670000230307370000180309670000170311470000160313170000140314770000220316177302460318314668232007-07-27 2004 bl uuuu u00u1 u #d1 aBINNECK, E. aBioinformatics tools for sequence analysis and annotation applied to soybean functional genome. c2004 ap. 248-249. a(Embrapa Soja. Documentos, 228). aEditado por Flávio Moscardi, Clara Beatriz Hoffmann-Campo, Odilon Ferreira Saraiva, Paulo Roberto Galerani, Francisco Carlos Krzyzanowski, Mercedes Concordia Carrão-Panizzi. aWorldwide functional genomics studies are making an important role on biotechnology for the identification of genes that can be use in the improvement of specific biological processes in plants. Large-scale gene discovery projects like that depend on high accuracy of the data. The data should not only be trustworthy but should be correctly annotated for various features it contains. In this work we report a bioinformatics system designed to process and annotate the expressed sequence tags (ESTs) obtained by the project Functional Genome of Soybean Roots at Embrapa Soybean (http://www.cnpso.embrapa.br/bioinformatica). The system is made of Perl and PHP scripts, which performs the automated sequence analysis and support the annotation process based on a MySQL database. Various Perl scripts was written to assist the sequence analysis process that includes basecalling, clustering and assembling the reads, filtering of contaminating, repetitive and low quality sequences, identification of sequence features, BLASTing and reports generation. BLAST (Altschul et al., 1997, Nucleic Acids Res. 25:3389-3402) outputs are processed and disposed on a way that is easily accessed by the personal that performs the massive handle of the data for annotation. Also, PHP scripts was written to make a friendly annotation interface through dynamical Web pages that perform the database operations needed for the complete annotation of the genes, whish comprise data retrieval aided by query searches linked to key words, data insertion, update, and generation of intuitive reports describing the results. These tools are helping to direct our work on identification, cloning and characterization of genes and regulatory sequences potentially useful in the improvement of soybean through genetic engineering. As results nearly 8,000 ESTs was obtained form cDNA clones derived from soybean roots in drought stress and nematode infection conditions. Consensus sequences are being functionally annotated and used to construct cDNA microarrays that will be useful for analyze gene expression under a broad variety of conditions. Initially we are studying drought stress and nematode infection conditions. Analysis of interactions of soybean roots with this defiance conditions will be used do identify new possible sources of resistance and tolerance. Candidate genes will be deeply studied and can be used on the production of transgenic plants. This work was supported by grants from CNPq, PRODETAB, Jircas and Embrapa.1 aSILVA, J. F. V.1 aNEUMAIER, N.1 aFARIAS, J. R. B.1 aARIAS, C. A. A.1 aALMEIDA, A. M. R.1 aMARIN, S. R. R.1 aWENDLAND, A.1 aSILVEIRA, C. A. da1 aMOLINA, J. C.1 aLEMOS, N. G.1 aFUGANTI, R.1 aSTOLF, R.1 aNEPOMUCENO, A. L. tIn: WORLD SOYBEAN RESEARCH CONFERENCE, 7.; INTERNATIONAL SOYBEAN PROCESSING AND UTILIZATION CONFERENCE, 4.; CONGRESSO BRASILEIRO DE SOJA, 3., 2004, Foz do Iguassu. Abstracts of contributed papers and posters. Londrina: Embrapa Soybean, 2004.