Autor(es):
Lourenço, Anália
; Carneiro, S.
; Rocha, I.
; Ferreira, E. C.
Data: 2010
Identificador Persistente: http://hdl.handle.net/1822/23826
Origem: RepositóriUM - Universidade do Minho
Descrição
Book of abstracts of the Meeting of the Institute for Biotechnology and Bioengineering, 2, Braga, Portugal, 2010 Hereby, the aim is to present some of our research efforts towards the reconstruction of
genome-scale models. Namely, we focus on the development of cross-cutting computational
strategies for the integration and validation of heterogeneous data in support to traditional
manual curation and, describe application scenarios on the model organism E. coli.
We address the systematic comparison of database contents and the harvest and extraction
of contents from scientific literature. Aiming to help researchers assess the gains and losses
to be accounted for in biological repositories and thus, choose the most content-bearing
repositories for each particular integration problem/domain, we have implemented a Webalike
report tool [1]. This tool analyses the contents of well-known repositories under userspecified
integration scenarios considering the coverage of main biological entities (genes,
proteins and compounds) and the evaluation of standard nomenclatures, common names
and repository cross-links as elements of integration. Also, acknowledging that most
biological data still lays on scientific literature and requires extensive and time-consuming
manual curation, we have been developing literature screening and processing tools [2]. The
goal is to systematise the search of relevant literature based on user-specified keywords and
the extraction of relevant information by applying statistical approaches that exploit simple
pattern matching, machine learning and ontological enrichment.
Considering the wide scope of current applications that can benefit from the analysis of large
amounts of data, all our tools are publicly available through our group’s Web pages
(http://biopseg.deb.uminho.pt).