Document details

Grabbing parallel corpora from the web

Author(s): Almeida, J. J. cv logo 1 ; Simões, Alberto cv logo 2 ; Castro, José Alves de cv logo 3

Date: 2002

Persistent ID: http://hdl.handle.net/1822/599

Origin: RepositóriUM - Universidade do Minho

Subject(s): Corpora paralelos; Web-mining


Description
Multilingual resources are useful for linguistic studies, translation, and many other tasks. Unfortunately, these resources are difficult to obtain and organize. In this document we describe a set of tools designed to help in the task of mining bilingual resources from the web, from a specific site, from a file system, from a list of URLs, or from a translation memory. As a design goal we intend to build tools that can be used both cooperatively (in pipeline) and also in a independent way.
Document Type Article
Language English
delicious logo  facebook logo  linkedin logo  twitter logo 
degois logo
mendeley logo

Related documents



    Financiadores do RCAAP

Fundação para a Ciência e a Tecnologia Universidade do Minho   Governo Português Ministério da Educação e Ciência Programa Operacional da Sociedade do Conhecimento EU