Encontrados 5 documentos, a visualizar página 1 de 1

Ordenado por Data

ir para a primeira página ir para a página anterior

ir para a página: 1

ir para a página seguinte ir para a última página

Efficient clustering of web-derived data sets

Luís António Diniz Fernandes de Morais Sarmento; Eugénio da Costa Oliveira; Alexander P. Kehlenbeck; Lyle Ungar

Many data sets derived from the web are large, high-dimensional, sparse and have a Zipfian distribution of both classes and features. On such data sets, current scalable clustering methods such as streaming clustering suffer from fragmentation, where large classes are incorrectly divided into many smaller clusters, and computational efficiency drops significantly. We present a new clustering algorithm based on ...

Data: 2009 | Origem: Repositório Aberto da Universidade do Porto

Mais info.

Automatic Extraction of Quotes and Topics from News Feeds

Luís António Diniz Fernandes de Morais Sarmento; Sérgio Sobral Nunes

The explosive growth in information production poses increasing challenges to consumers, confronted with problems often described as "information overﬂow". We present verbatim, a software system that can be used as a personal information butler to help structure and ﬁlter information. We address a small part of the information landscape, namely quotes extraction from portuguese news. This problem ...

Data: 2009 | Origem: Repositório Aberto da Universidade do Porto

Mais info.

An Approach to Web-scale Named-Entity Disambiguation

Luís António Diniz Fernandes de Morais Sarmento; Eugénio da Costa Oliveira

We present a multi-pass clustering approach to large scale, wide-scope named-entity disambiguation (NED) on collections of web pages. Our approach uses name co-occurrence information to cluster and hence disambiguate entities, and is designed to handle NED on the entire web. We show that on web collections, NED becomes increasingly difficult as the corpus size increases, not only because of the challenge of sca...

Data: 2009 | Origem: Repositório Aberto da Universidade do Porto

Mais info.