Besides source code, the fundamental source of information about Open Source Software lies in documentation, and other non source code files, like README, INSTALL, or HowTo files, commonly available in the software ecosystem. These documents, written in natural language, provide valuable information during the software development stage, but also in future maintenance and evolution tasks. DMOSS is a toolkit des...
This work aims at pointing out the benefits of a topology-oriented wide scope, but differentiated, profile analysis. The goal was to conciliate advanced common website usage profiling techniques with the analysis of the website's topology information, outputting valuable knowledge in an intuitive and comprehensible way. Server load balancing, crawler activity evaluation and Web site restructuring are the primar...
There are different definitions for ontologies. Different knowledge areas tend to define ontologies in a different way. For computer science, an ontology can be used to describe, in a well defined and structured way, knowledge about a specific domain. These artifacts store rich information that can be reasoned about, this information can also be target of many structured processing functions. There is a diversi...
In this paper we describe how Dicionáio-Aberto, an online dictionary for the Portuguese language, is being used as the base to construct diverse resources that are relevant in the processing of the Portuguese language. We will briefly present its history, explaining how we got here. Then, we will describe the resources already available to download and use, followed by the discussion on the resources that are b...
Perl is known for its versatile regular expressions. Nevertheless, using Perl regular expressions for creating fast lexical analyzer is not easy. As an alternative, the authors defend the automated generation of the lexical analyzer in a well known fast application (flex) based on a simple Perl definition in the syntactic analyzer. In this paper we extend the syntax used by Parse::Yapp, one of the most used par...
The eXtensible Mark-up Language (XML) is probably one of the most popular markup languages available today. It is very typical to find all kind of services or programs representing data in this format. This situation is even more common in web development environments or Service Oriented Architectures (SOA), where data flows from one service to another, being consumed and produced by an heterogeneous set of app...
Most existing programming languages can be categorized as general purpose programming languages, meaning that they can be used to implement solutions for any given domain. They are not, in any way, optimized for a specific set of problems. In contrast, Domain Specific Languages (DSL) are used to solve specific problems in a well defined domain. DSL are optimized to a particular set of problems, but they lack su...
Today, most developers prefer to store information in databases. But plain filesystems were used for years, and are still used, to store information, commonly in files of heterogeneous formats that are organized in directory trees. This approach is a very flexible and natural way to create hierarchical organized structures of documents. We can devise a formal notation to describe a filesystem tree structure, si...
Music sources are most commonly shared in music scores scanned or printed on paper sheets. These artifacts are rich in information, but since they are images it is hard to re-use and share their content in todays’ digital world. There are modern languages that can be used to transcribe music sheets, this is still a time consuming task, because of the complexity involved in the process and the typical huge size ...
Synonyms dictionaries are useful resources for natural language processing. Unfortunately their availability in digital format is limited, as publishing companies do not release their dictionaries in open digital formats. Dicionário-Aberto is an open and free digital synonyms dictionary for the Portuguese language. It is under public domain which makes it usable for any task. Synonyms dictionaries are commonly ...
Financiadores do RCAAP | |||||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |