velharia1


year
1987
number
1
journal
Revista de Informática
docpage
jj.bib.dp.html#velharia1
volume
6
title
Descrição de um Núcleo Gráfico e Aplicação em {CAD
chave
velharia1
tipo
article
author
note
(KGUM - kernel gráfico U.Minho)


velharia2


journal
Revista de Informática
docpage
jj.bib.dp.html#velharia2
volume
9
year
1988
number
6
chave
velharia2
tipo
article
author
title
Sistemas de Programação Modular


graminteractivas1990


title
Mecanismos para Especificação e Prototipagem de Interfaces Utilizador-Sistema
note
(Gramáticas Interactivas guardadas)
author
tipo
inproceedings
chave
graminteractivas1990
year
1990
address
Coimbra
booktitle
3$º$ Encontro Português de Computação Gráfica
docpage
jj.bib.dp.html#graminteractivas1990


tlc89


year
1988
docpage
jj.bib.dp.html#tlc89
type
Texto didáctico
title
Teoria das Linguagens
keyword
institution
Universidade do Minho, Departamento de Informática
chave
tlc89
tipo
techreport
author


estruturasdedados90


title
Estruturas de Dados
keyword
institution
Universidade do Minho, Departamento de Informática
chave
estruturasdedados90
tipo
techreport
author
year
1990
docpage
jj.bib.dp.html#estruturasdedados90
type
Texto didáctico


Camila


title
\textsc{Camila} - A Platform for Software Mathematical Development
tipo
techreport
docpage
jj.bib.dp.html#Camila
type
(Páginas do projecto)
keyword
chave
Camila
institution
Universidade do Minho, Departamento de Informática
author
year
1998
url
http://camila.di.uminho.pt
editor
L.S. Barbosa and J.J. Almeida and J.N. Oliveira and Luís Neves


Natura


title
{Natura} - Natural language processing
tipo
techreport
note
\url{http://natura.di.uminho.pt/}
docpage
jj.bib.dp.html#Natura
type
(Páginas do projecto)
keyword
institution
Universidade do Minho, Departamento de Informática
chave
Natura
author
year
1997
url
http://natura.di.uminho.pt/


PDavid


author
institution
Universidade do Minho, Departamento de Informática
chave
PDavid
keyword
editor
J.C. Ramalho and J.J. Almeida and P.R. Henriques
url
http://www.di.uminho.pt/~jcr/projectos/david/princ.html
year
1998
tipo
techreport
note
\url{http://www.di.uminho.pt/~jcr/projectos/david/princ.html}
title
David -- Processamento estruturado de documentos
docpage
jj.bib.dp.html#PDavid
type
(Páginas do projecto)


nllex


chave
nllex
tipo
misc
author
title
NLlex -- Natural Language LEX
keyword
url
http://natura.di.uminho.pt/~jj/pln/pln.html#nllex
docpage
jj.bib.dp.html#nllex
type
tool
year
1996


jspell


year
1997
type
tool
docpage
jj.bib.dp.html#jspell
url
http://natura.di.uminho.pt/~jj/pln/pln.html#jspell
keyword
title
Jspell a module for morphological analyser for natural language
author
tipo
misc
chave
jspell


jspell1


docpage
jj.bib.dp.html#jspell1
type
Manual
tipo
techreport
title
Manual de Utilizador do {JSpell}
url
http://natura.di.uminho.pt/~jj/pln/jspellman.ps.gz
year
1994
abstract
month
Jul
chave
jspell1
institution
Universidade do Minho, Departamento de Informática
author
keyword


Almeida94b


year
1994
url
http://natura.di.uminho.pt/~jj/pln/yalg3.ps.gz
docpage
jj.bib.dp.html#Almeida94b
editor
Carlos Martin Vide
booktitle
Actas del X Congreso de Lenguajes Naturales e Leanguajes Formales, Sevilla
title
{GPC} -- a Tool for higher-order grammar specification
keyword
chave
Almeida94b
tipo
inproceedings
author


Almeida95a


title
{YaLG} -- extending {DCG} for natural language processing
tipo
inproceedings
pages
621--628
docpage
jj.bib.dp.html#Almeida95a
booktitle
Actas del XI Congreso de Lenguajes Naturales e Leanguajes Formales, Tortosa
keyword
chave
Almeida95a
author
year
1995
url
http://natura.di.uminho.pt/~jj/pln/yalg.ps.gz
editor
Carlos Martin Vide


Almeida94c


title
Jspell -- um módulo para análise léxica genérica de linguagem natural
tipo
inproceedings
pages
1--15
booktitle
Actas do X Encontro da Associação Portuguesa de Linguística
docpage
jj.bib.dp.html#Almeida94c
keyword
author
chave
Almeida94c
year
1995
address
Évora 1994
url
http://natura.di.uminho.pt/~jj/pln/jspell1.ps.gz


Almeida94a


tipo
inproceedings
author
chave
Almeida94a
keyword
title
Documents in an Informatic Academic environment
docpage
jj.bib.dp.html#Almeida94a
booktitle
Congresso Nacional de Bibliotecários, Arquivistas e Documentalistas
year
1994
address
Lisboa


jj95


number
UM-DI-95.04
year
1995
url
http://natura.di.uminho.pt/~jj/pln/nllex.ps.gz
docpage
jj.bib.dp.html#jj95
title
{NLlex} -- a tool to generate lexical analysers for natural language
keyword
institution
Universidade do Minho, Departamento de Informática
chave
jj95
author
tipo
techreport


Barbosa95


url
http://www.di.uminho.pt/~lsb/pub_camila/LNcam.ps.gz
year
1995
institution
University of Minho
chave
Barbosa95
author
keyword
docpage
jj.bib.dp.html#Barbosa95
number
DI-CAM-95:11:1
note
Lecture notes for the System Design Course, Computer System Engineering, University of Bristol
tipo
techreport
title
System Prototyping in \textsc{Camila}


Barbosa95a


year
1995
number
DI-CAM-95:11:2
url
http://www.di.uminho.pt/~lsb/pub_camila/RMcam.ps.gz
docpage
jj.bib.dp.html#Barbosa95a
keyword
title
\textsc{Camila}: A reference Manual
tipo
techreport
author
chave
Barbosa95a
institution
University of Minho


BA97a


number
DI-CAM-95:11:1:v98
year
1998
type
{Lecture Notes for the Bristol Course (1st ed. 1995)}
docpage
jj.bib.dp.html#BA97a
keyword
title
Systems Prototyping in \textsc{Camila}
author
tipo
techreport
institution
DI (U. Minho)
chave
BA97a


Barbosa95b


number
DI-CAM-95:7:1
year
1995
url
http://www.di.uminho.pt/~lsb/pub_camila/romantic.ps.gz
docpage
jj.bib.dp.html#Barbosa95b
keyword
title
Growing Up With \textsc{Camila}
author
tipo
techreport
institution
Universidade do Minho, Departamento de Informática
chave
Barbosa95b


Almeida96a


chave
Almeida96a
author
keyword
url
http://natura.di.uminho.pt/~jj/pln/etdic.ps.gz
address
Lisboa 1995
year
1996
tipo
inproceedings
title
Especificação e tratamento de Dicionários
booktitle
Actas do XI Encontro da Associação Portuguesa de Linguística
docpage
jj.bib.dp.html#Almeida96a
volume
2


Ulisses96


docpage
jj.bib.dp.html#Ulisses96
booktitle
Actas do XI Encontro da Associação Portuguesa de Linguística
volume
2
tipo
inproceedings
title
Tratamento automático de termos compostos
url
http://natura.di.uminho.pt/~jj/pln/ptc.ps.gz
address
Lisboa 1995
year
1996
chave
Ulisses96
author
keyword


Almeida96b


year
1996
booktitle
II International Conference on Mathematical Linguistics, Tarragona, Spain
url
http://natura.di.uminho.pt/~jj/pln/yalg2.ps.gz
docpage
jj.bib.dp.html#Almeida96b
title
{YaLG} a tool for higher-order grammar specification
keyword
chave
Almeida96b
author
tipo
inproceedings


jj96


url
http://natura.di.uminho.pt/~jj/pln/nllex2.ps.gz
month
Sep
year
1996
chave
jj96
author
keyword
journal
Procesamiento del Lenguaje Natural
docpage
jj.bib.dp.html#jj96
volume
19
pages
81--90
publisher
Sociedade Española para el Procesamiento del Lenguaje Natural
tipo
article
title
{NLlex} -- a tool to generate lexical analysers for natural language


SGML97


year
1997
month
Dec.
address
Washington D.C. - USA
docpage
jj.bib.dp.html#SGML97
booktitle
SGML/XML'97 Conference
keyword
title
SGML Documents: where does quality go?
tipo
inproceedings
author
chave
SGML97


Almeida98


title
Programação de dicionários
tipo
inproceedings
pages
21--28
docpage
jj.bib.dp.html#Almeida98
booktitle
Actas do XIII Encontro da Associação Portuguesa de Linguística
volume
1
keyword
chave
Almeida98
author
address
Lisboa 1997
year
1998
url
http://natura.di.uminho.pt/~jj/bib/progDic.ps.gz


Reis98


title
Etiquetador morfo-sintáctico para o Português
keyword
chave
Reis98
author
tipo
inproceedings
address
Lisboa 1997
year
1998
booktitle
Actas do XIII Encontro da Associação Portuguesa de Linguística
docpage
jj.bib.dp.html#Reis98
url
http://natura.di.uminho.pt/~jj/bib/etiquetador2.ps.gz


ABNO97a


keyword
author
chave
ABNO97a
month
October
year
1997
address
La Plata, Argentina
url
http://camila.di.uminho.pt/camila-doc/CLaPF97.ps.gz
editor
De Giusti, A. and Diaz, J. and Pesado, P.
title
\textsc{Camila}: Formal Software Engineering Supported by Functional Programming
tipo
inproceedings
pages
1343--1358
booktitle
Proc. II Conf. Latino Americana de Programación Funcional ({CLaPF97})
docpage
jj.bib.dp.html#ABNO97a


ABNO97b


author
chave
ABNO97b
keyword
editor
Johnson, M.
month
December
year
1997
address
Sydney, Australia
publisher
Springer Lect. Notes Comp. Sci. (1349)
tipo
inproceedings
title
\textsc{Camila}: Prototyping and Refinement of Constructive Specifications
booktitle
6th International Conference on Algebraic Methods and Software Technology ({AMAST'97})
docpage
jj.bib.dp.html#ABNO97b
pages
554--559


AH97


docpage
jj.bib.dp.html#AH97
booktitle
Proc. II Conference on Knowledge-based Intelligent Electronic Systems ({Kes98})
title
Dynamic Dictionary = cooperative information sources
tipo
inproceedings
address
Australia
year
1998
month
April
url
http://natura.di.uminho.pt/~jj/bib/agentes97.ps.gz
keyword
chave
AH97
author


museums98


title
Adapting Museum Structures for the Web: No Changes Needed!
chave
museums98
note
Toronto - Canadá
author
tipo
inproceedings
year
1998
booktitle
Museums and the Web 1998
docpage
jj.bib.dp.html#museums98


ABBN98


tipo
inproceedings
author
publisher
Proc. 3rd Summer School on Advan. Funct. Prog., Braga
chave
ABBN98
title
On The Development of \textsc{Camila}
editor
L.S. Barbosa and J.A. Saraiva
docpage
jj.bib.dp.html#ABBN98
booktitle
Workshop on Research Themes on Functional Programming
year
1998
month
18 Sep.


Gis99


docpage
jj.bib.dp.html#Gis99
booktitle
Conferência da Association of Geographic Information Laboratories for Europe (AGILE)
address
Roma
year
1999
chave
Gis99
tipo
inproceedings
author
title
Formal Methods for {GI
keyword


RPA99


author
tipo
inproceedings
chave
RPA99
keyword
title
{MAPit
booktitle
Conferência da Association of Geographic Information Laboratories for Europe (AGILE)
docpage
jj.bib.dp.html#RPA99
year
1999
address
Roma


RSea99


year
1999
keyword
title
Sobre a Utilização de Metodologias Formais no Desenvolvimento de {SIG
tipo
inproceedings
author
docpage
jj.bib.dp.html#RSea99
booktitle
GISBRASIL'99, Salvador
chave
RSea99


xmldt99


chave
xmldt99
tipo
inproceedings
author
title
{XML::DT
keyword
docpage
jj.bib.dp.html#xmldt99
booktitle
XML-Europe'99, Granada - Espanha
year
1999
month
May


RRAH99


year
1999
chave
RRAH99
author
keyword
docpage
jj.bib.dp.html#RRAH99
journal
Markup Languages: theory and practice
pages
75--90
olume
1
publisher
MIT Press
tipo
article
title
SGML documents: Where does quality go?


Barbosa2000


author
tipo
inproceedings
chave
Barbosa2000
keyword
title
Polytypic Recursion Patterns
booktitle
{SBLP'2000} (to appear as a ENTCS volume)
docpage
jj.bib.dp.html#Barbosa2000
month
May
year
2000
address
{UFP}, Recife, Brasil


jj2001x


title
Smallbook -- comando para produção de livros em pequena escala
keyword
chave
jj2001x
tipo
inproceedings
author
address
Braga
pages
445--450
year
2000
docpage
jj.bib.dp.html#jj2001x
booktitle
Actas da II Conferência Internacional de Tecnologias de Informação e Comunicação na Educação


speaker:sepln2001


chave
speaker:sepln2001
author
keyword
address
Sevilha
month
Sep.
year
2001
publisher
Sociedade Española para el Procesamiento del Lenguaje Natural
tipo
article
title
Text to speech -- a rewriting system approach
docpage
jj.bib.dp.html#speaker:sepln2001
journal
Procesamiento del Lenguaje Natural
volume
27
pages
247--255


mp2001


month
Maio
year
2001
address
Porto
booktitle
Congresso Nacional de Bibliotecários, Arquivistas e Documentalistas
url
http://natura.di.uminho.pt/~jj/bib/museuDaPessoa2001.ps.gz
docpage
jj.bib.dp.html#mp2001
title
{Museu da Pessoa
author
tipo
inproceedings
chave
mp2001


alfarrabio2001


tipo
inproceedings
author
chave
alfarrabio2001
title
Alfarrábio: Adding value to an Heterogeneous Site Collection
docpage
jj.bib.dp.html#alfarrabio2001
url
http://natura.di.uminho.pt/~jj/bib/alfarrabio2001.ps.gz
booktitle
Congresso Nacional de Bibliotecários, Arquivistas e Documentalistas
year
2001
month
Maio
address
Porto


freq2002


title
Cálculo de frequências de palavras para entradas de dicionários através do uso conjunto de analisadores morfológicos, taggers e corpora
tipo
inproceedings
pages
407--418
booktitle
Actas do XVII Encontro da Associação Portuguesa de Linguística
docpage
jj.bib.dp.html#freq2002
author
chave
freq2002
abstract
Apresentamos neste documento uma possível abordagem à extracção de frequências de palavras a partir de corpora, baseada numa utilização cooperativa de várias ferramentas de Processamento de Linguagem Natural.
year
2002
address
Lisboa 2001
url
http://natura.di.uminho.pt/~jj/bib/apl:freqnormpt.ps.gz


jspell2002


address
Lisboa 2001
pages
485--495
year
2002
abstract
Neste documento é nosso propósito apresentar as características presentes no analisador morfológico jspell e quais as suas consequências ao nível de aplicações de processamento de linguagem natural. Como ferramenta que é frequentemente integrada em software mais específico, apresentamos um módulo Perl desenvolvido com o objectivo de facilitar a interligação do analisador morfológico com pequenas aplicações desenvolvidas em linguagens de scripting. Devido à constante necessidade de melhoramento de dicionários, e em particular dos analisadores morfológicos, discutimos as propriedades que estes devem conter para facilitar o seu processamento e enriquecimento automático.
docpage
jj.bib.dp.html#jspell2002
booktitle
Actas do XVII Encontro da Associação Portuguesa de Linguística
title
Jspell.pm -- um módulo de análise morfológica para uso em processamento de linguagem natural
chave
jspell2002
tipo
inproceedings
author


dag2002


chave
dag2002
author
tipo
inproceedings
title
Directory Attribute Grammars
booktitle
VI Simpósio Brasileiro de Linguagens de Programação
docpage
jj.bib.dp.html#dag2002
pages
297--308
address
Rio de Janeiro, Brasil
year
2002


elpub2002


booktitle
Elpub 2002 -- International Conference on Electronic Publishing
docpage
jj.bib.dp.html#elpub2002
month
Nov.
abstract
In last years the amount of digital documents has increased dramatically. Unfortunately the same did not occur with the structure and organization of the information. Traditionally we built a digital library using a catalog with documents' meta-information including a conceptual classification and an ontology of concepts. In this document we present a set of modules to help in the task of building and maintaining a digital library. It includes a module to work with ontologies, a set of modules to handle specific catalog formats (like Bib\TeX), a module to define new catalog formats and a tool to integrate ontologies and multi-format catalogs in a web browse-able knowledge-base.
year
2002
pages
203--211
address
Karlov Vary, República Checa
author
tipo
inproceedings
chave
elpub2002
isbn
3-89700-357-0
title
Library::* -- a toolkit for digital libraries


parguess2002


month
Sep.
abstract
Multilingual resources are useful for linguistic studies, translation, and many other tasks. Unfortunately, these resources are difficult to obtain and organize. In this document we describe a set of tools designed to help in the task of mining bilingual resources from the web, from a specific site, from a file system, from a list of URLs, or from a translation memory. As a design goal we intend to build tools that can be used both cooperatively (in pipeline) and also in a independent way.
year
2002
chave
parguess2002
author
pages
13--20
journal
Procesamiento del Lenguaje Natural
docpage
jj.bib.dp.html#parguess2002
volume
29
title
Grabbing parallel corpora from the web
publisher
Sociedade Española para el Procesamiento del Lenguaje Natural
tipo
article


cP


number
3
journal
The Perl Review
docpage
jj.bib.dp.html#cP
volume
0
title
Cooking Perl with flex
tipo
article
year
2002
month
May
abstract
There are a lot of tools for parser generation using Perl. As we know, Perl has flexible data structures which makes it easy to generate generic trees. While it is easy to write a grammar and a lexical analyzer using modules like \texttt{Parse::Yapp
chave
cP
author


APL2k2.Parguess


docpage
jj.bib.dp.html#APL2k2.Parguess
booktitle
Actas do XVIII Encontro da Associação Portuguesa de Linguística
title
Extracção de corpora paralelo a partir da web: construção e disponibilização
tipo
inproceedings
lang
PT
year
2003
abstract
Ao longo deste documento descrever-se-á um conjunto de ferramentas construídas para extracção automática de recursos bilingues a partir da Web, a partir de um \emph{site
address
Porto 2002
url
http://alfarrabio.di.uminho.pt/~albie/publications/APL2k2.Parguess.pdf
author
chave
APL2k2.Parguess


APL2k2.Synthesis


booktitle
Actas do XVIII Encontro da Associação Portuguesa de Linguística
docpage
jj.bib.dp.html#APL2k2.Synthesis
title
Geração de voz com sotaque
tipo
inproceedings
lang
PT
abstract
Como é sabido os sotaques podem estar ligados a uma zona geográfica, a um grupo social, podem até ser uma característica pessoal. O seu estudo e descrição tem interessado muitos investigadores embora normalmente esse estudo tem sido feito de modo pouco formal. No trabalho que aqui se relata, tentou-se descrever formalmente sotaques e disfunções através de criação de regras a integrar como variantes num gerador de voz. Deste modo, pretendeu-se criar um ambiente de experimentação dos modelos construídos para descrever algumas características de certos sotaques ou certas disfunções, de modo a permitir a sua validação. Constatou-se que se consegue obter certas disfunções e certos sotaques com facilidade por simples acrescento de regras opcionais em certas fases da geração da voz. Outros, aparentam ser de maior dificuldade, ou por não conhecermos suficiente bem os fenómenos neles envolvidos ou envolverem maior complexidade prosódica.
year
2003
address
Porto 2002
url
http://alfarrabio.di.uminho.pt/~albie/publications/APL2k2.Synthesis.pdf
author
chave
APL2k2.Synthesis


xata:xmldt


tipo
inproceedings
lang
PT
title
Engenharia reversa de {HTML} usando tecnologia {XML}
docpage
jj.bib.dp.html#xata:xmldt
booktitle
{XATA --- XML}, Aplicações e Tecnologias Associadas
author
chave
xata:xmldt
irreditor
José Carlos Ramalho
url
http://alfarrabio.di.uminho.pt/~albie/publications/xata2003xml.pdf
year
2003
abstract
O proliferar de ferramentas criadores de HTML e o uso de HTML guiado pelo aspecto, tem vindo a arruinar o seu lado conceptual. Este problema foi reconhecido e deu origem a vários formatos ou tecnologias com o objectivo de separar o aspecto do conceito. No entanto a realidade actual mostra uma enorme quantidade de páginas HTML com péssima leitura conceptual e estrutural, invalidando uma série de usos possíveis da informação nelas contida. Nesta comunicação apresenta-se um trabalho (em fase inicial) que pretende fazer engenharia reversa de HTML para permitir aumentar a sua acessibilidade, a fim de ser usada num \emph{browser


xata:museudapessoa


chave
xata:museudapessoa
author
strutural
M
url
http://alfarrabio.di.uminho.pt/~albie/publications/xata2003mp.pdf
editor
José Carlos Ramalho
abstract
Este artigo apresenta a arquitectura actual do Museu da Pessoa, contemplando a forma como os documentos estão a ser editador, catalogados, arquivados, e processados para a criação das estruturas necessárias ao Museu.
year
2003
lang
PT
tipo
inproceedings
title
{H
booktitle
{XATA --- XML}, Aplicações e Tecnologias Associadas
docpage
jj.bib.dp.html#xata:museudapessoa


elpub2003


pages
288--298
booktitle
ElPub 2003 -- International conference on electronic publishing
docpage
jj.bib.dp.html#elpub2003
title
Music publishing
publisher
Universidade do Minho
note
Guimarães
tipo
inproceedings
lang
EN
isbn
972-98921-2-1
abstract
Current music publishing in the Internet is mainly concerned with sound publishing. We claim that music publishing is not only to make sound available but also to define relations between a set of music objects like music scores, guitar chords, lyrics and their meta-data. We want an easy way to publish music in the Internet, to make high quality paper booklets and even to create Audio CD's. In this document we present a workbench for music publishing based on open formats, using open-source tools and script programming over them. The workbench is based on an archive specification written in a text-based format which includes sound references, music scores, chords and lyrics and their meta-information.
month
June
year
2003
editor
Sely Costa et al.
url
http://alfarrabio.di.uminho.pt/~albie/publications/elpub2003.pdf
keyword
author
chave
elpub2003


cp3a:terminum2003


abstract
O projecto TerminUM tem como objectivos principais o estudo, experimentação e a criação de recursos na área dos corpora paralelos, terminologia (descritiva) e recursos multilingues ligados a corpora: fazer extracção tão automática quanto possível de corpora a partir da web; fazer extracção de dicionários, de terminologia e de outros recursos ligados à tradução; criar e interligar as ferramentas desenvolvidas; criar e disponibilizar: (1) listas de Bitextos, corpora e corpora paralelos, (2) ferramentas de criação e transformação de corpora, (3) recursos multilingues derivados/ligados a corpora. Nesta apresentação serão abordadas algumas tarefas presentemente a decorrer no âmbito do projecto, nomeadamente: ciclo de vida da construção e transformação de corpora; resumo das ferramentas desenvolvidas (e em desenvolvimento); construção de corpora paralelos tomando como base legendas de filmes (subtitles), ficheiro de internacionalização (mensagens de software .po) e ficheiros de memórias de tradução (TMX); animação de corpora paralelos via web (criação de motores de consulta usando diversas ferramentas).
month
Jun.
year
2003
url
http://alfarrabio.di.uminho.pt/~albie/publications/cp3a2003-terminum.pdf
keyword
author
chave
cp3a:terminum2003
pages
7--14
booktitle
CP3A 2003 -- Workshop em Corpora Paralelos: aplicações e algoritmos associados
docpage
jj.bib.dp.html#cp3a:terminum2003
title
Projecto {TerminUM}
publisher
Universidade do Minho
note
Braga
tipo
inproceedings


cp3a:kvec2003


month
Jun.
year
2003
pages
65--70
booktitle
CP3A 2003 -- Workshop em Corpora Paralelos: aplicações e algoritmos associados
docpage
jj.bib.dp.html#cp3a:kvec2003
keyword
title
{Lingua-Biterm}: um módulo Perl para extracção de terminologia bilingue
publisher
Universidade do Minho
note
Braga
author
tipo
inproceedings
chave
cp3a:kvec2003


cp3a:natools2003


chave
cp3a:natools2003
note
Braga
publisher
Universidade do Minho
author
tipo
inproceedings
title
Alinhamento de corpora paralelos
keyword
booktitle
CP3A 2003 -- Workshop em Corpora Paralelos: aplicações e algoritmos associados
docpage
jj.bib.dp.html#cp3a:natools2003
pages
71--77
month
Jun.
year
2003


sepln2003


volume
31
docpage
jj.bib.dp.html#sepln2003
journal
Procesamiento del Lenguaje Natural
pages
217--224
tipo
article
publisher
Sociedade Española para el Procesamiento del Lenguaje Natural
title
{NATools} -- A Statistical Word Aligner Workbench
year
2003
month
Sep.
abstract
This document presents the TerminUM project and the work done in its statistical word aligner workbench (NATools). It shows a variety of alignment methods for parallel corpora and discusses the resulting terminological dictionaries and their use: evaluation of sentence translations; construction of a multi-level navigation system for linguistic studies or statistical translations.
author
chave
sepln2003
keyword


tesejj


author
chave
tesejj
url
http://natura.di.uminho.pt/~jj/bib/tesejj.pdf
year
2003
tipo
phdthesis
lang
PT
title
Dicionários dinâmicos multi-fonte
docpage
jj.bib.dp.html#tesejj
school
Universidade do Minho
type
Tese de Doutoramento
superviser
Pedro Rangel Henriques


teseambs


year
2004
url
http://alfarrabio.di.uminho.pt/~albie/publications/msc.pdf
chave
teseambs
author
docpage
jj.bib.dp.html#teseambs
superviser
José João Almeida and Pedro Rangel Henriques
type
Tese de Mestrado
school
Escola de Engenharia - Universidade do Minho
title
Parallel Corpora word alignment and applications
lang
EN
tipo
mastersthesis


xata04:tx


title
{TX
lang
PT
isbn
972-99166-0-8
tipo
inproceedings
pages
217--224
booktitle
{XATA 2004
docpage
jj.bib.dp.html#xata04:tx
irreditor
José Carlos Ramalho and Alberto Simões
chave
xata04:tx
author
month
February
abstract
Desde o advento do SGML e posteriormente do XML, que a validação de documentos tem sido focada. Esta validação surgiu para analisar a estrutura dos documentos SGML e XML usando DTDs. Além dessa, e devido às restrições do XML em relação ao SGML, a validação de XML bem formado também tem sido usada. Mais recentemente, os Schema e Schematron vieram permitir a validação a um nível superior: não só a estrutura do documento mas também alguma validação de conteúdo. Neste artigo apresentamos a ferramenta TX que visa outro nível de validação, em que os tipos possam ser mais ricos e/ou calculados dinamicamente, e onde se possa definir funções de anotação e/ou correcção das porções do documento que não sigam as especificações.
year
2004
url
http://alfarrabio.di.uminho.pt/~albie/publications/xata04-tx.pdf


xata04:mtd


year
2004
month
February
abstract
Neste documento apresenta-se o conceito de memórias de tradução distribuídas, discutindo-se o seu interesse na área da tradução, bem como as vantagens que uma ferramenta de tradução pode tirar do seu uso. É apresentada uma possível implementação de memórias de tradução distribuídas usando WebServices numa arquitectura de cooperativismo. São definidos as mensagens (API) que um serviço deste género deve implementar para que uma ferramenta de tradução possa tirar partido da colaboração entre tradutores.
url
http://alfarrabio.di.uminho.pt/~albie/publications/xata04-mtd.pdf
irreditor
José Carlos Ramalho and Alberto Simões
author
chave
xata04:mtd
pages
59--68
docpage
jj.bib.dp.html#xata04:mtd
booktitle
{XATA 2004
title
Memórias de Tradução Distribuídas
tipo
inproceedings
isbn
972-99166-0-8
lang
PT


xmldt2


year
2004
number
1
volume
1
docpage
jj.bib.dp.html#xmldt2
journal
The Perl Review
title
{XML::DT
tipo
article
author
chave
xmldt2


sepln2004


tipo
article
publisher
Sociedade Española para el Procesamiento del Lenguaje Natural
lang
EN
title
Distributed Translation Memories implementation using WebServices
volume
33
docpage
jj.bib.dp.html#sepln2004
journal
Procesamiento del Lenguaje Natural
pages
89--94
author
chave
sepln2004
keyword
url
http://alfarrabio.di.uminho.pt/~albie/publications/dtm-sepln.pdf
year
2004
abstract
Translation Memories are very useful for translators but are difficult to share and reuse in a community of translators. This article presents the concept of Distributed Translation Memories, where all users can contribute and sharing translations. Implementation details using WebServices are shown, as well as an example of a distributed system between Portugal and Spain.
month
July


linguateca


lang
EN
tipo
inproceedings
title
Linguateca: um centro de recursos distribuído para o processamento computacional da língua portuguesa
booktitle
Workshop on Linguistic Tools and Resources for Spanish and Portuguese
docpage
jj.bib.dp.html#linguateca
pages
147--154
chave
linguateca
author
editor
IBERAMIA 2004
url
http://alfarrabio.di.uminho.pt/~albie/publications/linguateca.pdf
address
Puebla, México
abstract
Neste artigo apresentamos uma panorâmica da actividade da Linguateca na criação e disponibilização de recursos e ferramentas para a língua portuguesa. Começamos por uma descrição dos objectivos e pressupostos da Linguateca e uma breve história da sua intervenção, e finalizamos com algumas considerações sobre a melhor forma de prosseguir na organização da área.
year
2004


xata05:fs


tipo
inproceedings
author
publisher
Departamento de Informática, Universidade do Minho
location
Braga
chave
xata05:fs
keyword
title
Representação em {XML} da {F}loresta {S}intáctica
irreditor
José Carlos Ramalho and Alberto Simões and João Correia Lopes
docpage
jj.bib.dp.html#xata05:fs
booktitle
XATA 2005, Aplicações e Tecnologias Associadas
year
2005
month
Fev.


xata05:tdt


year
2005
month
Fev.
docpage
jj.bib.dp.html#xata05:tdt
booktitle
XATA 2005, Aplicações e Tecnologias Associadas
keyword
title
Inferência de tipos em documentos {XML}
irreditor
José Carlos Ramalho and Alberto Simões and João Correia Lopes
tipo
inproceedings
author
publisher
Departamento de Informática, Universidade do Minho
location
Braga
chave
xata05:tdt


xata06:navegante


pages
376--377
address
Portalegre
month
Fev.
year
2006
booktitle
XATA 2006, Aplicações e Tecnologias Associadas
docpage
jj.bib.dp.html#xata06:navegante
ote
poster
irreditor
José Carlos Ramalho and Alberto Simões and João Correia Lopes
title
Navegante: um proxy de ordem superior para navegação intusiva
keyword
chave
xata06:navegante
author
publisher
ESTGP
tipo
inproceedings


xata06:xmlauto


irreditor
José Carlos Ramalho and Alberto Simões and João Correia Lopes
keyword
chave
xata06:xmlauto
author
address
Portalegre
year
2006
month
Fev.
abstract
É consensual que o XML como linguagem para a estruturação de documentos tem vindo a tomar um lugar relevante. É também evidente a vantagem obtida no uso de XML como linguagem de intercâmbio. No entanto, a sua sintaxe é demasiado descritiva pelo que a geração de documentos de forma manual é dolorosa sendo útil dispor de módulos que simplifiquem essa tarefa. Neste artigo propomos um módulo Perl (XML::Writer::Simple) configurável via DTD que simplifica a tarefa de gerar XML.
url
http://alfarrabio.di.uminho.pt/~albie/publications/xata2006-xmlwritersimple.pdf
title
Geração dinâmica de {API
isbn
972-99166-2-4
lang
PT
tipo
inproceedings
publisher
ESTGP
pages
307--314
docpage
jj.bib.dp.html#xata06:xmlauto
booktitle
{XATA 2006


sepln06


abstract
Parallel corpora are important resources for most Natural Language processing tasks. From the common applications, like machine translation, to the usually mono-lingual tasks as paraphrase detection and word sense disambiguation, most researchers are using massive parallel corpora. Thus, the availability of an efficient way to manage them is very important. This paper presents a Client-Server architecture to query efficiently parallel corpora and probabilistic translation dictionaries.
month
September
year
2006
address
Zaragoza, Spain
url
http://alfarrabio.di.uminho.pt/~albie/publications/sepln06.pdf
author
chave
sepln06
pages
91--97
volume
37
docpage
jj.bib.dp.html#sepln06
journal
Procesamiento del Lenguaje Natural
title
{NatServer:
tipo
article
lang
EN


eamt06


url
http://alfarrabio.di.uminho.pt/~albie/publications/eamt06.pdf
editor
Jan Tore Lønning and Stephan Oepen
address
Oslo, Norway
shortin
{EAMT
year
2006
month
19--20, June
abstract
One of the bottlenecks of example-based machine translation (EBMT) is to be able to amass automatically quantities of good examples. In our work in EBMT, we are investigating how far one can go by performing example extraction from parallel corpora using Probabilistic Translation Dictionaries to obtain example segmentation points. In fact, the success of EBMT highly depends on examples quality and quantity, but also in their length. Thus, we give special importance on methods to extract different size examples from the same translation unit. With this article we show that it is possible to extract quantities for examples from parallel corpora just using probabilistic translation dictionaries extracted from the same corpora.
chave
eamt06
author
docpage
jj.bib.dp.html#eamt06
booktitle
11th Annual Conference of the European Association for Machine Translation
pages
27--32
isbn
82-7368-294-3
lang
EN
tipo
inproceedings
title
Combinatory Examples Extraction for Machine Translation


lrec06


docpage
jj.bib.dp.html#lrec06
booktitle
Fifth international conference on Language Resources and Evaluation, LREC 2006
title
{$T_2O$
lang
EN
tipo
inproceedings
address
Genova, Italy
shortin
{LREC
year
2006
abstract
In this article we present $T_2O$ --- a workbench to assist the process of translating heterogeneous resources into ontologies, to enrich and add multilingual information, to help programming with them, and to support ontology publishing. $T_2O$ is an ontology algebra.
month
May
url
http://alfarrabio.di.uminho.pt/~albie/publications/lrec06.pdf
chave
lrec06
author


elpub06-t2o


url
http://alfarrabio.di.uminho.pt/~albie/publications/elpub06-t2o.pdf
address
Bansko, Bulgaria
month
June
abstract
Dictionary and Thesaurus are valuable resources for Natural Language Processing but do not exist as freely available as expected, especially for languages other than English and, when they exist, they are just available for querying online. Our main goal with T2O --- Thesaurus to Ontology framework --- is to create a multilingual ontology: freely available online and to download; with a computer readable format; with a good API; with a structure as rich as possible; reusing all the structured information we can get;
year
2006
chave
elpub06-t2o
author
booktitle
{ElPub 2006
docpage
jj.bib.dp.html#elpub06-t2o
pages
373--374
lang
EN
note
poster
tipo
inproceedings
title
Publishing multilingual ontologies: a quick way of obtaining feedback


elpub06-blind


url
http://alfarrabio.di.uminho.pt/~albie/publications/elpub06-blind.pdf
shortin
{ElPub
address
Bansko, Bulgaria
abstract
True accessibility requires minimizing the scanning time to find a particular piece of information. Sequentially reading web pages do not provide this type of accessibility, for instance before the user gets to the actual text content of the page it has to go through a lot of menus and headers. However if the user could navigate a web page based through semantically classified blocks then the user could jump faster to the actual content of the page, skipping all the menus and other parts of the page. We propose a transcoding engine that tackles accessibility at two distinct, yet complementary, levels: for specific known sites and general unknown sites. We present a tool for building customized scripts for known sites that turns this process in an extremely simple task, which can be performed by anyone, without any expertise. For general unknown sites, our approach relies on statistical analysis of the structural blocks that define a web page to infer a semantic for the block.
month
June
year
2007
chave
elpub06-blind
author
booktitle
The 31st Annual Conference of the German Classification Society on Data Analysis, Machine Learning, and Applications
docpage
jj.bib.dp.html#elpub06-blind
pages
123-134
lang
EN
note
\textbf{forthcoming
tipo
inproceedings
title
Mining Classical Music Scores for Epoch Classification


avalon:jspell


docpage
jj.bib.dp.html#avalon:jspell
booktitle
Avaliação conjunta: um novo paradigma no processamento computacional da língua portuguesa
pages
83--90
tipo
incollection
publisher
{IST Press
title
Jspellando nas morfolimpíadas: Sobre a participação do {Jspell
editor
Diana Santos
year
2007
shortin
Avaliação conjunta, cap. 8
author
chave
avalon:jspell


avalon:avalinha


editor
Diana Santos
year
2007
shortin
Avaliação conjunta, cap. 18
author
chave
avalon:avalinha
docpage
jj.bib.dp.html#avalon:avalinha
booktitle
Avaliação conjunta: um novo paradigma no processamento computacional da língua portuguesa
pages
219--230
tipo
incollection
publisher
{IST Press
title
Avaliação de alinhadores


xata07:xmltmx


editor
José Carlos Ramalho and João Correia Lopes and Luís Carríço
url
http://alfarrabio.di.uminho.pt/~albie/publications/xmlyamljson07.pdf
shortin
{XATA
year
2007
abstract
month
February
institution
Universidade do Minho, Departamento de Informática
chave
xata07:xmltmx
author
irreditor
José Carlos Ramalho and João Correia Lopes and Luís Carríço
keyword
docpage
jj.bib.dp.html#xata07:xmltmx
booktitle
{XATA 2007
type
Manual
pages
33--46
isbn
978-972-99166-4-9
tipo
inproceedings
title
Alternativas ao {XML


MP07


chave
MP07
author
address
Rennes, France
year
2007
abstract
Some processes are not easy to be programmed from scratch for parallel machines (clusters), but can be easily split on simple steps. Makefile::Parallel is a tool which lets users to specify how processes depend on each other. The language syntax resembles the well known Makefile makefiles format, but instead of specifying files or targets dependencies, Makefile::Parallel specifies processes (or jobs) dependencies. The scheduler submits jobs to the cluster scheduler (in our case, Rocks PBS) waiting them to end. When each process finishes, dependencies are calculated and direct dependent jobs are submitted. Makefile::Parallel language includes features to specify parametric rules, used to split and join processes dependencies. Some tasks can be split into n smaller jobs working on different portions of files. At the end, another process can be used to join the results.
month
August
editor
Anne-Marie Kermarrec and Luc Bougé and Thierry Priol
title
{Makefile::Parallel
tipo
inproceedings
publisher
Springer-Verlag
pages
33--41
docpage
jj.bib.dp.html#MP07
booktitle
Euro-Par 2007
series
LNCS
volume
4641


epia-bio-2007


chave
epia-bio-2007
tipo
inproceedings
author
title
An Ontology-Based Approach To Systems Biology Literature Retrieval and Processing
irreditor
José Neves and Manuel Filipe Santos and José Manuel Machado
docpage
jj.bib.dp.html#epia-bio-2007
booktitle
New Trends in Artificial Intelligence
shortin
Epia, CMBSB
pages
541--552
year
2007
abstract
This paper details the \emph{SysBio Explorer
month
December


epia-music-2007


shortin
Epia, TEMA
pages
791--799
year
2007
abstract
Music Classification is a particular area of Computational Musicology that provides valuable insights about the evolving of composition patterns and assists in catalogue generation. The proposed work detaches from former works by classifying music based on music score information. Text Mining techniques support music score processing while Classification techniques are used in the construction of decision models. Although research is still at its earliest beginnings, the work already provides valuable contributes to symbolic music representation processing and subsequent analysis. Score processing involved the counting of ascending and descending chromatic intervals, note duration and meta-information tagging. Analysis involved feature selection and the evaluation of several data mining algorithms, ensuring extensibility towards larger repositories or more complex problems. Experiments report the analysis of composition epochs on a subset of the Mutopia project open archive of classical LilyPond-annotated music scores.
month
December
docpage
jj.bib.dp.html#epia-music-2007
booktitle
New Trends in Artificial Intelligence
title
Using Text Mining Techniques for Classical Music Scores Analysis
irreditor
José Neves and Manuel Filipe Santos and José Manuel Machado
chave
epia-music-2007
tipo
inproceedings
author


harem:rena


note
Documentação e actas do HAREM, a primeira avaliação conjunta na área
publisher
Linguateca
tipo
incollection
title
{RENA
booktitle
Reconhecimento de entidades mencionadas em português
docpage
jj.bib.dp.html#harem:rena
pages
157-172
chave
harem:rena
author
irreditor
Diana Santos and Nuno Cardoso
url
http://acdc.linguateca.pt/aval_conjunta/LivroHAREM/Cap13-SantosCardoso2007-Almeida.pdf
shortin
{HAREM
year
2007


sepln07


title
Parallel Corpora based Translation Resources Extraction
lang
EN
tipo
article
pages
265--272
docpage
jj.bib.dp.html#sepln07
journal
Procesamiento del Lenguaje Natural
volume
39
chave
sepln07
author
year
2007
month
September
abstract
This paper describes NATools, a toolkit to process, analyze and extract translation resources from Parallel Corpora. It includes tools like a sentence-aligner, a probabilistic translation dictionaries extractor, word-aligner, a corpus server, a set of tools to query corpora and dictionaries, as well as a set of tools to extract bilingual resources.


cgiauto08


booktitle
{XATA 2008
docpage
jj.bib.dp.html#cgiauto08
pages
22--27
tipo
inproceedings
isbn
978-972-99166-5-6
title
{CGI::Auto
url
http://alfarrabio.di.uminho.pt/~albie/publications/cgiauto08.pdf
month
February
abstract
The creation of a CGI or a WebService as an interface for a command line tool is not as unusual as it may seem. It is extremely usual and useful. There are applications developed as command line tools that can be useful for different purposes, and different kind of users. Some of these users might not be able to run these tools directly. For instance, it is not easy to install a bunch of Perl modules to have a small tool working. For these situations, it is easier to make the tool available in the Web or as a WebService. The problem with making the tool available in these fashions, is that programmers tend to rewrite the tools to incorporate the CGI or XML specific layers. We defend that these CGI or WebService interfaces should use the already available command line tool, without any change. This interface should be able to read a simple textual specification of how the command line tool works, and buid the CGI or XML specific layers automatically. The CGI::Auto module aims this purpose: to encapsulate command line tools in a CGI layer based on a textual specification, transforming the command line tool in a web application.
year
2008
author
chave
cgiauto08
irreditor
José Carlos Ramalho and João Correia Lopes and Salvador Abreu


navegante08


irreditor
José Carlos Ramalho and João Correia Lopes and Salvador Abreu
author
chave
navegante08
year
2008
abstract
NAVEGANTE is a generic framework to build superior order proxies for intrusive browsing. This framework provides the means for developing tools that behave as proxies, but perform some processing task on the content that is being browsed. Parallel to this content processing, applications can also run other user-defined functions with different purposes and interfaces, but we'll explain those later. Currently, NAVEGANTE only builds applications that run as CGIs, but this is intended to change in a near future. Applications are built writing programs in NAVEGANTE's Domain Specific Language (DSL). NAVEGANTE is a work in progress. This article aims to describe the current state of development. What applications can be built and how. Also, we identify some implementation problems, and briefly discuss some future improvements. Finally, we try to illustrate most of the concepts described using a couple of case studies.
month
February
url
http://alfarrabio.di.uminho.pt/~albie/publications/navegante08.pdf
title
{NAVEGANTE
tipo
inproceedings
isbn
978-972-99166-5-6
pages
52--63
docpage
jj.bib.dp.html#navegante08
booktitle
{XATA 2008


sepln08


title
Bilingual Terminology Extraction based on Translation Patterns
tipo
article
lang
EN
pages
281--288
volume
41
journal
Procesamiento del Lenguaje Natural
docpage
jj.bib.dp.html#sepln08
author
chave
sepln08
year
2008
month
September
abstract
Parallel corpora are rich sources of translation resources. This document presents a methodology for the extraction of bilingual nominals (terminology candidates) from parallel corpora, using translation patterns. The patterns proposed in this work specify the order changes that occur during translation and that are intrinsic to the involved languages syntaxes. These patterns are described in a domain specific language named PDL (Pattern Description Language), and are extremely efficient for the detection of nominal phrases.


propor-apslt08


year
2008
pages
35--42
docpage
jj.bib.dp.html#propor-apslt08
booktitle
Applications of Portuguese Speech and Language Technologies, PROPOR 2008 Special session
title
A Textual Rewriting system for NLP
irreditor
António Teixeira and Daniela Braga
tipo
inproceedings
author
chave
propor-apslt08


epia:DruryA09


editor
Luis Seabra Lopes and Nuno Lau and Pedro Mariano and Luis Mateus Rocha
url
http://dx.doi.org/10.1007/978-3-642-04686-5_33
year
2009
author
chave
epia:DruryA09
series
Lecture Notes in Computer Science
volume
5816
docpage
jj.bib.dp.html#epia:DruryA09
booktitle
EPIA
pages
400-410
tipo
inproceedings
note
Progress in Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 12-15
publisher
Springer
title
Construction of a Local Domain Ontology from News Stories


markers09


isbn
978-989-96278-1-9
lang
EN
tipo
inproceedings
title
Bilingual Example Segmentation based on Markers Hypothesis
docpage
jj.bib.dp.html#markers09
booktitle
I Iberian SLTech 2009
pages
95--98
chave
markers09
author
editor
António Teixeira and Miguel Sales Dias and Daniela Braga
address
Porto Salvo, Portugal
year
2009
abstract
The Marker Hypothesis was first defined by Thomas Green in 1979. It is a psycho-linguistic hypothesis defining that there is a set of words in every language that marks boundaries of phrases in a sentence. While it remains a hypothesis because nobody has proved it, tests have shows that results are comparable to basic shallow parsers with higher efficiency. The chunking algorithm based on the Marker Hypothesis is simple, fast and almost language independent. It depends on a list of closed-class words, that are already available for most languages. This makes it suitable for bilingual chunking (there is not the requirement for separate language shallow parsers). This paper discusses the use of the Marker Hypothesis combined with Probabilistic Translation Dictionaries for example-based machine translation resources extraction from parallel corpora.
month
September, 3--4


xata2010-rewritexml


booktitle
{XATA 2010
docpage
jj.bib.dp.html#xata2010-rewritexml
pages
27--38
lang
EN
tipo
inproceedings
title
Processing {XML:
editor
Alberto Simões and Daniela da Cruz and José Carlos Ramalho
address
Vila do Conde
abstract
Nowadays XML processing is performed using one of two approaches: using the SAX (Simple API for XML) or using the DOM (Document Ob ject Model). While these two approaches are adequate for most cases there are situations where other approaches can make the solution easier to write, read and, therefore, to maintain. This document presents a rewriting approach for XML documents processing, focusing the tasks of transforming XML documents (into other XML formats or other textual documents) and the task of rewriting other textual formats into XML dialects. These approaches were validated with some case studies, ranging from an XML authoring tool to a dictionary publishing mechanism.
month
Maio
year
2010
chave
xata2010-rewritexml
author


ocr2010


title
A Case Study of Rule Based and Probabilistic Word Error Correction of Portuguese OCR Text in a "Real World" Environment for Inclusion in a Digital Library
tipo
article
note
presented in {CICLING2010
umber
1-2
olume
1
pages
307--315
journal
International Journal of Computational Linguistics
docpage
jj.bib.dp.html#ocr2010
author
chave
ocr2010
year
2010
url
http://10.255.0.115/pub/2010/DA10


lrec10:bigorna


editor
Nicoletta Calzolari and others
address
Valletta, Malta
shortin
{LREC
language
english
year
2010
month
may
chave
lrec10:bigorna
author
docpage
jj.bib.dp.html#lrec10:bigorna
booktitle
Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)
isbn
2-9517408-6-7
tipo
inproceedings
publisher
European Language Resources Association (ELRA)
title
Bigorna -- A Toolkit for Orthography Migration Challenges
date
19-21


lrec10:dicaberto


booktitle
Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10)
docpage
jj.bib.dp.html#lrec10:dicaberto
date
19-21
title
Processing and Extracting Data from Dicionário Aberto
publisher
European Language Resources Association (ELRA)
tipo
inproceedings
isbn
2-9517408-6-7
month
may
year
2010
language
english
shortin
{LREC
address
Valletta, Malta
editor
Nicoletta Calzolari and others
author
chave
lrec10:dicaberto


bucc2010


pages
50--55
docpage
jj.bib.dp.html#bucc2010
booktitle
BUCC2010 -- 3rd Workshop on Building and Using Comparable Corpora, lrec2010
title
Automatic Parallel Corpora and Bilingual Terminology extraction from Parallel WebSites
tipo
inproceedings
lang
EN
year
2010
month
May
abstract
In our days, the notion, the importance and the significance of parallel corpora is so big that needs no special introduction. Unfortunately, public available parallel corpora is somewhat limited in range. There are big corpora about politics or legislation, about medicine and other specific areas, but we miss corpora for other different areas. Currently there is a huge investment on using the Web as a corpus. This article uncovers GWB, a tool that aims automatic construction of parallel corpora from the web. We defend that it is possible to build high quality terminological corpora in an automatic fashion, just by specifying a sensible Internet domain and using an appropriate set of seed keywords. GWB is a web-spider that works in conjunction with a set of other Open-Source tools, de¿ning a pipeline that includes the documents retrieval from the web, alignment at sentence level and its quality analysis, bilingual dictionaries and terminology extraction and construction of off-line dictionaries.
address
Valletta, Malta
url
http://alfarrabio.di.uminho.pt/~albie/publications/bucc2010.pdf
editor
Reinhard Rapp and Pierre Zweigenbaum and Serge Sharoff
author
chave
bucc2010


brett:lrec


pages
19--22
booktitle
Entity2010 -- Workshop on Resources and Evaluation for Entity Resolution and Entity Management, lrec2010
docpage
jj.bib.dp.html#brett:lrec
title
Identification, extraction and population of collective named entities from business news
lang
EN
tipo
inproceedings
address
Valletta, Malta
abstract
Sentiment analysis of business news has become an increasingly popular area of research for both the practitioner and academic. The future financial prospects of companies can be estimated through the aggregation of sentiment over a period of time. The aggregation of sentiment for a specific company is only possible if the company is explicitly mentioned in the news text. In certain instances, news text may refer to groups or collections of companies, for exampleThe Automotive SectororThe Russell Group of Universities. Widely available named entity dictionaries will not recognize these groups of companies, and consequently, it may not be possible to assign sentiment attributed to these groups of companies to their individual members. This paper describes a method for identifying groups of companies, which for the purposes of this paper will be known asCollective Entities. The described method is corpus based: it uses linguistic patterns to identify Collective Entity Names, their members and their natural relations with other Collective Entities. The described methodology contains the following steps: 1. Identify and validate seed extraction patterns, 2. Expand seed patterns, 3. Extract and validate Collective Named Entities, 4. Extract related Collective Named Entities, 5. Construct and populate an Ontology and 6. Expand the members of Collective Entity sets with Linked Data.
month
May
year
2010
chave
brett:lrec
author


fala2010-triPsi


pages
217--220
docpage
jj.bib.dp.html#fala2010-triPsi
booktitle
FALA2010 -- II Iberian SLTech Workshop
title
Automating psycholinguistic statistics computation: Procura-Palavras
tipo
inproceedings
address
Vigo
year
2010
abstract
This article describes psycholinguistic lexical databases available in various languages, including English, Spanish and Portuguese. These lexical databases are important for researchers in Psycholinguistics and other related areas, providing a pool of experimental materials and allowing for an efficient process of selection of these experimental materials. The process of gathering statistics is slow, resulting in a small pool of materials in the short-term. The need to find an alternative method to gather limited or yet unavailable statistics for a specific language led us to consider gathering statistics from other languages and to compute their triangulation. Our aim was to automatize the computation of statistics such as Familiarity, Imageability, Age of Acquisition and Written Word Frequency for that specific language. We will describe the process of preparing this data and triangulating and comparing statistics for some languages in an attempt of finding a relationship between them. The results were analysed considering correlations between each statistic in each pair of languages and by computing the mean of absolute differences between each language's values.
month
November
editor
Carmen Mateo and Francisco Díaz and Francisco Pazó
chave
fala2010-triPsi
author


opencert2010


editor
Luis Barbosa and Antonio Cerone and Siraj Shaikh (Guest Eds.)
url
http://journal.ub.tu-berlin.de/index.php/eceasst/article/view/458/446
year
2010
author
chave
opencert2010
volume
33
journal
Electronic Communications of the EASST
docpage
jj.bib.dp.html#opencert2010
tipo
article
note
Foundations and Techniques for Open Source Software Certification
title
Testing as a Certification Approach


p-pal-linguamatica


pages
67--72
number
3
journal
Linguamática
docpage
jj.bib.dp.html#p-pal-linguamatica
volume
2
title
{P-PAL:
tipo
article
month
December
abstract
Neste trabalho apresentamos o projecto Procura-PALavras (P-PAL) cujo principal objectivo é desenvolver uma ferramenta electrónica que disponibilize informação sobre índices psicolinguísticos objectivos e subjectivos de palavras do Português Europeu (PE). O P-PAL será disponibilizado gratuitamente à comunidade científica num formato amigável a partir de um sítio na Internet a construir para o efeito. Ao utilizar o P-PAL, o investigador poderá fazer uma utilização personalizada do programa ao seleccionar, da ampla variedade de análises oferecidas, os índices que se adequam aos propósitos da sua investigação e numa dupla funcionalidade de utilização: pedir ao programa para analisar listas de palavras previamente constituídas nos índices considerados relevantes para a investigação ou para obter listas de palavras que obedeçam aos parâmetros definidos. O P-PAL assume-se assim como uma ferramenta fundamental à promoção e internacionalização da investigação em Portugal.
year
2010
url
http://linguamatica.com/index.php/linguamatica/article/download/80/108
irreditor
Alberto Simões and José João Almeida and Xavier Gómez Guinovart
chave
p-pal-linguamatica
issn
1647--0818
author


drury-torgo-almeida:2011:ROBUS


title
Guided Self Training for Sentiment Classification
tipo
inproceedings
pages
9--16
booktitle
Proceedings of Workshop on Robust Unsupervised and Semisupervised Methods in Natural Language Processing
docpage
jj.bib.dp.html#drury-torgo-almeida:2011:ROBUS
author
chave
drury-torgo-almeida:2011:ROBUS
month
September
year
2011
address
Hissar, Bulgaria
url
http://www.aclweb.org/anthology/W11-3902


drury1


title
Classifying News Stories to Estimate the Direction of a Stock Market Index
author
tipo
inproceedings
chave
drury1
location
Chaves
year
2011
pages
1-4
booktitle
Third Workshop on Intelligent Systems and Applications (WISA)
docpage
jj.bib.dp.html#drury1


drury2


title
Magellan: An Adaptive Ontology Drivenbreaking Financial NewsRecommender
author
tipo
inproceedings
chave
drury2
location
Chaves
year
2011
booktitle
CISTI-2011
docpage
jj.bib.dp.html#drury2


drury3


title
An Error Correction Methodology for Time Dependent Ontologies
isbn
978-3-642-22055-5
publisher
Springer
tipo
inproceedings
pages
501-512
ee
http://dx.doi.org/10.1007/978-3-642-22056-2_52
booktitle
{CAiSE
part
8
docpage
jj.bib.dp.html#drury3
volume
83
series
Lecture Notes in Business Information Processing
chave
drury3
author
year
2011
editor
Camille Salinesi and Oscar Pastor


nuno1


year
2011
booktitle
CISTI-2011
docpage
jj.bib.dp.html#nuno1
title
Oml: A Scripting Approach For Manipulating Ontologies
author
tipo
inproceedings
chave
nuno1
location
Chaves


corta2011-pftl


title
{PFTL
isbn
978-989-96001-5-7
tipo
inproceedings
publisher
Dep. de Eng. Informática da Universidade de Coimbra
pages
222--233
docpage
jj.bib.dp.html#corta2011-pftl
booktitle
INForum'11 --- Simpósio de Informática (CoRTA2011 track)
pdf
http://ambs.perl-hackers.net/publications/corta2011-pftl.pdf
chave
corta2011-pftl
author
address
Coimbra, Portugal
language
EN
year
2011
month
Setembro
abstract
Today, most developers prefer to store information in databases. But plain filesystems were used for years, and are still used, to store information, commonly in files of heterogeneous formats that are organized in directory trees. This approach is a very flexible and natural way to create hierarchical organized structures of documents. We can devise a formal notation to describe a filesystem tree structure, similar to a grammar, assuming that filenames can be considered terminal symbols, and directory names non-terminal symbols. This specification would allow to derive correct language sentences (combination of terminal symbols) and to associate semantic actions, that can produce arbitrary side effects, to each valid sentence, just as we do in common parser generation tools. These specifications can be used to systematically process files in directory trees, and the final result depends on the semantic actions associated with each production rule. In this paper we revamped an old idea of using a domain specific language to implement these specifications similar to context free grammars. And introduce some examples of applications that can be built using this approach.
editor
Raul Barbosa and Luis Caires


corta2011-oml


pdf
http://ambs.perl-hackers.net/publications/corta2011-oml.pdf
chave
corta2011-oml
author
editor
Raul Barbosa and Luis Caires
address
Coimbra, Portugal
language
EN
year
2011
abstract
Most existing programming languages can be categorized as general purpose programming languages, meaning that they can be used to implement solutions for any given domain. They are not, in any way, optimized for a specific set of problems. In contrast, Domain Specific Languages (DSL) are used to solve specific problems in a well defined domain. DSL are optimized to a particular set of problems, but they lack support for a wide range of operations that are required when dealing with real world problems. So, in a perfect world, we would like to implement applications using a general purpose programming language, but use a set of different DSL to handle specific domains' tasks. In this paper we describe a DSL named Ontology Manipulation Language (OML), designed to describe operations over with ontologies. Programs can be written using only the OML syntax and be executed independently. OML syntax was designed to deal with ontologies and the language itself is optimized to perform these tasks, which means that other relatively simpler tasks can not be easily done. To overcome this challenge a mechanism was developed so that you can weave small snippets of OML code inside Perl programs, meaning we have the power of OML to manipulate ontologies and, at the same time, all the paraphernalia of modules that Perl offers to handle everything else.
month
Setembro
isbn
978-989-96001-5-7
tipo
inproceedings
publisher
Dep. de Eng. Informática da Universidade de Coimbra
title
Weaving {OML
docpage
jj.bib.dp.html#corta2011-oml
booktitle
INForum'11 --- Simpósio de Informática (CoRTA2011 track)
pages
184--197


wims2011


year
2011
full
Proceedings of the International Conference on Web Intelligence, Mining and Semantics, WIMS 2011, Sogndal, Norway, May 25 - 27, 2011
editor
Rajendra Akerkar
author
chave
wims2011
ee
http://doi.acm.org/10.1145/1988688.1988720
pages
27--34
bibsource
DBLP, http://dblp.uni-trier.de
booktitle
WIMS
docpage
jj.bib.dp.html#wims2011
title
Identification of fine grained feature based event and sentiment phrases from business news stories
publisher
ACM
tipo
inproceedings
isbn
978-1-4503-0148-0


sepln:bookcleaner


year
2011
url
http://natura.di.uminho.pt/~jj/pln/sepln2011-boolcleaner.pdf
docpage
jj.bib.dp.html#sepln:bookcleaner
booktitle
Actas del XXVII Congreso de la Sociedad Española para el Procesamiento del Lenguaje Natural
title
{Text::Perfide::BookCleaner
tipo
inproceedings
pp
433-441
author
location
Huelva, 5 - 7 Set
chave
sepln:bookcleaner


drury4


number
3/4
pages
219-233
volume
6
journal
IJMSO
docpage
jj.bib.dp.html#drury4
title
Construction and maintenance of a fuzzy temporal ontology from news stories
tipo
article
year
2011
journalfull
International Journal of Metadata, Semantics and Ontologies
doi
http://dx.doi.org/10.1504/IJMSO.2011.048028
author
chave
drury4


xml2pm-xata2011


year
2011
month
1--2 June
abstract
The eXtensible Mark-up Language (XML) is probably one of the most popular markup languages available today. It is very typical to find all kind of services or programs representing data in this format. This situation is even more common in web development environments or Service Oriented Architectures (SOA), where data flows from one service to another, being consumed and produced by an heterogeneous set of applications, which sole requirement is to understand XML. This workflow of data represented in XML implies some tasks that applications have to perform if they are required to consume or produce information: the task of parsing an XML document, giving specific semantics to the information parsed, and the task of producing an XML document. Our main goal is to create object definitions that can analyze an XML document and automatically create an object definition that can be used abstractly by the application. These objects are able to parse the XML document and gather all the data required to mimic all the information present in the document. This paper introduces xml2pm, a simple tool that can inspect the structure of an XML document and create an object definition (a Perl module) that stores the same information present in the orinial document, but as a runtime object. We also introduce a simple case of how this approach allows the creation of applications based on Web Services in an elegant and simple way.
address
Vila do Conde, Portugal
editor
Alberto Simões
author
pdf
http://ambs.perl-hackers.net/publications/xml2pm-xata2011.pdf
chave
xml2pm-xata2011
pages
103--114
docpage
jj.bib.dp.html#xml2pm-xata2011
booktitle
{XATA 2011
title
xml2pm: A Tool for Automatic Creation of Object Definitions Based on {XML
tipo
inproceedings
isbn
978-989-96863-1-1
lang
EN


drury5


author
chave
drury5
year
2012
full
International Journal of Computer Science and Applications
url
http://www.tmrfindia.org/ijcsa/v9i11.pdf
title
Classifying News Stories with a Constrained Learning Strategy to Estimate the Direction of a Market Index
tipo
article
number
1
pages
1-22
bibsource
DBLP, http://dblp.uni-trier.de
volume
9
docpage
jj.bib.dp.html#drury5
journal
IJCSA


da2012


chave
da2012
author
address
Coimbra, Portugal
month
April
year
2012
editor
Helena Caseli and Aline Villavicencio and António Teixeira and Fernando Perdigão
title
Dicionário-Aberto -- A Source of Resources for the Portuguese Language Processing
publisher
Springer
tipo
article
pages
121--127
docpage
jj.bib.dp.html#da2012
journal
Computational Processing of the Portuguese Language, Lecture Notes for Artificial Intelligence
volume
7243


LREC12.967


date
23-25
title
Structural alignment of plain text books
tipo
inproceedings
publisher
European Language Resources Association (ELRA)
isbn
978-2-9517408-7-7
docpage
jj.bib.dp.html#LREC12.967
booktitle
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)
author
chave
LREC12.967
year
2012
month
may
address
Istanbul, Turkey
language
english
editor
Nicoletta Calzolari and others


LREC12.611


author
chave
LREC12.611
editor
Nicoletta Calzolari and others
year
2012
month
may
address
Istanbul, Turkey
language
english
tipo
inproceedings
publisher
European Language Resources Association (ELRA)
isbn
978-2-9517408-7-7
date
23-25
title
The Minho Quotation Resource
docpage
jj.bib.dp.html#LREC12.611
booktitle
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)


CAPH12a


pages
239-253
year
2012
abstract
Concept location is a common task in program comprehension techniques, essential in many approaches used for software care and software evolution. An important goal of this process is to discover a mapping between source code and human oriented concepts. Although programs are written in a strict and formal language, natural language terms and sentences like identifiers (variables or functions names), constant strings or comments, can still be found embedded in programs. Using terminology concepts and natural language processing techniques these terms can be exploited to discover clues about which real world concepts source code is addressing. This work extends symbol tables build by compilers with ontology driven constructs, extends synonym sets defined by linguistics, with automatically created Probabilistic SynSets from software domain parallel corpora. And using a relational algebra, creates semantic bridges between program elements and human oriented concepts, to enhance concept location tasks.
month
June
docpage
jj.bib.dp.html#CAPH12a
booktitle
SLATe'12 --- Symposium on Languages, Applications and Technologies
volume
21
title
Probabilistic SynSet Based Concept Location
irreditor
Alberto Simões and Ricardo Queirós and Daniela da Cruz
chave
CAPH12a
tipo
inproceedings
author
publisher
OASIC -- Open Access Series in Informatics, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany


wikiscore


year
2012
chave
wikiscore
author
journal
Information, Services and Use (ISU)
docpage
jj.bib.dp.html#wikiscore
volume
31
comment
elpub 2012
pages
177--187
number
3-4/2011
ee
DOI 10.3233/ISU-2012-0647
tipo
article
publisher
IOS Press
title
{Wiki::Score
small
ISU


flapp


series
OpenAccess Series in Informatics (OASIcs)
volume
21
docpage
jj.bib.dp.html#flapp
booktitle
1st Symposium on Languages, Applications and Technologies
pages
41--50
tipo
inproceedings
publisher
Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
title
{Generating flex Lexical Scanners for Perl Parse::Yapp
idx
DBLP
url
http://drops.dagstuhl.de/opus/volltexte/2012/3513
year
2012
abstract
Perl is known for its versatile regular expressions. Nevertheless, using Perl regular expressions for creating fast lexical analyzer is not easy. As an alternative, the authors defend the automated generation of the lexical analyzer in a well known fast application (flex) based on a simple Perl definition in the syntactic analyzer. In this paper we extend the syntax used by Parse::Yapp, one of the most used parser generators for Perl, making the automatic generation of flex lexical scanners possible. We explain how this is performed and conclude with some benchmarks that show the relevance of the approach.
address
Dagstuhl, Germany
doi
http://dx.doi.org/10.4230/OASIcs.SLATE.2012.41
author
chave
flapp
irreditor
Alberto Simões and Ricardo Queirós and Daniela da Cruz


DBLP:conf/slate/DruryA12


publisher
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
tipo
inproceedings
title
Predicting Market Direction from Direct Speech by Business Leaders
volume
21
series
OASICS
booktitle
SLATE
docpage
jj.bib.dp.html#DBLP:conf/slate/DruryA12
pages
163-172
bibsource
DBLP, http://dblp.uni-trier.de
author
chave
DBLP:conf/slate/DruryA12
irreditor
Alberto Sim{õ
year
2012
doi
http://dx.doi.org/10.4230/OASIcs.SLATE.2012.163


ptd2013


irreditor
Luís Correia and Luís Paulo Reis and José Cascalho and Luís Gomes and Hélia Guerra and Pedro Cardoso
uthor
Alberto Simões and José João Almeida and Nuno Ramos Carvalho
chave
ptd2013
address
Angra do Heroismo, Azores
url
http://natura.di.uminho.pt/~jj/bib/ptd-algebra.pdf
ear
2013
title
Defining a Probabilistic Translation Dictionaries Algebra
tipo
inproceedings
ooktitle
XVI Portuguese Conference on Artificial Inteligence - EPIA
onth
September
pages
444--455
docpage
jj.bib.dp.html#ptd2013


algarve-cross2013


url
http://alfarrabio.di.uminho.pt/~albie/publications/wcist2012-dmoss.pdf
bstract
Besides source code, the fundamental source of information about Open Source Software lies in documentation, and other non source code files, like README, INSTALL, or HowTo files, commonly available in the software ecosystem. These documents, written in natural language, provide valuable information during the software development stage, but also in future maintenance and evolution tasks. DMOSS is a toolkit designed to systematically assess the quality of non source code text found in software packages. The toolkit handles a package as an attribute tree, and performs several tree traverse algorithms through a set of plugins, specialized in retrieving specific metrics from text, gathering information about the software. These metrics are later used to infer knowledge about the software, and composed together to build reports that assess the quality of specific features of the software. This paper discusses the motivations for this work, continues with a description of the toolkit implementation and design goals. Follows an example of its usage to process a software package, and the produced report. Finally some final remarks and trends for future work are presented.
chave
algarve-cross2013
uthor
Nuno Ramos Carvalho and Alberto Simões and José João Almeida
docpage
jj.bib.dp.html#algarve-cross2013
eries
Advances in Intelligent Systems and Computing
olume
206
sbn
978-3-642-36980-3
ages
785--794
ooktitle
Advances in Information Systems and Technologies
tipo
inproceedings
title
Open Source Software Documentation Mining for Quality Assessment
ublisher
Springer Berlin Heidelberg
ear
2013
ditor
Rocha, Álvaro and Correia, Ana Maria and Wilson, Tom and Stroetmann, Karl A.


algarve2013


chave
algarve2013
uthor
Alberto Simões and Anália Lourenço and José João Almeida
bstract
This work aims at pointing out the benefits of a topology-oriented wide scope, but differentiated, profile analysis. The goal was to conciliate advanced common website usage profiling techniques with the analysis of the website's topology information, outputting valuable knowledge in an intuitive and comprehensible way. Server load balancing, crawler activity evaluation and Web site restructuring are the primary analysis concerns and, in this regard, experiments over six month data of a real-world Web site were considered successful.
url
http://alfarrabio.di.uminho.pt/~albie/publications/wcist2012-webtopology.pdf
title
Evaluating Web Site Structure Based on Navigation Profiles and Site Topology
ublisher
Springer Berlin Heidelberg
ear
2013
ditor
Rocha, Álvaro and Correia, Ana Maria and Wilson, Tom and Stroetmann, Karl A.
ages
305-311
tipo
inproceedings
ooktitle
Advances in Information Systems and Technologies
olume
206
eries
Advances in Intelligent Systems and Computing
sbn
978-3-642-36980-3
docpage
jj.bib.dp.html#algarve2013


Passarola2013


docpage
jj.bib.dp.html#Passarola2013
booktitle
CISTI-2013
pages
763--768
location
Lisboa
tipo
inproceedings
title
PASSAROLA: High-Order Exercise Generation
url
http://natura.di.uminho.pt/~jj/bib/passarola-cisti2013.pdf
year
2013
abstract
In order to be robust and achieve multi-domain coverage, exercise generation systems usually work with answers of simple types (e.g. multiple-choice, Boolean, integer, or file comparison). In this paper we describe an exercise generation system PASSAROLA, a simple, yet powerful, language that anyone with no computer science background, can use to develop exercises, that include a collection of heterogeneous objects, and allows the usage of complex elements. Its main characteristic features are the use of simple reusable templates, simple and rich types, rich notation and syntax (LaTeX based) for questions, solutions, and answers, transformations and calculations, external calculators.
chave
Passarola2013
author


ticames2013


tipo
inproceedings
location
Lisboa
title
Math exercise generation and smart assessment
booktitle
Workshop of TICAMES (Information and Communication Technology in Higher Education: Learning Mathematics), CISTI-2013
docpage
jj.bib.dp.html#ticames2013
pages
1014--1019
author
chave
ticames2013
url
http://natura.di.uminho.pt/~jj/bib/passarola-ticames2013.pdf
abstract
In this paper we concentrate on the field of mathematics education where the aim is to generate exercises going beyond those with answers of simple types (e.g. multiple-choice, Boolean, integer, or file comparison). We present three examples from introductory college mathematics and emphasize the key points that should be taken into account in order to develop a "well-posed" exercise together with its verification. All the presented examples were implemented in the system
year
2013


crossportal


irrbooktitle
Computational Science and Its Applications - ICCSA 2013 - 13th International Conference, Ho Chi Minh City, Vietnam, June 24-27, 2013, Proceedings, Part II
year
2013
doi
http://dx.doi.org/10.1007/978-3-642-39643-4_32
offcrossref
DBLP:conf/iccsa/2013-2
editor
Beniamino Murgante and others
author
chave
crossportal
ee
http://dx.doi.org/10.1007/978-3-642-39643-4
bibsource
DBLP, http://dblp.uni-trier.de
pages
443-458
series
Lecture Notes in Computer Science
volume
7972
docpage
jj.bib.dp.html#crossportal
booktitle
ICCSA (2)
title
A Framework for Modular and Customizable Software Analysis
tipo
inproceedings
publisher
Springer
isbn
978-3-642-39642-7


icaicte13


docpage
jj.bib.dp.html#icaicte13
booktitle
ICAICTE-13, Advances in Intelligent Systems Research
title
Exercise generation with the system Passarola
isbn
978-90786-77-79-6
tipo
inproceedings
doi
doi:10.2991/icaicte.2013.64
year
2013
abstract
A robust multi-domain coverage exercise generation system usually works with an-swers of simple types (e.g. multiple-choice, Boolean, integer, or file compari-son). In this paper we describe Passarola, a simple, yet powerful, exercise genera-tion system and its language that anyone with no computer science background can use to develop exercises. It may include a collection of heterogeneous objects allowing the usage of complex elements. Its main characteristics are the use of simple reusable templates, simple and rich types, and rich notation and syntax (LaTeX based) for questions, solutions, and answers.
url
http://natura.di.uminho.pt/~jj/bib/ecaicte2013.pdf
issn
1951-6851
chave
icaicte13
keywords
Passarola, exercise generation system, self-regulating study
author


slate/AzevedoA13


title
ABC with a UNIX Flavor
isbn
978-3-939897-52-1
publisher
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
tipo
inproceedings
pages
203-218
bibsource
DBLP, http://dblp.uni-trier.de
booktitle
2nd Symposium on Languages, Applications and Technologies, SLATE 2013, June 20-21, 2013 - Porto, Portugal
docpage
jj.bib.dp.html#slate/AzevedoA13
volume
29
series
OASICS
irreditor
José Paulo Leal and Ricardo Rocha and Alberto Simões
chave
slate/AzevedoA13
author
doi
http://dx.doi.org/10.4230/OASIcs.SLATE.2013.203
abstract
ABC is a simple, yet powerful, textual musical notation. This paper presents ABC::DT, a rule-based domain-specific language (Perl embedded), designed to simplify the creation of ABC processing tools. Inspired by the Unix philosophy, those tools intend to be simple and compositional in a Unix filters' way. From ABC::DT's rules we obtain an ABC processing tools whose main algorithm follows a traditional compiler architecture, thus consisting of three stages: 1) ABC parser (based on abcmtops parser), 2) ABC semantic transformation (associated with ABC attributes), 3) output generation (either a user defined or system provided ABC generator).
year
2013
url
http://drops.dagstuhl.de/opus/volltexte/2013/4039/pdf/14.pdf


escolex2013


chave
escolex2013
tipo
article
author
title
Escolex: A grade-level lexical database from european portuguese elementary to middle school textbooks.
journal
Behavior Research Methods
docpage
jj.bib.dp.html#escolex2013
url
http://p-pal.di.uminho.pt/static/files/db/Soares_et_al.__in_press_ESCOLEX.pdf
pages
1--14
year
2013
abstract
In this article, we introduce ESCOLEX, the first European Portuguese children's lexical database with grade-level-adjusted word frequency statistics. Computed from a 3.2-million-word corpus, ESCOLEX provides 48,381 word forms extracted from 171 elementary and middle school textbooks for 6- to 11-year-old children attendin' the first six grades in the Portuguese educational system. Like other children's grade-level databases, ESCOLEX provides four frequency indices for each grade: overall word frequency (F), index of dispersion across the selected textbooks (D), estimated frequency per million words (U), and standard frequency index (SFI). It also provides a new measure, contextual diversity (CD). In addition, the number of letters in the word and its part(s) of speech, number of syllables, syllable structure, and adult frequencies taken from P-PAL (a European Portuguese corpus-based lexical database) are provided. ESCOLEX will be a useful tool both for researchers interested in language processing and development and for professionals in need of verbal materials adjusted to children's developmental stages. ESCOLEX can be downloaded along with this article or from http://p-pal.di.uminho.pt/about/databases.


coloquiosOutono2013


booktitle
Humanidades: Novos Paradigmas do Conhecimento e da Investigação, XIV Colóquio de Outono
editor
Ana Gabriela Macedo and Carlos Mendes de Sousa and Vitor Moura
docpage
jj.bib.dp.html#coloquiosOutono2013
year
2013
pages
323--339
author
publisher
húmus, Universidade do Minho
tipo
inproceedings
chave
coloquiosOutono2013
title
{Per-fide


sardinha2014


month
April
year
2014
chapter
9
editor
Tony Berber Sardinha and Telma São-Bento Ferreira
url
http://ambs.perl-hackers.net/publications/perfide_ch9_sardinha.pdf
chave
sardinha2014
author
pages
177--200
booktitle
Working with Portuguese Corpora
docpage
jj.bib.dp.html#sardinha2014
title
The {Per-Fide
isbn
978-1441190505
publisher
Bloomsbury Publishing
tipo
incollection


SOARES2014


title
{Procura-PALavras (P-Pal): uma nova medida de frequência lexical do português europeu contemporâneo
script
sci_arttext
publisher
scielo
tipo
article
pages
110 - 123
docpage
jj.bib.dp.html#SOARES2014
journal
{Psicologia: Reflexão e Crítica
crossref
10.1590/S0102-79722014000100013
pid
S0102
chave
SOARES2014
author
nrm
iso
language
pt
month
03
year
2014
url
http://www.scielo.br/scielo.php?&-79722014000100013&volume = {27


ppal2014


volume
27
journal
{Psicologia: Reflexao e Critica
docpage
jj.bib.dp.html#ppal2014
number
1
pages
110-123
tipo
article
title
Procura-PALavras (P-PAL): A new measure of word frequency for contemporary European Portuguese | Procura-PALavras (P-PAL): Uma nova medida de frequência lexical do Português Europeu contemporâneo
year
2014
doi
10.1590/S0102-79722014000100013
author
chave
ppal2014


conclave-iccsa2104


address
annote
Document Type: Conference Paper; SCOPUS
doi
10.1007/978-3-319-09153-2_9
year
2014
chave
conclave-iccsa2104
author
journal
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
docpage
jj.bib.dp.html#conclave-iccsa2104
offbooktitle
14th International Conference on Computational Science and its Applications, ICCSA 2014; Guimaraes; Portugal
volume
8584 LNCS
pages
116-131
number
PART 6
tipo
article
publisher
Springer Verlag
title
{Conclave: Ontology-driven measurement of semantic relatedness between source code elements and problem domain concepts


comsys-dmoss


number
4
pages
1191-1207
volume
11
docpage
jj.bib.dp.html#comsys-dmoss
journal
Computer Science and Information Systems
title
{DMOSS
tipo
article
abstract
Besides source code, the fundamental source of information about open source software lies in documentation, and other non source code files, like README, INSTALL, or How-To files, commonly available in the software ecosystem. These documents, written in natural language, provide valuable information during the software development stage, but also in future maintenance and evolution tasks. DMOSS3 is a toolkit designed to systematically assess the quality of non source code content found in software packages. The toolkit handles a package as an attribute tree, and performs several tree traverse algorithms through a set of plugins, specialized in retrieving specific metrics from text, gathering information about the software. These metrics are later used to infer knowledge about the software, and composed together to build reports that assess the quality of specific features. This paper discusses the motivations for this work, continues with a description of the toolkit implementation and design goals. This is followed by an example of its usage to process a software package, and the produced report.
year
2014
show
pprwc110
url
http://www.comsis.org/archive.php?-1308
author
chave
comsys-dmoss


jss-Carvalho2014


url
http://www.sciencedirect.com/science/article/pii/S0164121214002179
doi
http://dx.doi.org/10.1016/j.jss.2014.10.013
year
2014
abstract
Abstract Program comprehension techniques often explore program identifiers, to infer knowledge about programs. The relevance of source code identifiers as one relevant source of information about programs is already established in the literature, as well as their direct impact on future comprehension tasks. Most programming languages enforce some constrains on identifiers strings (e.g., white spaces or commas are not allowed). Also, programmers often use word combinations and abbreviations, to devise strings that represent single, or multiple, domain concepts in order to increase programming linguistic efficiency (convey more semantics writing less). These strings do not always use explicit marks to distinguish the terms used (e.g., CamelCase or underscores), so techniques often referred as hard splitting are not enough. This paper introduces Lingua::IdSplitter a dictionary based algorithm for splitting and expanding strings that compose multi-term identifiers. It explores the use of general programming and abbreviations dictionaries, but also a custom dictionary automatically generated from software natural language content, prone to include application domain terms and specific abbreviations. This approach was applied to two software packages, written in C, achieving a f-measure of around 90% for correctly splitting and expanding identifiers. A comparison with current state-of-the-art approaches is also presented.
chave
jss-Carvalho2014
issn
0164-1212
keywords
Identifier splitting
author
journal
Journal of Systems and Software
docpage
jj.bib.dp.html#jss-Carvalho2014
volume
number
0
tipo
article
title
From source code identifiers to natural language terms


conclave-slate2014


irreditor
Maria João Varanda Pereira and José Paulo Leal and Alberto Simões
author
chave
conclave-slate2014
year
2014
annote
Keywords: software maintenance, software evolution, program comprehension, feature location, concept location, natural language processing
doi
http://dx.doi.org/10.4230/OASIcs.SLATE.2014.19
address
Dagstuhl, Germany
title
{Conclave: Writing Programs to Understand Programs
publisher
Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
tipo
inproceedings
pages
19--34
volume
38
series
OpenAccess Series in Informatics (OASIcs)
booktitle
3rd Symposium on Languages, Applications and Technologies
docpage
jj.bib.dp.html#conclave-slate2014


DBLP:conf/slate/BritoA14


title
A Workflow Description Language to Orchestrate Multi-Lingual Resources
biburl
http://dblp.uni-trier.de/rec/bib/conf/slate/BritoA14
isbn
978-3-939897-68-2
tipo
inproceedings
publisher
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
pages
77--83
docpage
jj.bib.dp.html#DBLP:conf/slate/BritoA14
booktitle
3rd Symposium on Languages, Applications and Technologies, {SLATE
series
{OASICS
volume
38
irreditor
Maria João Varanda Pereira and José Paulo Leal and Alberto Simões
chave
DBLP:conf/slate/BritoA14
author
doi
10.4230/OASIcs.SLATE.2014.77
year
2014
url
http://dx.doi.org/10.4230/OASIcs.SLATE.2014.77


DBLP:conf/slate/SimoesAB14


pages
251--265
booktitle
3rd Symposium on Languages, Applications and Technologies, {SLATE
docpage
jj.bib.dp.html#DBLP:conf/slate/SimoesAB14
volume
38
series
{OASICS
title
Language Identification: a Neural Network Approach
biburl
http://dblp.uni-trier.de/rec/bib/conf/slate/SimoesAB14
isbn
978-3-939897-68-2
publisher
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
tipo
inproceedings
doi
10.4230/OASIcs.SLATE.2014.251
year
2014
url
http://dx.doi.org/10.4230/OASIcs.SLATE.2014.251
irreditor
Maria João Varanda Pereira and José Paulo Leal and Alberto Simões
chave
DBLP:conf/slate/SimoesAB14
author


DBLP:conf/slate/CarvalhoA14


irreditor
Maria João Varanda Pereira and José Paulo Leal and Alberto Simões
author
chave
DBLP:conf/slate/CarvalhoA14
year
2014
doi
10.4230/OASIcs.SLATE.2014.283
url
http://dx.doi.org/10.4230/OASIcs.SLATE.2014.283
biburl
http://dblp.uni-trier.de/rec/bib/conf/slate/CarvalhoA14
title
MLT-prealigner: a Tool for Multilingual Text Alignment
tipo
inproceedings
publisher
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
isbn
978-3-939897-68-2
pages
283--290
series
{OASICS
volume
38
docpage
jj.bib.dp.html#DBLP:conf/slate/CarvalhoA14
booktitle
3rd Symposium on Languages, Applications and Technologies, {SLATE


tmxa


chave
tmxa
author
url
http://ambs.perl-hackers.net/publications/tmxa.pdf
address
Las Palmas de Gran Canaria, Spain
year
2014
abstract
In the later years the amount of freely available multilingual corpora has grown in an exponential way. Unfortunately the way these corpora are made available is very diverse, ranging from simple text files or specific XML schemas to supposedly standard formats like the XML Corpus Encoding Initiative, the Text Encoding Initiative, or even the Translation Memory Exchange formats. In this document we defend the usage of Translation Memory Exchange documents, but we enrich its structure in order to support the annotation of the documents with different information like lemmas, multi-words or entities. To support the adoption of the proposed formats, we present a set of tools to manipulate the different formats in an agile way.
month
November
tipo
inproceedings
title
Processing Annotated {TMX
docpage
jj.bib.dp.html#tmxa
booktitle
IberSpeech 2014 --- VIII Jornadas en Tecnologías del Habla and IV Iberian SLTech Workshop
pages
188--197


jss-CarvalhoAHP15


url
http://dx.doi.org/10.1016/j.jss.2014.10.013
doi
10.1016/j.jss.2014.10.013
abstract
Abstract Program comprehension techniques often explore program identifiers, to infer knowledge about programs. The relevance of source code identifiers as one relevant source of information about programs is already established in the literature, as well as their direct impact on future comprehension tasks. Most programming languages enforce some constrains on identifiers strings (e.g., white spaces or commas are not allowed). Also, programmers often use word combinations and abbreviations, to devise strings that represent single, or multiple, domain concepts in order to increase programming linguistic efficiency (convey more semantics writing less). These strings do not always use explicit marks to distinguish the terms used (e.g., CamelCase or underscores), so techniques often referred as hard splitting are not enough. This paper introduces Lingua::IdSplitter a dictionary based algorithm for splitting and expanding strings that compose multi-term identifiers. It explores the use of general programming and abbreviations dictionaries, but also a custom dictionary automatically generated from software natural language content, prone to include application domain terms and specific abbreviations. This approach was applied to two software packages, written in C, achieving a f-measure of around 90% for correctly splitting and expanding identifiers. A comparison with current state-of-the-art approaches is also presented.
year
2015
chave
jss-CarvalhoAHP15
author
keywords
Identifier splitting
timestamp
Mon, 22 Dec 2014 09:51:10 +0100
journal
Journal of Systems and Software
docpage
jj.bib.dp.html#jss-CarvalhoAHP15
volume
100
pages
117--128
bibsource
dblp computer science bibliography, http://dblp.org
tipo
article
title
From source code identifiers to natural language terms
biburl
http://dblp.uni-trier.de/rec/bib/journals/jss/CarvalhoAHP15


acores-wordcist2015


author
tipo
article
chave
acores-wordcist2015
title
New algorithms for smart assessment of math exercises
volume
353
docpage
jj.bib.dp.html#acores-wordcist2015
journal
Advances in Intelligent Systems and Computing
year
2015
pages
1221-1230


cisti-almeida2015


year
2015
booktitle
2015 10th Iberian Conference on Information Systems and Technologies, CISTI 2015
docpage
jj.bib.dp.html#cisti-almeida2015
url
http://www.scopus.com/inward/record.url?-s2.0-84943328958&partnerID=MN8TOARS
title
Gröbner bases and mathematical exercises generation with nondetermined structure
author
eid
2
tipo
inproceedings
chave
cisti-almeida2015
titlept
Bases de Gröbner e geração de exercícios matemáticos com estrutura não determinada


subtitles2015


docpage
jj.bib.dp.html#subtitles2015
journal
Quarterly Journal of Experimental Psychology
volume
68
pages
680-696
number
4
year
2015
chave
subtitles2015
author
tipo
article
title
On the advantages of word frequency and contextual diversity measures extracted from subtitles: The case of Portuguese


PULO:springer


title
Experiments on Enlarging a Lexical Ontology
isbn
978-3-319-27652-6
tipo
incollection
publisher
Springer International Publishing
pages
49--56
docpage
jj.bib.dp.html#PULO:springer
booktitle
Languages, Applications and Technologies
series
Communications in Computer and Information Science
volume
563
irreditor
Sierra-Rodríguez, José-Luis and Leal, José-Paulo and Simões, Alberto
chave
PULO:springer
author
language
English
doi
10.1007/978-3-319-27653-3_5
year
2015


SIMES16.1052


author
chave
SIMES16.1052
editor
Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis
month
may
year
2016
language
english
address
Portoroz, Slovenia
publisher
European Language Resources Association (ELRA)
tipo
inproceedings
isbn
978-2-9517408-9-1
title
Enriching a {P
date
23-28
booktitle
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2016)
docpage
jj.bib.dp.html#SIMES16.1052


almeida_et_al2016


booktitle
5th Symposium on Languages, Applications and Technologies (SLATE'16)
docpage
jj.bib.dp.html#almeida_et_al2016
volume
51
series
OpenAccess Series in Informatics (OASIcs)
pages
1--8
offeditor
Marjan Mernik and José Paulo Leal and Hugo Gonçalo Oliveira
publisher
Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
tipo
inproceedings
title
{Context-Free Grammars: Exercise Generation and Probabilistic Assessment
offaddress
Dagstuhl, Germany
annote
Keywords: Exercise generation, context-free grammars, assessment
doi
http://dx.doi.org/10.4230/OASIcs.SLATE.2016.10
year
2016
chave
almeida_et_al2016
author


cisti2016


journal
Iberian Conference on Information Systems and Technologies, CISTI
docpage
jj.bib.dp.html#cisti2016
volume
2016-July
doi
https://doi.org/10.1109/CISTI.2016.7521367
year
2016
chave
cisti2016
author
tipo
article
title
Architectural approaches to build the museum of the person


exercise-composition2016


author
chave
exercise-composition2016
year
2016
doi
https://doi.org/10.1007/978-3-319-31307-8_24
tipo
article
note
WorldCIST'16
title
Exercise composition: From environment properties to composed problems
volume
445
docpage
jj.bib.dp.html#exercise-composition2016
journal
Advances in Intelligent Systems and Computing
pages
235-244


ontoMP2016


tipo
article
note
WorldCIST'16
title
OntoMP, an ontology to build the museum of the person
journal
Advances in Intelligent Systems and Computing
docpage
jj.bib.dp.html#ontoMP2016
volume
445
pages
653-661
chave
ontoMP2016
author
doi
https://doi.org/10.1007/978-3-319-31307-8_67
year
2016


portosanto-worldcist2017


pages
277-286
abstract
Exercise generation on language specification is a challenging problem, because of the richness of the objects in the domain. In this paper we discuss Mgbeg (Meta-Grammar-Based Exercise Generator) -- a toolkit for exercise generation on context-free languages. Mgbeg approach is based on a meta-grammar formalism and tool, used to define a set of similar exercises. Mgbeg is a simple attributed grammar used to describe the set of valid exercise (and randomly generate one of them). Each exercise typically contains several attributes calculated during the generation steps: namely, one or more formal specification of the language (context free grammar); the exercise statement; other information such as examples, common mistakes, validation data, to be used in the construction of the exercise statement, solution, and assessment steps. Complementary the toolkit provides a grammar module, with functionality for grammar comparison, sentence generation and recognition; a template engine (to help in textual attributes calculation).
year
2017
booktitle
Recent Advances in Information Systems and Technologies
docpage
jj.bib.dp.html#portosanto-worldcist2017
series
Advances in Intelligent Systems and Computing, vol. 659
title
Exercise generation on language specification
chave
portosanto-worldcist2017
note
WorldCIST'17
author
tipo
inproceedings


Martins2018a


pages
763-772
volume
745
series
Advances in Intelligent Systems and Computing
booktitle
Trends and Advances in Information Systems and Technologies, WorldCist2018
docpage
jj.bib.dp.html#Martins2018a
title
Increasing authorship identification through emotional analysis
publisher
Springer International Publishing
tipo
incollection
offeditor
Álvaro Rocha and Hojjat Adeli and Luís Paulo Reis and Sandra Costanzo
isbn
978-3-319-77702-3
month
March
year
2018
doi
https://doi.org/10.1007/978-3-319-77703-0_76
author
chave
Martins2018a
edition
1


DBLP:conf/ideal/MarcondesAN18


year
2018
pages
374--384
volume
11314
series
Lecture Notes in Computer Science
booktitle
{IDEAL
docpage
jj.bib.dp.html#DBLP:conf/ideal/MarcondesAN18
title
Chatbot Theory - A Naïve and Elementary Theory for Dialogue Management
author
publisher
Springer
tipo
inproceedings
chave
DBLP:conf/ideal/MarcondesAN18


DBLP:conf/bracis/Martins0ANH18


pages
61--66
year
2018
booktitle
{BRACIS
docpage
jj.bib.dp.html#DBLP:conf/bracis/Martins0ANH18
title
Hate Speech Classification in Social Media Using Emotional Analysis
chave
DBLP:conf/bracis/Martins0ANH18
author
publisher
{IEEE
tipo
inproceedings


DBLP:conf/dcai/MartinsAHN18


year
2018
pages
276--283
series
Advances in Intelligent Systems and Computing
volume
800
docpage
jj.bib.dp.html#DBLP:conf/dcai/MartinsAHN18
booktitle
{DCAI
title
Domain Identification Through Sentiment Analysis
tipo
inproceedings
publisher
Springer
author
chave
DBLP:conf/dcai/MartinsAHN18


DBLP:conf/slate/MendesA18


tipo
inproceedings
publisher
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
author
chave
DBLP:conf/slate/MendesA18
title
eOS: The Exercise Operating System
series
{OASICS
volume
62
docpage
jj.bib.dp.html#DBLP:conf/slate/MendesA18
booktitle
{SLATE
year
2018
pages
5:1--5:13


DBLP:conf/slate/Almeida18


year
2018
pages
8:1--8:8
series
{OASICS
volume
62
docpage
jj.bib.dp.html#DBLP:conf/slate/Almeida18
booktitle
{SLATE
title
Abcl: Abc music notation with rich chord support (Short Paper)
tipo
inproceedings
publisher
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
author
chave
DBLP:conf/slate/Almeida18


DBLP:conf/slate/MartinsAHN18


series
{OASICS
volume
62
docpage
jj.bib.dp.html#DBLP:conf/slate/MartinsAHN18
booktitle
{SLATE
year
2018
pages
19:1--19:9
tipo
inproceedings
publisher
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
author
chave
DBLP:conf/slate/MartinsAHN18
title
Predicting Performance Problems Through Emotional Analysis (Short Paper)


DBLP:conf/webmedia/MartinsANH18


title
Creating a social media-based personal emotional lexicon
tipo
inproceedings
publisher
{ACM
author
chave
DBLP:conf/webmedia/MartinsANH18
year
2018
pages
261--264
docpage
jj.bib.dp.html#DBLP:conf/webmedia/MartinsANH18
booktitle
WebMedia


DBLP:conf/worldcist/MartinsAHN18


title
Increasing Authorship Identification Through Emotional Analysis
chave
DBLP:conf/worldcist/MartinsAHN18
tipo
inproceedings
publisher
Springer
author
pages
763--772
year
2018
docpage
jj.bib.dp.html#DBLP:conf/worldcist/MartinsAHN18
booktitle
WorldCIST {(1)
series
Advances in Intelligent Systems and Computing
volume
745


cola19


eywords
Formal languages, Context-free grammars, Automatic assessment
abstract
In this paper we consider the problem of cycle-free context-free grammars equivalence. To every context-free grammar there corresponds a system of formal equations. Formally applying the iteration method to this system we obtain the grammar axiom in the form of a formal power series composed of the words generated by the grammarmultipliedby the respective ambiguities. We define a transform that attributes a matrix meaning to the system of formal equations and to formal power series: terminal symbols are substituted by matrices and formal sum and product are substituted by the matrix ones. In order to effectively compute the sum of a matrix series we numerically solve the system of matrix equations. We prove distinguishability theorems showing that if two formal power series generated by cycle-free context-free grammars are different, then there exists a matrix substitution such that the sums of the respective matrix series are different. Based on this result, we suggest a procedure that can resolve the problem of equivalence of cycle-free context-free grammars in many practical cases. The results obtained in this paper form a theoretical basis for algorithms oriented to automatic assessment of students' answers in computer science. We present the respective algorithms. Then we compare our approach with a simple heuristic method based on CYK algorithm and discuss the limitations of our method.
chave
cola19
author
docpage
jj.bib.dp.html#cola19
journal
Journal of Computer Languages
volume
51
pages
48-56
tipo
article
publisher
Elsevier
title
On solving cycle-free context-free grammar equivalence problem using numerical analysis


Almeida2019


tipo
inproceedings
title
Hunting ancestors: A unified approach for discovering genealogical information
volume
74
type
Conference Paper
docpage
jj.bib.dp.html#Almeida2019
journal
OpenAccess Series in Informatics
number
22
author
eid
2
chave
Almeida2019
url
https://www.scopus.com/inward/record.uri?-s2.0-85071097688&.4230%2fOASIcs.SLATE.2019.22&partnerID=40&md5=8e2f42806d411bdfa553dcfa27be17a9
abstract
This paper presents an unified approach for discovering genealogical information. It presents a frameworks for storing information concerning ancestors, locations, dates and documents. It also intends to provide a framework that is able to perform inference concerning dates by using constraints and for handling relations, locations and sources. The DSL presented also aims to help users store information from heterogeneous sources along with the evidence contained therein. © José J. Almeida and Rui C. Mendes.
source
Scopus
year
2019
doi
10.4230/OASIcs.SLATE.2019.22


Simões2019453


abstract
The digital era has brought some challenges to lexicographers, but it has also brought new opportunities as part of the rise of information technology and, more recently, the emergence of digital humanities. This paper provides a description of LeXmart, the framework that supports the digital development of the Portuguese Academy of Sciences Dictionary. LeXmart is a smart tool framework to support lexicographers' work that offers different types of tools, ranging from a structural editor to a set of validation tools. Given that the dictionary is stored in eXist-DB, LeXmart is developed on top of its ecosystem, using W3C standard languages, and offering default functionalities offered by eXist-DB, namely a RESTful API. © 2019 Lexical Computing CZ s.r.o.. All rights reserved.
source
Scopus
year
2019
url
https://www.scopus.com/inward/record.uri?-s2.0-85075350281&partnerID=40&md5=c5171c547089e5728c1cec0d5c755df1
eid
2
author
chave
Simões2019453
pages
453-466
volume
2019-October
type
Conference Paper
docpage
jj.bib.dp.html#Simões2019453
journal
Proceedings of Electronic Lexicography in the 21st Century Conference
title
LexMart: A smart tool for lexicographers
tipo
inproceedings


Martins2019


type
Article
docpage
jj.bib.dp.html#Martins2019
journal
Expert Systems
number
e12469
tipo
article
title
A sentiment analysis approach to increase authorship identification
url
https://www.scopus.com/inward/record.uri?-s2.0-85074844787&.1111%2fexsy.12469&partnerID=40&md5=bb5b7acab849e47b90246393026a4ba4
abstract
Writing style is considered the manner in which an author expresses his thoughts, influenced by language characteristics, period, school, or nation. Often, this writing style can identify the author. One of the most famous examples comes from 1914 in Portuguese literature. With Fernando Pessoa and his heteronyms Alberto Caeiro, Álvaro de Campos, and Ricardo Reis, who had completely different writing styles, led people to believe that they were different individuals. Currently, the discussion of authorship identification is more relevant because of the considerable amount of widespread fake news in social media, in which it is hard to identify who authored a text and even a simple quote can impact the public image of an author, especially if these texts or quotes are from politicians. This paper presents a process to analyse the emotion contained in social media messages such as Facebook to identify the author's emotional profile and use it to improve the ability to predict the author of the message. Using preprocessing techniques, lexicon-based approaches, and machine learning, we achieved an authorship identification improvement of approximately 5% in the whole dataset and more than 50% in specific authors when considering the emotional profile on the writing style, thus increasing the ability to identify the author of a text by considering only the author's emotional profile, previously detected from prior texts. © 2019 John Wiley & Sons, Ltd.
source
Scopus
year
2019
doi
10.1111/exsy.12469
eid
2
author
chave
Martins2019


Martins2019276


tipo
inproceedings
title
Domain identification through sentiment analysis
volume
800
type
Conference Paper
docpage
jj.bib.dp.html#Martins2019276
journal
Advances in Intelligent Systems and Computing
pages
276-283
eid
2
author
chave
Martins2019276
url
https://www.scopus.com/inward/record.uri?-s2.0-85049987273&.1007%2f978-3-319-94649-8_33&partnerID=40&md5=3fe3521d746330d391ee8ec0dd7bd4e9
source
Scopus
abstract
When dealing with chatbots, domain identification is an important feature to adapt the interactions between user and computer in order to increase the reliability of the communication and, consequently, the audience and decrease its rejection avoiding misunderstandings. In order to adapt to different domains, the writing style will be different for the same author. For example, the same person in the role of a student writes to his professor in a different style than he does for his brother. This article presents a process that uses sentiment analysis to identify the average emotional profile of the communication scenario where the conversation is done. Using Natural Language Processing and Machine Learning techniques, it was possible to obtain an index of 96.21% of correct classifications in the identification of where these communications have occurred only analysing the emotional profile of these texts. © Springer International Publishing AG, part of Springer Nature 2019.
year
2019
doi
10.1007/978-3-319-94649-8_33


Silva2020


tipo
inproceedings
title
Musikla: Language for generating musical events
type
Conference Paper
journal
OpenAccess Series in Informatics
docpage
jj.bib.dp.html#Silva2020
volume
83
number
A6
chave
Silva2020
eid
2
author
url
https://www.scopus.com/inward/record.uri?-s2.0-85091704838&.4230%2fOASIcs.SLATE.2020.6&partnerID=40&md5=1c450e4e7bb940f5855eafaedb4ccba3
doi
10.4230/OASIcs.SLATE.2020.6
source
Scopus
abstract
In this paper, we'll discuss a simple approach to integrating musical events, such as notes or chords, into a programming language. This means treating music sequences as a first class citizen. It will be possible to save those sequences into variables or play them right away, pass them into functions or apply operators on them (like transposing or repeating the sequence). Furthermore, instead of just allowing static sequences to be generated, we'll integrate a music keyboard system that easily allows the user to bind keys (or other kinds of events) to expressions. Finally, it is important to provide the user with multiple and extensible ways of outputing their music, such as synthesizing it into a file or directly into the speakers, or writing a MIDI or music sheet file. We'll structure this paper first with an analysis of the problem and its particular requirements. Then we will discuss the solution we developed to meet those requirements. Finally we'll analyze the result and discuss possible alternative routes we could've taken. © 2020 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. All rights reserved.
year
2020


Oliveira2020


title
BhTSL, behavior trees specification and processing
tipo
inproceedings
number
A4
type
Conference Paper
docpage
jj.bib.dp.html#Oliveira2020
journal
OpenAccess Series in Informatics
volume
83
chave
Oliveira2020
author
eid
2
doi
10.4230/OASIcs.SLATE.2020.4
abstract
In the context of game development, there is always the need for describing behaviors for various entities, whether NPCs or even the world itself. That need requires a formalism to describe properly such behaviors. As the gaming industry has been growing, many approaches were proposed. First, finite state machines were used and evolved to hierarchical state machines. As that formalism was not enough, a more powerful concept appeared. Instead of using states for describing behaviors, people started to use tasks. This concept was incorporated in behavior trees. This paper focuses in the specification and processing of Behavior Trees. A DSL designed for that purpose will be introduced. It will also be discussed a generator that produces LATEX diagrams to document the trees, and a Python module to implement the behavior described. Additionally, a simulator will be presented. These achievements will be illustrated using a concrete game as a case study. © 2020 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. All rights reserved.
source
Scopus
year
2020
url
https://www.scopus.com/inward/record.uri?-s2.0-85091707856&.4230%2fOASIcs.SLATE.2020.4&partnerID=40&md5=3b2daa7d548eeed77224386d6790adc7


Simões2020


author
eid
2
chave
Simões2020
source
Scopus
abstract
In this document we present the first developments on an Umbundu dictionary for a jSpell, a morphological analyzer. Initially some comments are performed regarding the Umbundu language morphology, followed by the discussion on jSpell dictionaries structure and its environment. Last, we describe the Umbundu dictionary bootstrap process and perform some final experiments on its coverage. © 2020 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. All rights reserved.
year
2020
doi
10.4230/OASIcs.SLATE.2020.10
url
https://www.scopus.com/inward/record.uri?-s2.0-85091700212&.4230%2fOASIcs.SLATE.2020.10&partnerID=40&md5=26f3c0eacb3fc1ea35f005c08377b083
title
Towards a morphological analyzer for the umbundu language
tipo
inproceedings
number
A10
volume
83
type
Conference Paper
journal
OpenAccess Series in Informatics
docpage
jj.bib.dp.html#Simões2020


Marcondes2020


chave
Marcondes2020
author
eid
2
source
Scopus
abstract
The username hints for most of the on-line social networks are mostly unpleasant for human beings since they are mostly a simple name variation followed by numbers. This paper shows that it is possible to generate human likable usernames through heuristics guided by structural onomastics. The objective then is to conceive heuristics as such and check its availability in Twitter in order to verify if is it possible to generate a sufficiently big and available username data-set that is able to justify the transitions from unpleasant to a pleasant username suggestion. This paper finds that it is possible to generate 8281 handles on average through the proposed heuristics and their permutations, therefore, the number of various possibilities is comfortable. This is a partial account since not all possibilities were explored and some improvements are required, but suits for a proof of concept and to indicate paths. © 2020 CEUR-WS. All rights reserved.
year
2020
url
https://www.scopus.com/inward/record.uri?-s2.0-85090898082&partnerID=40&md5=3bee224fddd1133fbeb306d5c88737fa
title
Structural onomatology for username generation: A partial account
tipo
inproceedings
type
Conference Paper
docpage
jj.bib.dp.html#Marcondes2020
journal
CEUR Workshop Proceedings
volume
2655


Marcondes202028


title
A short survey on chatbot technology: Failure in raising the state of the art
tipo
inproceedings
pages
28-36
volume
1003
docpage
jj.bib.dp.html#Marcondes202028
journal
Advances in Intelligent Systems and Computing
type
Conference Paper
eid
2
author
chave
Marcondes202028
year
2020
abstract
This short survey aimed initially to explore the existing state of the art for the application of chatbot on fighting (and not on spreading) of fake-news. It was then realized that there is not common to use chatbots with this "virtuous" purpose. Therefore, after two surveys and a meta-analysis, the topic had to be withdrawn since there were no survey results to discuss besides the absence of results. The survey result raised then a need to realize how chatbots are being currently used, designed and their primary sources. The result was once again confusing since, on the sample: (1) no significant concentration of usage could be found; (2) no widely adopted design strategies were identified, and (3) no significant crosscutting references to be considered as primary sources. Certainly, this can be due to a biased sample but may also be a symptom of a methodological issue on the chatbot researches. If the second possibility is proved to be right it means that chatbot research is still on a pre-paradigm stage according to Kuhn¿s conception. For this paper, there were performed 4 surveys with a total sample of 50 papers mostly from the last 3 years. © Springer Nature Switzerland AG 2020.
source
Scopus
doi
10.1007/978-3-030-23887-2_4
url
https://www.scopus.com/inward/record.uri?-s2.0-85068602421&.1007%2f978-3-030-23887-2_4&partnerID=40&md5=cbf6fb00a51eb082aa7e1097f926fece


Marcondes2020170


type
Conference Paper
journal
Advances in Intelligent Systems and Computing
docpage
jj.bib.dp.html#Marcondes2020170
volume
1160 AISC
pages
170-180
tipo
article
title
Fact-Check spreading behavior in twitter: A qualitative profile for false-claim news
url
https://www.scopus.com/inward/record.uri?-s2.0-85086245198&.1007%2f978-3-030-45691-7_16&partnerID=40&md5=6547f11464462d6bfdb1505e6142b733
doi
10.1007/978-3-030-45691-7_16
source
Scopus
abstract
Fact-check spread is usually performed by a plain tweet with just the link. Since it is not proper human behavior, it may cause uncanny, hinder the reader¿s attention and harm the counter-propaganda influence. This paper presents a profile of fact-check link spread in Twitter (suiting for TRL-1) and, as an additional outcome, proposes a preliminary behavior design based on it (suiting for TRL-2). The underlying hypothesis is by simulating human-like behavior, a bot gets more attention and exerts more influence on its followers. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2020.
year
2020
chave
Marcondes2020170
author
eid
2


Martins2020134


url
https://www.scopus.com/inward/record.uri?-s2.0-85085513930&.1007%2f978-3-030-45688-7_14&partnerID=40&md5=d559e334a2140bea6ea02051264b73c4
abstract
Political debate - in its essence - carries a robust emotional charge, and social media have become a vast arena for voters to disseminate and discuss the ideas proposed by candidates. The Brazilian presidential elections of 2018 were marked by a high level of polarization, making the discussion of the candidates¿ ideas an ideological battlefield, full of accusations and verbal aggression, creating an excellent source for sentiment analysis. In this paper, we analyze the emotions of the tweets posted about the presidential candidates of Brazil on Twitter, so that it was possible to identify the emotional profile of the adherents of each of the leading candidates, and thus to discern which emotions had the strongest effects upon the election results. Also, we created a model using sentiment analysis and machine learning, which predicted with a correlation of 0.90 the final result of the election. © 2020, The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG.
source
Scopus
year
2020
doi
10.1007/978-3-030-45688-7_14
eid
2
author
chave
Martins2020134
volume
1159 AISC
type
Conference Paper
journal
Advances in Intelligent Systems and Computing
docpage
jj.bib.dp.html#Martins2020134
pages
134-143
tipo
article
title
Predicting an Election's Outcome Using Sentiment Analysis


Martins201861


chave
Martins201861
eid
2
author
url
https://www.scopus.com/inward/record.uri?-s2.0-85060849408&.1109%2fBRACIS.2018.00019&partnerID=40&md5=10284a22b511c161a903debd79e5619a
doi
10.1109/BRACIS.2018.00019
year
2018
abstract
In this paper, we examine methods to classify hate speech in social media. We aim to establish lexical baselines for this task by applying classification methods using a dataset annotated for this purpose. As features, our system uses Natural Language Processing (NLP) techniques in order to expand the original dataset with emotional information and provide it for machine learning classification. We obtain results of 80.56% accuracy in hate speech identification, which represents an increase of almost 100% from the original analysis used as a reference. © 2018 IEEE.
source
Scopus
tipo
inproceedings
title
Hate speech classification in social media using emotional analysis
journal
Proceedings - 2018 Brazilian Conference on Intelligent Systems, BRACIS 2018
docpage
jj.bib.dp.html#Martins201861
type
Conference Paper
pages
61-66
number
8575590