- year
- 1987
- number
- 1
- journal
- Revista de Informática
- docpage
- jj.bib.dp.html#velharia1
- volume
- 6
- title
- Descrição de um Núcleo Gráfico e Aplicação em {CAD
- chave
- velharia1
- tipo
- article
- author
- note
- (KGUM - kernel gráfico U.Minho)
- journal
- Revista de Informática
- docpage
- jj.bib.dp.html#velharia2
- volume
- 9
- year
- 1988
- number
- 6
- chave
- velharia2
- tipo
- article
- author
- C. Ferreira
- F. Ferreira
- F. Martins
- J.J. Almeida
- L. Barbosa
- title
- Sistemas de Programação Modular
- title
- Mecanismos para Especificação e Prototipagem de Interfaces
- note
- (Gramáticas Interactivas guardadas)
- author
- F. Mário Martins
- J.J. Almeida
- P.R. Henriques
- tipo
- inproceedings
- chave
- graminteractivas1990
- year
- 1990
- address
- Coimbra
- booktitle
- 3$º$ Encontro Português de Computação Gráfica
- docpage
- jj.bib.dp.html#graminteractivas1990
- year
- 1988
- docpage
- jj.bib.dp.html#tlc89
- type
- Texto didáctico
- title
- Teoria das Linguagens
- keyword
- institution
- Universidade do Minho, Departamento de Informática
- chave
- tlc89
- tipo
- techreport
- author
- title
- Estruturas de Dados
- keyword
- institution
- Universidade do Minho, Departamento de Informática
- chave
- estruturasdedados90
- tipo
- techreport
- author
- year
- 1990
- docpage
- jj.bib.dp.html#estruturasdedados90
- type
- Texto didáctico
- title
- \textsc{Camila} - A Platform for Software Mathematical Development
- tipo
- techreport
- docpage
- jj.bib.dp.html#Camila
- type
- (Páginas do projecto)
- keyword
- chave
- Camila
- institution
- Universidade do Minho, Departamento de Informática
- author
- year
- 1998
- url
- http://camila.di.uminho.pt
- editor
- L.S. Barbosa and J.J. Almeida and J.N. Oliveira and Luís Neves
- title
- {Natura} - Natural language processing
- tipo
- techreport
- note
- \url{http://natura.di.uminho.pt/}
- docpage
- jj.bib.dp.html#Natura
- type
- (Páginas do projecto)
- keyword
- institution
- Universidade do Minho, Departamento de Informática
- chave
- Natura
- author
- year
- 1997
- url
- http://natura.di.uminho.pt/
- author
- institution
- Universidade do Minho, Departamento de Informática
- chave
- PDavid
- keyword
- editor
- J.C. Ramalho and J.J. Almeida and P.R. Henriques
- url
- http://www.di.uminho.pt/~jcr/projectos/david/princ.html
- year
- 1998
- tipo
- techreport
- note
- \url{http://www.di.uminho.pt/~jcr/projectos/david/princ.html}
- title
- David -- Processamento estruturado de documentos
- docpage
- jj.bib.dp.html#PDavid
- type
- (Páginas do projecto)
- chave
- nllex
- tipo
- misc
- author
- title
- NLlex -- Natural Language LEX
- keyword
- lexical analysis
- Natura
- lex
- misc
- url
- http://natura.di.uminho.pt/~jj/pln/pln.html#nllex
- docpage
- jj.bib.dp.html#nllex
- type
- tool
- year
- 1996
- year
- 1997
- type
- tool
- docpage
- jj.bib.dp.html#jspell
- url
- http://natura.di.uminho.pt/~jj/pln/pln.html#jspell
- keyword
- lexical analysis
- Natura
- morphology
- misc
- title
- Jspell a module for morphological analyser for natural language
- author
- J.J. Almeida
- Ulisses Pinto
- tipo
- misc
- chave
- jspell
- docpage
- jj.bib.dp.html#jspell1
- type
- Manual
- tipo
- techreport
- title
- Manual de Utilizador do {JSpell}
- url
- http://natura.di.uminho.pt/~jj/pln/jspellman.ps.gz
- year
- 1994
- abstract
- month
- Jul
- chave
- jspell1
- institution
- Universidade do Minho, Departamento de Informática
- author
- J.J. Almeida
- Ulisses Pinto
- keyword
- morphology
- lexical analysis
- jspell
- techreport
- year
- 1994
- url
- http://natura.di.uminho.pt/~jj/pln/yalg3.ps.gz
- docpage
- jj.bib.dp.html#Almeida94b
- editor
- Carlos Martin Vide
- booktitle
- Actas del X Congreso de Lenguajes Naturales e Leanguajes Formales, Sevilla
- title
- {GPC} -- a Tool for higher-order grammar specification
- keyword
- grammar
- inproceedings
- chave
- Almeida94b
- tipo
- inproceedings
- author
- title
- {YaLG} -- extending {DCG} for natural language processing
- tipo
- inproceedings
- pages
- 621--628
- docpage
- jj.bib.dp.html#Almeida95a
- booktitle
- Actas del XI Congreso de Lenguajes Naturales e Leanguajes Formales, Tortosa
- keyword
- jspell
- morphology
- nllex
- inproceedings
- chave
- Almeida95a
- author
- year
- 1995
- url
- http://natura.di.uminho.pt/~jj/pln/yalg.ps.gz
- editor
- Carlos Martin Vide
- title
- Jspell -- um módulo para análise léxica genérica de linguagem natural
- tipo
- inproceedings
- pages
- 1--15
- booktitle
- Actas do X Encontro da Associação Portuguesa de Linguística
- docpage
- jj.bib.dp.html#Almeida94c
- keyword
- jspell
- morphology
- perl
- inproceedings
- author
- J.J. Almeida
- Ulisses Pinto
- chave
- Almeida94c
- year
- 1995
- address
- Évora 1994
- url
- http://natura.di.uminho.pt/~jj/pln/jspell1.ps.gz
- tipo
- inproceedings
- author
- chave
- Almeida94a
- keyword
- librarian studies
- IR
- inproceedings
- title
- Documents in an Informatic Academic environment
- docpage
- jj.bib.dp.html#Almeida94a
- booktitle
- Congresso Nacional de Bibliotecários, Arquivistas e
- year
- 1994
- address
- Lisboa
- number
- UM-DI-95.04
- year
- 1995
- url
- http://natura.di.uminho.pt/~jj/pln/nllex.ps.gz
- docpage
- jj.bib.dp.html#jj95
- title
- {NLlex} -- a tool to generate lexical analysers for natural language
- keyword
- jspell
- morphology
- lex
- nllex
- techreport
- institution
- Universidade do Minho, Departamento de Informática
- chave
- jj95
- author
- tipo
- techreport
- url
- http://www.di.uminho.pt/~lsb/pub_camila/LNcam.ps.gz
- year
- 1995
- institution
- University of Minho
- chave
- Barbosa95
- author
- L.S. Barbosa
- J.J. Almeida
- keyword
- Camila
- formal specification
- techreport
- docpage
- jj.bib.dp.html#Barbosa95
- number
- DI-CAM-95:11:1
- note
- Lecture notes for the System Design Course,
Computer System Engineering, University of Bristol
- tipo
- techreport
- title
- System Prototyping in \textsc{Camila}
- year
- 1995
- number
- DI-CAM-95:11:2
- url
- http://www.di.uminho.pt/~lsb/pub_camila/RMcam.ps.gz
- docpage
- jj.bib.dp.html#Barbosa95a
- keyword
- title
- \textsc{Camila}: A reference Manual
- tipo
- techreport
- author
- L.S. Barbosa
- J.J. Almeida
- chave
- Barbosa95a
- institution
- University of Minho
- number
- DI-CAM-95:11:1:v98
- year
- 1998
- type
- {Lecture Notes for the Bristol Course (1st ed. 1995)}
- docpage
- jj.bib.dp.html#BA97a
- keyword
- Formal Methods
- Prototyping
- Camila
- techreport
- title
- Systems Prototyping in \textsc{Camila}
- author
- L.S. Barbosa
- J.J. Almeida
- tipo
- techreport
- institution
- DI (U. Minho)
- chave
- BA97a
- number
- DI-CAM-95:7:1
- year
- 1995
- url
- http://www.di.uminho.pt/~lsb/pub_camila/romantic.ps.gz
- docpage
- jj.bib.dp.html#Barbosa95b
- keyword
- Camila
- formal specification
- didatics
- techreport
- title
- Growing Up With \textsc{Camila}
- author
- L.S. Barbosa
- J.J. Almeida
- tipo
- techreport
- institution
- Universidade do Minho, Departamento de Informática
- chave
- Barbosa95b
- chave
- Almeida96a
- author
- keyword
- perl
- morphology
- lexical analysis
- dictionary
- inproceedings
- url
- http://natura.di.uminho.pt/~jj/pln/etdic.ps.gz
- address
- Lisboa 1995
- year
- 1996
- tipo
- inproceedings
- title
- Especificação e tratamento de Dicionários
- booktitle
- Actas do XI Encontro da Associação Portuguesa de Linguística
- docpage
- jj.bib.dp.html#Almeida96a
- volume
- 2
- docpage
- jj.bib.dp.html#Ulisses96
- booktitle
- Actas do XI Encontro da Associação Portuguesa de Linguística
- volume
- 2
- tipo
- inproceedings
- title
- Tratamento automático de termos compostos
- url
- http://natura.di.uminho.pt/~jj/pln/ptc.ps.gz
- address
- Lisboa 1995
- year
- 1996
- chave
- Ulisses96
- author
- Ulisses Pinto
- J.J. Almeida
- keyword
- jspell
- morphology
- lexical analysis
- inproceedings
- year
- 1996
- booktitle
- II International Conference on Mathematical Linguistics, Tarragona, Spain
- url
- http://natura.di.uminho.pt/~jj/pln/yalg2.ps.gz
- docpage
- jj.bib.dp.html#Almeida96b
- title
- {YaLG} a tool for higher-order grammar specification
- keyword
- yalg
- RS
- inproceedings
- chave
- Almeida96b
- author
- tipo
- inproceedings
- url
- http://natura.di.uminho.pt/~jj/pln/nllex2.ps.gz
- month
- Sep
- year
- 1996
- chave
- jj96
- author
- keyword
- jspell
- morphology
- lex
- nllex
- article
- journal
- Procesamiento del Lenguaje Natural
- docpage
- jj.bib.dp.html#jj96
- volume
- 19
- pages
- 81--90
- publisher
- Sociedade Española para el Procesamiento del Lenguaje Natural
- tipo
- article
- title
- {NLlex} -- a tool to generate lexical analysers for natural language
- year
- 1997
- month
- Dec.
- address
- Washington D.C. - USA
- docpage
- jj.bib.dp.html#SGML97
- booktitle
- SGML/XML'97 Conference
- keyword
- PDavid
- Semantics
- inproceedings
- title
- SGML Documents: where does quality go?
- tipo
- inproceedings
- author
- J.C. Ramalho
- J.G. Rocha
- J.J. Almeida
- P.R. Henriques
- chave
- SGML97
- title
- Programação de dicionários
- tipo
- inproceedings
- pages
- 21--28
- docpage
- jj.bib.dp.html#Almeida98
- booktitle
- Actas do XIII Encontro da Associação Portuguesa de Linguística
- volume
- 1
- keyword
- perl
- morphology
- dictionary
- parser
- inproceedings
- chave
- Almeida98
- author
- address
- Lisboa 1997
- year
- 1998
- url
- http://natura.di.uminho.pt/~jj/bib/progDic.ps.gz
- title
- Etiquetador morfo-sintáctico para o Português
- keyword
- chave
- Reis98
- author
- Ricardo Reis
- J.J. Almeida
- tipo
- inproceedings
- address
- Lisboa 1997
- year
- 1998
- booktitle
- Actas do XIII Encontro da Associação Portuguesa de Linguística
- docpage
- jj.bib.dp.html#Reis98
- url
- http://natura.di.uminho.pt/~jj/bib/etiquetador2.ps.gz
- keyword
- Camila
- formal specification
- inproceedings
- author
- J.J. Almeida
- Barbosa, L.S.
- Neves, F.L.
- Oliveira, J.N.
- chave
- ABNO97a
- month
- October
- year
- 1997
- address
- La Plata, Argentina
- url
- http://camila.di.uminho.pt/camila-doc/CLaPF97.ps.gz
- editor
- De Giusti, A. and Diaz, J. and Pesado, P.
- title
- \textsc{Camila}: Formal Software Engineering Supported by Functional Programming
- tipo
- inproceedings
- pages
- 1343--1358
- booktitle
- Proc. II Conf. Latino Americana de Programación Funcional ({CLaPF97})
- docpage
- jj.bib.dp.html#ABNO97a
- author
- J.J. Almeida
- Barbosa, L.S.
- Neves, F.L.
- Oliveira, J.N.
- chave
- ABNO97b
- keyword
- Camila
- formal specification
- inproceedings
- editor
- Johnson, M.
- month
- December
- year
- 1997
- address
- Sydney, Australia
- publisher
- Springer Lect. Notes Comp. Sci. (1349)
- tipo
- inproceedings
- title
- \textsc{Camila}: Prototyping and Refinement of Constructive Specifications
- booktitle
- 6th International Conference on Algebraic Methods and Software Technology ({AMAST'97})
- docpage
- jj.bib.dp.html#ABNO97b
- pages
- 554--559
- docpage
- jj.bib.dp.html#AH97
- booktitle
- Proc. II Conference on Knowledge-based Intelligent Electronic Systems ({Kes98})
- title
- Dynamic Dictionary = cooperative information sources
- tipo
- inproceedings
- address
- Australia
- year
- 1998
- month
- April
- url
- http://natura.di.uminho.pt/~jj/bib/agentes97.ps.gz
- keyword
- dictionary
- Agentes
- inproceedings
- chave
- AH97
- author
- J.J. Almeida
- P.R. Henriques
- title
- Adapting Museum Structures for the Web: No Changes Needed!
- chave
- museums98
- note
- Toronto - Canadá
- author
- J.G. Rocha
- M.R. Henriques
- J.C. Ramalho
- J.J. Almeida
- J.L. Faria
- P.R. Henriques
- tipo
- inproceedings
- year
- 1998
- booktitle
- Museums and the Web 1998
- docpage
- jj.bib.dp.html#museums98
- tipo
- inproceedings
- author
- Almeida, J.J.
- Barbosa, L.S.
- Barros, J.B.
- Neves, L.F.
- publisher
- Proc. 3rd Summer School on Advan. Funct. Prog., Braga
- chave
- ABBN98
- title
- On The Development of \textsc{Camila}
- editor
- L.S. Barbosa and J.A. Saraiva
- docpage
- jj.bib.dp.html#ABBN98
- booktitle
- Workshop on Research Themes on Functional Programming
- year
- 1998
- month
- 18 Sep.
- docpage
- jj.bib.dp.html#Gis99
- booktitle
- Conferência da Association of Geographic Information
Laboratories for Europe (AGILE)
- address
- Roma
- year
- 1999
- chave
- Gis99
- tipo
- inproceedings
- author
- Jorge Rocha
- Ana Silva
- Ricardo Henriques
- J.J. Almeida
- Pedro Henriques
- title
- Formal Methods for {GI
- keyword
- author
- Jorge Rocha
- Tiago Pedroso
- J.J. Almeida
- tipo
- inproceedings
- chave
- RPA99
- keyword
- mapit
- inproceedings
- title
- {MAPit
- booktitle
- Conferência da Association of Geographic Information
Laboratories for Europe (AGILE)
- docpage
- jj.bib.dp.html#RPA99
- year
- 1999
- address
- Roma
- year
- 1999
- keyword
- title
- Sobre a Utilização de Metodologias Formais no Desenvolvimento de
- tipo
- inproceedings
- author
- Jorge Gustavo Rocha
- Ana Silva
- J.J. Almeida
- Mario Ricardo Henriques
- Pedro Rangel Henriques
- docpage
- jj.bib.dp.html#RSea99
- booktitle
- GISBRASIL'99, Salvador
- chave
- RSea99
- chave
- xmldt99
- tipo
- inproceedings
- author
- J.J. Almeida
- José Carlos Ramalho
- title
- {XML::DT
- keyword
- docpage
- jj.bib.dp.html#xmldt99
- booktitle
- XML-Europe'99, Granada - Espanha
- year
- 1999
- month
- May
- year
- 1999
- chave
- RRAH99
- author
- J.C. Ramalho
- J.G. Rocha
- J.J. Almeida
- P.R. Henriques
- keyword
- docpage
- jj.bib.dp.html#RRAH99
- journal
- Markup Languages: theory and practice
- pages
- 75--90
- olume
- 1
- publisher
- MIT Press
- tipo
- article
- title
- SGML documents: Where does quality go?
- author
- L.S. Barbosa
- J.B. Barros
- J.J. Almeida
- tipo
- inproceedings
- chave
- Barbosa2000
- keyword
- title
- Polytypic Recursion Patterns
- booktitle
- {SBLP'2000} (to appear as a ENTCS volume)
- docpage
- jj.bib.dp.html#Barbosa2000
- month
- May
- year
- 2000
- address
- {UFP}, Recife, Brasil
- title
- Smallbook -- comando para produção de livros em pequena escala
- keyword
- publishing
- latex
- smallbook
- inproceedings
- chave
- jj2001x
- tipo
- inproceedings
- author
- address
- Braga
- pages
- 445--450
- year
- 2000
- docpage
- jj.bib.dp.html#jj2001x
- booktitle
- Actas da II Conferência Internacional de Tecnologias de
Informação e Comunicação na Educação
- chave
- speaker:sepln2001
- author
- J.J. Almeida
- A. M. Simões
- keyword
- address
- Sevilha
- month
- Sep.
- year
- 2001
- publisher
- Sociedade Española para el Procesamiento del Lenguaje Natural
- tipo
- article
- title
- Text to speech -- a rewriting system approach
- docpage
- jj.bib.dp.html#speaker:sepln2001
- journal
- Procesamiento del Lenguaje Natural
- volume
- 27
- pages
- 247--255
- month
- Maio
- year
- 2001
- address
- Porto
- booktitle
- Congresso Nacional de Bibliotecários, Arquivistas e
- url
- http://natura.di.uminho.pt/~jj/bib/museuDaPessoa2001.ps.gz
- docpage
- jj.bib.dp.html#mp2001
- title
- {Museu da Pessoa
- author
- J.J. Almeida
- J. Gustavo Rocha
- P. Rangel Henriques
- Sónia Moreira
- Alberto Simões
- tipo
- inproceedings
- chave
- mp2001
- tipo
- inproceedings
- author
- J.J. Almeida
- P. Rangel Henriques
- J. Gustavo Rocha
- Alberto Simões
- chave
- alfarrabio2001
- title
- Alfarrábio: Adding value to an Heterogeneous Site Collection
- docpage
- jj.bib.dp.html#alfarrabio2001
- url
- http://natura.di.uminho.pt/~jj/bib/alfarrabio2001.ps.gz
- booktitle
- Congresso Nacional de Bibliotecários, Arquivistas e
- year
- 2001
- month
- Maio
- address
- Porto
- title
- Cálculo de frequências de
palavras para entradas de dicionários através do uso conjunto de analisadores
morfológicos, taggers e corpora
- tipo
- inproceedings
- pages
- 407--418
- booktitle
- Actas do XVII Encontro da Associação Portuguesa de Linguística
- docpage
- jj.bib.dp.html#freq2002
- author
- Paulo A. Rocha
- Alberto M. Simões
- J.J. Almeida
- chave
- freq2002
- abstract
- Apresentamos neste documento uma possível abordagem à
extracção de frequências de palavras a partir de
corpora, baseada numa utilização cooperativa de várias
ferramentas de Processamento de Linguagem Natural.
- year
- 2002
- address
- Lisboa 2001
- url
- http://natura.di.uminho.pt/~jj/bib/apl:freqnormpt.ps.gz
- address
- Lisboa 2001
- pages
- 485--495
- year
- 2002
- abstract
- Neste documento é nosso propósito apresentar as
características presentes no analisador morfológico
jspell e quais as suas consequências ao nível de
aplicações de processamento de linguagem natural. Como
ferramenta que é frequentemente integrada em software
mais específico, apresentamos um módulo Perl
desenvolvido com o objectivo de facilitar a interligação
do analisador morfológico com pequenas aplicações
desenvolvidas em linguagens de scripting. Devido à
constante necessidade de melhoramento de dicionários, e
em particular dos analisadores morfológicos, discutimos
as propriedades que estes devem conter para facilitar o
seu processamento e enriquecimento automático.
- docpage
- jj.bib.dp.html#jspell2002
- booktitle
- Actas do XVII Encontro da Associação Portuguesa de Linguística
- title
- Jspell.pm -- um módulo de análise morfológica
para uso em processamento de linguagem natural
- chave
- jspell2002
- tipo
- inproceedings
- author
- Alberto M. Simões
- J.J. Almeida
- chave
- dag2002
- author
- Alberto M. Simões
- J.J. Almeida
- Pedro R. Henriques
- tipo
- inproceedings
- title
- Directory Attribute Grammars
- booktitle
- VI Simpósio Brasileiro de Linguagens de Programação
- docpage
- jj.bib.dp.html#dag2002
- pages
- 297--308
- address
- Rio de Janeiro, Brasil
- year
- 2002
- booktitle
- Elpub 2002 -- International Conference on Electronic Publishing
- docpage
- jj.bib.dp.html#elpub2002
- month
- Nov.
- abstract
In last years the amount of digital documents has increased
dramatically. Unfortunately the same did not occur with the
structure and organization of the information. Traditionally we
built a digital library using a catalog with documents'
meta-information including a conceptual classification and an
ontology of concepts.
In this document we present a set of modules to help in the task of
building and maintaining a digital library. It includes a module to
work with ontologies, a set of modules to handle specific catalog
formats (like Bib\TeX), a module to define new catalog formats
and a tool to integrate ontologies and multi-format
catalogs in a web browse-able knowledge-base.
- year
- 2002
- pages
- 203--211
- address
- Karlov Vary, República Checa
- author
- Alberto M. Simões
- J.J. Almeida
- tipo
- inproceedings
- chave
- elpub2002
- isbn
- 3-89700-357-0
- title
- Library::* -- a toolkit for digital libraries
- month
- Sep.
- abstract
Multilingual resources are useful for linguistic studies, translation,
and many other tasks. Unfortunately, these resources are difficult to obtain
and organize.
In this document we describe a set of tools designed to help in the
task of mining bilingual resources from the web, from a specific site,
from a file system, from a list of URLs, or from a translation memory.
As a design goal we intend to build tools that can be used both
cooperatively (in pipeline) and also in a independent way.
- year
- 2002
- chave
- parguess2002
- author
- J.J. Almeida
- Alberto M. Simões
- J. Alves de Castro
- pages
- 13--20
- journal
- Procesamiento del Lenguaje Natural
- docpage
- jj.bib.dp.html#parguess2002
- volume
- 29
- title
- Grabbing parallel corpora from the web
- publisher
- Sociedade Española para el Procesamiento del Lenguaje Natural
- tipo
- article
- number
- 3
- journal
- The Perl Review
- docpage
- jj.bib.dp.html#cP
- volume
- 0
- title
- Cooking Perl with flex
- tipo
- article
- year
- 2002
- month
- May
- abstract
There are a lot of tools for parser generation using Perl. As we
know, Perl has flexible data structures which makes it easy to
generate generic trees. While it is easy to write a grammar and a
lexical analyzer using modules like \texttt{Parse::Yapp
- chave
- cP
- author
- docpage
- jj.bib.dp.html#APL2k2.Parguess
- booktitle
- Actas do XVIII Encontro da Associação Portuguesa de Linguística
- title
- Extracção de corpora paralelo a partir da web: construção e
- tipo
- inproceedings
- lang
- PT
- year
- 2003
- abstract
Ao longo deste documento descrever-se-á um conjunto de ferramentas
construídas para extracção automática de recursos bilingues a partir
da Web, a partir de um \emph{site
- address
- Porto 2002
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/APL2k2.Parguess.pdf
- author
- J.J. Almeida
- Alberto Manuel Simões
- José Alves Castro
- chave
- APL2k2.Parguess
- booktitle
- Actas do XVIII Encontro da Associação Portuguesa de Linguística
- docpage
- jj.bib.dp.html#APL2k2.Synthesis
- title
- Geração de voz com sotaque
- tipo
- inproceedings
- lang
- PT
- abstract
Como é sabido os sotaques podem estar ligados a uma zona geográfica,
a um grupo social, podem até ser uma característica pessoal. O seu
estudo e descrição tem interessado muitos investigadores embora
normalmente esse estudo tem sido feito de modo pouco formal.
No trabalho que aqui se relata, tentou-se descrever formalmente
sotaques e disfunções através de criação de regras a integrar como
variantes num gerador de voz. Deste modo, pretendeu-se criar um
ambiente de experimentação dos modelos construídos para descrever
algumas características de certos sotaques ou certas disfunções, de
modo a permitir a sua validação.
Constatou-se que se consegue obter certas disfunções e certos
sotaques com facilidade por simples acrescento de regras opcionais
em certas fases da geração da voz. Outros, aparentam ser de maior
dificuldade, ou por não conhecermos suficiente bem os fenómenos
neles envolvidos ou envolverem maior complexidade prosódica.
- year
- 2003
- address
- Porto 2002
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/APL2k2.Synthesis.pdf
- author
- J.J. Almeida
- Alberto Manuel Simões
- chave
- APL2k2.Synthesis
- tipo
- inproceedings
- lang
- PT
- title
- Engenharia reversa de {HTML} usando tecnologia {XML}
- docpage
- jj.bib.dp.html#xata:xmldt
- booktitle
- {XATA --- XML}, Aplicações e Tecnologias Associadas
- author
- J.J. Almeida
- Alberto Manuel Simões
- chave
- xata:xmldt
- irreditor
- José Carlos Ramalho
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata2003xml.pdf
- year
- 2003
- abstract
- O proliferar de ferramentas criadores de HTML e o uso
de HTML guiado pelo aspecto, tem vindo a arruinar o
seu lado conceptual. Este problema foi reconhecido e
deu origem a vários formatos ou tecnologias com o
objectivo de separar o aspecto do conceito. No
entanto a realidade actual mostra uma enorme
quantidade de páginas HTML com péssima leitura
conceptual e estrutural, invalidando uma série de
usos possíveis da informação nelas contida. Nesta
comunicação apresenta-se um trabalho (em fase
inicial) que pretende fazer engenharia reversa de
HTML para permitir aumentar a sua acessibilidade, a
fim de ser usada num \emph{browser
- chave
- xata:museudapessoa
- author
- Alberto Manuel Simões
- J.J. Almeida
- strutural
- M
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata2003mp.pdf
- editor
- José Carlos Ramalho
- abstract
Este artigo apresenta a arquitectura actual do Museu da Pessoa,
contemplando a forma como os documentos estão a ser editador,
catalogados, arquivados, e processados para a criação das estruturas
necessárias ao Museu.
- year
- 2003
- lang
- PT
- tipo
- inproceedings
- title
- {H
- booktitle
- {XATA --- XML}, Aplicações e Tecnologias Associadas
- docpage
- jj.bib.dp.html#xata:museudapessoa
- pages
- 288--298
- booktitle
- ElPub 2003 -- International conference on electronic publishing
- docpage
- jj.bib.dp.html#elpub2003
- title
- Music publishing
- publisher
- Universidade do Minho
- note
- Guimarães
- tipo
- inproceedings
- lang
- EN
- isbn
- 972-98921-2-1
- abstract
Current music publishing in the Internet is mainly concerned with
sound publishing. We claim that music publishing is not only to make
sound available but also to define relations between a set of music
objects like music scores, guitar chords, lyrics and their
meta-data. We want an easy way to publish music in the Internet, to
make high quality paper booklets and even to create Audio CD's.
In this document we present a workbench for music publishing based
on open formats, using open-source tools and script programming over
them. The workbench is based on an archive specification written in
a text-based format which includes sound references, music scores,
chords and lyrics and their meta-information.
- month
- June
- year
- 2003
- editor
- Sely Costa et al.
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/elpub2003.pdf
- keyword
- música
- bibliotecas digitais
- inproceedings
- author
- Alberto Manuel Simões
- J.J. Almeida
- chave
- elpub2003
- abstract
- O projecto TerminUM tem como objectivos principais o
estudo, experimentação e a criação de recursos na
área dos corpora paralelos, terminologia
(descritiva) e recursos multilingues ligados a
corpora: fazer extracção tão automática quanto
possível de corpora a partir da web; fazer extracção
de dicionários, de terminologia e de outros recursos
ligados à tradução; criar e interligar as
ferramentas desenvolvidas; criar e disponibilizar:
(1) listas de Bitextos, corpora e corpora paralelos,
(2) ferramentas de criação e transformação de
corpora, (3) recursos multilingues derivados/ligados
a corpora. Nesta apresentação serão abordadas
algumas tarefas presentemente a decorrer no âmbito
do projecto, nomeadamente: ciclo de vida da
construção e transformação de corpora; resumo das
ferramentas desenvolvidas (e em desenvolvimento);
construção de corpora paralelos tomando como base
legendas de filmes (subtitles), ficheiro de
internacionalização (mensagens de software .po) e
ficheiros de memórias de tradução (TMX); animação de
corpora paralelos via web (criação de motores de
consulta usando diversas ferramentas).
- month
- Jun.
- year
- 2003
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/cp3a2003-terminum.pdf
- keyword
- terminum
- parallel corpora
- inproceedings
- author
- J.J. Almeida
- Alberto Simões
- José Castro
- Bruno
- Paulo Silva
- chave
- cp3a:terminum2003
- pages
- 7--14
- booktitle
- CP3A 2003 -- Workshop em Corpora Paralelos: aplicações e
algoritmos associados
- docpage
- jj.bib.dp.html#cp3a:terminum2003
- title
- Projecto {TerminUM}
- publisher
- Universidade do Minho
- note
- Braga
- tipo
- inproceedings
- month
- Jun.
- year
- 2003
- pages
- 65--70
- booktitle
- CP3A 2003 -- Workshop em Corpora Paralelos: aplicações e
algoritmos associados
- docpage
- jj.bib.dp.html#cp3a:kvec2003
- keyword
- kvec
- terminum
- parallel corpora
- word alignment
- inproceedings
- title
- {Lingua-Biterm}: um módulo Perl para extracção de terminologia bilingue
- publisher
- Universidade do Minho
- note
- Braga
- author
- tipo
- inproceedings
- chave
- cp3a:kvec2003
- chave
- cp3a:natools2003
- note
- Braga
- publisher
- Universidade do Minho
- author
- tipo
- inproceedings
- title
- Alinhamento de corpora paralelos
- keyword
- natools
- terminum
- parallel corpora
- word alignment
- inproceedings
- booktitle
- CP3A 2003 -- Workshop em Corpora Paralelos: aplicações e
algoritmos associados
- docpage
- jj.bib.dp.html#cp3a:natools2003
- pages
- 71--77
- month
- Jun.
- year
- 2003
- volume
- 31
- docpage
- jj.bib.dp.html#sepln2003
- journal
- Procesamiento del Lenguaje Natural
- pages
- 217--224
- tipo
- article
- publisher
- Sociedade Española para el Procesamiento del Lenguaje Natural
- title
- {NATools} -- A Statistical Word Aligner Workbench
- year
- 2003
- month
- Sep.
- abstract
- This document presents the TerminUM project and the
work done in its statistical word aligner workbench (NATools). It
shows a variety of alignment methods for parallel corpora and
discusses the resulting terminological dictionaries and their use:
evaluation of sentence translations; construction of a multi-level
navigation system for linguistic studies or statistical
- author
- Alberto M. Simões
- J.J. Almeida
- chave
- sepln2003
- keyword
- natools
- terminum
- parallel corpora
- word alignment
- article
- author
- José João Dias de Almeida
- chave
- tesejj
- url
- http://natura.di.uminho.pt/~jj/bib/tesejj.pdf
- year
- 2003
- tipo
- phdthesis
- lang
- PT
- title
- Dicionários dinâmicos multi-fonte
- docpage
- jj.bib.dp.html#tesejj
- school
- Universidade do Minho
- type
- Tese de Doutoramento
- superviser
- Pedro Rangel Henriques
- year
- 2004
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/msc.pdf
- chave
- teseambs
- author
- Alberto Manuel Brandão Simões
- docpage
- jj.bib.dp.html#teseambs
- superviser
- José João Almeida and Pedro Rangel Henriques
- type
- Tese de Mestrado
- school
- Escola de Engenharia - Universidade do Minho
- title
- Parallel Corpora word alignment and applications
- lang
- EN
- tipo
- mastersthesis
- title
- {TX
- lang
- PT
- isbn
- 972-99166-0-8
- tipo
- inproceedings
- pages
- 217--224
- booktitle
- {XATA 2004
- docpage
- jj.bib.dp.html#xata04:tx
- irreditor
- José Carlos Ramalho and Alberto Simões
- chave
- xata04:tx
- author
- José João Almeida
- Alberto Simões
- month
- February
- abstract
Desde o advento do SGML e posteriormente do XML, que a validação de
documentos tem sido focada.
Esta validação surgiu para analisar a estrutura dos documentos SGML
e XML usando DTDs. Além dessa, e devido às restrições do XML em
relação ao SGML, a validação de XML bem formado também tem sido
usada. Mais recentemente, os Schema e Schematron vieram permitir a
validação a um nível superior: não só a estrutura do documento mas
também alguma validação de conteúdo.
Neste artigo apresentamos a ferramenta TX que visa outro nível de
validação, em que os tipos possam ser mais ricos e/ou calculados
dinamicamente, e onde se possa definir funções de anotação e/ou
correcção das porções do documento que não sigam as especificações.
- year
- 2004
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata04-tx.pdf
- year
- 2004
- month
- February
- abstract
Neste documento apresenta-se o conceito de memórias de tradução
distribuídas, discutindo-se o seu interesse na área da tradução, bem
como as vantagens que uma ferramenta de tradução pode tirar do seu
É apresentada uma possível implementação de memórias de tradução
distribuídas usando WebServices numa arquitectura de cooperativismo.
São definidos as mensagens (API) que um serviço deste género deve
implementar para que uma ferramenta de tradução possa tirar partido
da colaboração entre tradutores.
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata04-mtd.pdf
- irreditor
- José Carlos Ramalho and Alberto Simões
- author
- Alberto Simões
- José João Almeida
- Xavier Gomez
- chave
- xata04:mtd
- pages
- 59--68
- docpage
- jj.bib.dp.html#xata04:mtd
- booktitle
- {XATA 2004
- title
- Memórias de Tradução Distribuídas
- tipo
- inproceedings
- isbn
- 972-99166-0-8
- lang
- PT
- year
- 2004
- number
- 1
- volume
- 1
- docpage
- jj.bib.dp.html#xmldt2
- journal
- The Perl Review
- title
- {XML::DT
- tipo
- article
- author
- chave
- xmldt2
- tipo
- article
- publisher
- Sociedade Española para el Procesamiento del Lenguaje Natural
- lang
- EN
- title
- Distributed Translation Memories implementation using WebServices
- volume
- 33
- docpage
- jj.bib.dp.html#sepln2004
- journal
- Procesamiento del Lenguaje Natural
- pages
- 89--94
- author
- Alberto Simões
- Xavier Gómez Guinovart
- J.J. Almeida
- chave
- sepln2004
- keyword
- TMs
- MT
- distributed translation memories
- WebServices
- article
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/dtm-sepln.pdf
- year
- 2004
- abstract
- Translation Memories are very useful for translators
but are difficult to share and reuse in a community of translators.
This article presents the concept of Distributed Translation
Memories, where all users can contribute and sharing translations.
Implementation details using WebServices are shown, as well as an
example of a distributed system between Portugal and Spain.
- month
- July
- lang
- EN
- tipo
- inproceedings
- title
- Linguateca: um centro de recursos distribuído para o processamento
computacional da língua portuguesa
- booktitle
- Workshop on Linguistic Tools and Resources for Spanish and
- docpage
- jj.bib.dp.html#linguateca
- pages
- 147--154
- chave
- linguateca
- author
- Diana Santos
- Alberto Simões
- Ana Frankenberg-Garcia
- Ana Pinto
- Anabela Barreiro
- Belinda Maia
- Cristina Mota
- Débora
- Eckhard Bick
- Elisabete Ranchhod
- J.J. Almeida
- Luís Cabral
- Luís Costa
- Luís Sarmento
- Marcirio Chaves
- Nuno
- Paulo Rocha
- Rachel Aires
- Rosário Silva
- Rui Vilela
- Susana Afonso
- editor
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/linguateca.pdf
- address
- Puebla, México
- abstract
Neste artigo apresentamos uma panorâmica da actividade da Linguateca na criação
e disponibilização de recursos e ferramentas para a língua portuguesa. Começamos
por uma descrição dos objectivos e pressupostos da Linguateca e uma breve história
da sua intervenção, e finalizamos com algumas considerações sobre a melhor forma
de prosseguir na organização da área.
- year
- 2004
- tipo
- inproceedings
- author
- Rui Vilela
- Alberto Simões
- Eckhard Bick
- J.J. Almeida
- publisher
- Departamento de Informática, Universidade do Minho
- location
- Braga
- chave
- xata05:fs
- keyword
- Floresta Sintáctica
- tigerXML
- Lingua::PT::Dirty
- inproceedings
- title
- Representação em {XML} da {F}loresta {S}intáctica
- irreditor
- José Carlos Ramalho and Alberto Simões and João Correia Lopes
- docpage
- jj.bib.dp.html#xata05:fs
- booktitle
- XATA 2005, Aplicações e Tecnologias Associadas
- year
- 2005
- month
- Fev.
- year
- 2005
- month
- Fev.
- docpage
- jj.bib.dp.html#xata05:tdt
- booktitle
- XATA 2005, Aplicações e Tecnologias Associadas
- keyword
- inproceedings
- title
- Inferência de tipos em documentos {XML}
- irreditor
- José Carlos Ramalho and Alberto Simões and João Correia Lopes
- tipo
- inproceedings
- author
- J.J. Almeida
- Alberto Simões
- publisher
- Departamento de Informática, Universidade do Minho
- location
- Braga
- chave
- xata05:tdt
- pages
- 376--377
- address
- Portalegre
- month
- Fev.
- year
- 2006
- booktitle
- XATA 2006, Aplicações e Tecnologias Associadas
- docpage
- jj.bib.dp.html#xata06:navegante
- ote
- poster
- irreditor
- José Carlos Ramalho and Alberto Simões and João Correia Lopes
- title
- Navegante: um proxy de ordem superior para navegação intusiva
- keyword
- inproceedings
- chave
- xata06:navegante
- author
- J.J. Almeida
- Alberto Simões
- publisher
- tipo
- inproceedings
- irreditor
- José Carlos Ramalho and Alberto Simões and João Correia Lopes
- keyword
- inproceedings
- chave
- xata06:xmlauto
- author
- J.J. Almeida
- Alberto Simões
- address
- Portalegre
- year
- 2006
- month
- Fev.
- abstract
É consensual que o XML como linguagem para a estruturação de documentos
tem vindo a tomar um lugar relevante. É também evidente a vantagem
obtida no uso de XML como linguagem de intercâmbio.
No entanto, a sua sintaxe é
demasiado descritiva pelo que a geração de documentos de forma
manual é dolorosa sendo útil dispor de módulos
que simplifiquem essa tarefa.
Neste artigo propomos um módulo Perl (XML::Writer::Simple) configurável via
DTD que simplifica a tarefa de gerar XML.
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata2006-xmlwritersimple.pdf
- title
- Geração dinâmica de {API
- isbn
- 972-99166-2-4
- lang
- PT
- tipo
- inproceedings
- publisher
- pages
- 307--314
- docpage
- jj.bib.dp.html#xata06:xmlauto
- booktitle
- {XATA 2006
- abstract
- Parallel corpora are important resources for most
Natural Language processing tasks. From the common
applications, like machine translation, to the
usually mono-lingual tasks as paraphrase detection
and word sense disambiguation, most researchers are
using massive parallel corpora. Thus, the
availability of an efficient way to manage them is
very important. This paper presents a Client-Server
architecture to query efficiently parallel corpora
and probabilistic translation dictionaries.
- month
- September
- year
- 2006
- address
- Zaragoza, Spain
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/sepln06.pdf
- author
- Alberto Simões
- J. João Almeida
- chave
- sepln06
- pages
- 91--97
- volume
- 37
- docpage
- jj.bib.dp.html#sepln06
- journal
- Procesamiento del Lenguaje Natural
- title
- {NatServer:
- tipo
- article
- lang
- EN
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/eamt06.pdf
- editor
- Jan Tore Lønning and Stephan Oepen
- address
- Oslo, Norway
- shortin
- year
- 2006
- month
- 19--20, June
- abstract
- One of the bottlenecks of example-based machine
translation (EBMT) is to be able to amass
automatically quantities of good examples. In our
work in EBMT, we are investigating how far one can
go by performing example extraction from parallel
corpora using Probabilistic Translation Dictionaries
to obtain example segmentation points. In fact, the
success of EBMT highly depends on examples quality
and quantity, but also in their length. Thus, we
give special importance on methods to extract
different size examples from the same translation
unit. With this article we show that it is possible
to extract quantities for examples from parallel
corpora just using probabilistic translation
dictionaries extracted from the same corpora.
- chave
- eamt06
- author
- Alberto Simões
- J. João Almeida
- docpage
- jj.bib.dp.html#eamt06
- booktitle
- 11th Annual Conference of the European Association for Machine Translation
- pages
- 27--32
- isbn
- 82-7368-294-3
- lang
- EN
- tipo
- inproceedings
- title
- Combinatory Examples Extraction for Machine Translation
- docpage
- jj.bib.dp.html#lrec06
- booktitle
- Fifth international conference on Language Resources and Evaluation, LREC 2006
- title
- {$T_2O$
- lang
- EN
- tipo
- inproceedings
- address
- Genova, Italy
- shortin
- year
- 2006
- abstract
- In this article we present $T_2O$ --- a workbench to
assist the process of translating heterogeneous
resources into ontologies, to enrich and add
multilingual information, to help programming with
them, and to support ontology publishing. $T_2O$ is
an ontology algebra.
- month
- May
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/lrec06.pdf
- chave
- lrec06
- author
- José João Almeida
- Alberto Simões
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/elpub06-t2o.pdf
- address
- Bansko, Bulgaria
- month
- June
- abstract
- Dictionary and Thesaurus are valuable resources for
Natural Language Processing but do not exist as
freely available as expected, especially for
languages other than English and, when they exist,
they are just available for querying online. Our
main goal with T2O --- Thesaurus to Ontology
framework --- is to create a multilingual ontology:
freely available online and to download; with a
computer readable format; with a good API; with a
structure as rich as possible; reusing all the
structured information we can get;
- year
- 2006
- chave
- elpub06-t2o
- author
- J. João Almeida
- Alberto Simões
- booktitle
- {ElPub 2006
- docpage
- jj.bib.dp.html#elpub06-t2o
- pages
- 373--374
- lang
- EN
- note
- poster
- tipo
- inproceedings
- title
- Publishing multilingual ontologies: a quick way of obtaining feedback
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/elpub06-blind.pdf
- shortin
- {ElPub
- address
- Bansko, Bulgaria
- abstract
- True accessibility requires minimizing the scanning time
to find a particular piece of
information. Sequentially reading web pages do not
provide this type of accessibility, for instance
before the user gets to the actual text content of
the page it has to go through a lot of menus and
headers. However if the user could navigate a web
page based through semantically classified blocks
then the user could jump faster to the actual
content of the page, skipping all the menus and
other parts of the page. We propose a transcoding
engine that tackles accessibility at two distinct,
yet complementary, levels: for specific known sites
and general unknown sites. We present a tool for
building customized scripts for known sites that
turns this process in an extremely simple task,
which can be performed by anyone, without any
expertise. For general unknown sites, our approach
relies on statistical analysis of the structural
blocks that define a web page to infer a semantic
for the block.
- month
- June
- year
- 2007
- chave
- elpub06-blind
- author
- Alberto Simões
- Anália Lourenço
- José João Almeida
- booktitle
- The 31st Annual Conference of the German Classification Society on Data Analysis, Machine Learning, and Applications
- docpage
- jj.bib.dp.html#elpub06-blind
- pages
- 123-134
- lang
- EN
- note
- \textbf{forthcoming
- tipo
- inproceedings
- title
- Mining Classical Music Scores for Epoch Classification
- docpage
- jj.bib.dp.html#avalon:jspell
- booktitle
- Avaliação conjunta: um novo paradigma no processamento computacional da língua portuguesa
- pages
- 83--90
- tipo
- incollection
- publisher
- {IST Press
- title
- Jspellando nas morfolimpíadas: Sobre a participação do {Jspell
- editor
- Diana Santos
- year
- 2007
- shortin
- Avaliação conjunta, cap. 8
- author
- José João Almeida
- Alberto Simões
- chave
- avalon:jspell
- editor
- Diana Santos
- year
- 2007
- shortin
- Avaliação conjunta, cap. 18
- author
- Alberto Simões
- José João Almeida
- chave
- avalon:avalinha
- docpage
- jj.bib.dp.html#avalon:avalinha
- booktitle
- Avaliação conjunta: um novo paradigma no processamento computacional da língua portuguesa
- pages
- 219--230
- tipo
- incollection
- publisher
- {IST Press
- title
- Avaliação de alinhadores
- editor
- José Carlos Ramalho and João Correia Lopes and Luís Carríço
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xmlyamljson07.pdf
- shortin
- year
- 2007
- abstract
- month
- February
- institution
- Universidade do Minho, Departamento de Informática
- chave
- xata07:xmltmx
- author
- Rúben Fonseca
- Alberto Simões
- irreditor
- José Carlos Ramalho and João Correia Lopes and Luís Carríço
- keyword
- docpage
- jj.bib.dp.html#xata07:xmltmx
- booktitle
- {XATA 2007
- type
- Manual
- pages
- 33--46
- isbn
- 978-972-99166-4-9
- tipo
- inproceedings
- title
- Alternativas ao {XML
- chave
- MP07
- author
- Alberto Simões
- Rúben Fonseca
- José João Almeida
- address
- Rennes, France
- year
- 2007
- abstract
- Some processes are not easy to be programmed from scratch
parallel machines (clusters), but can be easily split on simple
steps. Makefile::Parallel is a tool which lets users to specify how processes
depend on each other.
The language syntax resembles the well known Makefile
makefiles format, but instead of specifying files or targets
dependencies, Makefile::Parallel specifies processes (or jobs) dependencies.
The scheduler submits jobs to the cluster scheduler (in our case,
Rocks PBS) waiting them to end. When each process finishes,
dependencies are calculated and direct dependent jobs are submitted.
Makefile::Parallel language includes features to specify parametric rules,
to split and join processes dependencies. Some tasks can be split
into n smaller jobs working on different portions of files. At the
end, another process can be used to join the results.
- month
- August
- editor
- Anne-Marie Kermarrec and Luc Bougé and Thierry Priol
- title
- {Makefile::Parallel
- tipo
- inproceedings
- publisher
- Springer-Verlag
- pages
- 33--41
- docpage
- jj.bib.dp.html#MP07
- booktitle
- Euro-Par 2007
- series
- volume
- 4641
- chave
- epia-bio-2007
- tipo
- inproceedings
- author
- Anália Lourenço
- Alberto Simões
- José João Almeida
- Miguel Rocha
- Isabel Rocha
- Eugénio Ferreira
- title
- An Ontology-Based Approach To Systems Biology Literature
Retrieval and Processing
- irreditor
- José Neves and Manuel Filipe Santos and José Manuel Machado
- docpage
- jj.bib.dp.html#epia-bio-2007
- booktitle
- New Trends in Artificial Intelligence
- shortin
- Epia, CMBSB
- pages
- 541--552
- year
- 2007
- abstract
- This paper details the \emph{SysBio Explorer
- month
- December
- shortin
- Epia, TEMA
- pages
- 791--799
- year
- 2007
- abstract
- Music Classification is a particular area
of Computational Musicology that provides valuable
insights about the evolving of composition patterns
and assists in catalogue generation. The proposed work
detaches from former works by classifying music based
on music score information. Text Mining techniques
support music score processing while Classification
techniques are used in the construction of decision
models. Although research is still at its earliest
beginnings, the work already provides valuable
contributes to symbolic music representation processing
and subsequent analysis. Score processing involved
the counting of ascending and descending chromatic
intervals, note duration and meta-information
tagging. Analysis involved feature selection and
the evaluation of several data mining algorithms,
ensuring extensibility towards larger repositories or
more complex problems. Experiments report the analysis
of composition epochs on a subset of the Mutopia project
open archive of classical LilyPond-annotated
music scores.
- month
- December
- docpage
- jj.bib.dp.html#epia-music-2007
- booktitle
- New Trends in Artificial Intelligence
- title
- Using Text Mining Techniques for Classical Music Scores
- irreditor
- José Neves and Manuel Filipe Santos and José Manuel Machado
- chave
- epia-music-2007
- tipo
- inproceedings
- author
- Alberto Simões
- Anália Lourenço
- José João Almeida
- note
- Documentação e actas do HAREM, a primeira avaliação conjunta na
- publisher
- Linguateca
- tipo
- incollection
- title
- booktitle
- Reconhecimento de entidades mencionadas em português
- docpage
- jj.bib.dp.html#harem:rena
- pages
- 157-172
- chave
- harem:rena
- author
- irreditor
- Diana Santos and Nuno Cardoso
- url
- http://acdc.linguateca.pt/aval_conjunta/LivroHAREM/Cap13-SantosCardoso2007-Almeida.pdf
- shortin
- year
- 2007
- title
- Parallel Corpora based Translation Resources Extraction
- lang
- EN
- tipo
- article
- pages
- 265--272
- docpage
- jj.bib.dp.html#sepln07
- journal
- Procesamiento del Lenguaje Natural
- volume
- 39
- chave
- sepln07
- author
- Alberto Simões
- José João Almeida
- year
- 2007
- month
- September
- abstract
- This paper describes NATools, a toolkit to process,
analyze and extract translation resources from
Parallel Corpora. It includes tools like a
sentence-aligner, a probabilistic translation
dictionaries extractor, word-aligner, a corpus
server, a set of tools to query corpora and
dictionaries, as well as a set of tools to extract
bilingual resources.
- booktitle
- {XATA 2008
- docpage
- jj.bib.dp.html#cgiauto08
- pages
- 22--27
- tipo
- inproceedings
- isbn
- 978-972-99166-5-6
- title
- {CGI::Auto
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/cgiauto08.pdf
- month
- February
- abstract
- The creation of a CGI or a WebService as an interface for
a command line tool is
not as unusual as it may seem. It is extremely usual and useful.
There are applications developed as command line tools that can be useful for
different purposes,
and different kind of users. Some of these users might not be able to run
these tools directly.
For instance, it
is not easy to install a bunch of Perl modules to have a small tool working.
For these situations, it is easier to make the tool available in the Web or as
The problem with making the tool available in these fashions, is that
programmers tend to rewrite
the tools to incorporate the CGI or XML specific layers.
We defend that these CGI or WebService interfaces should use the already
available command line
tool, without any change. This interface should be able to read a simple
specification of how the command line tool works, and buid the CGI or XML
specific layers
The CGI::Auto module aims this purpose:
to encapsulate command line tools in a CGI layer based on a textual
specification, transforming
the command line tool in a web application.
- year
- 2008
- author
- Davide Sousa
- Alberto Simões
- José João Almeida
- chave
- cgiauto08
- irreditor
- José Carlos Ramalho and João Correia Lopes and Salvador
- irreditor
- José Carlos Ramalho and João Correia Lopes and Salvador
- author
- Nuno Carvalho
- José João Almeida
- Alberto Simões
- chave
- navegante08
- year
- 2008
- abstract
- NAVEGANTE is a generic framework to build superior order
proxies for
intrusive browsing. This framework provides the means for developing
tools that behave as proxies, but perform some processing task on
the content that is being browsed. Parallel to this content processing,
applications can also run other user-defined functions with different
purposes and interfaces, but we'll explain those later. Currently,
NAVEGANTE only builds applications that run as CGIs, but this is intended
to change in a near future. Applications are built writing programs in
NAVEGANTE's Domain Specific Language (DSL).
NAVEGANTE is a work in progress. This article aims to describe the current
state of development. What applications can be built and how. Also, we
identify some implementation problems, and briefly discuss some future
improvements. Finally, we try to illustrate most of the concepts described
using a couple of case studies.
- month
- February
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/navegante08.pdf
- title
- tipo
- inproceedings
- isbn
- 978-972-99166-5-6
- pages
- 52--63
- docpage
- jj.bib.dp.html#navegante08
- booktitle
- {XATA 2008
- title
- Bilingual Terminology Extraction based on Translation
- tipo
- article
- lang
- EN
- pages
- 281--288
- volume
- 41
- journal
- Procesamiento del Lenguaje Natural
- docpage
- jj.bib.dp.html#sepln08
- author
- Alberto Simões
- José João Almeida
- chave
- sepln08
- year
- 2008
- month
- September
- abstract
- Parallel corpora are rich sources of translation
resources. This document presents a methodology for the extraction
of bilingual
nominals (terminology candidates) from parallel corpora, using
translation patterns.
The patterns proposed in this work specify the order changes that
occur during translation
and that are intrinsic to the involved languages syntaxes.
These patterns are described in a domain specific language
named PDL (Pattern Description Language), and are extremely
efficient for the detection of nominal phrases.
- year
- 2008
- pages
- 35--42
- docpage
- jj.bib.dp.html#propor-apslt08
- booktitle
- Applications of Portuguese Speech and Language Technologies,
PROPOR 2008 Special session
- title
- A Textual Rewriting system for NLP
- irreditor
- António Teixeira and Daniela Braga
- tipo
- inproceedings
- author
- J. J. Almeida
- Alberto Simões
- chave
- propor-apslt08
- editor
- Luis Seabra Lopes and
Nuno Lau and
Pedro Mariano and
Luis Mateus Rocha
- url
- http://dx.doi.org/10.1007/978-3-642-04686-5_33
- year
- 2009
- author
- Brett Drury
- J. J. Almeida
- chave
- epia:DruryA09
- series
- Lecture Notes in Computer Science
- volume
- 5816
- docpage
- jj.bib.dp.html#epia:DruryA09
- booktitle
- pages
- 400-410
- tipo
- inproceedings
- note
- Progress in Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 12-15
- publisher
- Springer
- title
- Construction of a Local Domain Ontology from News Stories
- isbn
- 978-989-96278-1-9
- lang
- EN
- tipo
- inproceedings
- title
- Bilingual Example Segmentation based on Markers
- docpage
- jj.bib.dp.html#markers09
- booktitle
- I Iberian SLTech 2009
- pages
- 95--98
- chave
- markers09
- author
- Alberto Simões
- José João Almeida
- editor
- António Teixeira and Miguel Sales Dias and Daniela Braga
- address
- Porto Salvo, Portugal
- year
- 2009
- abstract
- The Marker Hypothesis was first defined by Thomas Green
in 1979. It
is a psycho-linguistic hypothesis defining that there is a set of
words in every language that marks boundaries of phrases in a
sentence. While it remains a hypothesis because nobody has proved
it, tests have shows that results are comparable to basic shallow
parsers with higher efficiency.
The chunking algorithm based on the Marker Hypothesis is simple,
fast and almost language independent. It depends on a list of
closed-class words, that are already available for most languages.
This makes it suitable for bilingual chunking (there is not the
requirement for separate language shallow parsers).
This paper discusses the use of the Marker Hypothesis combined
with Probabilistic Translation Dictionaries for example-based machine
translation resources extraction from parallel corpora.
- month
- September, 3--4
- booktitle
- {XATA 2010
- docpage
- jj.bib.dp.html#xata2010-rewritexml
- pages
- 27--38
- lang
- EN
- tipo
- inproceedings
- title
- Processing {XML:
- editor
- Alberto Simões and Daniela da Cruz and José Carlos Ramalho
- address
- Vila do Conde
- abstract
- Nowadays XML processing is performed using one of
two approaches: using the SAX (Simple API for XML)
or using the DOM (Document Ob ject Model). While
these two approaches are adequate for most cases
there are situations where other approaches can make
the solution easier to write, read and, therefore,
to maintain. This document presents a rewriting
approach for XML documents processing, focusing
the tasks of transforming XML documents (into other
XML formats or other textual documents) and the task
of rewriting other textual formats into XML
dialects. These approaches were validated with some
case studies, ranging from an XML authoring tool to
a dictionary publishing mechanism.
- month
- Maio
- year
- 2010
- chave
- xata2010-rewritexml
- author
- Alberto Simões
- José João Almeida
- title
- A Case Study of Rule Based and Probabilistic Word Error Correction of
Portuguese OCR Text in a "Real World" Environment for Inclusion in a Digital
- tipo
- article
- note
- presented in {CICLING2010
- umber
- 1-2
- olume
- 1
- pages
- 307--315
- journal
- International Journal of Computational Linguistics
- docpage
- jj.bib.dp.html#ocr2010
- author
- Brett Drury
- José João Almeida
- chave
- ocr2010
- year
- 2010
- url
- editor
- Nicoletta Calzolari and others
- address
- Valletta, Malta
- shortin
- language
- english
- year
- 2010
- month
- may
- chave
- lrec10:bigorna
- author
- José João Almeida
- André Santos
- Alberto Simões
- docpage
- jj.bib.dp.html#lrec10:bigorna
- booktitle
- Proceedings of the Seventh conference on International Language
Resources and Evaluation (LREC'10)
- isbn
- 2-9517408-6-7
- tipo
- inproceedings
- publisher
- European Language Resources Association (ELRA)
- title
- Bigorna -- A Toolkit for Orthography Migration Challenges
- date
- 19-21
- booktitle
- Proceedings of the Seventh conference on International Language
Resources and Evaluation (LREC'10)
- docpage
- jj.bib.dp.html#lrec10:dicaberto
- date
- 19-21
- title
- Processing and Extracting Data from Dicionário Aberto
- publisher
- European Language Resources Association (ELRA)
- tipo
- inproceedings
- isbn
- 2-9517408-6-7
- month
- may
- year
- 2010
- language
- english
- shortin
- address
- Valletta, Malta
- editor
- Nicoletta Calzolari and others
- author
- Alberto Simões
- José João Almeida
- Rita Farinha
- chave
- lrec10:dicaberto
- pages
- 50--55
- docpage
- jj.bib.dp.html#bucc2010
- booktitle
- BUCC2010 -- 3rd Workshop on Building and Using Comparable Corpora, lrec2010
- title
- Automatic Parallel Corpora and Bilingual Terminology extraction from Parallel WebSites
- tipo
- inproceedings
- lang
- EN
- year
- 2010
- month
- May
- abstract
- In our days, the notion, the importance and the
significance of parallel corpora is so big that needs
no special introduction. Unfortunately, public
available parallel corpora is somewhat limited in
range. There are big corpora about politics or
legislation, about medicine and other specific areas,
but we miss corpora for other different
areas. Currently there is a huge investment on using
the Web as a corpus. This article uncovers GWB, a
tool that aims automatic construction of parallel
corpora from the web. We defend that it is possible
to build high quality terminological corpora in an
automatic fashion, just by specifying a sensible
Internet domain and using an appropriate set of seed
keywords. GWB is a web-spider that works in
conjunction with a set of other Open-Source tools,
de¿ning a pipeline that includes the documents
retrieval from the web, alignment at sentence level
and its quality analysis, bilingual dictionaries and
terminology extraction and construction of off-line
- address
- Valletta, Malta
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/bucc2010.pdf
- editor
- Reinhard Rapp and Pierre Zweigenbaum and Serge Sharoff
- author
- José João Almeida
- Alberto Simões
- chave
- bucc2010
- pages
- 19--22
- booktitle
- Entity2010 -- Workshop on Resources and Evaluation for Entity Resolution and Entity
Management, lrec2010
- docpage
- jj.bib.dp.html#brett:lrec
- title
- Identification, extraction and population of collective named
entities from business news
- lang
- EN
- tipo
- inproceedings
- address
- Valletta, Malta
- abstract
Sentiment analysis of business news has become an increasingly popular
area of research for both the practitioner and academic. The future
financial prospects of companies can be estimated through the aggregation
of sentiment over a period of time. The aggregation of sentiment
for a specific company is only possible if the company is explicitly
mentioned in the news text. In certain instances, news text may refer
to groups or collections of companies, for exampleThe Automotive
SectororThe Russell Group of Universities. Widely available named
entity dictionaries will not recognize these groups of companies, and
consequently, it may not be possible to assign sentiment attributed
to these groups of companies to their individual members. This paper
describes a method for identifying groups of companies, which for the
purposes of this paper will be known asCollective Entities. The
described method is corpus based: it uses linguistic patterns to
identify Collective Entity Names, their members and their natural
relations with other Collective Entities. The described methodology
contains the following steps: 1. Identify and validate seed extraction
patterns, 2. Expand seed patterns, 3. Extract and validate Collective
Named Entities, 4. Extract related Collective Named Entities, 5. Construct
and populate an Ontology and 6. Expand the members of Collective Entity
sets with Linked Data.
- month
- May
- year
- 2010
- chave
- brett:lrec
- author
- José João Almeida
- Brett Drury
- pages
- 217--220
- docpage
- jj.bib.dp.html#fala2010-triPsi
- booktitle
- FALA2010 -- II Iberian SLTech Workshop
- title
- Automating psycholinguistic statistics computation:
- tipo
- inproceedings
- address
- Vigo
- year
- 2010
- abstract
This article describes psycholinguistic lexical databases
available in various languages, including English, Spanish and
Portuguese. These lexical databases are important for researchers
in Psycholinguistics and other related areas, providing
a pool of experimental materials and allowing for an efficient
process of selection of these experimental materials.
The process of gathering statistics is slow, resulting in a
small pool of materials in the short-term. The need to find an
alternative method to gather limited or yet unavailable statistics
for a specific language led us to consider gathering statistics
from other languages and to compute their triangulation. Our
aim was to automatize the computation of statistics such as
Familiarity, Imageability, Age of Acquisition and Written Word
Frequency for that specific language.
We will describe the process of preparing this data and triangulating and
comparing statistics for some languages in an attempt of finding a
relationship between them. The results were
analysed considering correlations between each statistic in each
pair of languages and by computing the mean of absolute differences between
each language's values.
- month
- November
- editor
- Carmen Mateo and Francisco Díaz and Francisco Pazó
- chave
- fala2010-triPsi
- author
- João Filipe Machado
- José João Almeida
- Alberto Simões
- Ana Soares
- editor
- Luis Barbosa and Antonio Cerone and Siraj Shaikh (Guest Eds.)
- url
- http://journal.ub.tu-berlin.de/index.php/eceasst/article/view/458/446
- year
- 2010
- author
- Alberto Simões
- Nuno Carvalho
- José João Almeida
- chave
- opencert2010
- volume
- 33
- journal
- Electronic Communications of the EASST
- docpage
- jj.bib.dp.html#opencert2010
- tipo
- article
- note
- Foundations and Techniques for Open Source Software Certification
- title
- Testing as a Certification Approach
- pages
- 67--72
- number
- 3
- journal
- Linguamática
- docpage
- jj.bib.dp.html#p-pal-linguamatica
- volume
- 2
- title
- {P-PAL:
- tipo
- article
- month
- December
- abstract
Neste trabalho apresentamos o projecto Procura-PALavras (P-PAL)
cujo principal objectivo é desenvolver uma ferramenta
electrónica que disponibilize informação sobre índices
psicolinguísticos objectivos e subjectivos de palavras do
Português Europeu (PE). O P-PAL será disponibilizado
gratuitamente à comunidade científica num formato amigável a
partir de um sítio na Internet a construir para o efeito. Ao
utilizar o P-PAL, o investigador poderá fazer uma utilização
personalizada do programa ao seleccionar, da ampla variedade de
análises oferecidas, os índices que se adequam aos propósitos da
sua investigação e numa dupla funcionalidade de utilização:
pedir ao programa para analisar listas de palavras previamente
constituídas nos índices considerados relevantes para a
investigação ou para obter listas de palavras que obedeçam aos
parâmetros definidos. O P-PAL assume-se assim como uma
ferramenta fundamental à promoção e internacionalização da
investigação em Portugal.
- year
- 2010
- url
- http://linguamatica.com/index.php/linguamatica/article/download/80/108
- irreditor
- Alberto Simões and José João Almeida and Xavier Gómez
- chave
- p-pal-linguamatica
- issn
- 1647--0818
- author
- Ana Paula Soares
- Montserrat Comesaña
- Álvaro Iriarte
- José João Almeida
- Alberto Manuel Brandão Simões
- Ana Costa,
Patrícia Cunha França
- João Machado
- title
- Guided Self Training for Sentiment Classification
- tipo
- inproceedings
- pages
- 9--16
- booktitle
- Proceedings of Workshop on Robust Unsupervised and Semisupervised
Methods in Natural Language Processing
- docpage
- jj.bib.dp.html#drury-torgo-almeida:2011:ROBUS
- author
- Drury, Brett
- Torgo, Luis
- J.J. Almeida
- chave
- drury-torgo-almeida:2011:ROBUS
- month
- September
- year
- 2011
- address
- Hissar, Bulgaria
- url
- http://www.aclweb.org/anthology/W11-3902
- title
- Classifying News Stories to Estimate the Direction of a Stock Market Index
- author
- Brett Drury
- Luis Torgo
- J.J. Almeida
- tipo
- inproceedings
- chave
- drury1
- location
- Chaves
- year
- 2011
- pages
- 1-4
- booktitle
- Third Workshop on Intelligent Systems and Applications (WISA)
- docpage
- jj.bib.dp.html#drury1
- title
- Magellan: An Adaptive Ontology Drivenbreaking Financial NewsRecommender
- author
- Brett Drury
- J.J. Almeida
- Helena Morais
- tipo
- inproceedings
- chave
- drury2
- location
- Chaves
- year
- 2011
- booktitle
- CISTI-2011
- docpage
- jj.bib.dp.html#drury2
- title
- An Error Correction Methodology for Time Dependent Ontologies
- isbn
- 978-3-642-22055-5
- publisher
- Springer
- tipo
- inproceedings
- pages
- 501-512
- ee
- http://dx.doi.org/10.1007/978-3-642-22056-2_52
- booktitle
- {CAiSE
- part
- 8
- docpage
- jj.bib.dp.html#drury3
- volume
- 83
- series
- Lecture Notes in Business Information Processing
- chave
- drury3
- author
- Brett Drury
- J.J. Almeida
- Helena Morais
- year
- 2011
- editor
- Camille Salinesi and Oscar Pastor
- year
- 2011
- booktitle
- CISTI-2011
- docpage
- jj.bib.dp.html#nuno1
- title
- Oml: A Scripting Approach For Manipulating Ontologies
- author
- Nuno Carvalho
- Alberto Simões
- J.J. Almeida
- tipo
- inproceedings
- chave
- nuno1
- location
- Chaves
- title
- isbn
- 978-989-96001-5-7
- tipo
- inproceedings
- publisher
- Dep. de Eng. Informática da Universidade de Coimbra
- pages
- 222--233
- docpage
- jj.bib.dp.html#corta2011-pftl
- booktitle
- INForum'11 --- Simpósio de Informática (CoRTA2011 track)
- pdf
- http://ambs.perl-hackers.net/publications/corta2011-pftl.pdf
- chave
- corta2011-pftl
- author
- Nuno Carvalho
- Alberto Simões
- José João Almeida
- Pedro Rangel Henriques
- Maria João Varanda Pereira
- address
- Coimbra, Portugal
- language
- EN
- year
- 2011
- month
- Setembro
- abstract
- Today, most
developers prefer to store information in databases. But
plain filesystems were used for years, and are still used, to store
information, commonly in files of heterogeneous formats that are
organized in directory trees. This approach is a very flexible and
natural way to create hierarchical organized structures of
We can devise a formal notation to describe a filesystem tree structure,
similar to a grammar, assuming that filenames can be considered terminal
symbols, and directory names non-terminal symbols. This specification
would allow to derive correct language sentences (combination of terminal
symbols) and to associate semantic actions, that can produce arbitrary
side effects, to each valid sentence, just as we do in common parser
generation tools. These specifications can be used to systematically
process files in directory trees, and the final result depends on the
semantic actions associated with each production rule.
In this paper we revamped an old idea of using a domain specific
language to implement these specifications similar to context free
grammars. And introduce some examples of applications that can be
built using this approach.
- editor
- Raul Barbosa and Luis Caires
- pdf
- http://ambs.perl-hackers.net/publications/corta2011-oml.pdf
- chave
- corta2011-oml
- author
- Nuno Carvalho
- José João Almeida
- Alberto Simões
- editor
- Raul Barbosa and Luis Caires
- address
- Coimbra, Portugal
- language
- EN
- year
- 2011
- abstract
Most existing programming languages can be categorized as general
purpose programming languages, meaning that they can be used to
implement solutions for any given domain. They are not, in any way,
optimized for a specific set of problems. In contrast, Domain
Specific Languages (DSL) are used to solve specific problems in a
well defined domain. DSL are optimized to a particular set of
problems, but they lack support for a wide range of operations that
are required when dealing with real world problems. So, in a
perfect world, we would like to implement applications using a
general purpose programming language, but use a set of different DSL
to handle specific domains' tasks.
In this paper we describe a DSL named Ontology Manipulation Language
(OML), designed to describe operations over
with ontologies. Programs can be written
using only the OML syntax and be executed independently. OML syntax
was designed to deal with ontologies and the language itself is
optimized to perform these tasks, which means that other relatively
simpler tasks can not be easily done. To overcome this challenge a
mechanism was developed so that you can weave small snippets of OML code
inside Perl programs, meaning we have the power of OML to manipulate
ontologies and, at the same time, all the paraphernalia of modules
that Perl offers to handle everything else.
- month
- Setembro
- isbn
- 978-989-96001-5-7
- tipo
- inproceedings
- publisher
- Dep. de Eng. Informática da Universidade de Coimbra
- title
- Weaving {OML
- docpage
- jj.bib.dp.html#corta2011-oml
- booktitle
- INForum'11 --- Simpósio de Informática (CoRTA2011 track)
- pages
- 184--197
- year
- 2011
- full
- Proceedings of the International Conference on Web
Intelligence, Mining and Semantics, WIMS 2011, Sogndal, Norway, May 25
- 27, 2011
- editor
- Rajendra Akerkar
- author
- chave
- wims2011
- ee
- http://doi.acm.org/10.1145/1988688.1988720
- pages
- 27--34
- bibsource
- DBLP, http://dblp.uni-trier.de
- booktitle
- docpage
- jj.bib.dp.html#wims2011
- title
- Identification of fine grained feature based event and sentiment
phrases from business news stories
- publisher
- tipo
- inproceedings
- isbn
- 978-1-4503-0148-0
- year
- 2011
- url
- http://natura.di.uminho.pt/~jj/pln/sepln2011-boolcleaner.pdf
- docpage
- jj.bib.dp.html#sepln:bookcleaner
- booktitle
- Actas del XXVII Congreso de la Sociedad Española
para el Procesamiento del Lenguaje Natural
- title
- {Text::Perfide::BookCleaner
- tipo
- inproceedings
- pp
- 433-441
- author
- Santos, André
- José João Almeida
- location
- Huelva, 5 - 7 Set
- chave
- sepln:bookcleaner
- number
- 3/4
- pages
- 219-233
- volume
- 6
- journal
- docpage
- jj.bib.dp.html#drury4
- title
- Construction and maintenance of a fuzzy temporal ontology
from news stories
- tipo
- article
- year
- 2011
- journalfull
- International Journal of Metadata, Semantics and Ontologies
- doi
- http://dx.doi.org/10.1504/IJMSO.2011.048028
- author
- Brett Drury
- J.J. Almeida
- Helena Morais
- chave
- drury4
- year
- 2011
- month
- 1--2 June
- abstract
The eXtensible Mark-up Language (XML) is probably one of the
most popular markup languages available today. It is very typical to find all
of services or programs representing data in this format. This situation is
more common in web development environments or Service Oriented Architectures
(SOA), where data flows from one service to another, being consumed and
produced by an heterogeneous set of applications, which sole requirement is to
understand XML.
This workflow of data represented in XML implies some tasks that applications
have to perform if they are required to consume or produce information: the
task of parsing an XML document, giving specific semantics to the information
parsed, and the task of producing an XML document.
Our main goal is to create object definitions that can analyze an XML document
and automatically create an object definition that can be used abstractly by
application. These objects are able to parse the XML document and gather all
data required to mimic all the information present in the document.
This paper introduces xml2pm, a simple tool that can inspect the structure of
an XML document and create an object definition (a Perl module) that stores
same information present in the orinial document, but as a runtime object. We
introduce a simple case of how this approach allows the creation of
based on Web Services in an elegant and simple way.
- address
- Vila do Conde, Portugal
- editor
- Alberto Simões
- author
- Nuno Carvalho
- Alberto Simões
- José João Almeida
- pdf
- http://ambs.perl-hackers.net/publications/xml2pm-xata2011.pdf
- chave
- xml2pm-xata2011
- pages
- 103--114
- docpage
- jj.bib.dp.html#xml2pm-xata2011
- booktitle
- {XATA 2011
- title
- xml2pm: A Tool for Automatic Creation of Object Definitions Based
on {XML
- tipo
- inproceedings
- isbn
- 978-989-96863-1-1
- lang
- EN
- author
- Brett Drury
- Luis Torgo
- J.J. Almeida
- chave
- drury5
- year
- 2012
- full
- International Journal of Computer Science and Applications
- url
- http://www.tmrfindia.org/ijcsa/v9i11.pdf
- title
- Classifying News Stories with a Constrained Learning Strategy
to Estimate the Direction of a Market Index
- tipo
- article
- number
- 1
- pages
- 1-22
- bibsource
- DBLP, http://dblp.uni-trier.de
- volume
- 9
- docpage
- jj.bib.dp.html#drury5
- journal
- chave
- da2012
- author
- Alberto Simões
- Álvaro Iriarte Sanromán
- José João Almeida
- address
- Coimbra, Portugal
- month
- April
- year
- 2012
- editor
- Helena Caseli and Aline Villavicencio and António Teixeira
and Fernando Perdigão
- title
- Dicionário-Aberto -- A Source of Resources for the
Portuguese Language Processing
- publisher
- Springer
- tipo
- article
- pages
- 121--127
- docpage
- jj.bib.dp.html#da2012
- journal
- Computational Processing of the Portuguese Language,
Lecture Notes for Artificial Intelligence
- volume
- 7243
- date
- 23-25
- title
- Structural alignment of plain text books
- tipo
- inproceedings
- publisher
- European Language Resources Association (ELRA)
- isbn
- 978-2-9517408-7-7
- docpage
- jj.bib.dp.html#LREC12.967
- booktitle
- Proceedings of the Eight International Conference on Language
Resources and Evaluation (LREC'12)
- author
- André Santos
- José João Almeida
- Nuno Carvalho
- chave
- LREC12.967
- year
- 2012
- month
- may
- address
- Istanbul, Turkey
- language
- english
- editor
- Nicoletta Calzolari and others
- author
- Brett Drury
- José João Almeida
- chave
- LREC12.611
- editor
- Nicoletta Calzolari and others
- year
- 2012
- month
- may
- address
- Istanbul, Turkey
- language
- english
- tipo
- inproceedings
- publisher
- European Language Resources Association (ELRA)
- isbn
- 978-2-9517408-7-7
- date
- 23-25
- title
- The Minho Quotation Resource
- docpage
- jj.bib.dp.html#LREC12.611
- booktitle
- Proceedings of the Eight International Conference on Language
Resources and Evaluation (LREC'12)
- pages
- 239-253
- year
- 2012
- abstract
- Concept location is a common task in program comprehension
techniques, essential in many approaches used for software care and
software evolution. An important goal of this process is to discover
a mapping between source code and human oriented concepts.
Although programs are written in a strict and formal language, natural
language terms and sentences like identifiers (variables or functions
names), constant strings or comments, can still be found embedded in
programs. Using terminology concepts and natural language processing
techniques these terms can be exploited to discover clues about which
real world concepts source code is addressing.
This work extends symbol tables build by compilers with ontology
driven constructs, extends synonym sets defined by linguistics, with
automatically created Probabilistic SynSets from software
domain parallel corpora. And using a relational algebra, creates
semantic bridges between program elements and human oriented concepts,
to enhance concept location tasks.
- month
- June
- docpage
- jj.bib.dp.html#CAPH12a
- booktitle
- SLATe'12 --- Symposium on Languages, Applications and Technologies
- volume
- 21
- title
- Probabilistic SynSet Based Concept Location
- irreditor
- Alberto Simões and Ricardo Queirós and Daniela da Cruz
- chave
- CAPH12a
- tipo
- inproceedings
- author
- Nuno Ramos Carvalho
- Jose Joao Almeida
- Maria
João Varanda Pereira
- Pedro Rangel Henriques
- publisher
- OASIC -- Open Access Series in Informatics, Schloss
Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
- year
- 2012
- chave
- wikiscore
- author
- J.J. Almeida
- Nuno Ramos Carvalho
- José Nuno Oliveira
- journal
- Information, Services and Use (ISU)
- docpage
- jj.bib.dp.html#wikiscore
- volume
- 31
- comment
- elpub 2012
- pages
- 177--187
- number
- 3-4/2011
- ee
- DOI 10.3233/ISU-2012-0647
- tipo
- article
- publisher
- IOS Press
- title
- {Wiki::Score
- small
- series
- OpenAccess Series in Informatics (OASIcs)
- volume
- 21
- docpage
- jj.bib.dp.html#flapp
- booktitle
- 1st Symposium on Languages, Applications and Technologies
- pages
- 41--50
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
- title
- {Generating flex Lexical Scanners for Perl Parse::Yapp
- idx
- url
- http://drops.dagstuhl.de/opus/volltexte/2012/3513
- year
- 2012
- abstract
Perl is known for its versatile regular expressions. Nevertheless, using Perl regular
expressions for creating fast lexical analyzer is not easy. As an alternative, the authors
defend the automated generation of the lexical analyzer in a well known fast application
(flex) based on a simple Perl definition in the syntactic analyzer. In this paper we
extend the syntax used by Parse::Yapp, one of the most used parser generators for Perl,
making the automatic generation of flex lexical scanners possible. We explain how this is
performed and conclude with some benchmarks that show the relevance of the approach.
- address
- Dagstuhl, Germany
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2012.41
- author
- Alberto Simões
- Nuno Ramos Carvalho
- José João Almeida
- chave
- flapp
- irreditor
- Alberto Simões and Ricardo Queirós and Daniela da Cruz
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- title
- Predicting Market Direction from Direct Speech by Business
- volume
- 21
- series
- booktitle
- docpage
- jj.bib.dp.html#DBLP:conf/slate/DruryA12
- pages
- 163-172
- bibsource
- DBLP, http://dblp.uni-trier.de
- author
- Brett Drury
- José João Almeida
- chave
- DBLP:conf/slate/DruryA12
- irreditor
- Alberto Sim{õ
- year
- 2012
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2012.163
- irreditor
- Luís Correia and Luís Paulo Reis and José Cascalho and Luís
Gomes and Hélia Guerra and Pedro Cardoso
- uthor
- Alberto Simões and José João Almeida and Nuno Ramos Carvalho
- chave
- ptd2013
- address
- Angra do Heroismo, Azores
- url
- http://natura.di.uminho.pt/~jj/bib/ptd-algebra.pdf
- ear
- 2013
- title
- Defining a Probabilistic Translation Dictionaries Algebra
- tipo
- inproceedings
- ooktitle
- XVI Portuguese Conference on Artificial Inteligence - EPIA
- onth
- September
- pages
- 444--455
- docpage
- jj.bib.dp.html#ptd2013
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/wcist2012-dmoss.pdf
- bstract
- Besides source code, the fundamental source of information about Open Source
Software lies in documentation, and other non source code files, like README,
INSTALL, or HowTo files, commonly available in the software ecosystem. These
documents, written in natural language, provide valuable information during the
software development stage, but also in future maintenance and evolution tasks.
DMOSS is a toolkit designed to systematically assess the quality of non source
code text found in software packages. The toolkit handles a package as an
attribute tree, and performs several tree traverse algorithms through a set of
plugins, specialized in retrieving specific metrics from text, gathering
information about the software. These metrics are later used to infer knowledge
about the software, and composed together to build reports that assess the
quality of specific features of the software. This paper discusses the
motivations for this work, continues with a description of the toolkit
implementation and design goals. Follows an example of its usage to process a
software package, and the produced report. Finally some final remarks and
trends for future work are presented.
- chave
- algarve-cross2013
- uthor
- Nuno Ramos Carvalho and Alberto Simões and José João Almeida
- docpage
- jj.bib.dp.html#algarve-cross2013
- eries
- Advances in Intelligent Systems and Computing
- olume
- 206
- sbn
- 978-3-642-36980-3
- ages
- 785--794
- ooktitle
- Advances in Information Systems and Technologies
- tipo
- inproceedings
- title
- Open Source Software Documentation Mining for Quality Assessment
- ublisher
- Springer Berlin Heidelberg
- ear
- 2013
- ditor
- Rocha, Álvaro and Correia, Ana Maria and Wilson, Tom and Stroetmann,
Karl A.
- chave
- algarve2013
- uthor
- Alberto Simões and Anália Lourenço and José João Almeida
- bstract
- This work aims at pointing out the benefits of a topology-oriented
wide scope, but differentiated, profile analysis. The goal was to conciliate
advanced common website usage profiling techniques with the analysis of the
website's topology information, outputting valuable knowledge in an intuitive
and comprehensible way. Server load balancing, crawler activity evaluation and
Web site restructuring are the primary analysis concerns and, in this regard,
experiments over six month data of a real-world Web site were considered
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/wcist2012-webtopology.pdf
- title
- Evaluating Web Site Structure Based on Navigation Profiles and Site Topology
- ublisher
- Springer Berlin Heidelberg
- ear
- 2013
- ditor
- Rocha, Álvaro and Correia, Ana Maria and Wilson, Tom and Stroetmann,
Karl A.
- ages
- 305-311
- tipo
- inproceedings
- ooktitle
- Advances in Information Systems and Technologies
- olume
- 206
- eries
- Advances in Intelligent Systems and Computing
- sbn
- 978-3-642-36980-3
- docpage
- jj.bib.dp.html#algarve2013
- docpage
- jj.bib.dp.html#Passarola2013
- booktitle
- CISTI-2013
- pages
- 763--768
- location
- Lisboa
- tipo
- inproceedings
- title
- PASSAROLA: High-Order Exercise Generation
- url
- http://natura.di.uminho.pt/~jj/bib/passarola-cisti2013.pdf
- year
- 2013
- abstract
- In order to be robust and achieve multi-domain
coverage, exercise generation systems usually work with answers
of simple types (e.g. multiple-choice, Boolean, integer, or file
comparison). In this paper we describe an exercise generation
system PASSAROLA, a simple, yet powerful, language that anyone
with no computer science background, can use to develop
exercises, that include a collection of heterogeneous objects, and
allows the usage of complex elements. Its main characteristic
features are the use of simple reusable templates, simple and rich
types, rich notation and syntax (LaTeX based) for questions,
solutions, and answers, transformations and calculations,
external calculators.
- chave
- Passarola2013
- author
- J.João Almeida
- Isabel Araújo
- Irene Brito
- Nuno Carvalho
- Gaspar J. Machado
- Rui M.S. Pereira
- Georgi Smirnov
- tipo
- inproceedings
- location
- Lisboa
- title
- Math exercise generation and smart assessment
- booktitle
- Workshop of TICAMES (Information and Communication Technology in
Higher Education: Learning Mathematics), CISTI-2013
- docpage
- jj.bib.dp.html#ticames2013
- pages
- 1014--1019
- author
- J.João Almeida
- Isabel Araújo
- Irene Brito
- Nuno Carvalho
- Gaspar J. Machado
- Rui M.S. Pereira
- Georgi Smirnov
- chave
- ticames2013
- url
- http://natura.di.uminho.pt/~jj/bib/passarola-ticames2013.pdf
- abstract
- In this paper we concentrate on the field of
mathematics education where the aim is to generate exercises
going beyond those with answers of simple types (e.g. multiple-choice,
Boolean, integer, or file comparison). We present three
examples from introductory college mathematics and emphasize
the key points that should be taken into account in order to
develop a "well-posed" exercise together with its verification. All
the presented examples were implemented in the system
- year
- 2013
- irrbooktitle
- Computational Science and Its Applications - ICCSA 2013
- 13th International Conference, Ho Chi Minh City, Vietnam,
June 24-27, 2013, Proceedings, Part II
- year
- 2013
- doi
- http://dx.doi.org/10.1007/978-3-642-39643-4_32
- offcrossref
- DBLP:conf/iccsa/2013-2
- editor
- Beniamino Murgante and others
- author
- Pedro Martins
- Nuno Ramos Carvalho
- João Paulo Fernandes
- José João Almeida
- João Saraiva
- chave
- crossportal
- ee
- http://dx.doi.org/10.1007/978-3-642-39643-4
- bibsource
- DBLP, http://dblp.uni-trier.de
- pages
- 443-458
- series
- Lecture Notes in Computer Science
- volume
- 7972
- docpage
- jj.bib.dp.html#crossportal
- booktitle
- ICCSA (2)
- title
- A Framework for Modular and Customizable Software Analysis
- tipo
- inproceedings
- publisher
- Springer
- isbn
- 978-3-642-39642-7
- docpage
- jj.bib.dp.html#icaicte13
- booktitle
- ICAICTE-13, Advances in Intelligent Systems Research
- title
- Exercise generation with the system Passarola
- isbn
- 978-90786-77-79-6
- tipo
- inproceedings
- doi
- doi:10.2991/icaicte.2013.64
- year
- 2013
- abstract
- A robust multi-domain coverage exercise generation system
usually works with an-swers of simple types (e.g. multiple-choice,
Boolean, integer, or file compari-son). In this paper we describe
Passarola, a simple, yet powerful, exercise genera-tion system and its
language that anyone with no computer science background can use to
develop exercises. It may include a collection of heterogeneous objects
allowing the usage of complex elements. Its main characteristics are the
use of simple reusable templates, simple and rich types, and rich notation
and syntax (LaTeX based) for questions, solutions, and answers.
- url
- http://natura.di.uminho.pt/~jj/bib/ecaicte2013.pdf
- issn
- 1951-6851
- chave
- icaicte13
- keywords
- Passarola, exercise generation system, self-regulating study
- author
- José João Almeida
- Isabel Araújo
- Irene Brito
- Nuno Carvalho
- Gaspar J.
- Rui M. S. Pereira
- Georgi Smirnov
- title
- ABC with a UNIX Flavor
- isbn
- 978-3-939897-52-1
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- pages
- 203-218
- bibsource
- DBLP, http://dblp.uni-trier.de
- booktitle
- 2nd Symposium on Languages, Applications and Technologies,
SLATE 2013, June 20-21, 2013 - Porto, Portugal
- docpage
- jj.bib.dp.html#slate/AzevedoA13
- volume
- 29
- series
- irreditor
- José Paulo Leal and
Ricardo Rocha and
Alberto Simões
- chave
- slate/AzevedoA13
- author
- Bruno M. Azevedo
- José João Almeida
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2013.203
- abstract
ABC is a simple, yet powerful, textual musical notation.
This paper presents ABC::DT, a rule-based domain-specific
language (Perl embedded), designed to simplify the
creation of ABC processing tools. Inspired by the Unix philosophy,
those tools intend to be simple and compositional in a Unix filters' way.
From ABC::DT's rules we obtain an ABC processing tools whose main
algorithm follows a traditional compiler architecture, thus consisting of
three stages:
1) ABC parser (based on abcmtops parser),
2) ABC semantic transformation (associated with ABC attributes),
3) output generation (either a user defined or system provided ABC generator).
- year
- 2013
- url
- http://drops.dagstuhl.de/opus/volltexte/2013/4039/pdf/14.pdf
- chave
- escolex2013
- tipo
- article
- author
- Soares, Ana Paula
- José Carlos Medeiros
- Alberto Simões
- João Machado
- Ana Costa
- Álvaro Iriarte
- José João Almeida
- Ana P. Pinheiro
- and Montserrat Comesaña
- title
- Escolex: A grade-level lexical database from european portuguese
elementary to middle school textbooks.
- journal
- Behavior Research Methods
- docpage
- jj.bib.dp.html#escolex2013
- url
- http://p-pal.di.uminho.pt/static/files/db/Soares_et_al.__in_press_ESCOLEX.pdf
- pages
- 1--14
- year
- 2013
- abstract
In this article, we introduce ESCOLEX, the first European Portuguese children's
lexical database with grade-level-adjusted word frequency statistics. Computed
from a 3.2-million-word corpus, ESCOLEX provides 48,381 word forms extracted
from 171 elementary and middle school textbooks for 6- to 11-year-old children
attendin' the first six grades in the Portuguese educational system. Like other
children's grade-level databases, ESCOLEX provides four frequency indices for
each grade: overall word frequency (F), index of dispersion across the selected
textbooks (D), estimated frequency per million words (U), and standard
frequency index (SFI). It also provides a new measure, contextual diversity
(CD). In addition, the number of letters in the word and its part(s) of speech,
number of syllables, syllable structure, and adult frequencies taken from P-PAL
(a European Portuguese corpus-based lexical database) are provided. ESCOLEX
will be a useful tool both for researchers interested in language processing
and development and for professionals in need of verbal materials adjusted to
children's developmental stages. ESCOLEX can be downloaded along with this
article or from http://p-pal.di.uminho.pt/about/databases.
- booktitle
- Humanidades: Novos Paradigmas do Conhecimento e da Investigação,
XIV Colóquio de Outono
- editor
- Ana Gabriela Macedo and
Carlos Mendes de Sousa and
Vitor Moura
- docpage
- jj.bib.dp.html#coloquiosOutono2013
- year
- 2013
- pages
- 323--339
- author
- José João Almeida
- Sílvia Araújo
- Idalete Dias
- Ana Correio
- publisher
- húmus, Universidade do Minho
- tipo
- inproceedings
- chave
- coloquiosOutono2013
- title
- {Per-fide
- month
- April
- year
- 2014
- chapter
- 9
- editor
- Tony Berber Sardinha and Telma São-Bento Ferreira
- url
- http://ambs.perl-hackers.net/publications/perfide_ch9_sardinha.pdf
- chave
- sardinha2014
- author
- José João Almeida
- Sílvia Araújo
- Nuno Carvalho
- Idalete Dias
- Ana Oliveira
- André Santos
- Alberto Simões
- pages
- 177--200
- booktitle
- Working with Portuguese Corpora
- docpage
- jj.bib.dp.html#sardinha2014
- title
- The {Per-Fide
- isbn
- 978-1441190505
- publisher
- Bloomsbury Publishing
- tipo
- incollection
- title
- {Procura-PALavras (P-Pal): uma nova medida de frequência lexical do português europeu contemporâneo
- script
- sci_arttext
- publisher
- scielo
- tipo
- article
- pages
- 110 - 123
- docpage
- jj.bib.dp.html#SOARES2014
- journal
- {Psicologia: Reflexão e Crítica
- crossref
- 10.1590/S0102-79722014000100013
- pid
- S0102
- chave
- SOARES2014
- author
- Soares, Ana Paula
- Iriarte, Álvaro
- Almeida, José João
- Simões, Alberto
- Costa, Ana
- França, Patricia
- Machado, João
- Comesaña, Montserrat
- nrm
- iso
- language
- pt
- month
- 03
- year
- 2014
- url
- http://www.scielo.br/scielo.php?&-79722014000100013&volume = {27
- volume
- 27
- journal
- {Psicologia: Reflexao e Critica
- docpage
- jj.bib.dp.html#ppal2014
- number
- 1
- pages
- 110-123
- tipo
- article
- title
- Procura-PALavras (P-PAL): A new measure
of word frequency for contemporary European Portuguese | Procura-PALavras
(P-PAL): Uma nova medida de frequência lexical do Português Europeu
- year
- 2014
- doi
- 10.1590/S0102-79722014000100013
- author
- Soares, A.P.
- Iriarte, A.
- Almeida, J.J.
- Simões, A.
- Costa, A.
- França, P.
- Machado, J.
- Comesaña, M.
- chave
- ppal2014
- address
- annote
- Document Type: Conference Paper; SCOPUS
- doi
- 10.1007/978-3-319-09153-2_9
- year
- 2014
- chave
- conclave-iccsa2104
- author
- Nuno Ramos Carvalho
- José João Almeida
- Maria
João Varanda Pereira
- Pedro Rangel Henriques
- journal
- Lecture Notes in Computer Science (including subseries Lecture
Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
- docpage
- jj.bib.dp.html#conclave-iccsa2104
- offbooktitle
- 14th International Conference on Computational Science and its
Applications, ICCSA 2014; Guimaraes; Portugal
- volume
- 8584 LNCS
- pages
- 116-131
- number
- PART 6
- tipo
- article
- publisher
- Springer Verlag
- title
- {Conclave: Ontology-driven measurement of semantic relatedness
between source code elements and problem domain concepts
- number
- 4
- pages
- 1191-1207
- volume
- 11
- docpage
- jj.bib.dp.html#comsys-dmoss
- journal
- Computer Science and Information Systems
- title
- tipo
- article
- abstract
- Besides source code, the fundamental source of information
about open source software lies in documentation, and other non source
code files, like README, INSTALL, or How-To files, commonly available in
the software ecosystem. These documents, written in natural language,
provide valuable information during the software development stage,
but also in future maintenance and evolution tasks. DMOSS3 is a toolkit
designed to systematically assess the quality of non source code content
found in software packages. The toolkit handles a package as an attribute
tree, and performs several tree traverse algorithms through a set of
plugins, specialized in retrieving specific metrics from text, gathering
information about the software. These metrics are later used to infer
knowledge about the software, and composed together to build reports
that assess the quality of specific features. This paper discusses the
motivations for this work, continues with a description of the toolkit
implementation and design goals. This is followed by an example of its
usage to process a software package, and the produced report.
- year
- 2014
- show
- pprwc110
- url
- http://www.comsis.org/archive.php?-1308
- author
- Carvalho, N. R.
- Simões, A.
- Almeida, J. J.
- chave
- comsys-dmoss
- url
- http://www.sciencedirect.com/science/article/pii/S0164121214002179
- doi
- http://dx.doi.org/10.1016/j.jss.2014.10.013
- year
- 2014
- abstract
- Abstract Program comprehension techniques often explore program
identifiers, to infer knowledge about programs. The relevance of source code
identifiers as one relevant source of information about programs is already
established in the literature, as well as their direct impact on future
comprehension tasks. Most programming languages enforce some constrains on
identifiers strings (e.g., white spaces or commas are not allowed). Also,
programmers often use word combinations and abbreviations, to devise strings
that represent single, or multiple, domain concepts in order to increase
programming linguistic efficiency (convey more semantics writing less). These
strings do not always use explicit marks to distinguish the terms used (e.g.,
CamelCase or underscores), so techniques often referred as hard splitting are
not enough. This paper introduces Lingua::IdSplitter a dictionary based
algorithm for splitting and expanding strings that compose multi-term
identifiers. It explores the use of general programming and abbreviations
dictionaries, but also a custom dictionary automatically generated from
software natural language content, prone to include application domain terms
and specific abbreviations. This approach was applied to two software packages,
written in C, achieving a f-measure of around 90% for correctly splitting and
expanding identifiers. A comparison with current state-of-the-art approaches is
also presented.
- chave
- jss-Carvalho2014
- issn
- 0164-1212
- keywords
- Identifier splitting
- author
- Nuno Ramos Carvalho
- José João Almeida
- Pedro Rangel Henriques
- Maria João Varanda
- journal
- Journal of Systems and Software
- docpage
- jj.bib.dp.html#jss-Carvalho2014
- volume
- number
- 0
- tipo
- article
- title
- From source code identifiers to natural language terms
- irreditor
- Maria João Varanda Pereira and José Paulo Leal and Alberto Simões
- author
- Nuno Ramos Carvalho
- José João Almeida
- Maria João Varanda Pereira
- Pedro Rangel Henriques
- chave
- conclave-slate2014
- year
- 2014
- annote
- Keywords: software maintenance, software evolution, program comprehension, feature location, concept location, natural language processing
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2014.19
- address
- Dagstuhl, Germany
- title
- {Conclave: Writing Programs to Understand Programs
- publisher
- Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- pages
- 19--34
- volume
- 38
- series
- OpenAccess Series in Informatics (OASIcs)
- booktitle
- 3rd Symposium on Languages, Applications and Technologies
- docpage
- jj.bib.dp.html#conclave-slate2014
- title
- A Workflow Description Language to Orchestrate Multi-Lingual Resources
- biburl
- http://dblp.uni-trier.de/rec/bib/conf/slate/BritoA14
- isbn
- 978-3-939897-68-2
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- pages
- 77--83
- docpage
- jj.bib.dp.html#DBLP:conf/slate/BritoA14
- booktitle
- 3rd Symposium on Languages, Applications and Technologies, {SLATE
- series
- volume
- 38
- irreditor
- Maria João Varanda Pereira and
José Paulo Leal and
Alberto Simões
- chave
- DBLP:conf/slate/BritoA14
- author
- Rui Brito
- José João Almeida
- doi
- 10.4230/OASIcs.SLATE.2014.77
- year
- 2014
- url
- http://dx.doi.org/10.4230/OASIcs.SLATE.2014.77
- pages
- 251--265
- booktitle
- 3rd Symposium on Languages, Applications and Technologies, {SLATE
- docpage
- jj.bib.dp.html#DBLP:conf/slate/SimoesAB14
- volume
- 38
- series
- title
- Language Identification: a Neural Network Approach
- biburl
- http://dblp.uni-trier.de/rec/bib/conf/slate/SimoesAB14
- isbn
- 978-3-939897-68-2
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- doi
- 10.4230/OASIcs.SLATE.2014.251
- year
- 2014
- url
- http://dx.doi.org/10.4230/OASIcs.SLATE.2014.251
- irreditor
- Maria João Varanda Pereira and
José Paulo Leal and
Alberto Simões
- chave
- DBLP:conf/slate/SimoesAB14
- author
- Alberto Simões
- José João Almeida
- Simon D. Byers
- irreditor
- Maria João Varanda Pereira and
José Paulo Leal and
Alberto Simões
- author
- Pedro Carvalho
- José João Almeida
- chave
- DBLP:conf/slate/CarvalhoA14
- year
- 2014
- doi
- 10.4230/OASIcs.SLATE.2014.283
- url
- http://dx.doi.org/10.4230/OASIcs.SLATE.2014.283
- biburl
- http://dblp.uni-trier.de/rec/bib/conf/slate/CarvalhoA14
- title
- MLT-prealigner: a Tool for Multilingual Text Alignment
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- isbn
- 978-3-939897-68-2
- pages
- 283--290
- series
- volume
- 38
- docpage
- jj.bib.dp.html#DBLP:conf/slate/CarvalhoA14
- booktitle
- 3rd Symposium on Languages, Applications and Technologies, {SLATE
- chave
- tmxa
- author
- Rui Brito
- José João Almeida
- Alberto Simões
- url
- http://ambs.perl-hackers.net/publications/tmxa.pdf
- address
- Las Palmas de Gran Canaria, Spain
- year
- 2014
- abstract
- In the later years the amount of freely available multilingual
corpora has grown in an exponential way. Unfortunately the way these
corpora are made available is very diverse, ranging from simple text
files or specific XML schemas to supposedly standard formats like
the XML Corpus Encoding Initiative, the Text Encoding Initiative, or
even the Translation Memory Exchange formats.
In this document we defend the usage of Translation Memory Exchange
documents, but we enrich its structure in order to support the
annotation of the documents with different information like lemmas,
multi-words or entities.
To support the adoption of the proposed formats, we present a set of
tools to manipulate the different formats in an agile way.
- month
- November
- tipo
- inproceedings
- title
- Processing Annotated {TMX
- docpage
- jj.bib.dp.html#tmxa
- booktitle
- IberSpeech 2014 --- VIII Jornadas en Tecnologías del Habla and IV Iberian SLTech Workshop
- pages
- 188--197
- url
- http://dx.doi.org/10.1016/j.jss.2014.10.013
- doi
- 10.1016/j.jss.2014.10.013
- abstract
- Abstract Program comprehension techniques often explore program
identifiers, to infer knowledge about programs. The relevance of source code
identifiers as one relevant source of information about programs is already
established in the literature, as well as their direct impact on future
comprehension tasks. Most programming languages enforce some constrains on
identifiers strings (e.g., white spaces or commas are not allowed). Also,
programmers often use word combinations and abbreviations, to devise strings
that represent single, or multiple, domain concepts in order to increase
programming linguistic efficiency (convey more semantics writing less). These
strings do not always use explicit marks to distinguish the terms used (e.g.,
CamelCase or underscores), so techniques often referred as hard splitting are
not enough. This paper introduces Lingua::IdSplitter a dictionary based
algorithm for splitting and expanding strings that compose multi-term
identifiers. It explores the use of general programming and abbreviations
dictionaries, but also a custom dictionary automatically generated from
software natural language content, prone to include application domain terms
and specific abbreviations. This approach was applied to two software packages,
written in C, achieving a f-measure of around 90% for correctly splitting and
expanding identifiers. A comparison with current state-of-the-art approaches is
also presented.
- year
- 2015
- chave
- jss-CarvalhoAHP15
- author
- Nuno Ramos Carvalho
- Jos{é
- keywords
- Identifier splitting
- timestamp
- Mon, 22 Dec 2014 09:51:10 +0100
- journal
- Journal of Systems and Software
- docpage
- jj.bib.dp.html#jss-CarvalhoAHP15
- volume
- 100
- pages
- 117--128
- bibsource
- dblp computer science bibliography, http://dblp.org
- tipo
- article
- title
- From source code identifiers to natural language terms
- biburl
- http://dblp.uni-trier.de/rec/bib/journals/jss/CarvalhoAHP15
- author
- Araújo, I.
- Brito, I.
- Machado, G.J.
- Pereira, R.M.S.
- Almeida, J.J.
- Smirnov, G.
- tipo
- article
- chave
- acores-wordcist2015
- title
- New algorithms for smart assessment of math exercises
- volume
- 353
- docpage
- jj.bib.dp.html#acores-wordcist2015
- journal
- Advances in Intelligent Systems and Computing
- year
- 2015
- pages
- 1221-1230
- year
- 2015
- booktitle
- 2015 10th Iberian Conference on Information Systems and Technologies,
CISTI 2015
- docpage
- jj.bib.dp.html#cisti-almeida2015
- url
- http://www.scopus.com/inward/record.url?-s2.0-84943328958&partnerID=MN8TOARS
- title
- Gröbner bases and mathematical exercises generation with nondetermined structure
- author
- Araújo, I.
- Smirnov, G.
- Almeida, J.J.
- eid
- 2
- tipo
- inproceedings
- chave
- cisti-almeida2015
- titlept
- Bases de Gröbner e geração de exercícios matemáticos com estrutura não determinada
- docpage
- jj.bib.dp.html#subtitles2015
- journal
- Quarterly Journal of Experimental Psychology
- volume
- 68
- pages
- 680-696
- number
- 4
- year
- 2015
- chave
- subtitles2015
- author
- Soares, A.P.
- Machado, J.
- Costa, A.
- Iriarte, Á.
- Simões, A.
- Almeida, J.J.
- Comesaña, M.
- Perea, M.
- tipo
- article
- title
- On the advantages of word frequency and contextual
diversity measures extracted from subtitles: The case of Portuguese
- title
- Experiments on Enlarging a Lexical Ontology
- isbn
- 978-3-319-27652-6
- tipo
- incollection
- publisher
- Springer International Publishing
- pages
- 49--56
- docpage
- jj.bib.dp.html#PULO:springer
- booktitle
- Languages, Applications and Technologies
- series
- Communications in Computer and Information Science
- volume
- 563
- irreditor
- Sierra-Rodríguez, José-Luis and Leal, José-Paulo and Simões, Alberto
- chave
- PULO:springer
- author
- Simões, Alberto
- Almeida, José João
- language
- English
- doi
- 10.1007/978-3-319-27653-3_5
- year
- 2015
- author
- Alberto Simões
- Xavier Gómez Guinovart
- J. João Almeida
- chave
- SIMES16.1052
- editor
- Nicoletta Calzolari (Conference Chair) and Khalid Choukri
and Thierry Declerck and Marko Grobelnik and Bente Maegaard and Joseph
Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis
- month
- may
- year
- 2016
- language
- english
- address
- Portoroz, Slovenia
- publisher
- European Language Resources Association (ELRA)
- tipo
- inproceedings
- isbn
- 978-2-9517408-9-1
- title
- Enriching a {P
- date
- 23-28
- booktitle
- Proceedings of the Ninth International Conference on
Language Resources and Evaluation (LREC 2016)
- docpage
- jj.bib.dp.html#SIMES16.1052
- booktitle
- 5th Symposium on Languages, Applications and Technologies
- docpage
- jj.bib.dp.html#almeida_et_al2016
- volume
- 51
- series
- OpenAccess Series in Informatics (OASIcs)
- pages
- 1--8
- offeditor
- Marjan Mernik and José Paulo Leal and Hugo Gonçalo Oliveira
- publisher
- Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- title
- {Context-Free Grammars: Exercise Generation and Probabilistic
- offaddress
- Dagstuhl, Germany
- annote
- Keywords: Exercise generation, context-free grammars, assessment
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2016.10
- year
- 2016
- chave
- almeida_et_al2016
- author
- José João Almeida
- Eliana Grande
- Georgi Smirnov
- journal
- Iberian Conference on Information Systems and Technologies, CISTI
- docpage
- jj.bib.dp.html#cisti2016
- volume
- 2016-July
- doi
- https://doi.org/10.1109/CISTI.2016.7521367
- year
- 2016
- chave
- cisti2016
- author
- Araujo, C.
- Henriques, P.R.
- Martini, R.G.
- Almeida, J.J.
- tipo
- article
- title
- Architectural approaches to build the museum of the person
- author
- Araújo, I.
- Almeida, J.J.
- Smirnov, G.
- chave
- exercise-composition2016
- year
- 2016
- doi
- https://doi.org/10.1007/978-3-319-31307-8_24
- tipo
- article
- note
- WorldCIST'16
- title
- Exercise composition: From environment properties to composed problems
- volume
- 445
- docpage
- jj.bib.dp.html#exercise-composition2016
- journal
- Advances in Intelligent Systems and Computing
- pages
- 235-244
- tipo
- article
- note
- WorldCIST'16
- title
- OntoMP, an ontology to build the museum of the person
- journal
- Advances in Intelligent Systems and Computing
- docpage
- jj.bib.dp.html#ontoMP2016
- volume
- 445
- pages
- 653-661
- chave
- ontoMP2016
- author
- doi
- https://doi.org/10.1007/978-3-319-31307-8_67
- year
- 2016
- pages
- 277-286
- abstract
- Exercise generation on language specification is a challenging
problem, because of the richness of the objects in the domain.
In this paper we discuss Mgbeg (Meta-Grammar-Based Exercise Generator) -- a
toolkit for exercise generation on context-free languages.
Mgbeg approach is based on a meta-grammar formalism and tool, used to define
a set of similar exercises.
Mgbeg is a simple attributed grammar used to describe the set of valid
exercise (and randomly generate one of them).
Each exercise typically contains several attributes calculated during the
generation steps: namely, one or more formal specification of the language
(context free grammar); the exercise statement; other information such as
examples, common mistakes, validation data, to be used in the construction
of the exercise statement, solution, and assessment steps.
Complementary the toolkit provides a grammar module, with functionality
for grammar comparison, sentence generation and recognition; a template
engine (to help in textual attributes calculation).
- year
- 2017
- booktitle
- Recent Advances in Information Systems and Technologies
- docpage
- jj.bib.dp.html#portosanto-worldcist2017
- series
- Advances in Intelligent Systems and Computing, vol. 659
- title
- Exercise generation on language specification
- chave
- portosanto-worldcist2017
- note
- WorldCIST'17
- author
- Almeida, J.J.
- Eliana Grande
- Smirnov, G.
- tipo
- inproceedings
- pages
- 763-772
- volume
- 745
- series
- Advances in Intelligent Systems and Computing
- booktitle
- Trends and Advances in Information Systems and Technologies, WorldCist2018
- docpage
- jj.bib.dp.html#Martins2018a
- title
- Increasing authorship identification through emotional analysis
- publisher
- Springer International Publishing
- tipo
- incollection
- offeditor
- Álvaro Rocha and Hojjat Adeli and Luís Paulo Reis and Sandra Costanzo
- isbn
- 978-3-319-77702-3
- month
- March
- year
- 2018
- doi
- https://doi.org/10.1007/978-3-319-77703-0_76
- author
- Ricardo Martins
- J.João Almeida
- Pedro Rangel Henriques
- Paulo Novais
- chave
- Martins2018a
- edition
- 1
- year
- 2018
- pages
- 374--384
- volume
- 11314
- series
- Lecture Notes in Computer Science
- booktitle
- docpage
- jj.bib.dp.html#DBLP:conf/ideal/MarcondesAN18
- title
- Chatbot Theory - A Naïve and Elementary Theory for Dialogue
- author
- Francisco S. Marcondes
- José João Almeida
- Paulo Novais
- publisher
- Springer
- tipo
- inproceedings
- chave
- DBLP:conf/ideal/MarcondesAN18
- pages
- 61--66
- year
- 2018
- booktitle
- docpage
- jj.bib.dp.html#DBLP:conf/bracis/Martins0ANH18
- title
- Hate Speech Classification in Social Media Using Emotional Analysis
- chave
- DBLP:conf/bracis/Martins0ANH18
- author
- Ricardo Martins
- Marco Gomes
- José João Almeida
- Paulo Novais
- Pedro Rangel Henriques
- publisher
- tipo
- inproceedings
- year
- 2018
- pages
- 276--283
- series
- Advances in Intelligent Systems and Computing
- volume
- 800
- docpage
- jj.bib.dp.html#DBLP:conf/dcai/MartinsAHN18
- booktitle
- title
- Domain Identification Through Sentiment Analysis
- tipo
- inproceedings
- publisher
- Springer
- author
- Ricardo Martins
- José João Almeida
- Pedro Rangel Henriques
- Paulo Novais
- chave
- DBLP:conf/dcai/MartinsAHN18
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- author
- Rui Mendes
- José João Almeida
- chave
- DBLP:conf/slate/MendesA18
- title
- eOS: The Exercise Operating System
- series
- volume
- 62
- docpage
- jj.bib.dp.html#DBLP:conf/slate/MendesA18
- booktitle
- year
- 2018
- pages
- 5:1--5:13
- year
- 2018
- pages
- 8:1--8:8
- series
- volume
- 62
- docpage
- jj.bib.dp.html#DBLP:conf/slate/Almeida18
- booktitle
- title
- Abcl: Abc music notation with rich chord support (Short Paper)
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- author
- chave
- DBLP:conf/slate/Almeida18
- series
- volume
- 62
- docpage
- jj.bib.dp.html#DBLP:conf/slate/MartinsAHN18
- booktitle
- year
- 2018
- pages
- 19:1--19:9
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- author
- Ricardo Martins
- José João Almeida
- Pedro Rangel Henriques
- Paulo Novais
- chave
- DBLP:conf/slate/MartinsAHN18
- title
- Predicting Performance Problems Through Emotional Analysis (Short
- title
- Creating a social media-based personal emotional lexicon
- tipo
- inproceedings
- publisher
- {ACM
- author
- Ricardo Martins
- José João Almeida
- Paulo Novais
- Pedro Rangel Henriques
- chave
- DBLP:conf/webmedia/MartinsANH18
- year
- 2018
- pages
- 261--264
- docpage
- jj.bib.dp.html#DBLP:conf/webmedia/MartinsANH18
- booktitle
- WebMedia
- title
- Increasing Authorship Identification Through Emotional Analysis
- chave
- DBLP:conf/worldcist/MartinsAHN18
- tipo
- inproceedings
- publisher
- Springer
- author
- Ricardo Martins
- José João Almeida
- Pedro Rangel Henriques
- Paulo Novais
- pages
- 763--772
- year
- 2018
- docpage
- jj.bib.dp.html#DBLP:conf/worldcist/MartinsAHN18
- booktitle
- WorldCIST {(1)
- series
- Advances in Intelligent Systems and Computing
- volume
- 745
- eywords
- Formal languages, Context-free grammars, Automatic assessment
- abstract
In this paper we consider the problem of cycle-free context-free grammars equivalence. To every context-free
grammar there corresponds a system of formal equations. Formally applying the iteration method to this system
we obtain the grammar axiom in the form of a formal power series composed of the words generated by the
grammarmultipliedby the respective ambiguities.
We define a transform that attributes a matrix meaning to the system of formal equations and to formal power
series: terminal symbols are substituted by matrices and formal sum and product are substituted by the matrix
ones. In order to effectively compute the sum of a matrix series we numerically solve the system of matrix
equations. We prove distinguishability theorems showing that if two formal power series generated by cycle-free
context-free grammars are different, then there exists a matrix substitution such that the sums of the respective
matrix series are different. Based on this result, we suggest a procedure that can resolve the problem of
equivalence of cycle-free context-free grammars in many practical cases.
The results obtained in this paper form a theoretical basis for algorithms oriented to automatic assessment of
students' answers in computer science. We present the respective algorithms. Then we compare our approach
with a simple heuristic method based on CYK algorithm and discuss the limitations of our method.
- chave
- cola19
- author
- José João Almeida
- Eliana Grande
- Georgi Smirnov
- docpage
- jj.bib.dp.html#cola19
- journal
- Journal of Computer Languages
- volume
- 51
- pages
- 48-56
- tipo
- article
- publisher
- Elsevier
- title
- On solving cycle-free context-free grammar equivalence problem using numerical analysis
- tipo
- inproceedings
- title
- Hunting ancestors: A unified approach for discovering genealogical information
- volume
- 74
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Almeida2019
- journal
- OpenAccess Series in Informatics
- number
- 22
- author
- Almeida, J.J.
- Mendes, R.C.
- eid
- 2
- chave
- Almeida2019
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85071097688&.4230%2fOASIcs.SLATE.2019.22&partnerID=40&md5=8e2f42806d411bdfa553dcfa27be17a9
- abstract
- This paper presents an unified approach for discovering
genealogical information. It presents a frameworks for storing information
concerning ancestors, locations, dates and documents. It also intends
to provide a framework that is able to perform inference concerning
dates by using constraints and for handling relations, locations and
sources. The DSL presented also aims to help users store information
from heterogeneous sources along with the evidence contained therein. ©
José J. Almeida and Rui C. Mendes.
- source
- Scopus
- year
- 2019
- doi
- 10.4230/OASIcs.SLATE.2019.22
- abstract
- The digital era has brought some challenges to lexicographers,
but it has also brought new opportunities as part of the rise of
information technology and, more recently, the emergence of digital
humanities. This paper provides a description of LeXmart, the framework
that supports the digital development of the Portuguese Academy of
Sciences Dictionary. LeXmart is a smart tool framework to support
lexicographers' work that offers different types of tools, ranging from a
structural editor to a set of validation tools. Given that the dictionary
is stored in eXist-DB, LeXmart is developed on top of its ecosystem,
using W3C standard languages, and offering default functionalities
offered by eXist-DB, namely a RESTful API. © 2019 Lexical Computing CZ
s.r.o.. All rights reserved.
- source
- Scopus
- year
- 2019
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85075350281&partnerID=40&md5=c5171c547089e5728c1cec0d5c755df1
- eid
- 2
- author
- Simões, Alberto
- Salgado, Ana
- Costa, Rute
- Almeida, J.J.
- chave
- Simões2019453
- pages
- 453-466
- volume
- 2019-October
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Simões2019453
- journal
- Proceedings of Electronic Lexicography in the 21st Century Conference
- title
- LexMart: A smart tool for lexicographers
- tipo
- inproceedings
- type
- Article
- docpage
- jj.bib.dp.html#Martins2019
- journal
- Expert Systems
- number
- e12469
- tipo
- article
- title
- A sentiment analysis approach to increase authorship identification
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85074844787&.1111%2fexsy.12469&partnerID=40&md5=bb5b7acab849e47b90246393026a4ba4
- abstract
- Writing style is considered the manner in which an author
expresses his thoughts, influenced by language characteristics, period,
school, or nation. Often, this writing style can identify the author. One
of the most famous examples comes from 1914 in Portuguese literature. With
Fernando Pessoa and his heteronyms Alberto Caeiro, Álvaro de Campos,
and Ricardo Reis, who had completely different writing styles, led
people to believe that they were different individuals. Currently,
the discussion of authorship identification is more relevant because
of the considerable amount of widespread fake news in social media,
in which it is hard to identify who authored a text and even a simple
quote can impact the public image of an author, especially if these
texts or quotes are from politicians. This paper presents a process to
analyse the emotion contained in social media messages such as Facebook to
identify the author's emotional profile and use it to improve the ability
to predict the author of the message. Using preprocessing techniques,
lexicon-based approaches, and machine learning, we achieved an authorship
identification improvement of approximately 5% in the whole dataset
and more than 50% in specific authors when considering the emotional
profile on the writing style, thus increasing the ability to identify
the author of a text by considering only the author's emotional profile,
previously detected from prior texts. © 2019 John Wiley & Sons, Ltd.
- source
- Scopus
- year
- 2019
- doi
- 10.1111/exsy.12469
- eid
- 2
- author
- Martins, R.
- Almeida, J.J.
- Henriques, P.
- Novais, P.
- chave
- Martins2019
- tipo
- inproceedings
- title
- Domain identification through sentiment analysis
- volume
- 800
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Martins2019276
- journal
- Advances in Intelligent Systems and Computing
- pages
- 276-283
- eid
- 2
- author
- Martins, R.
- Almeida, J.J.
- Henriques, P.
- Novais, P.
- chave
- Martins2019276
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85049987273&.1007%2f978-3-319-94649-8_33&partnerID=40&md5=3fe3521d746330d391ee8ec0dd7bd4e9
- source
- Scopus
- abstract
- When dealing with chatbots, domain identification is an
important feature to adapt the interactions between user and computer in
order to increase the reliability of the communication and, consequently,
the audience and decrease its rejection avoiding misunderstandings. In
order to adapt to different domains, the writing style will be different
for the same author. For example, the same person in the role of a
student writes to his professor in a different style than he does for
his brother. This article presents a process that uses sentiment analysis
to identify the average emotional profile of the communication scenario
where the conversation is done. Using Natural Language Processing and
Machine Learning techniques, it was possible to obtain an index of
96.21% of correct classifications in the identification of where these
communications have occurred only analysing the emotional profile of
these texts. © Springer International Publishing AG, part of Springer
Nature 2019.
- year
- 2019
- doi
- 10.1007/978-3-319-94649-8_33
- tipo
- inproceedings
- title
- Musikla: Language for generating musical events
- type
- Conference Paper
- journal
- OpenAccess Series in Informatics
- docpage
- jj.bib.dp.html#Silva2020
- volume
- 83
- number
- A6
- chave
- Silva2020
- eid
- 2
- author
- Silva, Pedro
- Almeida, J.J.
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85091704838&.4230%2fOASIcs.SLATE.2020.6&partnerID=40&md5=1c450e4e7bb940f5855eafaedb4ccba3
- doi
- 10.4230/OASIcs.SLATE.2020.6
- source
- Scopus
- abstract
- In this paper, we'll discuss a simple approach to integrating
musical events, such as notes or chords, into a programming language. This
means treating music sequences as a first class citizen. It will be
possible to save those sequences into variables or play them right away,
pass them into functions or apply operators on them (like transposing or
repeating the sequence). Furthermore, instead of just allowing static
sequences to be generated, we'll integrate a music keyboard system
that easily allows the user to bind keys (or other kinds of events) to
expressions. Finally, it is important to provide the user with multiple
and extensible ways of outputing their music, such as synthesizing it into
a file or directly into the speakers, or writing a MIDI or music sheet
file. We'll structure this paper first with an analysis of the problem
and its particular requirements. Then we will discuss the solution we
developed to meet those requirements. Finally we'll analyze the result
and discuss possible alternative routes we could've taken. © 2020 Schloss
Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. All
rights reserved.
- year
- 2020
- title
- BhTSL, behavior trees specification and processing
- tipo
- inproceedings
- number
- A4
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Oliveira2020
- journal
- OpenAccess Series in Informatics
- volume
- 83
- chave
- Oliveira2020
- author
- Oliveira, M.
- Silva, P.M.
- Moura, Pedro
- Almeida, J.J.
- Henriques, P.R.
- eid
- 2
- doi
- 10.4230/OASIcs.SLATE.2020.4
- abstract
- In the context of game development, there is always the
need for describing behaviors for various entities, whether NPCs or
even the world itself. That need requires a formalism to describe
properly such behaviors. As the gaming industry has been growing,
many approaches were proposed. First, finite state machines were used
and evolved to hierarchical state machines. As that formalism was not
enough, a more powerful concept appeared. Instead of using states for
describing behaviors, people started to use tasks. This concept was
incorporated in behavior trees. This paper focuses in the specification
and processing of Behavior Trees. A DSL designed for that purpose will
be introduced. It will also be discussed a generator that produces LATEX
diagrams to document the trees, and a Python module to implement the
behavior described. Additionally, a simulator will be presented. These
achievements will be illustrated using a concrete game as a case study. ©
2020 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl
Publishing. All rights reserved.
- source
- Scopus
- year
- 2020
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85091707856&.4230%2fOASIcs.SLATE.2020.4&partnerID=40&md5=3b2daa7d548eeed77224386d6790adc7
- author
- Simões, Alberto
- Sacanene, B.
- Iriarte, Alvaro
- Almeida, J.J.
- Macedo, J.
- eid
- 2
- chave
- Simões2020
- source
- Scopus
- abstract
- In this document we present the first developments on an Umbundu
dictionary for a jSpell, a morphological analyzer. Initially some comments
are performed regarding the Umbundu language morphology, followed by the
discussion on jSpell dictionaries structure and its environment. Last, we
describe the Umbundu dictionary bootstrap process and perform some final
experiments on its coverage. © 2020 Schloss Dagstuhl- Leibniz-Zentrum
fur Informatik GmbH, Dagstuhl Publishing. All rights reserved.
- year
- 2020
- doi
- 10.4230/OASIcs.SLATE.2020.10
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85091700212&.4230%2fOASIcs.SLATE.2020.10&partnerID=40&md5=26f3c0eacb3fc1ea35f005c08377b083
- title
- Towards a morphological analyzer for the umbundu language
- tipo
- inproceedings
- number
- A10
- volume
- 83
- type
- Conference Paper
- journal
- OpenAccess Series in Informatics
- docpage
- jj.bib.dp.html#Simões2020
- chave
- Marcondes2020
- author
- Marcondes, F.S.
- Almeida, J.J.
- Novais, P.
- eid
- 2
- source
- Scopus
- abstract
- The username hints for most of the on-line social networks are
mostly unpleasant for human beings since they are mostly a simple name
variation followed by numbers. This paper shows that it is possible to
generate human likable usernames through heuristics guided by structural
onomastics. The objective then is to conceive heuristics as such and
check its availability in Twitter in order to verify if is it possible
to generate a sufficiently big and available username data-set that is
able to justify the transitions from unpleasant to a pleasant username
suggestion. This paper finds that it is possible to generate 8281 handles
on average through the proposed heuristics and their permutations,
therefore, the number of various possibilities is comfortable. This is
a partial account since not all possibilities were explored and some
improvements are required, but suits for a proof of concept and to
indicate paths. © 2020 CEUR-WS. All rights reserved.
- year
- 2020
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85090898082&partnerID=40&md5=3bee224fddd1133fbeb306d5c88737fa
- title
- Structural onomatology for username generation: A partial account
- tipo
- inproceedings
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Marcondes2020
- journal
- CEUR Workshop Proceedings
- volume
- 2655
- title
- A short survey on chatbot technology: Failure in raising the state of the art
- tipo
- inproceedings
- pages
- 28-36
- volume
- 1003
- docpage
- jj.bib.dp.html#Marcondes202028
- journal
- Advances in Intelligent Systems and Computing
- type
- Conference Paper
- eid
- 2
- author
- Marcondes, F.S.
- Almeida, J.J.
- Novais, P.
- chave
- Marcondes202028
- year
- 2020
- abstract
- This short survey aimed initially to explore the existing
state of the art for the application of chatbot on fighting (and not on
spreading) of fake-news. It was then realized that there is not common to
use chatbots with this "virtuous" purpose. Therefore, after two surveys
and a meta-analysis, the topic had to be withdrawn since there were no
survey results to discuss besides the absence of results. The survey
result raised then a need to realize how chatbots are being currently
used, designed and their primary sources. The result was once again
confusing since, on the sample: (1) no significant concentration of usage
could be found; (2) no widely adopted design strategies were identified,
and (3) no significant crosscutting references to be considered as primary
sources. Certainly, this can be due to a biased sample but may also be a
symptom of a methodological issue on the chatbot researches. If the second
possibility is proved to be right it means that chatbot research is still
on a pre-paradigm stage according to Kuhn¿s conception. For this paper,
there were performed 4 surveys with a total sample of 50 papers mostly
from the last 3Â years. © Springer Nature Switzerland AG 2020.
- source
- Scopus
- doi
- 10.1007/978-3-030-23887-2_4
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85068602421&.1007%2f978-3-030-23887-2_4&partnerID=40&md5=cbf6fb00a51eb082aa7e1097f926fece
- type
- Conference Paper
- journal
- Advances in Intelligent Systems and Computing
- docpage
- jj.bib.dp.html#Marcondes2020170
- volume
- 1160 AISC
- pages
- 170-180
- tipo
- article
- title
- Fact-Check spreading behavior in twitter: A qualitative profile for false-claim news
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85086245198&.1007%2f978-3-030-45691-7_16&partnerID=40&md5=6547f11464462d6bfdb1505e6142b733
- doi
- 10.1007/978-3-030-45691-7_16
- source
- Scopus
- abstract
- Fact-check spread is usually performed by a plain tweet with
just the link. Since it is not proper human behavior, it may cause
uncanny, hinder the reader¿s attention and harm the counter-propaganda
influence. This paper presents a profile of fact-check link spread in
Twitter (suiting for TRL-1) and, as an additional outcome, proposes
a preliminary behavior design based on it (suiting for TRL-2). The
underlying hypothesis is by simulating human-like behavior, a bot gets
more attention and exerts more influence on its followers. © The Editor(s)
(if applicable) and The Author(s), under exclusive license to Springer
Nature Switzerland AG 2020.
- year
- 2020
- chave
- Marcondes2020170
- author
- Marcondes, F.S.
- Almeida, J.J.
- Durães, D.
- Novais, P.
- eid
- 2
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85085513930&.1007%2f978-3-030-45688-7_14&partnerID=40&md5=d559e334a2140bea6ea02051264b73c4
- abstract
- Political debate - in its essence - carries a robust
emotional charge, and social media have become a vast arena for voters
to disseminate and discuss the ideas proposed by candidates. The
Brazilian presidential elections of 2018 were marked by a high level
of polarization, making the discussion of the candidates¿ ideas an
ideological battlefield, full of accusations and verbal aggression,
creating an excellent source for sentiment analysis. In this paper,
we analyze the emotions of the tweets posted about the presidential
candidates of Brazil on Twitter, so that it was possible to identify the
emotional profile of the adherents of each of the leading candidates,
and thus to discern which emotions had the strongest effects upon the
election results. Also, we created a model using sentiment analysis and
machine learning, which predicted with a correlation of 0.90 the final
result of the election. © 2020, The Editor(s) (if applicable) and The
Author(s), under exclusive license to Springer Nature Switzerland AG.
- source
- Scopus
- year
- 2020
- doi
- 10.1007/978-3-030-45688-7_14
- eid
- 2
- author
- Martins, R.
- Almeida, J.
- Henriques, P.
- Novais, P.
- chave
- Martins2020134
- volume
- 1159 AISC
- type
- Conference Paper
- journal
- Advances in Intelligent Systems and Computing
- docpage
- jj.bib.dp.html#Martins2020134
- pages
- 134-143
- tipo
- article
- title
- Predicting an Election's Outcome Using Sentiment Analysis
- chave
- Martins201861
- eid
- 2
- author
- Martins, R.
- Gomes, M.
- Almeida, J.J.
- Novais, P.
- Henriques, P.
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85060849408&.1109%2fBRACIS.2018.00019&partnerID=40&md5=10284a22b511c161a903debd79e5619a
- doi
- 10.1109/BRACIS.2018.00019
- year
- 2018
- abstract
- In this paper, we examine methods to classify hate speech
in social media. We aim to establish lexical baselines for this task
by applying classification methods using a dataset annotated for this
purpose. As features, our system uses Natural Language Processing (NLP)
techniques in order to expand the original dataset with emotional
information and provide it for machine learning classification. We
obtain results of 80.56% accuracy in hate speech identification, which
represents an increase of almost 100% from the original analysis used
as a reference. © 2018 IEEE.
- source
- Scopus
- tipo
- inproceedings
- title
- Hate speech classification in social media using emotional analysis
- journal
- Proceedings - 2018 Brazilian Conference on Intelligent Systems, BRACIS 2018
- docpage
- jj.bib.dp.html#Martins201861
- type
- Conference Paper
- pages
- 61-66
- number
- 8575590