- year
- 1987
- number
- 1
- journal
- Revista de Inform�tica
- docpage
- jj.bib.dp.html#velharia1
- volume
- 6
- title
- Descri��o de um N�cleo Gr�fico e Aplica��o em {CAD
- chave
- velharia1
- tipo
- article
- author
-
- note
- (KGUM - kernel gr�fico U.Minho)
- journal
- Revista de Inform�tica
- docpage
- jj.bib.dp.html#velharia2
- volume
- 9
- year
- 1988
- number
- 6
- chave
- velharia2
- tipo
- article
- author
-
- C. Ferreira
- F. Ferreira
- F. Martins
- J.J. Almeida
- L. Barbosa
- title
- Sistemas de Programa��o Modular
- title
- Mecanismos para Especifica��o e Prototipagem de Interfaces
Utilizador-Sistema
- note
- (Gram�ticas Interactivas guardadas)
- author
-
- F. M�rio Martins
- J.J. Almeida
- P.R. Henriques
- tipo
- inproceedings
- chave
- graminteractivas1990
- year
- 1990
- address
- Coimbra
- booktitle
- 3$�$ Encontro Portugu�s de Computa��o Gr�fica
- docpage
- jj.bib.dp.html#graminteractivas1990
- year
- 1988
- docpage
- jj.bib.dp.html#tlc89
- type
- Texto did�ctico
- title
- Teoria das Linguagens
- keyword
-
- institution
- Universidade do Minho, Departamento de Inform�tica
- chave
- tlc89
- tipo
- techreport
- author
-
- title
- Estruturas de Dados
- keyword
-
- institution
- Universidade do Minho, Departamento de Inform�tica
- chave
- estruturasdedados90
- tipo
- techreport
- author
-
- year
- 1990
- docpage
- jj.bib.dp.html#estruturasdedados90
- type
- Texto did�ctico
- title
- \textsc{Camila} - A Platform for Software Mathematical Development
- tipo
- techreport
- docpage
- jj.bib.dp.html#Camila
- type
- (P�ginas do projecto)
- keyword
-
- chave
- Camila
- institution
- Universidade do Minho, Departamento de Inform�tica
- author
-
- year
- 1998
- url
- http://camila.di.uminho.pt
- editor
- L.S. Barbosa and J.J. Almeida and J.N. Oliveira and Lu�s Neves
- title
- {Natura} - Natural language processing
- tipo
- techreport
- note
- \url{http://natura.di.uminho.pt/}
- docpage
- jj.bib.dp.html#Natura
- type
- (P�ginas do projecto)
- keyword
-
- institution
- Universidade do Minho, Departamento de Inform�tica
- chave
- Natura
- author
-
- year
- 1997
- url
- http://natura.di.uminho.pt/
- author
-
- institution
- Universidade do Minho, Departamento de Inform�tica
- chave
- PDavid
- keyword
-
- editor
- J.C. Ramalho and J.J. Almeida and P.R. Henriques
- url
- http://www.di.uminho.pt/~jcr/projectos/david/princ.html
- year
- 1998
- tipo
- techreport
- note
- \url{http://www.di.uminho.pt/~jcr/projectos/david/princ.html}
- title
- David -- Processamento estruturado de documentos
- docpage
- jj.bib.dp.html#PDavid
- type
- (P�ginas do projecto)
- chave
- nllex
- tipo
- misc
- author
-
- title
- NLlex -- Natural Language LEX
- keyword
-
- lexical analysis
- Natura
- lex
- misc
- url
- http://natura.di.uminho.pt/~jj/pln/pln.html#nllex
- docpage
- jj.bib.dp.html#nllex
- type
- tool
- year
- 1996
- year
- 1997
- type
- tool
- docpage
- jj.bib.dp.html#jspell
- url
- http://natura.di.uminho.pt/~jj/pln/pln.html#jspell
- keyword
-
- lexical analysis
- Natura
- morphology
- misc
- title
- Jspell a module for morphological analyser for natural language
- author
-
- J.J. Almeida
- Ulisses Pinto
- tipo
- misc
- chave
- jspell
- docpage
- jj.bib.dp.html#jspell1
- type
- Manual
- tipo
- techreport
- title
- Manual de Utilizador do {JSpell}
- url
- http://natura.di.uminho.pt/~jj/pln/jspellman.ps.gz
- year
- 1994
- abstract
-
- month
- Jul
- chave
- jspell1
- institution
- Universidade do Minho, Departamento de Inform�tica
- author
-
- J.J. Almeida
- Ulisses Pinto
- keyword
-
- morphology
- lexical analysis
- jspell
- techreport
- year
- 1994
- url
- http://natura.di.uminho.pt/~jj/pln/yalg3.ps.gz
- docpage
- jj.bib.dp.html#Almeida94b
- editor
- Carlos Martin Vide
- booktitle
- Actas del X Congreso de Lenguajes Naturales e Leanguajes Formales, Sevilla
- title
- {GPC} -- a Tool for higher-order grammar specification
- keyword
-
- DCG
- grammar
- inproceedings
- chave
- Almeida94b
- tipo
- inproceedings
- author
-
- title
- {YaLG} -- extending {DCG} for natural language processing
- tipo
- inproceedings
- pages
- 621--628
- docpage
- jj.bib.dp.html#Almeida95a
- booktitle
- Actas del XI Congreso de Lenguajes Naturales e Leanguajes Formales, Tortosa
- keyword
-
- jspell
- morphology
- PLN
- DCG
- nllex
- inproceedings
- chave
- Almeida95a
- author
-
- year
- 1995
- url
- http://natura.di.uminho.pt/~jj/pln/yalg.ps.gz
- editor
- Carlos Martin Vide
- title
- Jspell -- um m�dulo para an�lise l�xica gen�rica de linguagem natural
- tipo
- inproceedings
- pages
- 1--15
- booktitle
- Actas do X Encontro da Associa��o Portuguesa de Lingu�stica
- docpage
- jj.bib.dp.html#Almeida94c
- keyword
-
- jspell
- morphology
- PLN
- perl
- inproceedings
- author
-
- J.J. Almeida
- Ulisses Pinto
- chave
- Almeida94c
- year
- 1995
- address
- �vora 1994
- url
- http://natura.di.uminho.pt/~jj/pln/jspell1.ps.gz
- tipo
- inproceedings
- author
-
- chave
- Almeida94a
- keyword
-
- librarian studies
- WWW
- WAIS
- IR
- inproceedings
- title
- Documents in an Informatic Academic environment
- docpage
- jj.bib.dp.html#Almeida94a
- booktitle
- Congresso Nacional de Bibliotec�rios, Arquivistas e
Documentalistas
- year
- 1994
- address
- Lisboa
- number
- UM-DI-95.04
- year
- 1995
- url
- http://natura.di.uminho.pt/~jj/pln/nllex.ps.gz
- docpage
- jj.bib.dp.html#jj95
- title
- {NLlex} -- a tool to generate lexical analysers for natural language
- keyword
-
- jspell
- morphology
- lex
- PLN
- nllex
- techreport
- institution
- Universidade do Minho, Departamento de Inform�tica
- chave
- jj95
- author
-
- tipo
- techreport
- url
- http://www.di.uminho.pt/~lsb/pub_camila/LNcam.ps.gz
- year
- 1995
- institution
- University of Minho
- chave
- Barbosa95
- author
-
- L.S. Barbosa
- J.J. Almeida
- keyword
-
- Camila
- formal specification
- techreport
- docpage
- jj.bib.dp.html#Barbosa95
- number
- DI-CAM-95:11:1
- note
- Lecture notes for the System Design Course,
Computer System Engineering, University of Bristol
- tipo
- techreport
- title
- System Prototyping in \textsc{Camila}
- year
- 1995
- number
- DI-CAM-95:11:2
- url
- http://www.di.uminho.pt/~lsb/pub_camila/RMcam.ps.gz
- docpage
- jj.bib.dp.html#Barbosa95a
- keyword
-
- title
- \textsc{Camila}: A reference Manual
- tipo
- techreport
- author
-
- L.S. Barbosa
- J.J. Almeida
- chave
- Barbosa95a
- institution
- University of Minho
- number
- DI-CAM-95:11:1:v98
- year
- 1998
- type
- {Lecture Notes for the Bristol Course (1st ed. 1995)}
- docpage
- jj.bib.dp.html#BA97a
- keyword
-
- Formal Methods
- Prototyping
- Camila
- techreport
- title
- Systems Prototyping in \textsc{Camila}
- author
-
- L.S. Barbosa
- J.J. Almeida
- tipo
- techreport
- institution
- DI (U. Minho)
- chave
- BA97a
- number
- DI-CAM-95:7:1
- year
- 1995
- url
- http://www.di.uminho.pt/~lsb/pub_camila/romantic.ps.gz
- docpage
- jj.bib.dp.html#Barbosa95b
- keyword
-
- Camila
- formal specification
- didatics
- techreport
- title
- Growing Up With \textsc{Camila}
- author
-
- L.S. Barbosa
- J.J. Almeida
- tipo
- techreport
- institution
- Universidade do Minho, Departamento de Inform�tica
- chave
- Barbosa95b
- chave
- Almeida96a
- author
-
- keyword
-
- perl
- morphology
- lexical analysis
- dictionary
- inproceedings
- url
- http://natura.di.uminho.pt/~jj/pln/etdic.ps.gz
- address
- Lisboa 1995
- year
- 1996
- tipo
- inproceedings
- title
- Especifica��o e tratamento de Dicion�rios
- booktitle
- Actas do XI Encontro da Associa��o Portuguesa de Lingu�stica
- docpage
- jj.bib.dp.html#Almeida96a
- volume
- 2
- docpage
- jj.bib.dp.html#Ulisses96
- booktitle
- Actas do XI Encontro da Associa��o Portuguesa de Lingu�stica
- volume
- 2
- tipo
- inproceedings
- title
- Tratamento autom�tico de termos compostos
- url
- http://natura.di.uminho.pt/~jj/pln/ptc.ps.gz
- address
- Lisboa 1995
- year
- 1996
- chave
- Ulisses96
- author
-
- Ulisses Pinto
- J.J. Almeida
- keyword
-
- jspell
- morphology
- lexical analysis
- PLN
- inproceedings
- year
- 1996
- booktitle
- II International Conference on Mathematical Linguistics, Tarragona, Spain
- url
- http://natura.di.uminho.pt/~jj/pln/yalg2.ps.gz
- docpage
- jj.bib.dp.html#Almeida96b
- title
- {YaLG} a tool for higher-order grammar specification
- keyword
-
- yalg
- DCG
- RS
- inproceedings
- chave
- Almeida96b
- author
-
- tipo
- inproceedings
- url
- http://natura.di.uminho.pt/~jj/pln/nllex2.ps.gz
- month
- Sep
- year
- 1996
- chave
- jj96
- author
-
- keyword
-
- jspell
- morphology
- lex
- PLN
- nllex
- article
- journal
- Procesamiento del Lenguaje Natural
- docpage
- jj.bib.dp.html#jj96
- volume
- 19
- pages
- 81--90
- publisher
- Sociedade Espa�ola para el Procesamiento del Lenguaje Natural
- tipo
- article
- title
- {NLlex} -- a tool to generate lexical analysers for natural language
- year
- 1997
- month
- Dec.
- address
- Washington D.C. - USA
- docpage
- jj.bib.dp.html#SGML97
- booktitle
- SGML/XML'97 Conference
- keyword
-
- PDavid
- SGML
- Semantics
- inproceedings
- title
- SGML Documents: where does quality go?
- tipo
- inproceedings
- author
-
- J.C. Ramalho
- J.G. Rocha
- J.J. Almeida
- P.R. Henriques
- chave
- SGML97
- title
- Programa��o de dicion�rios
- tipo
- inproceedings
- pages
- 21--28
- docpage
- jj.bib.dp.html#Almeida98
- booktitle
- Actas do XIII Encontro da Associa��o Portuguesa de Lingu�stica
- volume
- 1
- keyword
-
- perl
- morphology
- dictionary
- parser
- inproceedings
- chave
- Almeida98
- author
-
- address
- Lisboa 1997
- year
- 1998
- url
- http://natura.di.uminho.pt/~jj/bib/progDic.ps.gz
- title
- Etiquetador morfo-sint�ctico para o Portugu�s
- keyword
-
- chave
- Reis98
- author
-
- Ricardo Reis
- J.J. Almeida
- tipo
- inproceedings
- address
- Lisboa 1997
- year
- 1998
- booktitle
- Actas do XIII Encontro da Associa��o Portuguesa de Lingu�stica
- docpage
- jj.bib.dp.html#Reis98
- url
- http://natura.di.uminho.pt/~jj/bib/etiquetador2.ps.gz
- keyword
-
- Camila
- formal specification
- inproceedings
- author
-
- J.J. Almeida
- Barbosa, L.S.
- Neves, F.L.
- Oliveira, J.N.
- chave
- ABNO97a
- month
- October
- year
- 1997
- address
- La Plata, Argentina
- url
- http://camila.di.uminho.pt/camila-doc/CLaPF97.ps.gz
- editor
- De Giusti, A. and Diaz, J. and Pesado, P.
- title
- \textsc{Camila}: Formal Software Engineering Supported by Functional Programming
- tipo
- inproceedings
- pages
- 1343--1358
- booktitle
- Proc. II Conf. Latino Americana de Programaci�n Funcional ({CLaPF97})
- docpage
- jj.bib.dp.html#ABNO97a
- author
-
- J.J. Almeida
- Barbosa, L.S.
- Neves, F.L.
- Oliveira, J.N.
- chave
- ABNO97b
- keyword
-
- Camila
- formal specification
- inproceedings
- editor
- Johnson, M.
- month
- December
- year
- 1997
- address
- Sydney, Australia
- publisher
- Springer Lect. Notes Comp. Sci. (1349)
- tipo
- inproceedings
- title
- \textsc{Camila}: Prototyping and Refinement of Constructive Specifications
- booktitle
- 6th International Conference on Algebraic Methods and Software Technology ({AMAST'97})
- docpage
- jj.bib.dp.html#ABNO97b
- pages
- 554--559
- docpage
- jj.bib.dp.html#AH97
- booktitle
- Proc. II Conference on Knowledge-based Intelligent Electronic Systems ({Kes98})
- title
- Dynamic Dictionary = cooperative information sources
- tipo
- inproceedings
- address
- Australia
- year
- 1998
- month
- April
- url
- http://natura.di.uminho.pt/~jj/bib/agentes97.ps.gz
- keyword
-
- dictionary
- Agentes
- inproceedings
- chave
- AH97
- author
-
- J.J. Almeida
- P.R. Henriques
- title
- Adapting Museum Structures for the Web: No Changes Needed!
- chave
- museums98
- note
- Toronto - Canad�
- author
-
- J.G. Rocha
- M.R. Henriques
- J.C. Ramalho
- J.J. Almeida
- J.L. Faria
- P.R. Henriques
- tipo
- inproceedings
- year
- 1998
- booktitle
- Museums and the Web 1998
- docpage
- jj.bib.dp.html#museums98
- tipo
- inproceedings
- author
-
- Almeida, J.J.
- Barbosa, L.S.
- Barros, J.B.
- Neves, L.F.
- publisher
- Proc. 3rd Summer School on Advan. Funct. Prog., Braga
- chave
- ABBN98
- title
- On The Development of \textsc{Camila}
- editor
- L.S. Barbosa and J.A. Saraiva
- docpage
- jj.bib.dp.html#ABBN98
- booktitle
- Workshop on Research Themes on Functional Programming
- year
- 1998
- month
- 18 Sep.
- docpage
- jj.bib.dp.html#Gis99
- booktitle
- Confer�ncia da Association of Geographic Information
Laboratories for Europe (AGILE)
- address
- Roma
- year
- 1999
- chave
- Gis99
- tipo
- inproceedings
- author
-
- Jorge Rocha
- Ana Silva
- Ricardo Henriques
- J.J. Almeida
- Pedro Henriques
- title
- Formal Methods for {GI
- keyword
-
- author
-
- Jorge Rocha
- Tiago Pedroso
- J.J. Almeida
- tipo
- inproceedings
- chave
- RPA99
- keyword
-
- GIS
- XML
- mapit
- inproceedings
- title
- {MAPit
- booktitle
- Confer�ncia da Association of Geographic Information
Laboratories for Europe (AGILE)
- docpage
- jj.bib.dp.html#RPA99
- year
- 1999
- address
- Roma
- year
- 1999
- keyword
-
- title
- Sobre a Utiliza��o de Metodologias Formais no Desenvolvimento de
{SIG
- tipo
- inproceedings
- author
-
- Jorge Gustavo Rocha
- Ana Silva
- J.J. Almeida
- Mario Ricardo Henriques
- Pedro Rangel Henriques
- docpage
- jj.bib.dp.html#RSea99
- booktitle
- GISBRASIL'99, Salvador
- chave
- RSea99
- chave
- xmldt99
- tipo
- inproceedings
- author
-
- J.J. Almeida
- Jos� Carlos Ramalho
- title
- {XML::DT
- keyword
-
- docpage
- jj.bib.dp.html#xmldt99
- booktitle
- XML-Europe'99, Granada - Espanha
- year
- 1999
- month
- May
- year
- 1999
- chave
- RRAH99
- author
-
- J.C. Ramalho
- J.G. Rocha
- J.J. Almeida
- P.R. Henriques
- keyword
-
- docpage
- jj.bib.dp.html#RRAH99
- journal
- Markup Languages: theory and practice
- pages
- 75--90
- olume
- 1
- publisher
- MIT Press
- tipo
- article
- title
- SGML documents: Where does quality go?
- author
-
- L.S. Barbosa
- J.B. Barros
- J.J. Almeida
- tipo
- inproceedings
- chave
- Barbosa2000
- keyword
-
- title
- Polytypic Recursion Patterns
- booktitle
- {SBLP'2000} (to appear as a ENTCS volume)
- docpage
- jj.bib.dp.html#Barbosa2000
- month
- May
- year
- 2000
- address
- {UFP}, Recife, Brasil
- title
- Smallbook -- comando para produ��o de livros em pequena escala
- keyword
-
- publishing
- latex
- smallbook
- inproceedings
- chave
- jj2001x
- tipo
- inproceedings
- author
-
- address
- Braga
- pages
- 445--450
- year
- 2000
- docpage
- jj.bib.dp.html#jj2001x
- booktitle
- Actas da II Confer�ncia Internacional de Tecnologias de
Informa��o e Comunica��o na Educa��o
- chave
- speaker:sepln2001
- author
-
- J.J. Almeida
- A. M. Sim�es
- keyword
-
- address
- Sevilha
- month
- Sep.
- year
- 2001
- publisher
- Sociedade Espa�ola para el Procesamiento del Lenguaje Natural
- tipo
- article
- title
- Text to speech -- a rewriting system approach
- docpage
- jj.bib.dp.html#speaker:sepln2001
- journal
- Procesamiento del Lenguaje Natural
- volume
- 27
- pages
- 247--255
- month
- Maio
- year
- 2001
- address
- Porto
- booktitle
- Congresso Nacional de Bibliotec�rios, Arquivistas e
Documentalistas
- url
- http://natura.di.uminho.pt/~jj/bib/museuDaPessoa2001.ps.gz
- docpage
- jj.bib.dp.html#mp2001
- title
- {Museu da Pessoa
- author
-
- J.J. Almeida
- J. Gustavo Rocha
- P. Rangel Henriques
- S�nia Moreira
- Alberto Sim�es
- tipo
- inproceedings
- chave
- mp2001
- tipo
- inproceedings
- author
-
- J.J. Almeida
- P. Rangel Henriques
- J. Gustavo Rocha
- Alberto Sim�es
- chave
- alfarrabio2001
- title
- Alfarr�bio: Adding value to an Heterogeneous Site Collection
- docpage
- jj.bib.dp.html#alfarrabio2001
- url
- http://natura.di.uminho.pt/~jj/bib/alfarrabio2001.ps.gz
- booktitle
- Congresso Nacional de Bibliotec�rios, Arquivistas e
Documentalistas
- year
- 2001
- month
- Maio
- address
- Porto
- title
- C�lculo de frequ�ncias de
palavras para entradas de dicion�rios atrav�s do uso conjunto de analisadores
morfol�gicos, taggers e corpora
- tipo
- inproceedings
- pages
- 407--418
- booktitle
- Actas do XVII Encontro da Associa��o Portuguesa de Lingu�stica
- docpage
- jj.bib.dp.html#freq2002
- author
-
- Paulo A. Rocha
- Alberto M. Sim�es
- J.J. Almeida
- chave
- freq2002
- abstract
- Apresentamos neste documento uma poss�vel abordagem �
extrac��o de frequ�ncias de palavras a partir de
corpora, baseada numa utiliza��o cooperativa de v�rias
ferramentas de Processamento de Linguagem Natural.
- year
- 2002
- address
- Lisboa 2001
- url
- http://natura.di.uminho.pt/~jj/bib/apl:freqnormpt.ps.gz
- address
- Lisboa 2001
- pages
- 485--495
- year
- 2002
- abstract
- Neste documento � nosso prop�sito apresentar as
caracter�sticas presentes no analisador morfol�gico
jspell e quais as suas consequ�ncias ao n�vel de
aplica��es de processamento de linguagem natural. Como
ferramenta que � frequentemente integrada em software
mais espec�fico, apresentamos um m�dulo Perl
desenvolvido com o objectivo de facilitar a interliga��o
do analisador morfol�gico com pequenas aplica��es
desenvolvidas em linguagens de scripting. Devido �
constante necessidade de melhoramento de dicion�rios, e
em particular dos analisadores morfol�gicos, discutimos
as propriedades que estes devem conter para facilitar o
seu processamento e enriquecimento autom�tico.
- docpage
- jj.bib.dp.html#jspell2002
- booktitle
- Actas do XVII Encontro da Associa��o Portuguesa de Lingu�stica
- title
- Jspell.pm -- um m�dulo de an�lise morfol�gica
para uso em processamento de linguagem natural
- chave
- jspell2002
- tipo
- inproceedings
- author
-
- Alberto M. Sim�es
- J.J. Almeida
- chave
- dag2002
- author
-
- Alberto M. Sim�es
- J.J. Almeida
- Pedro R. Henriques
- tipo
- inproceedings
- title
- Directory Attribute Grammars
- booktitle
- VI Simp�sio Brasileiro de Linguagens de Programa��o
- docpage
- jj.bib.dp.html#dag2002
- pages
- 297--308
- address
- Rio de Janeiro, Brasil
- year
- 2002
- booktitle
- Elpub 2002 -- International Conference on Electronic Publishing
- docpage
- jj.bib.dp.html#elpub2002
- month
- Nov.
- abstract
-
In last years the amount of digital documents has increased
dramatically. Unfortunately the same did not occur with the
structure and organization of the information. Traditionally we
built a digital library using a catalog with documents'
meta-information including a conceptual classification and an
ontology of concepts.
In this document we present a set of modules to help in the task of
building and maintaining a digital library. It includes a module to
work with ontologies, a set of modules to handle specific catalog
formats (like Bib\TeX), a module to define new catalog formats
and a tool to integrate ontologies and multi-format
catalogs in a web browse-able knowledge-base.
- year
- 2002
- pages
- 203--211
- address
- Karlov Vary, Rep�blica Checa
- author
-
- Alberto M. Sim�es
- J.J. Almeida
- tipo
- inproceedings
- chave
- elpub2002
- isbn
- 3-89700-357-0
- title
- Library::* -- a toolkit for digital libraries
- month
- Sep.
- abstract
-
Multilingual resources are useful for linguistic studies, translation,
and many other tasks. Unfortunately, these resources are difficult to obtain
and organize.
In this document we describe a set of tools designed to help in the
task of mining bilingual resources from the web, from a specific site,
from a file system, from a list of URLs, or from a translation memory.
As a design goal we intend to build tools that can be used both
cooperatively (in pipeline) and also in a independent way.
- year
- 2002
- chave
- parguess2002
- author
-
- J.J. Almeida
- Alberto M. Sim�es
- J. Alves de Castro
- pages
- 13--20
- journal
- Procesamiento del Lenguaje Natural
- docpage
- jj.bib.dp.html#parguess2002
- volume
- 29
- title
- Grabbing parallel corpora from the web
- publisher
- Sociedade Espa�ola para el Procesamiento del Lenguaje Natural
- tipo
- article
- number
- 3
- journal
- The Perl Review
- docpage
- jj.bib.dp.html#cP
- volume
- 0
- title
- Cooking Perl with flex
- tipo
- article
- year
- 2002
- month
- May
- abstract
-
There are a lot of tools for parser generation using Perl. As we
know, Perl has flexible data structures which makes it easy to
generate generic trees. While it is easy to write a grammar and a
lexical analyzer using modules like \texttt{Parse::Yapp
- chave
- cP
- author
-
- docpage
- jj.bib.dp.html#APL2k2.Parguess
- booktitle
- Actas do XVIII Encontro da Associa��o Portuguesa de Lingu�stica
- title
- Extrac��o de corpora paralelo a partir da web: constru��o e
disponibiliza��o
- tipo
- inproceedings
- lang
- PT
- year
- 2003
- abstract
-
Ao longo deste documento descrever-se-� um conjunto de ferramentas
constru�das para extrac��o autom�tica de recursos bilingues a partir
da Web, a partir de um \emph{site
- address
- Porto 2002
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/APL2k2.Parguess.pdf
- author
-
- J.J. Almeida
- Alberto Manuel Sim�es
- Jos� Alves Castro
- chave
- APL2k2.Parguess
- booktitle
- Actas do XVIII Encontro da Associa��o Portuguesa de Lingu�stica
- docpage
- jj.bib.dp.html#APL2k2.Synthesis
- title
- Gera��o de voz com sotaque
- tipo
- inproceedings
- lang
- PT
- abstract
-
Como � sabido os sotaques podem estar ligados a uma zona geogr�fica,
a um grupo social, podem at� ser uma caracter�stica pessoal. O seu
estudo e descri��o tem interessado muitos investigadores embora
normalmente esse estudo tem sido feito de modo pouco formal.
No trabalho que aqui se relata, tentou-se descrever formalmente
sotaques e disfun��es atrav�s de cria��o de regras a integrar como
variantes num gerador de voz. Deste modo, pretendeu-se criar um
ambiente de experimenta��o dos modelos constru�dos para descrever
algumas caracter�sticas de certos sotaques ou certas disfun��es, de
modo a permitir a sua valida��o.
Constatou-se que se consegue obter certas disfun��es e certos
sotaques com facilidade por simples acrescento de regras opcionais
em certas fases da gera��o da voz. Outros, aparentam ser de maior
dificuldade, ou por n�o conhecermos suficiente bem os fen�menos
neles envolvidos ou envolverem maior complexidade pros�dica.
- year
- 2003
- address
- Porto 2002
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/APL2k2.Synthesis.pdf
- author
-
- J.J. Almeida
- Alberto Manuel Sim�es
- chave
- APL2k2.Synthesis
- tipo
- inproceedings
- lang
- PT
- title
- Engenharia reversa de {HTML} usando tecnologia {XML}
- docpage
- jj.bib.dp.html#xata:xmldt
- booktitle
- {XATA --- XML}, Aplica��es e Tecnologias Associadas
- author
-
- J.J. Almeida
- Alberto Manuel Sim�es
- chave
- xata:xmldt
- irreditor
- Jos� Carlos Ramalho
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata2003xml.pdf
- year
- 2003
- abstract
- O proliferar de ferramentas criadores de HTML e o uso
de HTML guiado pelo aspecto, tem vindo a arruinar o
seu lado conceptual. Este problema foi reconhecido e
deu origem a v�rios formatos ou tecnologias com o
objectivo de separar o aspecto do conceito. No
entanto a realidade actual mostra uma enorme
quantidade de p�ginas HTML com p�ssima leitura
conceptual e estrutural, invalidando uma s�rie de
usos poss�veis da informa��o nelas contida. Nesta
comunica��o apresenta-se um trabalho (em fase
inicial) que pretende fazer engenharia reversa de
HTML para permitir aumentar a sua acessibilidade, a
fim de ser usada num \emph{browser
- chave
- xata:museudapessoa
- author
-
- Alberto Manuel Sim�es
- J.J. Almeida
- strutural
- M
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata2003mp.pdf
- editor
- Jos� Carlos Ramalho
- abstract
-
Este artigo apresenta a arquitectura actual do Museu da Pessoa,
contemplando a forma como os documentos est�o a ser editador,
catalogados, arquivados, e processados para a cria��o das estruturas
necess�rias ao Museu.
- year
- 2003
- lang
- PT
- tipo
- inproceedings
- title
- {H
- booktitle
- {XATA --- XML}, Aplica��es e Tecnologias Associadas
- docpage
- jj.bib.dp.html#xata:museudapessoa
- pages
- 288--298
- booktitle
- ElPub 2003 -- International conference on electronic publishing
- docpage
- jj.bib.dp.html#elpub2003
- title
- Music publishing
- publisher
- Universidade do Minho
- note
- Guimar�es
- tipo
- inproceedings
- lang
- EN
- isbn
- 972-98921-2-1
- abstract
-
Current music publishing in the Internet is mainly concerned with
sound publishing. We claim that music publishing is not only to make
sound available but also to define relations between a set of music
objects like music scores, guitar chords, lyrics and their
meta-data. We want an easy way to publish music in the Internet, to
make high quality paper booklets and even to create Audio CD's.
In this document we present a workbench for music publishing based
on open formats, using open-source tools and script programming over
them. The workbench is based on an archive specification written in
a text-based format which includes sound references, music scores,
chords and lyrics and their meta-information.
- month
- June
- year
- 2003
- editor
- Sely Costa et al.
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/elpub2003.pdf
- keyword
-
- m�sica
- bibliotecas digitais
- inproceedings
- author
-
- Alberto Manuel Sim�es
- J.J. Almeida
- chave
- elpub2003
- abstract
- O projecto TerminUM tem como objectivos principais o
estudo, experimenta��o e a cria��o de recursos na
�rea dos corpora paralelos, terminologia
(descritiva) e recursos multilingues ligados a
corpora: fazer extrac��o t�o autom�tica quanto
poss�vel de corpora a partir da web; fazer extrac��o
de dicion�rios, de terminologia e de outros recursos
ligados � tradu��o; criar e interligar as
ferramentas desenvolvidas; criar e disponibilizar:
(1) listas de Bitextos, corpora e corpora paralelos,
(2) ferramentas de cria��o e transforma��o de
corpora, (3) recursos multilingues derivados/ligados
a corpora. Nesta apresenta��o ser�o abordadas
algumas tarefas presentemente a decorrer no �mbito
do projecto, nomeadamente: ciclo de vida da
constru��o e transforma��o de corpora; resumo das
ferramentas desenvolvidas (e em desenvolvimento);
constru��o de corpora paralelos tomando como base
legendas de filmes (subtitles), ficheiro de
internacionaliza��o (mensagens de software .po) e
ficheiros de mem�rias de tradu��o (TMX); anima��o de
corpora paralelos via web (cria��o de motores de
consulta usando diversas ferramentas).
- month
- Jun.
- year
- 2003
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/cp3a2003-terminum.pdf
- keyword
-
- terminum
- parallel corpora
- inproceedings
- author
-
- J.J. Almeida
- Alberto Sim�es
- Jos� Castro
- Bruno
Martins
- Paulo Silva
- chave
- cp3a:terminum2003
- pages
- 7--14
- booktitle
- CP3A 2003 -- Workshop em Corpora Paralelos: aplica��es e
algoritmos associados
- docpage
- jj.bib.dp.html#cp3a:terminum2003
- title
- Projecto {TerminUM}
- publisher
- Universidade do Minho
- note
- Braga
- tipo
- inproceedings
- month
- Jun.
- year
- 2003
- pages
- 65--70
- booktitle
- CP3A 2003 -- Workshop em Corpora Paralelos: aplica��es e
algoritmos associados
- docpage
- jj.bib.dp.html#cp3a:kvec2003
- keyword
-
- kvec
- terminum
- parallel corpora
- word alignment
- inproceedings
- title
- {Lingua-Biterm}: um m�dulo Perl para extrac��o de terminologia bilingue
- publisher
- Universidade do Minho
- note
- Braga
- author
-
- tipo
- inproceedings
- chave
- cp3a:kvec2003
- chave
- cp3a:natools2003
- note
- Braga
- publisher
- Universidade do Minho
- author
-
- tipo
- inproceedings
- title
- Alinhamento de corpora paralelos
- keyword
-
- natools
- terminum
- parallel corpora
- word alignment
- inproceedings
- booktitle
- CP3A 2003 -- Workshop em Corpora Paralelos: aplica��es e
algoritmos associados
- docpage
- jj.bib.dp.html#cp3a:natools2003
- pages
- 71--77
- month
- Jun.
- year
- 2003
- volume
- 31
- docpage
- jj.bib.dp.html#sepln2003
- journal
- Procesamiento del Lenguaje Natural
- pages
- 217--224
- tipo
- article
- publisher
- Sociedade Espa�ola para el Procesamiento del Lenguaje Natural
- title
- {NATools} -- A Statistical Word Aligner Workbench
- year
- 2003
- month
- Sep.
- abstract
- This document presents the TerminUM project and the
work done in its statistical word aligner workbench (NATools). It
shows a variety of alignment methods for parallel corpora and
discusses the resulting terminological dictionaries and their use:
evaluation of sentence translations; construction of a multi-level
navigation system for linguistic studies or statistical
translations.
- author
-
- Alberto M. Sim�es
- J.J. Almeida
- chave
- sepln2003
- keyword
-
- natools
- terminum
- parallel corpora
- word alignment
- article
- author
-
- Jos� Jo�o Dias de Almeida
- chave
- tesejj
- url
- http://natura.di.uminho.pt/~jj/bib/tesejj.pdf
- year
- 2003
- tipo
- phdthesis
- lang
- PT
- title
- Dicion�rios din�micos multi-fonte
- docpage
- jj.bib.dp.html#tesejj
- school
- Universidade do Minho
- type
- Tese de Doutoramento
- superviser
- Pedro Rangel Henriques
- year
- 2004
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/msc.pdf
- chave
- teseambs
- author
-
- Alberto Manuel Brand�o Sim�es
- docpage
- jj.bib.dp.html#teseambs
- superviser
- Jos� Jo�o Almeida and Pedro Rangel Henriques
- type
- Tese de Mestrado
- school
- Escola de Engenharia - Universidade do Minho
- title
- Parallel Corpora word alignment and applications
- lang
- EN
- tipo
- mastersthesis
- title
- {TX
- lang
- PT
- isbn
- 972-99166-0-8
- tipo
- inproceedings
- pages
- 217--224
- booktitle
- {XATA 2004
- docpage
- jj.bib.dp.html#xata04:tx
- irreditor
- Jos� Carlos Ramalho and Alberto Sim�es
- chave
- xata04:tx
- author
-
- Jos� Jo�o Almeida
- Alberto Sim�es
- month
- February
- abstract
-
Desde o advento do SGML e posteriormente do XML, que a valida��o de
documentos tem sido focada.
Esta valida��o surgiu para analisar a estrutura dos documentos SGML
e XML usando DTDs. Al�m dessa, e devido �s restri��es do XML em
rela��o ao SGML, a valida��o de XML bem formado tamb�m tem sido
usada. Mais recentemente, os Schema e Schematron vieram permitir a
valida��o a um n�vel superior: n�o s� a estrutura do documento mas
tamb�m alguma valida��o de conte�do.
Neste artigo apresentamos a ferramenta TX que visa outro n�vel de
valida��o, em que os tipos possam ser mais ricos e/ou calculados
dinamicamente, e onde se possa definir fun��es de anota��o e/ou
correc��o das por��es do documento que n�o sigam as especifica��es.
- year
- 2004
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata04-tx.pdf
- year
- 2004
- month
- February
- abstract
-
Neste documento apresenta-se o conceito de mem�rias de tradu��o
distribu�das, discutindo-se o seu interesse na �rea da tradu��o, bem
como as vantagens que uma ferramenta de tradu��o pode tirar do seu
uso.
� apresentada uma poss�vel implementa��o de mem�rias de tradu��o
distribu�das usando WebServices numa arquitectura de cooperativismo.
S�o definidos as mensagens (API) que um servi�o deste g�nero deve
implementar para que uma ferramenta de tradu��o possa tirar partido
da colabora��o entre tradutores.
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata04-mtd.pdf
- irreditor
- Jos� Carlos Ramalho and Alberto Sim�es
- author
-
- Alberto Sim�es
- Jos� Jo�o Almeida
- Xavier Gomez
Guinovart
- chave
- xata04:mtd
- pages
- 59--68
- docpage
- jj.bib.dp.html#xata04:mtd
- booktitle
- {XATA 2004
- title
- Mem�rias de Tradu��o Distribu�das
- tipo
- inproceedings
- isbn
- 972-99166-0-8
- lang
- PT
- year
- 2004
- number
- 1
- volume
- 1
- docpage
- jj.bib.dp.html#xmldt2
- journal
- The Perl Review
- title
- {XML::DT
- tipo
- article
- author
-
- chave
- xmldt2
- tipo
- article
- publisher
- Sociedade Espa�ola para el Procesamiento del Lenguaje Natural
- lang
- EN
- title
- Distributed Translation Memories implementation using WebServices
- volume
- 33
- docpage
- jj.bib.dp.html#sepln2004
- journal
- Procesamiento del Lenguaje Natural
- pages
- 89--94
- author
-
- Alberto Sim�es
- Xavier G�mez Guinovart
- J.J. Almeida
- chave
- sepln2004
- keyword
-
- TMs
- MT
- distributed translation memories
- WebServices
- CAT
- article
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/dtm-sepln.pdf
- year
- 2004
- abstract
- Translation Memories are very useful for translators
but are difficult to share and reuse in a community of translators.
This article presents the concept of Distributed Translation
Memories, where all users can contribute and sharing translations.
Implementation details using WebServices are shown, as well as an
example of a distributed system between Portugal and Spain.
- month
- July
- lang
- EN
- tipo
- inproceedings
- title
- Linguateca: um centro de recursos distribu�do para o processamento
computacional da l�ngua portuguesa
- booktitle
- Workshop on Linguistic Tools and Resources for Spanish and
Portuguese
- docpage
- jj.bib.dp.html#linguateca
- pages
- 147--154
- chave
- linguateca
- author
-
- Diana Santos
- Alberto Sim�es
- Ana Frankenberg-Garcia
- Ana Pinto
- Anabela Barreiro
- Belinda Maia
- Cristina Mota
- D�bora
Oliveira
- Eckhard Bick
- Elisabete Ranchhod
- J.J. Almeida
- Lu�s Cabral
- Lu�s Costa
- Lu�s Sarmento
- Marcirio Chaves
- Nuno
Cardoso
- Paulo Rocha
- Rachel Aires
- Ros�rio Silva
- Rui Vilela
- Susana Afonso
- editor
- IBERAMIA 2004
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/linguateca.pdf
- address
- Puebla, M�xico
- abstract
-
Neste artigo apresentamos uma panor�mica da actividade da Linguateca na cria��o
e disponibiliza��o de recursos e ferramentas para a l�ngua portuguesa. Come�amos
por uma descri��o dos objectivos e pressupostos da Linguateca e uma breve hist�ria
da sua interven��o, e finalizamos com algumas considera��es sobre a melhor forma
de prosseguir na organiza��o da �rea.
- year
- 2004
- tipo
- inproceedings
- author
-
- Rui Vilela
- Alberto Sim�es
- Eckhard Bick
- J.J. Almeida
- publisher
- Departamento de Inform�tica, Universidade do Minho
- location
- Braga
- chave
- xata05:fs
- keyword
-
- XML
- Floresta Sint�ctica
- tigerXML
- Lingua::PT::Dirty
- inproceedings
- title
- Representa��o em {XML} da {F}loresta {S}int�ctica
- irreditor
- Jos� Carlos Ramalho and Alberto Sim�es and Jo�o Correia Lopes
- docpage
- jj.bib.dp.html#xata05:fs
- booktitle
- XATA 2005, Aplica��es e Tecnologias Associadas
- year
- 2005
- month
- Fev.
- year
- 2005
- month
- Fev.
- docpage
- jj.bib.dp.html#xata05:tdt
- booktitle
- XATA 2005, Aplica��es e Tecnologias Associadas
- keyword
-
- XML
- XML::DT
- inproceedings
- title
- Infer�ncia de tipos em documentos {XML}
- irreditor
- Jos� Carlos Ramalho and Alberto Sim�es and Jo�o Correia Lopes
- tipo
- inproceedings
- author
-
- J.J. Almeida
- Alberto Sim�es
- publisher
- Departamento de Inform�tica, Universidade do Minho
- location
- Braga
- chave
- xata05:tdt
- pages
- 376--377
- address
- Portalegre
- month
- Fev.
- year
- 2006
- booktitle
- XATA 2006, Aplica��es e Tecnologias Associadas
- docpage
- jj.bib.dp.html#xata06:navegante
- ote
- poster
- irreditor
- Jos� Carlos Ramalho and Alberto Sim�es and Jo�o Correia Lopes
- title
- Navegante: um proxy de ordem superior para navega��o intusiva
- keyword
-
- XML
- XML::DT
- HTML
- inproceedings
- chave
- xata06:navegante
- author
-
- J.J. Almeida
- Alberto Sim�es
- publisher
- ESTGP
- tipo
- inproceedings
- irreditor
- Jos� Carlos Ramalho and Alberto Sim�es and Jo�o Correia Lopes
- keyword
-
- XML
- XML::DT
- HTML
- inproceedings
- chave
- xata06:xmlauto
- author
-
- J.J. Almeida
- Alberto Sim�es
- address
- Portalegre
- year
- 2006
- month
- Fev.
- abstract
-
� consensual que o XML como linguagem para a estrutura��o de documentos
tem vindo a tomar um lugar relevante. � tamb�m evidente a vantagem
obtida no uso de XML como linguagem de interc�mbio.
No entanto, a sua sintaxe �
demasiado descritiva pelo que a gera��o de documentos de forma
manual � dolorosa sendo �til dispor de m�dulos
que simplifiquem essa tarefa.
Neste artigo propomos um m�dulo Perl (XML::Writer::Simple) configur�vel via
DTD que simplifica a tarefa de gerar XML.
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xata2006-xmlwritersimple.pdf
- title
- Gera��o din�mica de {API
- isbn
- 972-99166-2-4
- lang
- PT
- tipo
- inproceedings
- publisher
- ESTGP
- pages
- 307--314
- docpage
- jj.bib.dp.html#xata06:xmlauto
- booktitle
- {XATA 2006
- abstract
- Parallel corpora are important resources for most
Natural Language processing tasks. From the common
applications, like machine translation, to the
usually mono-lingual tasks as paraphrase detection
and word sense disambiguation, most researchers are
using massive parallel corpora. Thus, the
availability of an efficient way to manage them is
very important. This paper presents a Client-Server
architecture to query efficiently parallel corpora
and probabilistic translation dictionaries.
- month
- September
- year
- 2006
- address
- Zaragoza, Spain
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/sepln06.pdf
- author
-
- Alberto Sim�es
- J. Jo�o Almeida
- chave
- sepln06
- pages
- 91--97
- volume
- 37
- docpage
- jj.bib.dp.html#sepln06
- journal
- Procesamiento del Lenguaje Natural
- title
- {NatServer:
- tipo
- article
- lang
- EN
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/eamt06.pdf
- editor
- Jan Tore L�nning and Stephan Oepen
- address
- Oslo, Norway
- shortin
- {EAMT
- year
- 2006
- month
- 19--20, June
- abstract
- One of the bottlenecks of example-based machine
translation (EBMT) is to be able to amass
automatically quantities of good examples. In our
work in EBMT, we are investigating how far one can
go by performing example extraction from parallel
corpora using Probabilistic Translation Dictionaries
to obtain example segmentation points. In fact, the
success of EBMT highly depends on examples quality
and quantity, but also in their length. Thus, we
give special importance on methods to extract
different size examples from the same translation
unit. With this article we show that it is possible
to extract quantities for examples from parallel
corpora just using probabilistic translation
dictionaries extracted from the same corpora.
- chave
- eamt06
- author
-
- Alberto Sim�es
- J. Jo�o Almeida
- docpage
- jj.bib.dp.html#eamt06
- booktitle
- 11th Annual Conference of the European Association for Machine Translation
- pages
- 27--32
- isbn
- 82-7368-294-3
- lang
- EN
- tipo
- inproceedings
- title
- Combinatory Examples Extraction for Machine Translation
- docpage
- jj.bib.dp.html#lrec06
- booktitle
- Fifth international conference on Language Resources and Evaluation, LREC 2006
- title
- {$T_2O$
- lang
- EN
- tipo
- inproceedings
- address
- Genova, Italy
- shortin
- {LREC
- year
- 2006
- abstract
- In this article we present $T_2O$ --- a workbench to
assist the process of translating heterogeneous
resources into ontologies, to enrich and add
multilingual information, to help programming with
them, and to support ontology publishing. $T_2O$ is
an ontology algebra.
- month
- May
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/lrec06.pdf
- chave
- lrec06
- author
-
- Jos� Jo�o Almeida
- Alberto Sim�es
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/elpub06-t2o.pdf
- address
- Bansko, Bulgaria
- month
- June
- abstract
- Dictionary and Thesaurus are valuable resources for
Natural Language Processing but do not exist as
freely available as expected, especially for
languages other than English and, when they exist,
they are just available for querying online. Our
main goal with T2O --- Thesaurus to Ontology
framework --- is to create a multilingual ontology:
freely available online and to download; with a
computer readable format; with a good API; with a
structure as rich as possible; reusing all the
structured information we can get;
- year
- 2006
- chave
- elpub06-t2o
- author
-
- J. Jo�o Almeida
- Alberto Sim�es
- booktitle
- {ElPub 2006
- docpage
- jj.bib.dp.html#elpub06-t2o
- pages
- 373--374
- lang
- EN
- note
- poster
- tipo
- inproceedings
- title
- Publishing multilingual ontologies: a quick way of obtaining feedback
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/elpub06-blind.pdf
- shortin
- {ElPub
- address
- Bansko, Bulgaria
- abstract
- True accessibility requires minimizing the scanning time
to find a particular piece of
information. Sequentially reading web pages do not
provide this type of accessibility, for instance
before the user gets to the actual text content of
the page it has to go through a lot of menus and
headers. However if the user could navigate a web
page based through semantically classified blocks
then the user could jump faster to the actual
content of the page, skipping all the menus and
other parts of the page. We propose a transcoding
engine that tackles accessibility at two distinct,
yet complementary, levels: for specific known sites
and general unknown sites. We present a tool for
building customized scripts for known sites that
turns this process in an extremely simple task,
which can be performed by anyone, without any
expertise. For general unknown sites, our approach
relies on statistical analysis of the structural
blocks that define a web page to infer a semantic
for the block.
- month
- June
- year
- 2007
- chave
- elpub06-blind
- author
-
- Alberto Sim�es
- An�lia Louren�o
- Jos� Jo�o Almeida
- booktitle
- The 31st Annual Conference of the German Classification Society on Data Analysis, Machine Learning, and Applications
- docpage
- jj.bib.dp.html#elpub06-blind
- pages
- 123-134
- lang
- EN
- note
- \textbf{forthcoming
- tipo
- inproceedings
- title
- Mining Classical Music Scores for Epoch Classification
- docpage
- jj.bib.dp.html#avalon:jspell
- booktitle
- Avalia��o conjunta: um novo paradigma no processamento computacional da l�ngua portuguesa
- pages
- 83--90
- tipo
- incollection
- publisher
- {IST Press
- title
- Jspellando nas morfolimp�adas: Sobre a participa��o do {Jspell
- editor
- Diana Santos
- year
- 2007
- shortin
- Avalia��o conjunta, cap. 8
- author
-
- Jos� Jo�o Almeida
- Alberto Sim�es
- chave
- avalon:jspell
- editor
- Diana Santos
- year
- 2007
- shortin
- Avalia��o conjunta, cap. 18
- author
-
- Alberto Sim�es
- Jos� Jo�o Almeida
- chave
- avalon:avalinha
- docpage
- jj.bib.dp.html#avalon:avalinha
- booktitle
- Avalia��o conjunta: um novo paradigma no processamento computacional da l�ngua portuguesa
- pages
- 219--230
- tipo
- incollection
- publisher
- {IST Press
- title
- Avalia��o de alinhadores
- editor
- Jos� Carlos Ramalho and Jo�o Correia Lopes and Lu�s Carr��o
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/xmlyamljson07.pdf
- shortin
- {XATA
- year
- 2007
- abstract
-
- month
- February
- institution
- Universidade do Minho, Departamento de Inform�tica
- chave
- xata07:xmltmx
- author
-
- R�ben Fonseca
- Alberto Sim�es
- irreditor
- Jos� Carlos Ramalho and Jo�o Correia Lopes and Lu�s Carr��o
- keyword
-
- docpage
- jj.bib.dp.html#xata07:xmltmx
- booktitle
- {XATA 2007
- type
- Manual
- pages
- 33--46
- isbn
- 978-972-99166-4-9
- tipo
- inproceedings
- title
- Alternativas ao {XML
- chave
- MP07
- author
-
- Alberto Sim�es
- R�ben Fonseca
- Jos� Jo�o Almeida
- address
- Rennes, France
- year
- 2007
- abstract
- Some processes are not easy to be programmed from scratch
for
parallel machines (clusters), but can be easily split on simple
steps. Makefile::Parallel is a tool which lets users to specify how processes
depend on each other.
The language syntax resembles the well known Makefile
makefiles format, but instead of specifying files or targets
dependencies, Makefile::Parallel specifies processes (or jobs) dependencies.
The scheduler submits jobs to the cluster scheduler (in our case,
Rocks PBS) waiting them to end. When each process finishes,
dependencies are calculated and direct dependent jobs are submitted.
Makefile::Parallel language includes features to specify parametric rules,
used
to split and join processes dependencies. Some tasks can be split
into n smaller jobs working on different portions of files. At the
end, another process can be used to join the results.
- month
- August
- editor
- Anne-Marie Kermarrec and Luc Boug� and Thierry Priol
- title
- {Makefile::Parallel
- tipo
- inproceedings
- publisher
- Springer-Verlag
- pages
- 33--41
- docpage
- jj.bib.dp.html#MP07
- booktitle
- Euro-Par 2007
- series
- LNCS
- volume
- 4641
- chave
- epia-bio-2007
- tipo
- inproceedings
- author
-
- An�lia Louren�o
- Alberto Sim�es
- Jos� Jo�o Almeida
- Miguel Rocha
- Isabel Rocha
- Eug�nio Ferreira
- title
- An Ontology-Based Approach To Systems Biology Literature
Retrieval and Processing
- irreditor
- Jos� Neves and Manuel Filipe Santos and Jos� Manuel Machado
- docpage
- jj.bib.dp.html#epia-bio-2007
- booktitle
- New Trends in Artificial Intelligence
- shortin
- Epia, CMBSB
- pages
- 541--552
- year
- 2007
- abstract
- This paper details the \emph{SysBio Explorer
- month
- December
- shortin
- Epia, TEMA
- pages
- 791--799
- year
- 2007
- abstract
- Music Classification is a particular area
of Computational Musicology that provides valuable
insights about the evolving of composition patterns
and assists in catalogue generation. The proposed work
detaches from former works by classifying music based
on music score information. Text Mining techniques
support music score processing while Classification
techniques are used in the construction of decision
models. Although research is still at its earliest
beginnings, the work already provides valuable
contributes to symbolic music representation processing
and subsequent analysis. Score processing involved
the counting of ascending and descending chromatic
intervals, note duration and meta-information
tagging. Analysis involved feature selection and
the evaluation of several data mining algorithms,
ensuring extensibility towards larger repositories or
more complex problems. Experiments report the analysis
of composition epochs on a subset of the Mutopia project
open archive of classical LilyPond-annotated
music scores.
- month
- December
- docpage
- jj.bib.dp.html#epia-music-2007
- booktitle
- New Trends in Artificial Intelligence
- title
- Using Text Mining Techniques for Classical Music Scores
Analysis
- irreditor
- Jos� Neves and Manuel Filipe Santos and Jos� Manuel Machado
- chave
- epia-music-2007
- tipo
- inproceedings
- author
-
- Alberto Sim�es
- An�lia Louren�o
- Jos� Jo�o Almeida
- note
- Documenta��o e actas do HAREM, a primeira avalia��o conjunta na
�rea
- publisher
- Linguateca
- tipo
- incollection
- title
- {RENA
- booktitle
- Reconhecimento de entidades mencionadas em portugu�s
- docpage
- jj.bib.dp.html#harem:rena
- pages
- 157-172
- chave
- harem:rena
- author
-
- irreditor
- Diana Santos and Nuno Cardoso
- url
- http://acdc.linguateca.pt/aval_conjunta/LivroHAREM/Cap13-SantosCardoso2007-Almeida.pdf
- shortin
- {HAREM
- year
- 2007
- title
- Parallel Corpora based Translation Resources Extraction
- lang
- EN
- tipo
- article
- pages
- 265--272
- docpage
- jj.bib.dp.html#sepln07
- journal
- Procesamiento del Lenguaje Natural
- volume
- 39
- chave
- sepln07
- author
-
- Alberto Sim�es
- Jos� Jo�o Almeida
- year
- 2007
- month
- September
- abstract
- This paper describes NATools, a toolkit to process,
analyze and extract translation resources from
Parallel Corpora. It includes tools like a
sentence-aligner, a probabilistic translation
dictionaries extractor, word-aligner, a corpus
server, a set of tools to query corpora and
dictionaries, as well as a set of tools to extract
bilingual resources.
- booktitle
- {XATA 2008
- docpage
- jj.bib.dp.html#cgiauto08
- pages
- 22--27
- tipo
- inproceedings
- isbn
- 978-972-99166-5-6
- title
- {CGI::Auto
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/cgiauto08.pdf
- month
- February
- abstract
- The creation of a CGI or a WebService as an interface for
a command line tool is
not as unusual as it may seem. It is extremely usual and useful.
There are applications developed as command line tools that can be useful for
different purposes,
and different kind of users. Some of these users might not be able to run
these tools directly.
For instance, it
is not easy to install a bunch of Perl modules to have a small tool working.
For these situations, it is easier to make the tool available in the Web or as
a
WebService.
The problem with making the tool available in these fashions, is that
programmers tend to rewrite
the tools to incorporate the CGI or XML specific layers.
We defend that these CGI or WebService interfaces should use the already
available command line
tool, without any change. This interface should be able to read a simple
textual
specification of how the command line tool works, and buid the CGI or XML
specific layers
automatically.
The CGI::Auto module aims this purpose:
to encapsulate command line tools in a CGI layer based on a textual
specification, transforming
the command line tool in a web application.
- year
- 2008
- author
-
- Davide Sousa
- Alberto Sim�es
- Jos� Jo�o Almeida
- chave
- cgiauto08
- irreditor
- Jos� Carlos Ramalho and Jo�o Correia Lopes and Salvador
Abreu
- irreditor
- Jos� Carlos Ramalho and Jo�o Correia Lopes and Salvador
Abreu
- author
-
- Nuno Carvalho
- Jos� Jo�o Almeida
- Alberto Sim�es
- chave
- navegante08
- year
- 2008
- abstract
- NAVEGANTE is a generic framework to build superior order
proxies for
intrusive browsing. This framework provides the means for developing
tools that behave as proxies, but perform some processing task on
the content that is being browsed. Parallel to this content processing,
applications can also run other user-defined functions with different
purposes and interfaces, but we'll explain those later. Currently,
NAVEGANTE only builds applications that run as CGIs, but this is intended
to change in a near future. Applications are built writing programs in
NAVEGANTE's Domain Specific Language (DSL).
NAVEGANTE is a work in progress. This article aims to describe the current
state of development. What applications can be built and how. Also, we
identify some implementation problems, and briefly discuss some future
improvements. Finally, we try to illustrate most of the concepts described
using a couple of case studies.
- month
- February
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/navegante08.pdf
- title
- {NAVEGANTE
- tipo
- inproceedings
- isbn
- 978-972-99166-5-6
- pages
- 52--63
- docpage
- jj.bib.dp.html#navegante08
- booktitle
- {XATA 2008
- title
- Bilingual Terminology Extraction based on Translation
Patterns
- tipo
- article
- lang
- EN
- pages
- 281--288
- volume
- 41
- journal
- Procesamiento del Lenguaje Natural
- docpage
- jj.bib.dp.html#sepln08
- author
-
- Alberto Sim�es
- Jos� Jo�o Almeida
- chave
- sepln08
- year
- 2008
- month
- September
- abstract
- Parallel corpora are rich sources of translation
resources. This document presents a methodology for the extraction
of bilingual
nominals (terminology candidates) from parallel corpora, using
translation patterns.
The patterns proposed in this work specify the order changes that
occur during translation
and that are intrinsic to the involved languages syntaxes.
These patterns are described in a domain specific language
named PDL (Pattern Description Language), and are extremely
efficient for the detection of nominal phrases.
- year
- 2008
- pages
- 35--42
- docpage
- jj.bib.dp.html#propor-apslt08
- booktitle
- Applications of Portuguese Speech and Language Technologies,
PROPOR 2008 Special session
- title
- A Textual Rewriting system for NLP
- irreditor
- Ant�nio Teixeira and Daniela Braga
- tipo
- inproceedings
- author
-
- J. J. Almeida
- Alberto Sim�es
- chave
- propor-apslt08
- editor
- Luis Seabra Lopes and
Nuno Lau and
Pedro Mariano and
Luis Mateus Rocha
- url
- http://dx.doi.org/10.1007/978-3-642-04686-5_33
- year
- 2009
- author
-
- Brett Drury
- J. J. Almeida
- chave
- epia:DruryA09
- series
- Lecture Notes in Computer Science
- volume
- 5816
- docpage
- jj.bib.dp.html#epia:DruryA09
- booktitle
- EPIA
- pages
- 400-410
- tipo
- inproceedings
- note
- Progress in Artificial Intelligence, EPIA 2009, Aveiro, Portugal, October 12-15
- publisher
- Springer
- title
- Construction of a Local Domain Ontology from News Stories
- isbn
- 978-989-96278-1-9
- lang
- EN
- tipo
- inproceedings
- title
- Bilingual Example Segmentation based on Markers
Hypothesis
- docpage
- jj.bib.dp.html#markers09
- booktitle
- I Iberian SLTech 2009
- pages
- 95--98
- chave
- markers09
- author
-
- Alberto Sim�es
- Jos� Jo�o Almeida
- editor
- Ant�nio Teixeira and Miguel Sales Dias and Daniela Braga
- address
- Porto Salvo, Portugal
- year
- 2009
- abstract
- The Marker Hypothesis was first defined by Thomas Green
in 1979. It
is a psycho-linguistic hypothesis defining that there is a set of
words in every language that marks boundaries of phrases in a
sentence. While it remains a hypothesis because nobody has proved
it, tests have shows that results are comparable to basic shallow
parsers with higher efficiency.
The chunking algorithm based on the Marker Hypothesis is simple,
fast and almost language independent. It depends on a list of
closed-class words, that are already available for most languages.
This makes it suitable for bilingual chunking (there is not the
requirement for separate language shallow parsers).
This paper discusses the use of the Marker Hypothesis combined
with Probabilistic Translation Dictionaries for example-based machine
translation resources extraction from parallel corpora.
- month
- September, 3--4
- booktitle
- {XATA 2010
- docpage
- jj.bib.dp.html#xata2010-rewritexml
- pages
- 27--38
- lang
- EN
- tipo
- inproceedings
- title
- Processing {XML:
- editor
- Alberto Sim�es and Daniela da Cruz and Jos� Carlos Ramalho
- address
- Vila do Conde
- abstract
- Nowadays XML processing is performed using one of
two approaches: using the SAX (Simple API for XML)
or using the DOM (Document Ob ject Model). While
these two approaches are adequate for most cases
there are situations where other approaches can make
the solution easier to write, read and, therefore,
to maintain. This document presents a rewriting
approach for XML documents processing, focusing
the tasks of transforming XML documents (into other
XML formats or other textual documents) and the task
of rewriting other textual formats into XML
dialects. These approaches were validated with some
case studies, ranging from an XML authoring tool to
a dictionary publishing mechanism.
- month
- Maio
- year
- 2010
- chave
- xata2010-rewritexml
- author
-
- Alberto Sim�es
- Jos� Jo�o Almeida
- title
- A Case Study of Rule Based and Probabilistic Word Error Correction of
Portuguese OCR Text in a "Real World" Environment for Inclusion in a Digital
Library
- tipo
- article
- note
- presented in {CICLING2010
- umber
- 1-2
- olume
- 1
- pages
- 307--315
- journal
- International Journal of Computational Linguistics
- docpage
- jj.bib.dp.html#ocr2010
- author
-
- Brett Drury
- Jos� Jo�o Almeida
- chave
- ocr2010
- year
- 2010
- url
- http://10.255.0.115/pub/2010/DA10
- editor
- Nicoletta Calzolari and others
- address
- Valletta, Malta
- shortin
- {LREC
- language
- english
- year
- 2010
- month
- may
- chave
- lrec10:bigorna
- author
-
- Jos� Jo�o Almeida
- Andr� Santos
- Alberto Sim�es
- docpage
- jj.bib.dp.html#lrec10:bigorna
- booktitle
- Proceedings of the Seventh conference on International Language
Resources and Evaluation (LREC'10)
- isbn
- 2-9517408-6-7
- tipo
- inproceedings
- publisher
- European Language Resources Association (ELRA)
- title
- Bigorna -- A Toolkit for Orthography Migration Challenges
- date
- 19-21
- booktitle
- Proceedings of the Seventh conference on International Language
Resources and Evaluation (LREC'10)
- docpage
- jj.bib.dp.html#lrec10:dicaberto
- date
- 19-21
- title
- Processing and Extracting Data from Dicion�rio Aberto
- publisher
- European Language Resources Association (ELRA)
- tipo
- inproceedings
- isbn
- 2-9517408-6-7
- month
- may
- year
- 2010
- language
- english
- shortin
- {LREC
- address
- Valletta, Malta
- editor
- Nicoletta Calzolari and others
- author
-
- Alberto Sim�es
- Jos� Jo�o Almeida
- Rita Farinha
- chave
- lrec10:dicaberto
- pages
- 50--55
- docpage
- jj.bib.dp.html#bucc2010
- booktitle
- BUCC2010 -- 3rd Workshop on Building and Using Comparable Corpora, lrec2010
- title
- Automatic Parallel Corpora and Bilingual Terminology extraction from Parallel WebSites
- tipo
- inproceedings
- lang
- EN
- year
- 2010
- month
- May
- abstract
- In our days, the notion, the importance and the
significance of parallel corpora is so big that needs
no special introduction. Unfortunately, public
available parallel corpora is somewhat limited in
range. There are big corpora about politics or
legislation, about medicine and other specific areas,
but we miss corpora for other different
areas. Currently there is a huge investment on using
the Web as a corpus. This article uncovers GWB, a
tool that aims automatic construction of parallel
corpora from the web. We defend that it is possible
to build high quality terminological corpora in an
automatic fashion, just by specifying a sensible
Internet domain and using an appropriate set of seed
keywords. GWB is a web-spider that works in
conjunction with a set of other Open-Source tools,
de�ning a pipeline that includes the documents
retrieval from the web, alignment at sentence level
and its quality analysis, bilingual dictionaries and
terminology extraction and construction of off-line
dictionaries.
- address
- Valletta, Malta
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/bucc2010.pdf
- editor
- Reinhard Rapp and Pierre Zweigenbaum and Serge Sharoff
- author
-
- Jos� Jo�o Almeida
- Alberto Sim�es
- chave
- bucc2010
- pages
- 19--22
- booktitle
- Entity2010 -- Workshop on Resources and Evaluation for Entity Resolution and Entity
Management, lrec2010
- docpage
- jj.bib.dp.html#brett:lrec
- title
- Identification, extraction and population of collective named
entities from business news
- lang
- EN
- tipo
- inproceedings
- address
- Valletta, Malta
- abstract
-
Sentiment analysis of business news has become an increasingly popular
area of research for both the practitioner and academic. The future
financial prospects of companies can be estimated through the aggregation
of sentiment over a period of time. The aggregation of sentiment
for a specific company is only possible if the company is explicitly
mentioned in the news text. In certain instances, news text may refer
to groups or collections of companies, for exampleThe Automotive
SectororThe Russell Group of Universities. Widely available named
entity dictionaries will not recognize these groups of companies, and
consequently, it may not be possible to assign sentiment attributed
to these groups of companies to their individual members. This paper
describes a method for identifying groups of companies, which for the
purposes of this paper will be known asCollective Entities. The
described method is corpus based: it uses linguistic patterns to
identify Collective Entity Names, their members and their natural
relations with other Collective Entities. The described methodology
contains the following steps: 1. Identify and validate seed extraction
patterns, 2. Expand seed patterns, 3. Extract and validate Collective
Named Entities, 4. Extract related Collective Named Entities, 5. Construct
and populate an Ontology and 6. Expand the members of Collective Entity
sets with Linked Data.
- month
- May
- year
- 2010
- chave
- brett:lrec
- author
-
- Jos� Jo�o Almeida
- Brett Drury
- pages
- 217--220
- docpage
- jj.bib.dp.html#fala2010-triPsi
- booktitle
- FALA2010 -- II Iberian SLTech Workshop
- title
- Automating psycholinguistic statistics computation:
Procura-Palavras
- tipo
- inproceedings
- address
- Vigo
- year
- 2010
- abstract
-
This article describes psycholinguistic lexical databases
available in various languages, including English, Spanish and
Portuguese. These lexical databases are important for researchers
in Psycholinguistics and other related areas, providing
a pool of experimental materials and allowing for an efficient
process of selection of these experimental materials.
The process of gathering statistics is slow, resulting in a
small pool of materials in the short-term. The need to find an
alternative method to gather limited or yet unavailable statistics
for a specific language led us to consider gathering statistics
from other languages and to compute their triangulation. Our
aim was to automatize the computation of statistics such as
Familiarity, Imageability, Age of Acquisition and Written Word
Frequency for that specific language.
We will describe the process of preparing this data and triangulating and
comparing statistics for some languages in an attempt of finding a
relationship between them. The results were
analysed considering correlations between each statistic in each
pair of languages and by computing the mean of absolute differences between
each language's values.
- month
- November
- editor
- Carmen Mateo and Francisco D�az and Francisco Paz�
- chave
- fala2010-triPsi
- author
-
- Jo�o Filipe Machado
- Jos� Jo�o Almeida
- Alberto Sim�es
- Ana Soares
- editor
- Luis Barbosa and Antonio Cerone and Siraj Shaikh (Guest Eds.)
- url
- http://journal.ub.tu-berlin.de/index.php/eceasst/article/view/458/446
- year
- 2010
- author
-
- Alberto Sim�es
- Nuno Carvalho
- Jos� Jo�o Almeida
- chave
- opencert2010
- volume
- 33
- journal
- Electronic Communications of the EASST
- docpage
- jj.bib.dp.html#opencert2010
- tipo
- article
- note
- Foundations and Techniques for Open Source Software Certification
- title
- Testing as a Certification Approach
- pages
- 67--72
- number
- 3
- journal
- Linguam�tica
- docpage
- jj.bib.dp.html#p-pal-linguamatica
- volume
- 2
- title
- {P-PAL:
- tipo
- article
- month
- December
- abstract
-
Neste trabalho apresentamos o projecto Procura-PALavras (P-PAL)
cujo principal objectivo � desenvolver uma ferramenta
electr�nica que disponibilize informa��o sobre �ndices
psicolingu�sticos objectivos e subjectivos de palavras do
Portugu�s Europeu (PE). O P-PAL ser� disponibilizado
gratuitamente � comunidade cient�fica num formato amig�vel a
partir de um s�tio na Internet a construir para o efeito. Ao
utilizar o P-PAL, o investigador poder� fazer uma utiliza��o
personalizada do programa ao seleccionar, da ampla variedade de
an�lises oferecidas, os �ndices que se adequam aos prop�sitos da
sua investiga��o e numa dupla funcionalidade de utiliza��o:
pedir ao programa para analisar listas de palavras previamente
constitu�das nos �ndices considerados relevantes para a
investiga��o ou para obter listas de palavras que obede�am aos
par�metros definidos. O P-PAL assume-se assim como uma
ferramenta fundamental � promo��o e internacionaliza��o da
investiga��o em Portugal.
- year
- 2010
- url
- http://linguamatica.com/index.php/linguamatica/article/download/80/108
- irreditor
- Alberto Sim�es and Jos� Jo�o Almeida and Xavier G�mez
Guinovart
- chave
- p-pal-linguamatica
- issn
- 1647--0818
- author
-
- Ana Paula Soares
- Montserrat Comesa�a
- �lvaro Iriarte
Sanroman
- Jos� Jo�o Almeida
- Alberto Manuel Brand�o Sim�es
- Ana Costa,
Patr�cia Cunha Fran�a
- Jo�o Machado
- title
- Guided Self Training for Sentiment Classification
- tipo
- inproceedings
- pages
- 9--16
- booktitle
- Proceedings of Workshop on Robust Unsupervised and Semisupervised
Methods in Natural Language Processing
- docpage
- jj.bib.dp.html#drury-torgo-almeida:2011:ROBUS
- author
-
- Drury, Brett
- Torgo, Luis
- J.J. Almeida
- chave
- drury-torgo-almeida:2011:ROBUS
- month
- September
- year
- 2011
- address
- Hissar, Bulgaria
- url
- http://www.aclweb.org/anthology/W11-3902
- title
- Classifying News Stories to Estimate the Direction of a Stock Market Index
- author
-
- Brett Drury
- Luis Torgo
- J.J. Almeida
- tipo
- inproceedings
- chave
- drury1
- location
- Chaves
- year
- 2011
- pages
- 1-4
- booktitle
- Third Workshop on Intelligent Systems and Applications (WISA)
- docpage
- jj.bib.dp.html#drury1
- title
- Magellan: An Adaptive Ontology Drivenbreaking Financial NewsRecommender
- author
-
- Brett Drury
- J.J. Almeida
- Helena Morais
- tipo
- inproceedings
- chave
- drury2
- location
- Chaves
- year
- 2011
- booktitle
- CISTI-2011
- docpage
- jj.bib.dp.html#drury2
- title
- An Error Correction Methodology for Time Dependent Ontologies
- isbn
- 978-3-642-22055-5
- publisher
- Springer
- tipo
- inproceedings
- pages
- 501-512
- ee
- http://dx.doi.org/10.1007/978-3-642-22056-2_52
- booktitle
- {CAiSE
- part
- 8
- docpage
- jj.bib.dp.html#drury3
- volume
- 83
- series
- Lecture Notes in Business Information Processing
- chave
- drury3
- author
-
- Brett Drury
- J.J. Almeida
- Helena Morais
- year
- 2011
- editor
- Camille Salinesi and Oscar Pastor
- year
- 2011
- booktitle
- CISTI-2011
- docpage
- jj.bib.dp.html#nuno1
- title
- Oml: A Scripting Approach For Manipulating Ontologies
- author
-
- Nuno Carvalho
- Alberto Sim�es
- J.J. Almeida
- tipo
- inproceedings
- chave
- nuno1
- location
- Chaves
- title
- {PFTL
- isbn
- 978-989-96001-5-7
- tipo
- inproceedings
- publisher
- Dep. de Eng. Inform�tica da Universidade de Coimbra
- pages
- 222--233
- docpage
- jj.bib.dp.html#corta2011-pftl
- booktitle
- INForum'11 --- Simp�sio de Inform�tica (CoRTA2011 track)
- pdf
- http://ambs.perl-hackers.net/publications/corta2011-pftl.pdf
- chave
- corta2011-pftl
- author
-
- Nuno Carvalho
- Alberto Sim�es
- Jos� Jo�o Almeida
- Pedro Rangel Henriques
- Maria Jo�o Varanda Pereira
- address
- Coimbra, Portugal
- language
- EN
- year
- 2011
- month
- Setembro
- abstract
- Today, most
developers prefer to store information in databases. But
plain filesystems were used for years, and are still used, to store
information, commonly in files of heterogeneous formats that are
organized in directory trees. This approach is a very flexible and
natural way to create hierarchical organized structures of
documents.
We can devise a formal notation to describe a filesystem tree structure,
similar to a grammar, assuming that filenames can be considered terminal
symbols, and directory names non-terminal symbols. This specification
would allow to derive correct language sentences (combination of terminal
symbols) and to associate semantic actions, that can produce arbitrary
side effects, to each valid sentence, just as we do in common parser
generation tools. These specifications can be used to systematically
process files in directory trees, and the final result depends on the
semantic actions associated with each production rule.
In this paper we revamped an old idea of using a domain specific
language to implement these specifications similar to context free
grammars. And introduce some examples of applications that can be
built using this approach.
- editor
- Raul Barbosa and Luis Caires
- pdf
- http://ambs.perl-hackers.net/publications/corta2011-oml.pdf
- chave
- corta2011-oml
- author
-
- Nuno Carvalho
- Jos� Jo�o Almeida
- Alberto Sim�es
- editor
- Raul Barbosa and Luis Caires
- address
- Coimbra, Portugal
- language
- EN
- year
- 2011
- abstract
-
Most existing programming languages can be categorized as general
purpose programming languages, meaning that they can be used to
implement solutions for any given domain. They are not, in any way,
optimized for a specific set of problems. In contrast, Domain
Specific Languages (DSL) are used to solve specific problems in a
well defined domain. DSL are optimized to a particular set of
problems, but they lack support for a wide range of operations that
are required when dealing with real world problems. So, in a
perfect world, we would like to implement applications using a
general purpose programming language, but use a set of different DSL
to handle specific domains' tasks.
In this paper we describe a DSL named Ontology Manipulation Language
(OML), designed to describe operations over
with ontologies. Programs can be written
using only the OML syntax and be executed independently. OML syntax
was designed to deal with ontologies and the language itself is
optimized to perform these tasks, which means that other relatively
simpler tasks can not be easily done. To overcome this challenge a
mechanism was developed so that you can weave small snippets of OML code
inside Perl programs, meaning we have the power of OML to manipulate
ontologies and, at the same time, all the paraphernalia of modules
that Perl offers to handle everything else.
- month
- Setembro
- isbn
- 978-989-96001-5-7
- tipo
- inproceedings
- publisher
- Dep. de Eng. Inform�tica da Universidade de Coimbra
- title
- Weaving {OML
- docpage
- jj.bib.dp.html#corta2011-oml
- booktitle
- INForum'11 --- Simp�sio de Inform�tica (CoRTA2011 track)
- pages
- 184--197
- year
- 2011
- full
- Proceedings of the International Conference on Web
Intelligence, Mining and Semantics, WIMS 2011, Sogndal, Norway, May 25
- 27, 2011
- editor
- Rajendra Akerkar
- author
-
- chave
- wims2011
- ee
- http://doi.acm.org/10.1145/1988688.1988720
- pages
- 27--34
- bibsource
- DBLP, http://dblp.uni-trier.de
- booktitle
- WIMS
- docpage
- jj.bib.dp.html#wims2011
- title
- Identification of fine grained feature based event and sentiment
phrases from business news stories
- publisher
- ACM
- tipo
- inproceedings
- isbn
- 978-1-4503-0148-0
- year
- 2011
- url
- http://natura.di.uminho.pt/~jj/pln/sepln2011-boolcleaner.pdf
- docpage
- jj.bib.dp.html#sepln:bookcleaner
- booktitle
- Actas del XXVII Congreso de la Sociedad Espa�ola
para el Procesamiento del Lenguaje Natural
- title
- {Text::Perfide::BookCleaner
- tipo
- inproceedings
- pp
- 433-441
- author
-
- Santos, Andr�
- Jos� Jo�o Almeida
- location
- Huelva, 5 - 7 Set
- chave
- sepln:bookcleaner
- number
- 3/4
- pages
- 219-233
- volume
- 6
- journal
- IJMSO
- docpage
- jj.bib.dp.html#drury4
- title
- Construction and maintenance of a fuzzy temporal ontology
from news stories
- tipo
- article
- year
- 2011
- journalfull
- International Journal of Metadata, Semantics and Ontologies
- doi
- http://dx.doi.org/10.1504/IJMSO.2011.048028
- author
-
- Brett Drury
- J.J. Almeida
- Helena Morais
- chave
- drury4
- year
- 2011
- month
- 1--2 June
- abstract
-
The eXtensible Mark-up Language (XML) is probably one of the
most popular markup languages available today. It is very typical to find all
kind
of services or programs representing data in this format. This situation is
even
more common in web development environments or Service Oriented Architectures
(SOA), where data flows from one service to another, being consumed and
produced by an heterogeneous set of applications, which sole requirement is to
understand XML.
This workflow of data represented in XML implies some tasks that applications
have to perform if they are required to consume or produce information: the
task of parsing an XML document, giving specific semantics to the information
parsed, and the task of producing an XML document.
Our main goal is to create object definitions that can analyze an XML document
and automatically create an object definition that can be used abstractly by
the
application. These objects are able to parse the XML document and gather all
the
data required to mimic all the information present in the document.
This paper introduces xml2pm, a simple tool that can inspect the structure of
an XML document and create an object definition (a Perl module) that stores
the
same information present in the orinial document, but as a runtime object. We
also
introduce a simple case of how this approach allows the creation of
applications
based on Web Services in an elegant and simple way.
- address
- Vila do Conde, Portugal
- editor
- Alberto Sim�es
- author
-
- Nuno Carvalho
- Alberto Sim�es
- Jos� Jo�o Almeida
- pdf
- http://ambs.perl-hackers.net/publications/xml2pm-xata2011.pdf
- chave
- xml2pm-xata2011
- pages
- 103--114
- docpage
- jj.bib.dp.html#xml2pm-xata2011
- booktitle
- {XATA 2011
- title
- xml2pm: A Tool for Automatic Creation of Object Definitions Based
on {XML
- tipo
- inproceedings
- isbn
- 978-989-96863-1-1
- lang
- EN
- author
-
- Brett Drury
- Luis Torgo
- J.J. Almeida
- chave
- drury5
- year
- 2012
- full
- International Journal of Computer Science and Applications
- url
- http://www.tmrfindia.org/ijcsa/v9i11.pdf
- title
- Classifying News Stories with a Constrained Learning Strategy
to Estimate the Direction of a Market Index
- tipo
- article
- number
- 1
- pages
- 1-22
- bibsource
- DBLP, http://dblp.uni-trier.de
- volume
- 9
- docpage
- jj.bib.dp.html#drury5
- journal
- IJCSA
- chave
- da2012
- author
-
- Alberto Sim�es
- �lvaro Iriarte Sanrom�n
- Jos� Jo�o Almeida
- address
- Coimbra, Portugal
- month
- April
- year
- 2012
- editor
- Helena Caseli and Aline Villavicencio and Ant�nio Teixeira
and Fernando Perdig�o
- title
- Dicion�rio-Aberto -- A Source of Resources for the
Portuguese Language Processing
- publisher
- Springer
- tipo
- article
- pages
- 121--127
- docpage
- jj.bib.dp.html#da2012
- journal
- Computational Processing of the Portuguese Language,
Lecture Notes for Artificial Intelligence
- volume
- 7243
- date
- 23-25
- title
- Structural alignment of plain text books
- tipo
- inproceedings
- publisher
- European Language Resources Association (ELRA)
- isbn
- 978-2-9517408-7-7
- docpage
- jj.bib.dp.html#LREC12.967
- booktitle
- Proceedings of the Eight International Conference on Language
Resources and Evaluation (LREC'12)
- author
-
- Andr� Santos
- Jos� Jo�o Almeida
- Nuno Carvalho
- chave
- LREC12.967
- year
- 2012
- month
- may
- address
- Istanbul, Turkey
- language
- english
- editor
- Nicoletta Calzolari and others
- author
-
- Brett Drury
- Jos� Jo�o Almeida
- chave
- LREC12.611
- editor
- Nicoletta Calzolari and others
- year
- 2012
- month
- may
- address
- Istanbul, Turkey
- language
- english
- tipo
- inproceedings
- publisher
- European Language Resources Association (ELRA)
- isbn
- 978-2-9517408-7-7
- date
- 23-25
- title
- The Minho Quotation Resource
- docpage
- jj.bib.dp.html#LREC12.611
- booktitle
- Proceedings of the Eight International Conference on Language
Resources and Evaluation (LREC'12)
- pages
- 239-253
- year
- 2012
- abstract
- Concept location is a common task in program comprehension
techniques, essential in many approaches used for software care and
software evolution. An important goal of this process is to discover
a mapping between source code and human oriented concepts.
Although programs are written in a strict and formal language, natural
language terms and sentences like identifiers (variables or functions
names), constant strings or comments, can still be found embedded in
programs. Using terminology concepts and natural language processing
techniques these terms can be exploited to discover clues about which
real world concepts source code is addressing.
This work extends symbol tables build by compilers with ontology
driven constructs, extends synonym sets defined by linguistics, with
automatically created Probabilistic SynSets from software
domain parallel corpora. And using a relational algebra, creates
semantic bridges between program elements and human oriented concepts,
to enhance concept location tasks.
- month
- June
- docpage
- jj.bib.dp.html#CAPH12a
- booktitle
- SLATe'12 --- Symposium on Languages, Applications and Technologies
- volume
- 21
- title
- Probabilistic SynSet Based Concept Location
- irreditor
- Alberto Sim�es and Ricardo Queir�s and Daniela da Cruz
- chave
- CAPH12a
- tipo
- inproceedings
- author
-
- Nuno Ramos Carvalho
- Jose Joao Almeida
- Maria
Jo�o Varanda Pereira
- Pedro Rangel Henriques
- publisher
- OASIC -- Open Access Series in Informatics, Schloss
Dagstuhl - Leibniz-Zentrum f�r Informatik, Dagstuhl Publishing, Germany
- year
- 2012
- chave
- wikiscore
- author
-
- J.J. Almeida
- Nuno Ramos Carvalho
- Jos� Nuno Oliveira
- journal
- Information, Services and Use (ISU)
- docpage
- jj.bib.dp.html#wikiscore
- volume
- 31
- comment
- elpub 2012
- pages
- 177--187
- number
- 3-4/2011
- ee
- DOI 10.3233/ISU-2012-0647
- tipo
- article
- publisher
- IOS Press
- title
- {Wiki::Score
- small
- ISU
- series
- OpenAccess Series in Informatics (OASIcs)
- volume
- 21
- docpage
- jj.bib.dp.html#flapp
- booktitle
- 1st Symposium on Languages, Applications and Technologies
- pages
- 41--50
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
- title
- {Generating flex Lexical Scanners for Perl Parse::Yapp
- idx
- DBLP
- url
- http://drops.dagstuhl.de/opus/volltexte/2012/3513
- year
- 2012
- abstract
-
Perl is known for its versatile regular expressions. Nevertheless, using Perl regular
expressions for creating fast lexical analyzer is not easy. As an alternative, the authors
defend the automated generation of the lexical analyzer in a well known fast application
(flex) based on a simple Perl definition in the syntactic analyzer. In this paper we
extend the syntax used by Parse::Yapp, one of the most used parser generators for Perl,
making the automatic generation of flex lexical scanners possible. We explain how this is
performed and conclude with some benchmarks that show the relevance of the approach.
- address
- Dagstuhl, Germany
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2012.41
- author
-
- Alberto Sim�es
- Nuno Ramos Carvalho
- Jos� Jo�o Almeida
- chave
- flapp
- irreditor
- Alberto Sim�es and Ricardo Queir�s and Daniela da Cruz
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- title
- Predicting Market Direction from Direct Speech by Business
Leaders
- volume
- 21
- series
- OASICS
- booktitle
- SLATE
- docpage
- jj.bib.dp.html#DBLP:conf/slate/DruryA12
- pages
- 163-172
- bibsource
- DBLP, http://dblp.uni-trier.de
- author
-
- Brett Drury
- Jos� Jo�o Almeida
- chave
- DBLP:conf/slate/DruryA12
- irreditor
- Alberto Sim{�
- year
- 2012
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2012.163
- irreditor
- Lu�s Correia and Lu�s Paulo Reis and Jos� Cascalho and Lu�s
Gomes and H�lia Guerra and Pedro Cardoso
- uthor
- Alberto Sim�es and Jos� Jo�o Almeida and Nuno Ramos Carvalho
- chave
- ptd2013
- address
- Angra do Heroismo, Azores
- url
- http://natura.di.uminho.pt/~jj/bib/ptd-algebra.pdf
- ear
- 2013
- title
- Defining a Probabilistic Translation Dictionaries Algebra
- tipo
- inproceedings
- ooktitle
- XVI Portuguese Conference on Artificial Inteligence - EPIA
- onth
- September
- pages
- 444--455
- docpage
- jj.bib.dp.html#ptd2013
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/wcist2012-dmoss.pdf
- bstract
- Besides source code, the fundamental source of information about Open Source
Software lies in documentation, and other non source code files, like README,
INSTALL, or HowTo files, commonly available in the software ecosystem. These
documents, written in natural language, provide valuable information during the
software development stage, but also in future maintenance and evolution tasks.
DMOSS is a toolkit designed to systematically assess the quality of non source
code text found in software packages. The toolkit handles a package as an
attribute tree, and performs several tree traverse algorithms through a set of
plugins, specialized in retrieving specific metrics from text, gathering
information about the software. These metrics are later used to infer knowledge
about the software, and composed together to build reports that assess the
quality of specific features of the software. This paper discusses the
motivations for this work, continues with a description of the toolkit
implementation and design goals. Follows an example of its usage to process a
software package, and the produced report. Finally some final remarks and
trends for future work are presented.
- chave
- algarve-cross2013
- uthor
- Nuno Ramos Carvalho and Alberto Sim�es and Jos� Jo�o Almeida
- docpage
- jj.bib.dp.html#algarve-cross2013
- eries
- Advances in Intelligent Systems and Computing
- olume
- 206
- sbn
- 978-3-642-36980-3
- ages
- 785--794
- ooktitle
- Advances in Information Systems and Technologies
- tipo
- inproceedings
- title
- Open Source Software Documentation Mining for Quality Assessment
- ublisher
- Springer Berlin Heidelberg
- ear
- 2013
- ditor
- Rocha, �lvaro and Correia, Ana Maria and Wilson, Tom and Stroetmann,
Karl A.
- chave
- algarve2013
- uthor
- Alberto Sim�es and An�lia Louren�o and Jos� Jo�o Almeida
- bstract
- This work aims at pointing out the benefits of a topology-oriented
wide scope, but differentiated, profile analysis. The goal was to conciliate
advanced common website usage profiling techniques with the analysis of the
website's topology information, outputting valuable knowledge in an intuitive
and comprehensible way. Server load balancing, crawler activity evaluation and
Web site restructuring are the primary analysis concerns and, in this regard,
experiments over six month data of a real-world Web site were considered
successful.
- url
- http://alfarrabio.di.uminho.pt/~albie/publications/wcist2012-webtopology.pdf
- title
- Evaluating Web Site Structure Based on Navigation Profiles and Site Topology
- ublisher
- Springer Berlin Heidelberg
- ear
- 2013
- ditor
- Rocha, �lvaro and Correia, Ana Maria and Wilson, Tom and Stroetmann,
Karl A.
- ages
- 305-311
- tipo
- inproceedings
- ooktitle
- Advances in Information Systems and Technologies
- olume
- 206
- eries
- Advances in Intelligent Systems and Computing
- sbn
- 978-3-642-36980-3
- docpage
- jj.bib.dp.html#algarve2013
- docpage
- jj.bib.dp.html#Passarola2013
- booktitle
- CISTI-2013
- pages
- 763--768
- location
- Lisboa
- tipo
- inproceedings
- title
- PASSAROLA: High-Order Exercise Generation
- url
- http://natura.di.uminho.pt/~jj/bib/passarola-cisti2013.pdf
- year
- 2013
- abstract
- In order to be robust and achieve multi-domain
coverage, exercise generation systems usually work with answers
of simple types (e.g. multiple-choice, Boolean, integer, or file
comparison). In this paper we describe an exercise generation
system PASSAROLA, a simple, yet powerful, language that anyone
with no computer science background, can use to develop
exercises, that include a collection of heterogeneous objects, and
allows the usage of complex elements. Its main characteristic
features are the use of simple reusable templates, simple and rich
types, rich notation and syntax (LaTeX based) for questions,
solutions, and answers, transformations and calculations,
external calculators.
- chave
- Passarola2013
- author
-
- J.Jo�o Almeida
- Isabel Ara�jo
- Irene Brito
- Nuno Carvalho
- Gaspar J. Machado
- Rui M.S. Pereira
- Georgi Smirnov
- tipo
- inproceedings
- location
- Lisboa
- title
- Math exercise generation and smart assessment
- booktitle
- Workshop of TICAMES (Information and Communication Technology in
Higher Education: Learning Mathematics), CISTI-2013
- docpage
- jj.bib.dp.html#ticames2013
- pages
- 1014--1019
- author
-
- J.Jo�o Almeida
- Isabel Ara�jo
- Irene Brito
- Nuno Carvalho
- Gaspar J. Machado
- Rui M.S. Pereira
- Georgi Smirnov
- chave
- ticames2013
- url
- http://natura.di.uminho.pt/~jj/bib/passarola-ticames2013.pdf
- abstract
- In this paper we concentrate on the field of
mathematics education where the aim is to generate exercises
going beyond those with answers of simple types (e.g. multiple-choice,
Boolean, integer, or file comparison). We present three
examples from introductory college mathematics and emphasize
the key points that should be taken into account in order to
develop a "well-posed" exercise together with its verification. All
the presented examples were implemented in the system
- year
- 2013
- irrbooktitle
- Computational Science and Its Applications - ICCSA 2013
- 13th International Conference, Ho Chi Minh City, Vietnam,
June 24-27, 2013, Proceedings, Part II
- year
- 2013
- doi
- http://dx.doi.org/10.1007/978-3-642-39643-4_32
- offcrossref
- DBLP:conf/iccsa/2013-2
- editor
- Beniamino Murgante and others
- author
-
- Pedro Martins
- Nuno Ramos Carvalho
- Jo�o Paulo Fernandes
- Jos� Jo�o Almeida
- Jo�o Saraiva
- chave
- crossportal
- ee
- http://dx.doi.org/10.1007/978-3-642-39643-4
- bibsource
- DBLP, http://dblp.uni-trier.de
- pages
- 443-458
- series
- Lecture Notes in Computer Science
- volume
- 7972
- docpage
- jj.bib.dp.html#crossportal
- booktitle
- ICCSA (2)
- title
- A Framework for Modular and Customizable Software Analysis
- tipo
- inproceedings
- publisher
- Springer
- isbn
- 978-3-642-39642-7
- docpage
- jj.bib.dp.html#icaicte13
- booktitle
- ICAICTE-13, Advances in Intelligent Systems Research
- title
- Exercise generation with the system Passarola
- isbn
- 978-90786-77-79-6
- tipo
- inproceedings
- doi
- doi:10.2991/icaicte.2013.64
- year
- 2013
- abstract
- A robust multi-domain coverage exercise generation system
usually works with an-swers of simple types (e.g. multiple-choice,
Boolean, integer, or file compari-son). In this paper we describe
Passarola, a simple, yet powerful, exercise genera-tion system and its
language that anyone with no computer science background can use to
develop exercises. It may include a collection of heterogeneous objects
allowing the usage of complex elements. Its main characteristics are the
use of simple reusable templates, simple and rich types, and rich notation
and syntax (LaTeX based) for questions, solutions, and answers.
- url
- http://natura.di.uminho.pt/~jj/bib/ecaicte2013.pdf
- issn
- 1951-6851
- chave
- icaicte13
- keywords
- Passarola, exercise generation system, self-regulating study
- author
-
- Jos� Jo�o Almeida
- Isabel Ara�jo
- Irene Brito
- Nuno Carvalho
- Gaspar J.
Machado
- Rui M. S. Pereira
- Georgi Smirnov
- title
- ABC with a UNIX Flavor
- isbn
- 978-3-939897-52-1
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- pages
- 203-218
- bibsource
- DBLP, http://dblp.uni-trier.de
- booktitle
- 2nd Symposium on Languages, Applications and Technologies,
SLATE 2013, June 20-21, 2013 - Porto, Portugal
- docpage
- jj.bib.dp.html#slate/AzevedoA13
- volume
- 29
- series
- OASICS
- irreditor
- Jos� Paulo Leal and
Ricardo Rocha and
Alberto Sim�es
- chave
- slate/AzevedoA13
- author
-
- Bruno M. Azevedo
- Jos� Jo�o Almeida
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2013.203
- abstract
-
ABC is a simple, yet powerful, textual musical notation.
This paper presents ABC::DT, a rule-based domain-specific
language (Perl embedded), designed to simplify the
creation of ABC processing tools. Inspired by the Unix philosophy,
those tools intend to be simple and compositional in a Unix filters' way.
From ABC::DT's rules we obtain an ABC processing tools whose main
algorithm follows a traditional compiler architecture, thus consisting of
three stages:
1) ABC parser (based on abcmtops parser),
2) ABC semantic transformation (associated with ABC attributes),
3) output generation (either a user defined or system provided ABC generator).
- year
- 2013
- url
- http://drops.dagstuhl.de/opus/volltexte/2013/4039/pdf/14.pdf
- chave
- escolex2013
- tipo
- article
- author
-
- Soares, Ana Paula
- Jos� Carlos Medeiros
- Alberto Sim�es
- Jo�o Machado
- Ana Costa
- �lvaro Iriarte
- Jos� Jo�o Almeida
- Ana P. Pinheiro
- and Montserrat Comesa�a
- title
- Escolex: A grade-level lexical database from european portuguese
elementary to middle school textbooks.
- journal
- Behavior Research Methods
- docpage
- jj.bib.dp.html#escolex2013
- url
- http://p-pal.di.uminho.pt/static/files/db/Soares_et_al.__in_press_ESCOLEX.pdf
- pages
- 1--14
- year
- 2013
- abstract
-
In this article, we introduce ESCOLEX, the first European Portuguese children's
lexical database with grade-level-adjusted word frequency statistics. Computed
from a 3.2-million-word corpus, ESCOLEX provides 48,381 word forms extracted
from 171 elementary and middle school textbooks for 6- to 11-year-old children
attendin' the first six grades in the Portuguese educational system. Like other
children's grade-level databases, ESCOLEX provides four frequency indices for
each grade: overall word frequency (F), index of dispersion across the selected
textbooks (D), estimated frequency per million words (U), and standard
frequency index (SFI). It also provides a new measure, contextual diversity
(CD). In addition, the number of letters in the word and its part(s) of speech,
number of syllables, syllable structure, and adult frequencies taken from P-PAL
(a European Portuguese corpus-based lexical database) are provided. ESCOLEX
will be a useful tool both for researchers interested in language processing
and development and for professionals in need of verbal materials adjusted to
children's developmental stages. ESCOLEX can be downloaded along with this
article or from http://p-pal.di.uminho.pt/about/databases.
- booktitle
- Humanidades: Novos Paradigmas do Conhecimento e da Investiga��o,
XIV Col�quio de Outono
- editor
- Ana Gabriela Macedo and
Carlos Mendes de Sousa and
Vitor Moura
- docpage
- jj.bib.dp.html#coloquiosOutono2013
- year
- 2013
- pages
- 323--339
- author
-
- Jos� Jo�o Almeida
- S�lvia Ara�jo
- Idalete Dias
- Ana Correio
- publisher
- h�mus, Universidade do Minho
- tipo
- inproceedings
- chave
- coloquiosOutono2013
- title
- {Per-fide
- month
- April
- year
- 2014
- chapter
- 9
- editor
- Tony Berber Sardinha and Telma S�o-Bento Ferreira
- url
- http://ambs.perl-hackers.net/publications/perfide_ch9_sardinha.pdf
- chave
- sardinha2014
- author
-
- Jos� Jo�o Almeida
- S�lvia Ara�jo
- Nuno Carvalho
- Idalete Dias
- Ana Oliveira
- Andr� Santos
- Alberto Sim�es
- pages
- 177--200
- booktitle
- Working with Portuguese Corpora
- docpage
- jj.bib.dp.html#sardinha2014
- title
- The {Per-Fide
- isbn
- 978-1441190505
- publisher
- Bloomsbury Publishing
- tipo
- incollection
- title
- {Procura-PALavras (P-Pal): uma nova medida de frequ�ncia lexical do portugu�s europeu contempor�neo
- script
- sci_arttext
- publisher
- scielo
- tipo
- article
- pages
- 110 - 123
- docpage
- jj.bib.dp.html#SOARES2014
- journal
- {Psicologia: Reflex�o e Cr�tica
- crossref
- 10.1590/S0102-79722014000100013
- pid
- S0102
- chave
- SOARES2014
- author
-
- Soares, Ana Paula
- Iriarte, �lvaro
- Almeida, Jos� Jo�o
- Sim�es, Alberto
- Costa, Ana
- Fran�a, Patricia
- Machado, Jo�o
- Comesa�a, Montserrat
- nrm
- iso
- language
- pt
- month
- 03
- year
- 2014
- url
- http://www.scielo.br/scielo.php?&-79722014000100013&volume = {27
- volume
- 27
- journal
- {Psicologia: Reflexao e Critica
- docpage
- jj.bib.dp.html#ppal2014
- number
- 1
- pages
- 110-123
- tipo
- article
- title
- Procura-PALavras (P-PAL): A new measure
of word frequency for contemporary European Portuguese | Procura-PALavras
(P-PAL): Uma nova medida de frequ�ncia lexical do Portugu�s Europeu
contempor�neo
- year
- 2014
- doi
- 10.1590/S0102-79722014000100013
- author
-
- Soares, A.P.
- Iriarte, A.
- Almeida, J.J.
- Sim�es, A.
- Costa, A.
- Fran�a, P.
- Machado, J.
- Comesa�a, M.
- chave
- ppal2014
- address
-
- annote
- Document Type: Conference Paper; SCOPUS
- doi
- 10.1007/978-3-319-09153-2_9
- year
- 2014
- chave
- conclave-iccsa2104
- author
-
- Nuno Ramos Carvalho
- Jos� Jo�o Almeida
- Maria
Jo�o Varanda Pereira
- Pedro Rangel Henriques
- journal
- Lecture Notes in Computer Science (including subseries Lecture
Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
- docpage
- jj.bib.dp.html#conclave-iccsa2104
- offbooktitle
- 14th International Conference on Computational Science and its
Applications, ICCSA 2014; Guimaraes; Portugal
- volume
- 8584 LNCS
- pages
- 116-131
- number
- PART 6
- tipo
- article
- publisher
- Springer Verlag
- title
- {Conclave: Ontology-driven measurement of semantic relatedness
between source code elements and problem domain concepts
- number
- 4
- pages
- 1191-1207
- volume
- 11
- docpage
- jj.bib.dp.html#comsys-dmoss
- journal
- Computer Science and Information Systems
- title
- {DMOSS
- tipo
- article
- abstract
- Besides source code, the fundamental source of information
about open source software lies in documentation, and other non source
code files, like README, INSTALL, or How-To files, commonly available in
the software ecosystem. These documents, written in natural language,
provide valuable information during the software development stage,
but also in future maintenance and evolution tasks. DMOSS3 is a toolkit
designed to systematically assess the quality of non source code content
found in software packages. The toolkit handles a package as an attribute
tree, and performs several tree traverse algorithms through a set of
plugins, specialized in retrieving specific metrics from text, gathering
information about the software. These metrics are later used to infer
knowledge about the software, and composed together to build reports
that assess the quality of specific features. This paper discusses the
motivations for this work, continues with a description of the toolkit
implementation and design goals. This is followed by an example of its
usage to process a software package, and the produced report.
- year
- 2014
- show
- pprwc110
- url
- http://www.comsis.org/archive.php?-1308
- author
-
- Carvalho, N. R.
- Sim�es, A.
- Almeida, J. J.
- chave
- comsys-dmoss
- url
- http://www.sciencedirect.com/science/article/pii/S0164121214002179
- doi
- http://dx.doi.org/10.1016/j.jss.2014.10.013
- year
- 2014
- abstract
- Abstract Program comprehension techniques often explore program
identifiers, to infer knowledge about programs. The relevance of source code
identifiers as one relevant source of information about programs is already
established in the literature, as well as their direct impact on future
comprehension tasks. Most programming languages enforce some constrains on
identifiers strings (e.g., white spaces or commas are not allowed). Also,
programmers often use word combinations and abbreviations, to devise strings
that represent single, or multiple, domain concepts in order to increase
programming linguistic efficiency (convey more semantics writing less). These
strings do not always use explicit marks to distinguish the terms used (e.g.,
CamelCase or underscores), so techniques often referred as hard splitting are
not enough. This paper introduces Lingua::IdSplitter a dictionary based
algorithm for splitting and expanding strings that compose multi-term
identifiers. It explores the use of general programming and abbreviations
dictionaries, but also a custom dictionary automatically generated from
software natural language content, prone to include application domain terms
and specific abbreviations. This approach was applied to two software packages,
written in C, achieving a f-measure of around 90% for correctly splitting and
expanding identifiers. A comparison with current state-of-the-art approaches is
also presented.
- chave
- jss-Carvalho2014
- issn
- 0164-1212
- keywords
- Identifier splitting
- author
-
- Nuno Ramos Carvalho
- Jos� Jo�o Almeida
- Pedro Rangel Henriques
- Maria Jo�o Varanda
- journal
- Journal of Systems and Software
- docpage
- jj.bib.dp.html#jss-Carvalho2014
- volume
-
- number
- 0
- tipo
- article
- title
- From source code identifiers to natural language terms
- irreditor
- Maria Jo�o Varanda Pereira and Jos� Paulo Leal and Alberto Sim�es
- author
-
- Nuno Ramos Carvalho
- Jos� Jo�o Almeida
- Maria Jo�o Varanda Pereira
- Pedro Rangel Henriques
- chave
- conclave-slate2014
- year
- 2014
- annote
- Keywords: software maintenance, software evolution, program comprehension, feature location, concept location, natural language processing
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2014.19
- address
- Dagstuhl, Germany
- title
- {Conclave: Writing Programs to Understand Programs
- publisher
- Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- pages
- 19--34
- volume
- 38
- series
- OpenAccess Series in Informatics (OASIcs)
- booktitle
- 3rd Symposium on Languages, Applications and Technologies
- docpage
- jj.bib.dp.html#conclave-slate2014
- title
- A Workflow Description Language to Orchestrate Multi-Lingual Resources
- biburl
- http://dblp.uni-trier.de/rec/bib/conf/slate/BritoA14
- isbn
- 978-3-939897-68-2
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- pages
- 77--83
- docpage
- jj.bib.dp.html#DBLP:conf/slate/BritoA14
- booktitle
- 3rd Symposium on Languages, Applications and Technologies, {SLATE
- series
- {OASICS
- volume
- 38
- irreditor
- Maria Jo�o Varanda Pereira and
Jos� Paulo Leal and
Alberto Sim�es
- chave
- DBLP:conf/slate/BritoA14
- author
-
- Rui Brito
- Jos� Jo�o Almeida
- doi
- 10.4230/OASIcs.SLATE.2014.77
- year
- 2014
- url
- http://dx.doi.org/10.4230/OASIcs.SLATE.2014.77
- pages
- 251--265
- booktitle
- 3rd Symposium on Languages, Applications and Technologies, {SLATE
- docpage
- jj.bib.dp.html#DBLP:conf/slate/SimoesAB14
- volume
- 38
- series
- {OASICS
- title
- Language Identification: a Neural Network Approach
- biburl
- http://dblp.uni-trier.de/rec/bib/conf/slate/SimoesAB14
- isbn
- 978-3-939897-68-2
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- doi
- 10.4230/OASIcs.SLATE.2014.251
- year
- 2014
- url
- http://dx.doi.org/10.4230/OASIcs.SLATE.2014.251
- irreditor
- Maria Jo�o Varanda Pereira and
Jos� Paulo Leal and
Alberto Sim�es
- chave
- DBLP:conf/slate/SimoesAB14
- author
-
- Alberto Sim�es
- Jos� Jo�o Almeida
- Simon D. Byers
- irreditor
- Maria Jo�o Varanda Pereira and
Jos� Paulo Leal and
Alberto Sim�es
- author
-
- Pedro Carvalho
- Jos� Jo�o Almeida
- chave
- DBLP:conf/slate/CarvalhoA14
- year
- 2014
- doi
- 10.4230/OASIcs.SLATE.2014.283
- url
- http://dx.doi.org/10.4230/OASIcs.SLATE.2014.283
- biburl
- http://dblp.uni-trier.de/rec/bib/conf/slate/CarvalhoA14
- title
- MLT-prealigner: a Tool for Multilingual Text Alignment
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- isbn
- 978-3-939897-68-2
- pages
- 283--290
- series
- {OASICS
- volume
- 38
- docpage
- jj.bib.dp.html#DBLP:conf/slate/CarvalhoA14
- booktitle
- 3rd Symposium on Languages, Applications and Technologies, {SLATE
- chave
- tmxa
- author
-
- Rui Brito
- Jos� Jo�o Almeida
- Alberto Sim�es
- url
- http://ambs.perl-hackers.net/publications/tmxa.pdf
- address
- Las Palmas de Gran Canaria, Spain
- year
- 2014
- abstract
- In the later years the amount of freely available multilingual
corpora has grown in an exponential way. Unfortunately the way these
corpora are made available is very diverse, ranging from simple text
files or specific XML schemas to supposedly standard formats like
the XML Corpus Encoding Initiative, the Text Encoding Initiative, or
even the Translation Memory Exchange formats.
In this document we defend the usage of Translation Memory Exchange
documents, but we enrich its structure in order to support the
annotation of the documents with different information like lemmas,
multi-words or entities.
To support the adoption of the proposed formats, we present a set of
tools to manipulate the different formats in an agile way.
- month
- November
- tipo
- inproceedings
- title
- Processing Annotated {TMX
- docpage
- jj.bib.dp.html#tmxa
- booktitle
- IberSpeech 2014 --- VIII Jornadas en Tecnolog�as del Habla and IV Iberian SLTech Workshop
- pages
- 188--197
- url
- http://dx.doi.org/10.1016/j.jss.2014.10.013
- doi
- 10.1016/j.jss.2014.10.013
- abstract
- Abstract Program comprehension techniques often explore program
identifiers, to infer knowledge about programs. The relevance of source code
identifiers as one relevant source of information about programs is already
established in the literature, as well as their direct impact on future
comprehension tasks. Most programming languages enforce some constrains on
identifiers strings (e.g., white spaces or commas are not allowed). Also,
programmers often use word combinations and abbreviations, to devise strings
that represent single, or multiple, domain concepts in order to increase
programming linguistic efficiency (convey more semantics writing less). These
strings do not always use explicit marks to distinguish the terms used (e.g.,
CamelCase or underscores), so techniques often referred as hard splitting are
not enough. This paper introduces Lingua::IdSplitter a dictionary based
algorithm for splitting and expanding strings that compose multi-term
identifiers. It explores the use of general programming and abbreviations
dictionaries, but also a custom dictionary automatically generated from
software natural language content, prone to include application domain terms
and specific abbreviations. This approach was applied to two software packages,
written in C, achieving a f-measure of around 90% for correctly splitting and
expanding identifiers. A comparison with current state-of-the-art approaches is
also presented.
- year
- 2015
- chave
- jss-CarvalhoAHP15
- author
-
- Nuno Ramos Carvalho
- Jos{�
- keywords
- Identifier splitting
- timestamp
- Mon, 22 Dec 2014 09:51:10 +0100
- journal
- Journal of Systems and Software
- docpage
- jj.bib.dp.html#jss-CarvalhoAHP15
- volume
- 100
- pages
- 117--128
- bibsource
- dblp computer science bibliography, http://dblp.org
- tipo
- article
- title
- From source code identifiers to natural language terms
- biburl
- http://dblp.uni-trier.de/rec/bib/journals/jss/CarvalhoAHP15
- author
-
- Ara�jo, I.
- Brito, I.
- Machado, G.J.
- Pereira, R.M.S.
- Almeida, J.J.
- Smirnov, G.
- tipo
- article
- chave
- acores-wordcist2015
- title
- New algorithms for smart assessment of math exercises
- volume
- 353
- docpage
- jj.bib.dp.html#acores-wordcist2015
- journal
- Advances in Intelligent Systems and Computing
- year
- 2015
- pages
- 1221-1230
- year
- 2015
- booktitle
- 2015 10th Iberian Conference on Information Systems and Technologies,
CISTI 2015
- docpage
- jj.bib.dp.html#cisti-almeida2015
- url
- http://www.scopus.com/inward/record.url?-s2.0-84943328958&partnerID=MN8TOARS
- title
- Gr�bner bases and mathematical exercises generation with nondetermined structure
- author
-
- Ara�jo, I.
- Smirnov, G.
- Almeida, J.J.
- eid
- 2
- tipo
- inproceedings
- chave
- cisti-almeida2015
- titlept
- Bases de Gr�bner e gera��o de exerc�cios matem�ticos com estrutura n�o determinada
- docpage
- jj.bib.dp.html#subtitles2015
- journal
- Quarterly Journal of Experimental Psychology
- volume
- 68
- pages
- 680-696
- number
- 4
- year
- 2015
- chave
- subtitles2015
- author
-
- Soares, A.P.
- Machado, J.
- Costa, A.
- Iriarte, �.
- Sim�es, A.
- Almeida, J.J.
- Comesa�a, M.
- Perea, M.
- tipo
- article
- title
- On the advantages of word frequency and contextual
diversity measures extracted from subtitles: The case of Portuguese
- title
- Experiments on Enlarging a Lexical Ontology
- isbn
- 978-3-319-27652-6
- tipo
- incollection
- publisher
- Springer International Publishing
- pages
- 49--56
- docpage
- jj.bib.dp.html#PULO:springer
- booktitle
- Languages, Applications and Technologies
- series
- Communications in Computer and Information Science
- volume
- 563
- irreditor
- Sierra-Rodr�guez, Jos�-Luis and Leal, Jos�-Paulo and Sim�es, Alberto
- chave
- PULO:springer
- author
-
- Sim�es, Alberto
- Almeida, Jos� Jo�o
- language
- English
- doi
- 10.1007/978-3-319-27653-3_5
- year
- 2015
- author
-
- Alberto Sim�es
- Xavier G�mez Guinovart
- J. Jo�o Almeida
- chave
- SIMES16.1052
- editor
- Nicoletta Calzolari (Conference Chair) and Khalid Choukri
and Thierry Declerck and Marko Grobelnik and Bente Maegaard and Joseph
Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis
- month
- may
- year
- 2016
- language
- english
- address
- Portoroz, Slovenia
- publisher
- European Language Resources Association (ELRA)
- tipo
- inproceedings
- isbn
- 978-2-9517408-9-1
- title
- Enriching a {P
- date
- 23-28
- booktitle
- Proceedings of the Ninth International Conference on
Language Resources and Evaluation (LREC 2016)
- docpage
- jj.bib.dp.html#SIMES16.1052
- booktitle
- 5th Symposium on Languages, Applications and Technologies
(SLATE'16)
- docpage
- jj.bib.dp.html#almeida_et_al2016
- volume
- 51
- series
- OpenAccess Series in Informatics (OASIcs)
- pages
- 1--8
- offeditor
- Marjan Mernik and Jos� Paulo Leal and Hugo Gon�alo Oliveira
- publisher
- Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik
- tipo
- inproceedings
- title
- {Context-Free Grammars: Exercise Generation and Probabilistic
Assessment
- offaddress
- Dagstuhl, Germany
- annote
- Keywords: Exercise generation, context-free grammars, assessment
- doi
- http://dx.doi.org/10.4230/OASIcs.SLATE.2016.10
- year
- 2016
- chave
- almeida_et_al2016
- author
-
- Jos� Jo�o Almeida
- Eliana Grande
- Georgi Smirnov
- journal
- Iberian Conference on Information Systems and Technologies, CISTI
- docpage
- jj.bib.dp.html#cisti2016
- volume
- 2016-July
- doi
- https://doi.org/10.1109/CISTI.2016.7521367
- year
- 2016
- chave
- cisti2016
- author
-
- Araujo, C.
- Henriques, P.R.
- Martini, R.G.
- Almeida, J.J.
- tipo
- article
- title
- Architectural approaches to build the museum of the person
- author
-
- Ara�jo, I.
- Almeida, J.J.
- Smirnov, G.
- chave
- exercise-composition2016
- year
- 2016
- doi
- https://doi.org/10.1007/978-3-319-31307-8_24
- tipo
- article
- note
- WorldCIST'16
- title
- Exercise composition: From environment properties to composed problems
- volume
- 445
- docpage
- jj.bib.dp.html#exercise-composition2016
- journal
- Advances in Intelligent Systems and Computing
- pages
- 235-244
- tipo
- article
- note
- WorldCIST'16
- title
- OntoMP, an ontology to build the museum of the person
- journal
- Advances in Intelligent Systems and Computing
- docpage
- jj.bib.dp.html#ontoMP2016
- volume
- 445
- pages
- 653-661
- chave
- ontoMP2016
- author
-
- doi
- https://doi.org/10.1007/978-3-319-31307-8_67
- year
- 2016
- pages
- 277-286
- abstract
- Exercise generation on language specification is a challenging
problem, because of the richness of the objects in the domain.
In this paper we discuss Mgbeg (Meta-Grammar-Based Exercise Generator) -- a
toolkit for exercise generation on context-free languages.
Mgbeg approach is based on a meta-grammar formalism and tool, used to define
a set of similar exercises.
Mgbeg is a simple attributed grammar used to describe the set of valid
exercise (and randomly generate one of them).
Each exercise typically contains several attributes calculated during the
generation steps: namely, one or more formal specification of the language
(context free grammar); the exercise statement; other information such as
examples, common mistakes, validation data, to be used in the construction
of the exercise statement, solution, and assessment steps.
Complementary the toolkit provides a grammar module, with functionality
for grammar comparison, sentence generation and recognition; a template
engine (to help in textual attributes calculation).
- year
- 2017
- booktitle
- Recent Advances in Information Systems and Technologies
- docpage
- jj.bib.dp.html#portosanto-worldcist2017
- series
- Advances in Intelligent Systems and Computing, vol. 659
- title
- Exercise generation on language specification
- chave
- portosanto-worldcist2017
- note
- WorldCIST'17
- author
-
- Almeida, J.J.
- Eliana Grande
- Smirnov, G.
- tipo
- inproceedings
- pages
- 763-772
- volume
- 745
- series
- Advances in Intelligent Systems and Computing
- booktitle
- Trends and Advances in Information Systems and Technologies, WorldCist2018
- docpage
- jj.bib.dp.html#Martins2018a
- title
- Increasing authorship identification through emotional analysis
- publisher
- Springer International Publishing
- tipo
- incollection
- offeditor
- �lvaro Rocha and Hojjat Adeli and Lu�s Paulo Reis and Sandra Costanzo
- isbn
- 978-3-319-77702-3
- month
- March
- year
- 2018
- doi
- https://doi.org/10.1007/978-3-319-77703-0_76
- author
-
- Ricardo Martins
- J.Jo�o Almeida
- Pedro Rangel Henriques
- Paulo Novais
- chave
- Martins2018a
- edition
- 1
- year
- 2018
- pages
- 374--384
- volume
- 11314
- series
- Lecture Notes in Computer Science
- booktitle
- {IDEAL
- docpage
- jj.bib.dp.html#DBLP:conf/ideal/MarcondesAN18
- title
- Chatbot Theory - A Na�ve and Elementary Theory for Dialogue
Management
- author
-
- Francisco S. Marcondes
- Jos� Jo�o Almeida
- Paulo Novais
- publisher
- Springer
- tipo
- inproceedings
- chave
- DBLP:conf/ideal/MarcondesAN18
- pages
- 61--66
- year
- 2018
- booktitle
- {BRACIS
- docpage
- jj.bib.dp.html#DBLP:conf/bracis/Martins0ANH18
- title
- Hate Speech Classification in Social Media Using Emotional Analysis
- chave
- DBLP:conf/bracis/Martins0ANH18
- author
-
- Ricardo Martins
- Marco Gomes
- Jos� Jo�o Almeida
- Paulo Novais
- Pedro Rangel Henriques
- publisher
- {IEEE
- tipo
- inproceedings
- year
- 2018
- pages
- 276--283
- series
- Advances in Intelligent Systems and Computing
- volume
- 800
- docpage
- jj.bib.dp.html#DBLP:conf/dcai/MartinsAHN18
- booktitle
- {DCAI
- title
- Domain Identification Through Sentiment Analysis
- tipo
- inproceedings
- publisher
- Springer
- author
-
- Ricardo Martins
- Jos� Jo�o Almeida
- Pedro Rangel Henriques
- Paulo Novais
- chave
- DBLP:conf/dcai/MartinsAHN18
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- author
-
- Rui Mendes
- Jos� Jo�o Almeida
- chave
- DBLP:conf/slate/MendesA18
- title
- eOS: The Exercise Operating System
- series
- {OASICS
- volume
- 62
- docpage
- jj.bib.dp.html#DBLP:conf/slate/MendesA18
- booktitle
- {SLATE
- year
- 2018
- pages
- 5:1--5:13
- year
- 2018
- pages
- 8:1--8:8
- series
- {OASICS
- volume
- 62
- docpage
- jj.bib.dp.html#DBLP:conf/slate/Almeida18
- booktitle
- {SLATE
- title
- Abcl: Abc music notation with rich chord support (Short Paper)
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- author
-
- chave
- DBLP:conf/slate/Almeida18
- series
- {OASICS
- volume
- 62
- docpage
- jj.bib.dp.html#DBLP:conf/slate/MartinsAHN18
- booktitle
- {SLATE
- year
- 2018
- pages
- 19:1--19:9
- tipo
- inproceedings
- publisher
- Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik
- author
-
- Ricardo Martins
- Jos� Jo�o Almeida
- Pedro Rangel Henriques
- Paulo Novais
- chave
- DBLP:conf/slate/MartinsAHN18
- title
- Predicting Performance Problems Through Emotional Analysis (Short
Paper)
- title
- Creating a social media-based personal emotional lexicon
- tipo
- inproceedings
- publisher
- {ACM
- author
-
- Ricardo Martins
- Jos� Jo�o Almeida
- Paulo Novais
- Pedro Rangel Henriques
- chave
- DBLP:conf/webmedia/MartinsANH18
- year
- 2018
- pages
- 261--264
- docpage
- jj.bib.dp.html#DBLP:conf/webmedia/MartinsANH18
- booktitle
- WebMedia
- title
- Increasing Authorship Identification Through Emotional Analysis
- chave
- DBLP:conf/worldcist/MartinsAHN18
- tipo
- inproceedings
- publisher
- Springer
- author
-
- Ricardo Martins
- Jos� Jo�o Almeida
- Pedro Rangel Henriques
- Paulo Novais
- pages
- 763--772
- year
- 2018
- docpage
- jj.bib.dp.html#DBLP:conf/worldcist/MartinsAHN18
- booktitle
- WorldCIST {(1)
- series
- Advances in Intelligent Systems and Computing
- volume
- 745
- eywords
- Formal languages, Context-free grammars, Automatic assessment
- abstract
-
In this paper we consider the problem of cycle-free context-free grammars equivalence. To every context-free
grammar there corresponds a system of formal equations. Formally applying the iteration method to this system
we obtain the grammar axiom in the form of a formal power series composed of the words generated by the
grammarmultipliedby the respective ambiguities.
We define a transform that attributes a matrix meaning to the system of formal equations and to formal power
series: terminal symbols are substituted by matrices and formal sum and product are substituted by the matrix
ones. In order to effectively compute the sum of a matrix series we numerically solve the system of matrix
equations. We prove distinguishability theorems showing that if two formal power series generated by cycle-free
context-free grammars are different, then there exists a matrix substitution such that the sums of the respective
matrix series are different. Based on this result, we suggest a procedure that can resolve the problem of
equivalence of cycle-free context-free grammars in many practical cases.
The results obtained in this paper form a theoretical basis for algorithms oriented to automatic assessment of
students' answers in computer science. We present the respective algorithms. Then we compare our approach
with a simple heuristic method based on CYK algorithm and discuss the limitations of our method.
- chave
- cola19
- author
-
- Jos� Jo�o Almeida
- Eliana Grande
- Georgi Smirnov
- docpage
- jj.bib.dp.html#cola19
- journal
- Journal of Computer Languages
- volume
- 51
- pages
- 48-56
- tipo
- article
- publisher
- Elsevier
- title
- On solving cycle-free context-free grammar equivalence problem using numerical analysis
- tipo
- inproceedings
- title
- Hunting ancestors: A unified approach for discovering genealogical information
- volume
- 74
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Almeida2019
- journal
- OpenAccess Series in Informatics
- number
- 22
- author
-
- Almeida, J.J.
- Mendes, R.C.
- eid
- 2
- chave
- Almeida2019
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85071097688&.4230%2fOASIcs.SLATE.2019.22&partnerID=40&md5=8e2f42806d411bdfa553dcfa27be17a9
- abstract
- This paper presents an unified approach for discovering
genealogical information. It presents a frameworks for storing information
concerning ancestors, locations, dates and documents. It also intends
to provide a framework that is able to perform inference concerning
dates by using constraints and for handling relations, locations and
sources. The DSL presented also aims to help users store information
from heterogeneous sources along with the evidence contained therein. �
Jos� J. Almeida and Rui C. Mendes.
- source
- Scopus
- year
- 2019
- doi
- 10.4230/OASIcs.SLATE.2019.22
- abstract
- The digital era has brought some challenges to lexicographers,
but it has also brought new opportunities as part of the rise of
information technology and, more recently, the emergence of digital
humanities. This paper provides a description of LeXmart, the framework
that supports the digital development of the Portuguese Academy of
Sciences Dictionary. LeXmart is a smart tool framework to support
lexicographers' work that offers different types of tools, ranging from a
structural editor to a set of validation tools. Given that the dictionary
is stored in eXist-DB, LeXmart is developed on top of its ecosystem,
using W3C standard languages, and offering default functionalities
offered by eXist-DB, namely a RESTful API. � 2019 Lexical Computing CZ
s.r.o.. All rights reserved.
- source
- Scopus
- year
- 2019
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85075350281&partnerID=40&md5=c5171c547089e5728c1cec0d5c755df1
- eid
- 2
- author
-
- Sim�es, Alberto
- Salgado, Ana
- Costa, Rute
- Almeida, J.J.
- chave
- Sim�es2019453
- pages
- 453-466
- volume
- 2019-October
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Sim�es2019453
- journal
- Proceedings of Electronic Lexicography in the 21st Century Conference
- title
- LexMart: A smart tool for lexicographers
- tipo
- inproceedings
- type
- Article
- docpage
- jj.bib.dp.html#Martins2019
- journal
- Expert Systems
- number
- e12469
- tipo
- article
- title
- A sentiment analysis approach to increase authorship identification
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85074844787&.1111%2fexsy.12469&partnerID=40&md5=bb5b7acab849e47b90246393026a4ba4
- abstract
- Writing style is considered the manner in which an author
expresses his thoughts, influenced by language characteristics, period,
school, or nation. Often, this writing style can identify the author. One
of the most famous examples comes from 1914 in Portuguese literature. With
Fernando Pessoa and his heteronyms Alberto Caeiro, �lvaro de Campos,
and Ricardo Reis, who had completely different writing styles, led
people to believe that they were different individuals. Currently,
the discussion of authorship identification is more relevant because
of the considerable amount of widespread fake news in social media,
in which it is hard to identify who authored a text and even a simple
quote can impact the public image of an author, especially if these
texts or quotes are from politicians. This paper presents a process to
analyse the emotion contained in social media messages such as Facebook to
identify the author's emotional profile and use it to improve the ability
to predict the author of the message. Using preprocessing techniques,
lexicon-based approaches, and machine learning, we achieved an authorship
identification improvement of approximately 5% in the whole dataset
and more than 50% in specific authors when considering the emotional
profile on the writing style, thus increasing the ability to identify
the author of a text by considering only the author's emotional profile,
previously detected from prior texts. � 2019 John Wiley & Sons, Ltd.
- source
- Scopus
- year
- 2019
- doi
- 10.1111/exsy.12469
- eid
- 2
- author
-
- Martins, R.
- Almeida, J.J.
- Henriques, P.
- Novais, P.
- chave
- Martins2019
- tipo
- inproceedings
- title
- Domain identification through sentiment analysis
- volume
- 800
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Martins2019276
- journal
- Advances in Intelligent Systems and Computing
- pages
- 276-283
- eid
- 2
- author
-
- Martins, R.
- Almeida, J.J.
- Henriques, P.
- Novais, P.
- chave
- Martins2019276
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85049987273&.1007%2f978-3-319-94649-8_33&partnerID=40&md5=3fe3521d746330d391ee8ec0dd7bd4e9
- source
- Scopus
- abstract
- When dealing with chatbots, domain identification is an
important feature to adapt the interactions between user and computer in
order to increase the reliability of the communication and, consequently,
the audience and decrease its rejection avoiding misunderstandings. In
order to adapt to different domains, the writing style will be different
for the same author. For example, the same person in the role of a
student writes to his professor in a different style than he does for
his brother. This article presents a process that uses sentiment analysis
to identify the average emotional profile of the communication scenario
where the conversation is done. Using Natural Language Processing and
Machine Learning techniques, it was possible to obtain an index of
96.21% of correct classifications in the identification of where these
communications have occurred only analysing the emotional profile of
these texts. � Springer International Publishing AG, part of Springer
Nature 2019.
- year
- 2019
- doi
- 10.1007/978-3-319-94649-8_33
- tipo
- inproceedings
- title
- Musikla: Language for generating musical events
- type
- Conference Paper
- journal
- OpenAccess Series in Informatics
- docpage
- jj.bib.dp.html#Silva2020
- volume
- 83
- number
- A6
- chave
- Silva2020
- eid
- 2
- author
-
- Silva, Pedro
- Almeida, J.J.
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85091704838&.4230%2fOASIcs.SLATE.2020.6&partnerID=40&md5=1c450e4e7bb940f5855eafaedb4ccba3
- doi
- 10.4230/OASIcs.SLATE.2020.6
- source
- Scopus
- abstract
- In this paper, we'll discuss a simple approach to integrating
musical events, such as notes or chords, into a programming language. This
means treating music sequences as a first class citizen. It will be
possible to save those sequences into variables or play them right away,
pass them into functions or apply operators on them (like transposing or
repeating the sequence). Furthermore, instead of just allowing static
sequences to be generated, we'll integrate a music keyboard system
that easily allows the user to bind keys (or other kinds of events) to
expressions. Finally, it is important to provide the user with multiple
and extensible ways of outputing their music, such as synthesizing it into
a file or directly into the speakers, or writing a MIDI or music sheet
file. We'll structure this paper first with an analysis of the problem
and its particular requirements. Then we will discuss the solution we
developed to meet those requirements. Finally we'll analyze the result
and discuss possible alternative routes we could've taken. � 2020 Schloss
Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. All
rights reserved.
- year
- 2020
- title
- BhTSL, behavior trees specification and processing
- tipo
- inproceedings
- number
- A4
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Oliveira2020
- journal
- OpenAccess Series in Informatics
- volume
- 83
- chave
- Oliveira2020
- author
-
- Oliveira, M.
- Silva, P.M.
- Moura, Pedro
- Almeida, J.J.
- Henriques, P.R.
- eid
- 2
- doi
- 10.4230/OASIcs.SLATE.2020.4
- abstract
- In the context of game development, there is always the
need for describing behaviors for various entities, whether NPCs or
even the world itself. That need requires a formalism to describe
properly such behaviors. As the gaming industry has been growing,
many approaches were proposed. First, finite state machines were used
and evolved to hierarchical state machines. As that formalism was not
enough, a more powerful concept appeared. Instead of using states for
describing behaviors, people started to use tasks. This concept was
incorporated in behavior trees. This paper focuses in the specification
and processing of Behavior Trees. A DSL designed for that purpose will
be introduced. It will also be discussed a generator that produces LATEX
diagrams to document the trees, and a Python module to implement the
behavior described. Additionally, a simulator will be presented. These
achievements will be illustrated using a concrete game as a case study. �
2020 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl
Publishing. All rights reserved.
- source
- Scopus
- year
- 2020
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85091707856&.4230%2fOASIcs.SLATE.2020.4&partnerID=40&md5=3b2daa7d548eeed77224386d6790adc7
- author
-
- Sim�es, Alberto
- Sacanene, B.
- Iriarte, Alvaro
- Almeida, J.J.
- Macedo, J.
- eid
- 2
- chave
- Sim�es2020
- source
- Scopus
- abstract
- In this document we present the first developments on an Umbundu
dictionary for a jSpell, a morphological analyzer. Initially some comments
are performed regarding the Umbundu language morphology, followed by the
discussion on jSpell dictionaries structure and its environment. Last, we
describe the Umbundu dictionary bootstrap process and perform some final
experiments on its coverage. � 2020 Schloss Dagstuhl- Leibniz-Zentrum
fur Informatik GmbH, Dagstuhl Publishing. All rights reserved.
- year
- 2020
- doi
- 10.4230/OASIcs.SLATE.2020.10
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85091700212&.4230%2fOASIcs.SLATE.2020.10&partnerID=40&md5=26f3c0eacb3fc1ea35f005c08377b083
- title
- Towards a morphological analyzer for the umbundu language
- tipo
- inproceedings
- number
- A10
- volume
- 83
- type
- Conference Paper
- journal
- OpenAccess Series in Informatics
- docpage
- jj.bib.dp.html#Sim�es2020
- chave
- Marcondes2020
- author
-
- Marcondes, F.S.
- Almeida, J.J.
- Novais, P.
- eid
- 2
- source
- Scopus
- abstract
- The username hints for most of the on-line social networks are
mostly unpleasant for human beings since they are mostly a simple name
variation followed by numbers. This paper shows that it is possible to
generate human likable usernames through heuristics guided by structural
onomastics. The objective then is to conceive heuristics as such and
check its availability in Twitter in order to verify if is it possible
to generate a sufficiently big and available username data-set that is
able to justify the transitions from unpleasant to a pleasant username
suggestion. This paper finds that it is possible to generate 8281 handles
on average through the proposed heuristics and their permutations,
therefore, the number of various possibilities is comfortable. This is
a partial account since not all possibilities were explored and some
improvements are required, but suits for a proof of concept and to
indicate paths. � 2020 CEUR-WS. All rights reserved.
- year
- 2020
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85090898082&partnerID=40&md5=3bee224fddd1133fbeb306d5c88737fa
- title
- Structural onomatology for username generation: A partial account
- tipo
- inproceedings
- type
- Conference Paper
- docpage
- jj.bib.dp.html#Marcondes2020
- journal
- CEUR Workshop Proceedings
- volume
- 2655
- title
- A short survey on chatbot technology: Failure in raising the state of the art
- tipo
- inproceedings
- pages
- 28-36
- volume
- 1003
- docpage
- jj.bib.dp.html#Marcondes202028
- journal
- Advances in Intelligent Systems and Computing
- type
- Conference Paper
- eid
- 2
- author
-
- Marcondes, F.S.
- Almeida, J.J.
- Novais, P.
- chave
- Marcondes202028
- year
- 2020
- abstract
- This short survey aimed initially to explore the existing
state of the art for the application of chatbot on fighting (and not on
spreading) of fake-news. It was then realized that there is not common to
use chatbots with this "virtuous" purpose. Therefore, after two surveys
and a meta-analysis, the topic had to be withdrawn since there were no
survey results to discuss besides the absence of results. The survey
result raised then a need to realize how chatbots are being currently
used, designed and their primary sources. The result was once again
confusing since, on the sample: (1) no significant concentration of usage
could be found; (2) no widely adopted design strategies were identified,
and (3) no significant crosscutting references to be considered as primary
sources. Certainly, this can be due to a biased sample but may also be a
symptom of a methodological issue on the chatbot researches. If the second
possibility is proved to be right it means that chatbot research is still
on a pre-paradigm stage according to Kuhn�s conception. For this paper,
there were performed 4 surveys with a total sample of 50 papers mostly
from the last 3 years. � Springer Nature Switzerland AG 2020.
- source
- Scopus
- doi
- 10.1007/978-3-030-23887-2_4
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85068602421&.1007%2f978-3-030-23887-2_4&partnerID=40&md5=cbf6fb00a51eb082aa7e1097f926fece
- type
- Conference Paper
- journal
- Advances in Intelligent Systems and Computing
- docpage
- jj.bib.dp.html#Marcondes2020170
- volume
- 1160 AISC
- pages
- 170-180
- tipo
- article
- title
- Fact-Check spreading behavior in twitter: A qualitative profile for false-claim news
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85086245198&.1007%2f978-3-030-45691-7_16&partnerID=40&md5=6547f11464462d6bfdb1505e6142b733
- doi
- 10.1007/978-3-030-45691-7_16
- source
- Scopus
- abstract
- Fact-check spread is usually performed by a plain tweet with
just the link. Since it is not proper human behavior, it may cause
uncanny, hinder the reader�s attention and harm the counter-propaganda
influence. This paper presents a profile of fact-check link spread in
Twitter (suiting for TRL-1) and, as an additional outcome, proposes
a preliminary behavior design based on it (suiting for TRL-2). The
underlying hypothesis is by simulating human-like behavior, a bot gets
more attention and exerts more influence on its followers. � The Editor(s)
(if applicable) and The Author(s), under exclusive license to Springer
Nature Switzerland AG 2020.
- year
- 2020
- chave
- Marcondes2020170
- author
-
- Marcondes, F.S.
- Almeida, J.J.
- Dur�es, D.
- Novais, P.
- eid
- 2
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85085513930&.1007%2f978-3-030-45688-7_14&partnerID=40&md5=d559e334a2140bea6ea02051264b73c4
- abstract
- Political debate - in its essence - carries a robust
emotional charge, and social media have become a vast arena for voters
to disseminate and discuss the ideas proposed by candidates. The
Brazilian presidential elections of 2018 were marked by a high level
of polarization, making the discussion of the candidates� ideas an
ideological battlefield, full of accusations and verbal aggression,
creating an excellent source for sentiment analysis. In this paper,
we analyze the emotions of the tweets posted about the presidential
candidates of Brazil on Twitter, so that it was possible to identify the
emotional profile of the adherents of each of the leading candidates,
and thus to discern which emotions had the strongest effects upon the
election results. Also, we created a model using sentiment analysis and
machine learning, which predicted with a correlation of 0.90 the final
result of the election. � 2020, The Editor(s) (if applicable) and The
Author(s), under exclusive license to Springer Nature Switzerland AG.
- source
- Scopus
- year
- 2020
- doi
- 10.1007/978-3-030-45688-7_14
- eid
- 2
- author
-
- Martins, R.
- Almeida, J.
- Henriques, P.
- Novais, P.
- chave
- Martins2020134
- volume
- 1159 AISC
- type
- Conference Paper
- journal
- Advances in Intelligent Systems and Computing
- docpage
- jj.bib.dp.html#Martins2020134
- pages
- 134-143
- tipo
- article
- title
- Predicting an Election's Outcome Using Sentiment Analysis
- chave
- Martins201861
- eid
- 2
- author
-
- Martins, R.
- Gomes, M.
- Almeida, J.J.
- Novais, P.
- Henriques, P.
- url
- https://www.scopus.com/inward/record.uri?-s2.0-85060849408&.1109%2fBRACIS.2018.00019&partnerID=40&md5=10284a22b511c161a903debd79e5619a
- doi
- 10.1109/BRACIS.2018.00019
- year
- 2018
- abstract
- In this paper, we examine methods to classify hate speech
in social media. We aim to establish lexical baselines for this task
by applying classification methods using a dataset annotated for this
purpose. As features, our system uses Natural Language Processing (NLP)
techniques in order to expand the original dataset with emotional
information and provide it for machine learning classification. We
obtain results of 80.56% accuracy in hate speech identification, which
represents an increase of almost 100% from the original analysis used
as a reference. � 2018 IEEE.
- source
- Scopus
- tipo
- inproceedings
- title
- Hate speech classification in social media using emotional analysis
- journal
- Proceedings - 2018 Brazilian Conference on Intelligent Systems, BRACIS 2018
- docpage
- jj.bib.dp.html#Martins201861
- type
- Conference Paper
- pages
- 61-66
- number
- 8575590