Name

Package trad - a perl Module for naif translation


Synopsis

use trad; trans_dic(<dictname>) trans_prefix(<undefWordPrefix>); trans(<string1>):string trans_ppp(<PrePostProcessingFunction>); # pre and pos processing trans_und(<filename>) trans_dont_touch(<patternstring>)


Description

Setting dictionary(s)

In order to define the list of dictionaries to be used, call

trans_dic(<filename1>, ... , <filenameN>);

Text not to be touched

In order to define the parts of the text not to be touched, call

trans_dont_touch(<regExpressionString>);

Translating a phrase

In order to translate a phrase call

trans(<phrase>)

The translated phrase will be returned

Undefined word prefix

in order to define the prefix to tag undefined words, call:

trans_prefix(<newprefix>)

by default the prefix is "@".

Pos and pre processing

user can define o preprocessing function

preProc();

and a postprocessing function

posProc();

to adapt notations.

In order to do that, create a new file with those functions and call

trans_ppp(<filename>);

to activate them.

Undefined word log file

In order to have a undefined word log file, call:

trans_und(<filename>);


Example


Dictionary format

# Dicionario portugues ingles # #1 - usado para adjectivos (para trocar a ordem) # #2 - so para o "a" (an elephant) # #3 - Para nomes proprios (O Joao => Joao) Portugu�s=Portuguese a partir de=from a=the abrir=open aceder=access acrescentado=added#1 actividades de investiga��o=research activities algumas=some alterado=changed#1 alterar o nome de=rename desconhecida=unknown#1 n�o alterado=unchanged#1 n�o foi alterado=is unchanged n�o reconhecida=unrecognized#1 n�o se conseguiu=could not n�o � um=not a#3 n�o �=is not n�o=not um=a#3 uma=a#3


Postprocessing function

sub posProc{ while( # o gato bonito -> the beatifull cat # � bonito -> is beautifull s/\b(the|a#3|some|all) (\w+) (\w+)#1/$1 $3 $2/g || s/\b(is|are|were) (\w+)#1/$1 $2/g || # O Joao -> Joao s/\b([Tt]he) (\w+)#2/$2/g || s/#2//g || # an elefant # a table s/#3 ([aeiouAEIOU]\w*)/n $1/g || s/#3//g) {} } 1; # to keep perl happy


A script example

#!/usr/local/bin/perl use trad; trans_dic("dict"); trans_prefix("@@@"); trans_und("dict.und"); trans_ppp("dict.pos"); trans_dont_touch('\\\\(\w+)'); #LaTeX comands while(<>){ print(trans($_)); }