This toolkit has a set of perl scripts for CWB generation and processing

makefile   - try make t1 ou make t2

cqp.cgi -   CGI for query all the corpus (with permition for read by all)
            This scripts also guess if the corpus is a paralel one

txt2cqp    - makes a cwb corpus from a list of txt, html, or XML files
   Ex: txt2cqp -lema -html corpusname ~/public_html/musica/html/*.html
     and see http://natura.di.uminho.pt/jjbin/cqp5 (corpus mmm)

addradical - (used in txt2cqp) use jspell to add lema and pos to a cqp corpus

html2pml   - transform html in a pml file (=html with almost just <p> tags)

html2pml -listofpair - transform a file with lines with filename pairs
           in a pair of files with the concatenation of the pml.
           good for align htmls

my.tmx     - a translation memory in TMX format for test t1

quebraxmlsent - (used in txt2cpq) in  texts or xml specific element

tmx2cqp    - builds a cpq paralell corpus from a TMX

tmxsplit   - split TMX in XML files (one for each language)

xmlalign2cqp - makes the cwb corpus and align them (
   tags f (for syncronization) and 
        p (for align))

align2tmx - see cqpalign2tmx

cqpalign2tmx - makes a TMX (translation memory exange format) from align file.
   The align files are created with EasyAlign (CQP)

filealigner -  align a pair of files
   uses 
     html2pml (to convert to PML)
     xmlalign2cqp ... (to align with EasyAlign)
     align2tmx      (to build a TMX)

pdfaligner
htmlaligner

mkbitextra  - see mkterminum

mkterminum
   dir           directory
            -> paths        .paths           list of files
            -> blocks       .blocks          list of blocks
            -> _pairs       ._pairs          list of bitext candidate pairs
            -> pairs        .pairs           list of bitext
            -> tmx          .tmxdir          directory with the TMXs


Files used in tests:
   makefile
   listofpairs.ex
   listofurls.ex
   my.tmx

Installation needs:
   cwb
   jspell and jpell.pt dicionary (or jspell.en but this one is a poor one)
   Lingua::PT::PLN