Skip to content

M4t1ss/ChunkMT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Combining machine translated sentence chunks from multiple MT systems

This is a hybrid solution for acquiring the best translation of an input sentence by combining translated chunks from multiple online MT engines

Included software

Requirements

Supported APIs

  • Google Translate
  • Bing Translator
  • LetsMT

Usage

Upload the files to your server. Set execute permissions (chmod 755) for exp.sh, query and utils/translated_chunks_to_hybrid/query.

The ChunkMT requires three parameters - the language model, input sentences, grammar file. It is run with the following command:

php ChunkMT.php <language model> <input sentences> <grammar>

For example:

php ChunkMT.php languageModel.binary inputSentences.txt eng_sm6.gr

The output generates four three files:

  • output.google.txt
  • output.bing.txt
  • output.letsmt.txt
  • output.hybrid.txt

Utils

The utils directory contains separate parts of the ChunkMT system that can be run as standalone

  • utils/chunking/ contains files for individual chunking and unchunking

    • to parse an input file with the Berkeley Parser (a parsed file is required as input for the chunker) run:
     java -Xmx1024m -jar BerkeleyParser-1.7.jar -gr grammar.gr < input.txt
    
  • utils/chunks_to_translated_chunks/ contains files for individual translating of chunked files

  • utils/translated_chunks_to_hybrid/ contains files for running the hybrid system with chunked translated files

About

Combining machine translated sentence chunks from multiple MT systems

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published