Some parts of this website may do not work correctly, because your browser doesn't support JavaScript or you have disabled it. In order to use all features please enable JavaScript in your browser.

Specification for annotator > translator > bonsai

bonsai

Bonsai is a tree-to-string decoder for statistical machine translation. It requires its input sentences to be parsed before translation. For predefined translation rules, it is best to use default options without modification, as the weights have already been optimized.

You can now run a Polish to English toy translation model using the following pipe:

gobio --lang pl ! bonsai --lang pl --trg-lang en

It is possible to create your own translation rule set using any parser integrated with PSI-Toolkit and the training-specific tools described in the bonsai tutorial. This, however, is an advanced topic, recommended for people with some experience in statistical machine translation. It is also quite resource and time-intensive.

Options

Allowed options:
  --lang arg (=guess)                   language
  --force-language                      force using specified language even if 
                                        a text was resognised otherwise
  --trg-lang arg                        target language
  --config arg (=%ITSDATA%/%LANG%%TRGLANG%/%LANG%%TRGLANG%.cfg)
                                        Path to configuration
  --rs arg                              Paths to translation rule sets
  --lm arg                              Paths to language models
  --stacksize arg (=20)                 Node translation stack size
  --max_trans arg (=20)                 Maximal number of transformations per 
                                        hyper edge
  --max_hyper arg (=20)                 Maximal number of hyper edges per 
                                        symbol
  --eps arg (=-1)                       Allowed transformation cost factor
  --nbest arg (=1)                      Display n best translations
  --verbose arg (=0)                    Level of verbosity: 0, 1, 2
  --pedantic                            Pedantic cost calculation (for 
                                        debugging)
  --mert                                Output for MERT (combine with nbest)
  --tm_weight arg                       Weights for translation model 
                                        parameters
  --rs_weight arg                       Weights for different translation rules
                                        sets
  --lm_weight arg                       Weights for different language models
  --word_penalty arg                    Weight for word penalty

Other help resources