Some parts of this website may do not work correctly, because your browser doesn't support JavaScript or you have disabled it. In order to use all features please enable JavaScript in your browser.

Specification for writer > bracketing-writer

bracketing-writer

Tags the input text with the language units (e.g. parses) marked with various types of "brackets", e.g. with square brackets with the category prepended (NP[AP[very large] house] or with XML tags (<np><ap>very large</ap> house</np>).

The actual format of opening and closing brackets is specified with, respectively, --opening-bracket and --closing-bracket. The following expressions have a special meaning when given in --opening/closing-bracket option:

  • %T - edge tags,
  • %c - edge category,
  • %t - edge text,
  • %A - all edge attributes,
  • %a[...] - the value of the given attribute,
  • %s - score,
  • %r - edge role (only in --disambig mode)
  • %*[...][...], %*(...)(...)%, %*<...><...> or %*{...}{...} - join operators (see below).

Let us call the elements of an edge description that were referred to in the --opening/closing-bracket (but not inside join operators) active elements. (E.g. for the specification <edge tags="%T" category="%c"> the tags and the category are active elements). Edges covering the same span with same values of active elements will be collapsed (i.e. only one bracket pair will be generated). This is the default behavior, however it can be reverted by the option --no-collapse.

The alternative values of the elements of an edge description that are not active can be given with a join operator. A join operator has two arguments - a separator and the expression to be joined. Separator cannot contain %-specification, the expression to be joined must contain at least one %-specification. For instance the specification <%c text="%*[,][%t]> means that for each category one bracket will be generated, alternative text values will be separated with commas.

No more than one join operator can be given. Join operators can be, hovewer, nested, e.g. <%c>%*[][<text val="%t" attrs="%*(;)(%A)"/>] is acceptable.

The string that will substitute %T is obtained by joining tags with commas. An alternative separator can be given with --tag-separator. The tag names that will appear in %T substitions can be limited with the --show-only-tags options, if --tags is specified and --show-only-tags was not given, the --tags will be assumed for --show-only-tags.

The string that will substtitue %A is obtainted by joining the attribute-value pairs with commas. An alternative separator can be given with --av-pairs-separator option. The attribute and its value are separated with =, an alternative separator can be given with --av-separator. The attribute to be shown can be chosen with the --show-attributes option.

By default, symbol edges are displayed but blank edges are omitted. Both these behaviors can be changed by using options --skip-symbol-edges and --with-blank respectively.

Options

Allowed options:
  --opening-bracket arg (=[)    the actual format of opening brackets
  --closing-bracket arg (=])    the actual format of closing brackets
  --tag-separator arg (=,)      separates tags
  --show-only-tags arg          limits the tag names that will appear in `%T` 
                                substitions
  --tags arg                    filters the edges by tags
  --av-pairs-separator arg (=,) separates the attribute-value pairs
  --av-separator arg (==)       separates the attribute and its value
  --show-attributes arg         the attributes to be shown
  --skip-symbol-edges           skip symbol edges
  --with-blank                  do not skip edges with whitespace text
  --no-collapse                 do not collapse duplicate edge labels
  --disambig                    choose only one partition

Other help resources