Frequently Asked Questions
If you have any question that is not included in this FAQ but it should be, please send it by email firstname.lastname@example.org.
Is the graphical user interface available offline?
Yes, but, currently, the GUI is available only for Linux users. After installing one of the PSI Linux packages, you will get the same webservice as this and you will be able to run it offline by command:
Which types of input files does PSI-Toolkit support?
PSI-Toolkit deals with all kinds of text files, including some NLP-tools
internal formats like PSI and UTT formats. Other supported file formats are:
Which languages does PSI-Toolkit support?
It depends on the processor. Some of them, such as tokenizer and segmentizer,
offer support for the wide range of languages, whereas others are designed
for specific language, like in the case of
In documentation you may find all supported
languages for each processor.
How can I display all text fragments with particular tag?
You can filter the display of PSI-lattice by using the
tag option of
See page Working with PSI-lattice for details.
How to extract grammatical classes for each word in a sentence?
One of the possible solutions is to run the PSI-Toolkit with the following pipeline:
lemmatize ! write --tags lexeme --fallback-tags token
It returns the list of all known lemmas and its grammatical classes for each
token, or it simply returns token if lemmas is not found. The order of obtained
lexemes is exactly the same as the token's order for the
Which types of character set are supported?
The PSI-Toolkit has a native support for text in UTF-8 character encoding
system, but there are also libraries for automatic detection and conversion of
character set integrated into
psi-pipe command. Please, check its
Additionally, for all files sending to PSI-Server the conversion to UTF-8 is enabled.