text_quality.settings
Global settings.
Module Contents
- text_quality.settings.EMPTY_PAGE_OUTPUT: int | None = 0[source]
Output value for empty pages. If None, empty pages are handled through the standard pipeline.
- text_quality.settings.SHORT_COLUMN_WIDTH: int = 5[source]
If all lines (columns) in a page are shorter than this it is considered broken.
- text_quality.settings.ENCODING = 'utf-8'[source]
Encoding to be used throughout all text file processing operations.
- text_quality.settings.TOKEN_DICT_FILE: pathlib.Path[source]
- text_quality.settings.QGRAMS_FILE: pathlib.Path[source]
- text_quality.settings.PIPELINE_FILE: pathlib.Path[source]