text_quality.feature.tokenizer

Module Contents

Classes

Tokenizer

Helper class that provides a standard way to create an ABC using

NautilusOcrTokenizer

Helper class that provides a standard way to create an ABC using

class text_quality.feature.tokenizer.Tokenizer[source]

Bases: abc.ABC

Helper class that provides a standard way to create an ABC using inheritance.

abstract tokenize(text: str) List[str][source]
class text_quality.feature.tokenizer.NautilusOcrTokenizer[source]

Bases: Tokenizer

Helper class that provides a standard way to create an ABC using inheritance.

_HYPHENS[source]
tokenize(text: str) List[str][source]

Nautilus-OCR tokenizer