Character-based#
augmenty.character.casing#
- augmenty.character.casing.create_random_casing_augmenter_v1(level: float) Callable[[Language, Example], Iterator[Example]] [source]#
Create an augment that randomly changes the casing of the document.
- Parameters:
level – The percentage of character that will have its casing changed.
- Returns:
The augmenter.
Example
>>> import augmenty >>> from spacy >>> nlp = spacy.blank("en") >>> random_casing_augmenter = augmenty.load("random_casing_v1", level=0.1) >>> texts = ["A sample text"] >>> list(augmenty.texts(texts, random_casing_augmenter, nlp)) ["A saMple texT"]
augmenty.character.replace#
Augmenters for randomly or semi-randomly replacing characters.
- augmenty.character.replace.create_char_random_augmenter_v1(level: float, keyboard: str = 'en_qwerty_v1') Callable[[Language, Example], Iterator[Example]] [source]#
Creates an augmenter that replaces a character with a random character from the keyboard.
- Parameters:
level – The probability to replace a character with a neightbouring character.
keyboard – A defined keyboard in the keyboard registry. To see a list of all keyboard you can run augmenty,keyboards(). Defaults to “en_qwerty_v1”.
- Returns:
The augmenter.
Example
>>> import augmenty >>> from spacy.lang.en import English >>> nlp = English() >>> char_random_augmenter = augmenty.load("char_replace_random_v1", level=0.1) >>> texts = ["A sample text"] >>> list(augmenty.texts(texts, char_random_augmenter, nlp)) ["A sabple tex3"]
- augmenty.character.replace.create_char_replace_augmenter_v1(level: float, replace: dict) Callable[[Language, Example], Iterator[Example]] [source]#
Creates an augmenter that replaces a character with a random character from replace dict.
- Parameters:
level – probability to augment character, if document is augmented.
replace – A dictionary denoting which characters denote potentials replace for each character.
- Returns:
The augmenter function.
Example
>>> create_char_replace_augmenter_v1(level=0.02, >>> replace={"æ": ["ae"], "ß": ["ss"]})
- augmenty.character.replace.create_keystroke_error_augmenter_v1(level: float, distance: float = 1.5, keyboard: str = 'en_qwerty_v1') Callable[[Language, Example], Iterator[Example]] [source]#
Creates a augmenter which augments a text with plausible typos based on keyboard distance.
- Parameters:
level – The probability to replace a character with a neightbouring character.
distance – keyboard distance. Defaults to 1.5 corresponding to neighbouring keys including diagonals.
keyboard – A defined keyboard in the keyboard registry. To see a list of all keyboard you can run augmenty,keyboards.get_all(). Defaults to “en_qwerty_v1”.
- Returns:
The augmenter.
Example
>>> import augmenty >>> from spacy.lang.en import English >>> nlp = English() >>> keystroke_error_augmenter = augmenty.load("keystroke_error_v1", >>> level=0.1, >>> keyboard="en_qwerty_v1") >>> texts = ["A sample text"] >>> list(augmenty.texts(texts, keystroke_error_augmenter, nlp)) ["A sajple texr"]
augmenty.character.spacing#
Augmenters for modyfing spacing.
- augmenty.character.spacing.create_remove_spacing_augmenter_v1(level: float) Callable[[Language, Example], Iterator[Example]] [source]#
Creates an augmenter that removes spacing with a given probability.
- Parameters:
level – The probability to remove a space.
- Returns:
The augmenter.
Example
>>> import augmenty >>> import spacy >>> nlp = spacy.blank("en") >>> remove_spacing_augmenter = augmenty.load("remove_spacing_v1", level=0.5) >>> texts = ["A sample text"] >>> list(augmenty.texts(texts, remove_spacing_augmenter, nlp)) ["A sampletext"]
augmenty.character.swap#
Augmenters for swapping characters.
- augmenty.character.swap.create_char_swap_augmenter_v1(level: float) Callable[[Language, Example], Iterator[Example]] [source]#
Creates an augmenter that swaps two neighbouring characters in a token with a given probability.
- Parameters:
level – probability to replace a character.
- Returns:
The augmenter.
Example
>>> import augmenty >>> from spacy.lang.en import English >>> nlp = English() >>> char_swap_augmenter = augmenty.load("char_swap_v1", level=0.1) >>> texts = ["A sample text"] >>> list(augmenty.texts(texts, char_swap_augmenter, nlp)) ["A smaple txet"]