Character-based#

augmenty.character.casing#

augmenty.character.casing.create_random_casing_augmenter_v1(level: float) Callable[[Language, Example], Iterator[Example]][source]#

Create an augment that randomly changes the casing of the document.

Parameters:

level – The percentage of character that will have its casing changed.

Returns:

The augmenter.

Example

>>> import augmenty
>>> from spacy
>>> nlp = spacy.blank("en")
>>> random_casing_augmenter = augmenty.load("random_casing_v1", level=0.1)
>>> texts = ["A sample text"]
>>> list(augmenty.texts(texts, random_casing_augmenter, nlp))
["A saMple texT"]

augmenty.character.replace#

Augmenters for randomly or semi-randomly replacing characters.

augmenty.character.replace.create_char_random_augmenter_v1(level: float, keyboard: str = 'en_qwerty_v1') Callable[[Language, Example], Iterator[Example]][source]#

Creates an augmenter that replaces a character with a random character from the keyboard.

Parameters:
  • level – The probability to replace a character with a neightbouring character.

  • keyboard – A defined keyboard in the keyboard registry. To see a list of all keyboard you can run augmenty,keyboards(). Defaults to “en_qwerty_v1”.

Returns:

The augmenter.

Example

>>> import augmenty
>>> from spacy.lang.en import English
>>> nlp = English()
>>> char_random_augmenter = augmenty.load("char_replace_random_v1", level=0.1)
>>> texts = ["A sample text"]
>>> list(augmenty.texts(texts, char_random_augmenter, nlp))
["A sabple tex3"]
augmenty.character.replace.create_char_replace_augmenter_v1(level: float, replace: dict) Callable[[Language, Example], Iterator[Example]][source]#

Creates an augmenter that replaces a character with a random character from replace dict.

Parameters:
  • level – probability to augment character, if document is augmented.

  • replace – A dictionary denoting which characters denote potentials replace for each character.

Returns:

The augmenter function.

Example

>>> create_char_replace_augmenter_v1(level=0.02,
>>>                                  replace={"æ": ["ae"], "ß": ["ss"]})
augmenty.character.replace.create_keystroke_error_augmenter_v1(level: float, distance: float = 1.5, keyboard: str = 'en_qwerty_v1') Callable[[Language, Example], Iterator[Example]][source]#

Creates a augmenter which augments a text with plausible typos based on keyboard distance.

Parameters:
  • level – The probability to replace a character with a neightbouring character.

  • distance – keyboard distance. Defaults to 1.5 corresponding to neighbouring keys including diagonals.

  • keyboard – A defined keyboard in the keyboard registry. To see a list of all keyboard you can run augmenty,keyboards.get_all(). Defaults to “en_qwerty_v1”.

Returns:

The augmenter.

Example

>>> import augmenty
>>> from spacy.lang.en import English
>>> nlp = English()
>>> keystroke_error_augmenter = augmenty.load("keystroke_error_v1",
>>>                                           level=0.1,
>>>                                           keyboard="en_qwerty_v1")
>>> texts = ["A sample text"]
>>> list(augmenty.texts(texts, keystroke_error_augmenter, nlp))
["A sajple texr"]

augmenty.character.spacing#

Augmenters for modyfing spacing.

augmenty.character.spacing.create_remove_spacing_augmenter_v1(level: float) Callable[[Language, Example], Iterator[Example]][source]#

Creates an augmenter that removes spacing with a given probability.

Parameters:

level – The probability to remove a space.

Returns:

The augmenter.

Example

>>> import augmenty
>>> import spacy
>>> nlp = spacy.blank("en")
>>> remove_spacing_augmenter = augmenty.load("remove_spacing_v1", level=0.5)
>>> texts = ["A sample text"]
>>> list(augmenty.texts(texts, remove_spacing_augmenter, nlp))
["A sampletext"]

augmenty.character.swap#

Augmenters for swapping characters.

augmenty.character.swap.create_char_swap_augmenter_v1(level: float) Callable[[Language, Example], Iterator[Example]][source]#

Creates an augmenter that swaps two neighbouring characters in a token with a given probability.

Parameters:

level – probability to replace a character.

Returns:

The augmenter.

Example

>>> import augmenty
>>> from spacy.lang.en import English
>>> nlp = English()
>>> char_swap_augmenter = augmenty.load("char_swap_v1", level=0.1)
>>> texts = ["A sample text"]
>>> list(augmenty.texts(texts, char_swap_augmenter, nlp))
["A smaple txet"]