Augmentation Utilities#

Augmentation utilities include functions for combining and moderating augmentations.

augmenty.augment_utilities#

Utility functions used for augmentation.

augmenty.augment_utilities.combine(augmenters: Iterable[Callable[[Language, Example], Iterator[Example]]]) Callable[[Language, Example], Iterator[Example]][source]#

Combines a series of spaCy style augmenters.

Parameters:

augmenters – An list of spaCy augmenters.

Returns:

The combined augmenter

Example

>>> char_swap_augmenter = augmenty.load("char_swap_v1", level=.02)
>>> synonym_augmenter = augmenty.load("wordnet_synonym_v1", level=1, lang="en")
>>> combined_aug = augmenty.combine([char_swap_augmenter, synonym_augmenter])
>>> # combine doc using two augmenters
>>> augmented_docs = list(augmenty.docs(docs, augmenter=combined_aug, nlp=nlp))
augmenty.augment_utilities.repeat(augmenter: Callable[[Language, Example], Iterator[Example]], n: int) Callable[[Language, Example], Iterator[Example]][source]#

Repeats an augmenter n times over the same example thus increasing the sample size.

Parameters:
  • augmenter – An augmenter.

  • n – Number of times the augmenter should be repeated

Returns:

The repeated augmenter

Example

>>> augmenter = augmenty.load("char_swap_v1", level=.02)
>>> repeated_augmenter = augmenty.repeat(augmenter=aug, n=3)
augmenty.augment_utilities.set_doc_level(augmenter: Callable[[Language, Example], Iterator[Example]], level: float) Callable[[Language, Example], Iterator[Example]][source]#

Set the percantage of examples that the augmenter should be applied to.

Parameters:
  • augmenter – A spaCy augmenters which you only want to apply to a certain percentage of docs

  • level – The percentage of docs the which should be augmented.

Returns:

The combined augmenter

augmenty.augment_utilities.yield_original(augmenter: Callable[[Language, Example], Iterator[Example]], doc_level: float = 1.0) Callable[[Language, Example], Iterator[Example]][source]#

Wraps and augmented such that it yields both the original and augmented example.

Parameters:
  • augmenter – A spaCy augmenters.

  • doc_level – The percentage of documents the augmenter should be applied to. Only yield the original when the original doc is augmented.

Returns:

The augmenter, which now yields both the original and augmented example.