Document-based#
Document-based augmenters include all augmenters which augment the entire document. It includes augmenters such as changing whole document casing or subsetting documents into smaller parts.
augmenty.doc.casing#
- augmenty.doc.casing.create_spongebob_augmenter_v1(level: float) Callable[[Language, Example], Iterator[Example]] [source]#
Create an augmneter that converts documents to SpOnGeBoB casing.
- Parameters:
level – The percentage of examples that will be augmented.
- Returns:
The augmenter.
Example
>>> import augmenty >>> import spacy >>> nlp = spacy.blank("en") >>> spongebob_augmenter = augmenty.load("spongebob_v1", level=1) >>> texts = ["A sample text"] >>> list(augmenty.texts(texts, spongebob_augmenter, nlp)) ["A SaMpLe tExT"]
- augmenty.doc.casing.create_upper_casing_augmenter_v1(level: float) Callable[[Language, Example], Iterator[Example]] [source]#
Create an augmenter that converts documents to uppercase.
- Parameters:
level – The percentage of examples that will be augmented.
- Returns:
The augmenter.
Example
>>> import augmenty >>> import spacy >>> nlp = spacy.blank("en") >>> upper_case_augmenter = augmenty.load("upper_case_v1", level=0.1) >>> texts = ["A sample text"] >>> list(augmenty.texts(texts, upper_case_augmenter, nlp)) ["A SAMPLE TEXT"]
augmenty.doc.subset#
- augmenty.doc.subset.create_paragraph_subset_augmenter_v1(min_paragraph: Union[float, int] = 1, max_paragraph: Union[float, int] = 1.0, respect_sentences: bool = True) Callable[[Language, Example], Iterator[Example]] [source]#
Create an augmenter that extracts a subset of a document.
- Parameters:
min_paragraph – An float indicating the min percentage of the document to include or a float indicating the minimum number of paragraps to include (tokens in respect sentences is False). E.g. 1, indicates at least one sentence.
max_paragraph – An float indicating the max percentage of the document to include or a float indicating the maximum number of paragraps to include (tokens in respect sentences is False). E.g. 1.00 indicates 100%.
respect_sentences – should the augmenter respect sentence bounderies?
- Returns:
The augmenter.
Example
>>> import augmenty >>> import spacy >>> nlp = spacy.blank("en") >>> nlp.add_pipe("sentencizer") >>> augmenter = augmenty.load("sent_subset_v1", level=0.7) >>> text = "Augmenty is a wonderful tool for augmentation. " + >>> "It have tons of different augmenters. " + >>> " Augmenty is developed using spaCy." >>> list(augmenty.texts([text], augmenter, nlp)) ["Augmenty is a wonderful tool for augmentation. Augmenty is developed using spaCy."]