spaCy - Utility Functions



We can find some small collection of spaCy’s utility functions in spacy/util.py. Let us understand those functions and their usage.

The utility functions are listed below in a table with their descriptions.

Sr.No. Utility Function & Description
1 Util.get_data_path

To get path to the data directory.

2 Util.set_data_path

To set custom path to the data directory.

3 Util.get_lang_class

To import and load a Language class.

4 Util.set_lang_class

To set a custom Language class.

5 Util.lang_class_is_loaded

To find whether a Language class is already loaded or not.

6 Util.load_model

This function will load a model.

7 Util.load_model_from_path

This function will load a model from a data directory path.

8 Util.load_model_from_init_py

It is a helper function which is used in the load() method of a model package.

9 Util.get_model_meta

To get a model’s meta.json from a directory path.

10 Util.update_exc

This function will update, validate, and overwrite tokenizer expectations.

11 Util.is_in_jupyter

To check whether we are running the spacy from a Jupyter notebook.

12 Util.get_package_path

To get the path of an installed spacy package.

13 Util.is_package

To validate model packages.

14 Util.compile_prefix_regex

This function will compile a sequence of prefix rules into a regex object.

15 Util.compile_suffix_regex

This function will compile a sequence of suffix rules into a regex object.

16 Util.compile_infix_regex

This function will compile a sequence of infix rules into a regex object.

17 Util.compounding

This function will yield an infinite series of compounding values.

18 Util.decaying

This function will yield an infinite series of linearly decaying values.

19 Util.itershuffle

To shuffle an iterator.

20 Util.filter_spans

To filter a sequence of span objects and to remove the duplicates.

Advertisements