- spaCy Tutorial
- spaCy - Home
- spaCy - Introduction
- spaCy - Getting Started
- spaCy - Models and Languages
- spaCy - Architecture
- spaCy - Command Line Helpers
- spaCy - Top-level Functions
- spaCy - Visualization Function
- spaCy - Utility Functions
- spaCy - Compatibility Functions
- spaCy - Containers
- Doc Class ContextManager and Property
- spaCy - Container Token Class
- spaCy - Token Properties
- spaCy - Container Span Class
- spaCy - Span Class Properties
- spaCy - Container Lexeme Class
- Training Neural Network Model
- Updating Neural Network Model
- spaCy Useful Resources
- spaCy - Quick Guide
- spaCy - Useful Resources
- spaCy - Discussion
Doc Class ContextManager and Property
In this chapter, let us learn about the context manager and the properties of Doc Class in spaCy.
Context Manager
It is a context manager, which is used to handle the retokenization of the Doc class. Let us now learn about the same in detail.
Doc.retokenize
When you use this context manager, it will first modify the Doc’s tokenization, store it, and then, make all at once, when the context manager exists.
The advantage of this context manager is that it is more efficient and less error prone.
Example 1
Refer the example for Doc.retokenize context manager given below −
import spacy nlp_model = spacy.load("en_core_web_sm") from spacy.tokens import Doc doc = nlp_model("This is Tutorialspoint.com.") with doc.retokenize() as retokenizer: retokenizer.merge(doc[0:0]) doc
Output
You will see the following output −
is Tutorialspoint.com.
Example 2
Here is another example of Doc.retokenize context manager −
import spacy nlp_model = spacy.load("en_core_web_sm") from spacy.tokens import Doc doc = nlp_model("This is Tutorialspoint.com.") with doc.retokenize() as retokenizer: retokenizer.merge(doc[0:2]) doc
Output
You will see the following output −
This is Tutorialspoint.com.
Retokenize Methods
Given below is the table, which provides information about the retokenize methods in a nutshell. The two retokenize methods are explained below the table in detail.
Sr.No. | Method & Description |
---|---|
1 | Retokenizer.merge It will mark a span for merging. |
2 | Retokenizer.split It will mark a token for splitting into the specified orths. |
Properties
The properties of Doc Class in spaCy are explained below −
Sr.No. | Doc Property & Description |
---|---|
1 | Doc.ents Used for the named entities in the document. |
2 | Doc.noun_chunks Used to iterate over the base noun phrases in a particular document. |
3 | Doc.sents Used to iterate over the sentences in a particular document. |
4 | Doc.has_vector Represents a Boolean value which indicates whether a word vector is associated with the object or not. |
5 | Doc.vector Represents a real-valued meaning. |
6 | Doc.vector_norm Represents the L2 norm of the document’s vector representation. |
To Continue Learning Please Login
Login with Google