POS - Parts of Speech Tagging

POS Tagging or parts-of-speech tagging is a useful way of labeling and categorizing tokens in a corpus. Below we use Space to demonstrate POS tagging

import spacy
import en_core_web_sm

nlp = en_core_web_sm.load()

# sample corpus        
sample_text = "sofia is an amazing data scientist"

[" ==> ".join([str(token), token.pos_ ]) for token in nlp(sample_text)]

The output is:

[ 'sofia ==> NOUN', 'is ==> VERB', 'an ==> DET', 'amazing ==> ADJ', 'data ==> NOUN', 'scientist ==> NOUN']