-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

The Handbook of NLP with Gensim
By :

I will use the classic song Theme from New York, New York by Frank Sinatra. Its repetition of “New York” is a good choice for NLP. At the time of writing this chapter, the jazz music of the song accompanies the calm and snowy midnight in New York.
“Start spreading the news
You’re leaving today (tell him friend)
I want to be a part of it, New York, New York
Your vagabond shoes, they are longing to stray
And steps around the heart of it, New York, New York”
Let’s learn how to do BoW with Gensim.
Let’s import several modules. The Gensim simple_preprocess
function converts a document into a list of tokens. The Gensim Dictionary()
class implements the concept of a dictionary in Gensim. It maps a tokenized word to a unique ID. I will also import pprint
for prettyprint
. It will print output in a prettier form:
import gensimfrom gensim.utils import simple_preprocess from gensim.corpora...