-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Natural Language Understanding with Python
By :

We’ll start our exploration of topic modeling by looking at some considerations relating to grouping semantically similar documents in general, and then we’ll look at a specific example.
Like most of the machine learning problems we’ve discussed so far, the overall task generally breaks down into two sub-problems, representing the data and performing a task based on the representations. We’ll look at these two sub-problems next.
The data representations we’ve looked at so far were reviewed in Chapter 7. These approaches included the simple bag of words (BoW) variants, term frequency - inverse document frequency (TF-IDF), and newer approaches, including Word2Vec. Word2Vec is based on word vectors, which are vectors that represent words in isolation, without taking into account the context in which they occur. A newer...