Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Learning Data Mining with Python
  • Toc
  • feedback
Learning Data Mining with Python

Learning Data Mining with Python

By : Robert Layton
3.7 (7)
close
Learning Data Mining with Python

Learning Data Mining with Python

3.7 (7)
By: Robert Layton

Overview of this book

If you are a programmer who wants to get started with data mining, then this book is for you.
Table of Contents (15 chapters)
close
14
Index

Chapter 9 – Authorship Attribution

Increasing the sample size

The Enron application we used ended up using just a portion of the overall dataset. There is lots more data available in this dataset. Increasing the number of authors will likely lead to a drop in accuracy, but it is possible to boost the accuracy further than was achieved in this chapter, using similar methods. Using a Grid Search, try different values for n-grams and different parameters for support vector machines, in order to get better performance on a larger number of authors.

Blogs dataset

The dataset used in Chapter 12, Working with Big Data, provides authorship-based classes (each blogger ID is a separate author). This dataset can be tested using this kind of method as well. In addition, there are the other classes of gender, age, industry, and star sign that can be tested—are authorship-based methods good for these classification tasks?

bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete