Gutenberg Analysis

Vocabulary from top 100 books on Project Gutenberg

1 The visualization summarizes the vocabulary of the books publicly published on Project Gutenberg. The data was gathered from the Project Gutenberg website by using Python. It was then analyzed using NLTK library for Python. The relevant data was baked into a JSON format. Some of the missing dataset fields were fetched from Goodreads API. Minor formatting and cleaning up was done in sublime text and finally, the visualization was created using D3js.

Project Link