evekeys: Isolate keywords from an event-based and custom-grouped textual corpus
By Chris Lindgren firstname.lastname@example.org
Distributed under the BSD 3-clause license. See LICENSE.txt or http://opensource.org/licenses/BSD-3-Clause for details.
A set of functions that uses scikit-learn to conduct a TF-IDF analysis to isolate keywords from event-based documents. It answers the following questions:
- What keywords represent a particular period of content?
- What keywords represent a particular group of content from a particular period?
It assumes you have:
- imported your corpus as a pandas DataFrame,
- included metadata information, such as a list of dates and list of groups to reorganize your corpus, and
- pre-processed your documents.
It functions only with Python 3.x and is not backwards-compatible.
Warning: evekeys performs little to no custom error-handling, so make sure your inputs are formatted properly. If you have questions, please let me know via email.
pip install evekeys