Web Resources
Click on the following links. Please note these will open in a new window.
- AI::Categorizer
An easy-to-use Perl package for text classification, with several classification algorithms.
-
MALLET
A collection of Java tools for statistical language processing, including text classification -
NLTK
A comprehensive toolkit for natural language processing, which includes software for text classification
Datasets:
- Reuters topic classification data
-
Language identification data
See also the resources under Supervised Learning