tag	b6dff4e3e2ab107c11a7a033b3d002e99de32be6
tagger	Kenneth Chan <kenneth@prediction.io>	Thu Jan 28 15:59:52 2016 -0800
object	3dce40022c465b74ad39ad836f4215ad27ffe0f9

commit	3dce40022c465b74ad39ad836f4215ad27ffe0f9	[log] [tgz]
author	Kenneth Chan <kenneth@prediction.io>	Thu Jan 28 15:59:21 2016 -0800
committer	Kenneth Chan <kenneth@prediction.io>	Thu Jan 28 15:59:21 2016 -0800
tree	0ed191c3a06d3f2af4b15807dc823680dd0c0944
parent	55fd981c684d3d8744d9189d06da28e97b25a6ac [diff]

tree: 0ed191c3a06d3f2af4b15807dc823680dd0c0944

README.md

Text Classification Engine

Look at the following tutorial for a Quick Start guide and implementation details.

Release Information

Version 3.1

Fix DataSource to read “content”, “e-mail”, and use label “spam” for tutorial data. Fix engine.json for default algorithm setting.

Version 2.2

Modified PreparedData to use MLLib hashing and tf-idf implementations.

Version 2.1

Fixed dot product implementation in the predict methods to work with batch predict method for evaluation.

Version 2.0

Included three different data sets: e-mail spam, 20 newsgroups, and the rotten tomatoes semantic analysis set. Includes Multinomial Logistic Regression algorithm for text classification.

Version 1.2

Fixed import script bug occuring with Python 2.

Version 1.1 Changes

Changed data import Python script to pull straight from the 20 newsgroups page.