Merge pull request #11 from thygesen/tfnerpoc added files for test

commit: c0a14f65b4d5b1acaf7d6e6e34a84d2887632121 [log] [tgz]
author: Peter Thygesen <thygesen@users.noreply.github.com> Thu Apr 12 10:40:11 2018 +0200
committer: GitHub <noreply@github.com> Thu Apr 12 10:40:11 2018 +0200
tree: 94496fea2dbf1f247434a4b3033f6e570adb4a12
parent: 87a75a75f596a09b44143b49aa1d7d30e4a1d120 [diff]
parent: 23c0ebb770eb6a3464ca51d49bc904b948035596 [diff]
diff --git a/.gitignore b/.gitignore
index fe06e66..126d4a6 100644
--- a/.gitignore
+++ b/.gitignore

@@ -5,3 +5,6 @@
 nbactions.xml
 nb-configuration.xml
 *.DS_Store
+
+.idea
+*.iml

diff --git a/aws-ec2-testing-scripts/README.md b/aws-ec2-testing-scripts/README.md
new file mode 100644
index 0000000..13f0ad6
--- /dev/null
+++ b/aws-ec2-testing-scripts/README.md

@@ -0,0 +1,58 @@
+# OpenNLP Testing Scripts
+
+These are scripts to automate running OpenNLP's evaluation tests on AWS EC2. The scripts are composed of a Packer script, a CloudFormation template, and a few supporting bash scripts.
+
+## Running OpenNLP Tests
+
+To run the tests two actions must be performed:
+
+1. Create an AMI that contains the required tools and OpenNLP test data.
+1. Launch a CloudFormation stack that creates an instance from the AMI and runs the tests.
+
+These two steps are described in detail below.
+
+### Creating the AMI used for Testing
+
+Creating the AMI requires the [Packer](https://www.packer.io/intro/index.html) tool. To create the AMI execute the `build-ami.sh` script. You may need to modify the location of the Packer executable in the `build-ami.sh` script. The OpenNLP test data should exist as `opennlp-data.zip` in the current directory prior to creating the AMI. This allows the Packer script to upload the test data to the instance when creating the AMI.
+
+You only need to create the AMI once. The same AMI can be reused for testing future OpenNLP versions. The only need to create a new AMI is to include updated OpenNLP test data.
+
+### Create the CloudFormation Stack
+
+Using the `cf-template.json` CloudFormation template create a new stack. The `Image` parameter should reference the AMI created by Packer. Be sure to check your email and confirm your subscription to the newly created SNS topic in order to receive the build emails.
+
+You can create a stack from the template either through the AWS Console or using the AWS CLI:
+
+```
+aws cloudformation create-stack \
+  --stack-name OpenNLP-Testing \
+  --template-body file://./cf-template.json \
+  --parameters \
+    ParameterKey=InstanceType,ParameterValue=m4.xlarge \
+    ParameterKey=KeyName,ParameterValue=keyname \
+    ParameterKey=NotificationsEmail,ParameterValue=your@email.com \
+    ParameterKey=Branch,ParameterValue=opennlp-1.8.3 \
+    ParameterKey=Tests,ParameterValue=run-eval-tests.sh
+```
+
+When the tests are complete (either as success or failure) the email address specified in the `NotificationsEmail` parameter will receive an email notification. The email's subject will indicate if the tests were successful or failed and the email's body will contain approximately the last 200 KB of text from the Maven build log. Once you receive the notification you can terminate the stack or you can SSH into the EC2 instance if you need to debug or re-run any tests.
+
+```
+aws cloudformation delete-stack --stack-name OpenNLP-Testing
+```
+
+## AWS Infrastructure
+
+The `cf-template.json` CloudFormation template creates a new VPC to contain the EC2 instance that runs the tests. The template creates all the necessary components such as the route table, subnet, IAM policies and roles, and security group.
+
+### Instance Directory Structure
+
+These scripts are written expecting the following directory structure:
+
+* `/opt/` - Contains these scripts.
+* `/opt/opennlp` - Contains the OpenNLP code as cloned from https://github.com/apache/opennlp.
+* `/opt/opennlp-data` - Contains the data required for OpenNLP's eval tests.
+
+## License
+
+Licensed under the Apache Software License, version 2.

diff --git a/aws-ec2-testing-scripts/build-ami.sh b/aws-ec2-testing-scripts/build-ami.sh
new file mode 100755
index 0000000..fe887ef
--- /dev/null
+++ b/aws-ec2-testing-scripts/build-ami.sh

@@ -0,0 +1,4 @@
+#!/bin/bash
+
+packer build ./packer.json
+

diff --git a/aws-ec2-testing-scripts/cf-template.json b/aws-ec2-testing-scripts/cf-template.json
new file mode 100644
index 0000000..2d8b3ca
--- /dev/null
+++ b/aws-ec2-testing-scripts/cf-template.json

@@ -0,0 +1,318 @@
+{
+  "AWSTemplateFormatVersion": "2010-09-09",
+  "Description": "Stack for running OpenNLP testing. Licensed under the ASLv2.",
+  "Parameters": {
+    "InstanceType": {
+      "Description": "EC2 instance type.",
+      "Type": "String",
+      "Default": "m4.xlarge"
+    },
+    "Image": {
+      "Description": "The OpenNLP testing AMI made with the Packer script.",
+      "Type": "String",
+      "Default": "ami-6191821a"
+    },
+    "KeyName": {
+      "Description": "An existing EC2 keypair.",
+      "Type": "AWS::EC2::KeyPair::KeyName",
+      "ConstraintDescription": "Must be the name of an existing EC2 keypair."
+    },
+    "NotificationsEmail": {
+      "Description": "Email address to receive notifications.",
+      "Type": "String"
+    },
+    "SSHCIDR": {
+      "Description": "IP to allow SSH.",
+      "Type": "String",
+      "Default": "0.0.0.0/0"
+    },
+    "Branch": {
+      "Description": "The OpenNLP git branch or tag to test.",
+      "Type": "String",
+      "Default": "opennlp-1.8.2"
+    },
+    "Tests": {
+      "Description": "The OpenNLP tests to run.",
+      "Type": "String",
+      "AllowedValues": ["run-eval-tests.sh", "run-high-memory-tests.sh"],
+      "Default": "run-eval-tests.sh"
+    }
+  },
+  "Resources": {
+    "VPC": {
+      "Type": "AWS::EC2::VPC",
+      "Properties": {
+        "CidrBlock": "10.0.0.0/16",
+        "EnableDnsSupport": true,
+        "EnableDnsHostnames": true,
+        "Tags": [
+          {
+            "Key": "Application",
+            "Value": {
+              "Ref": "AWS::StackId"
+            }
+          }
+        ]
+      }
+    },
+    "Subnet": {
+      "Type": "AWS::EC2::Subnet",
+      "Properties": {
+        "CidrBlock": "10.0.0.0/24",
+        "MapPublicIpOnLaunch": true,
+        "Tags": [
+          {
+            "Key": "Application",
+            "Value": {
+              "Ref": "AWS::StackId"
+            }
+          }
+        ],
+        "VpcId": {
+          "Ref": "VPC"
+        }
+      }
+    },
+    "InternetGateway": {
+      "Type": "AWS::EC2::InternetGateway",
+      "Properties": {
+        "Tags": [
+          {
+            "Key": "Application",
+            "Value": {
+              "Ref": "AWS::StackId"
+            }
+          }
+        ]
+      }
+    },
+    "AttachGateway": {
+      "Type": "AWS::EC2::VPCGatewayAttachment",
+      "Properties": {
+        "VpcId": {
+          "Ref": "VPC"
+        },
+        "InternetGatewayId": {
+          "Ref": "InternetGateway"
+        }
+      }
+    },
+    "RouteTable": {
+      "Type": "AWS::EC2::RouteTable",
+      "Properties": {
+        "VpcId": {
+          "Ref": "VPC"
+        },
+        "Tags": [
+          {
+            "Key": "Application",
+            "Value": {
+              "Ref": "AWS::StackId"
+            }
+          }
+        ]
+      }
+    },
+    "Route": {
+      "Type": "AWS::EC2::Route",
+      "DependsOn": "AttachGateway",
+      "Properties": {
+        "RouteTableId": {
+          "Ref": "RouteTable"
+        },
+        "DestinationCidrBlock": "0.0.0.0/0",
+        "GatewayId": {
+          "Ref": "InternetGateway"
+        }
+      }
+    },
+    "SubnetRouteTableAssociation": {
+      "Type": "AWS::EC2::SubnetRouteTableAssociation",
+      "Properties": {
+        "SubnetId": {
+          "Ref": "Subnet"
+        },
+        "RouteTableId": {
+          "Ref": "RouteTable"
+        }
+      }
+    },
+    "InstanceSecurityGroup": {
+      "Type": "AWS::EC2::SecurityGroup",
+      "Properties": {
+        "GroupDescription": "Enable SSH access via port 22",
+        "SecurityGroupIngress": [
+          {
+            "IpProtocol": "tcp",
+            "FromPort": "22",
+            "ToPort": "22",
+            "CidrIp": {
+              "Ref": "SSHCIDR"
+            }
+          }
+        ],
+        "VpcId": {
+          "Ref": "VPC"
+        }
+      }
+    },
+    "RolePolicies": {
+      "Type": "AWS::IAM::Policy",
+      "Properties": {
+        "PolicyName": "root",
+        "PolicyDocument": {
+          "Version": "2012-10-17",
+          "Statement": [
+            {
+              "Effect": "Allow",
+              "Action": "s3:*",
+              "Resource": "*"
+            },
+            {
+              "Action": [
+                "sns:*"
+              ],
+              "Effect": "Allow",
+              "Resource": "*"
+            }
+          ]
+        },
+        "Roles": [
+          {
+            "Ref": "InstanceRole"
+          }
+        ]
+      }
+    },
+    "InstanceRole": {
+      "Type": "AWS::IAM::Role",
+      "Properties": {
+        "AssumeRolePolicyDocument": {
+          "Version": "2012-10-17",
+          "Statement": [
+            {
+              "Effect": "Allow",
+              "Principal": {
+                "Service": [
+                  "ec2.amazonaws.com"
+                ]
+              },
+              "Action": [
+                "sts:AssumeRole"
+              ]
+            }
+          ]
+        },
+        "Path": "/"
+      }
+    },
+    "InstanceProfile": {
+      "Type": "AWS::IAM::InstanceProfile",
+      "Properties": {
+        "Path": "/",
+        "Roles": [
+          {
+            "Ref": "InstanceRole"
+          }
+        ]
+      }
+    },
+    "SNSTopic": {
+      "Type": "AWS::SNS::Topic",
+      "Properties": {
+        "Subscription": [
+          {
+            "Endpoint": {
+              "Ref": "NotificationsEmail"
+            },
+            "Protocol": "email"
+          }
+        ]
+      }
+    },
+    "SNSTopic": {
+      "Type": "AWS::SNS::Topic",
+      "Properties": {
+        "Subscription": [
+          {
+            "Endpoint": {
+              "Ref": "NotificationsEmail"
+            },
+            "Protocol": "email"
+          }
+        ],
+        "TopicName": "OpenNLP-Notification"
+      }
+    },
+    "OpenNLPInstance": {
+      "Type": "AWS::EC2::Instance",
+      "DependsOn": "AttachGateway",
+      "Properties": {
+        "IamInstanceProfile": {
+          "Ref": "InstanceProfile"
+        },
+        "ImageId": {
+          "Ref": "Image"
+        },
+        "InstanceType": {
+          "Ref": "InstanceType"
+        },
+        "KeyName": {
+          "Ref": "KeyName"
+        },
+        "SecurityGroupIds": [
+          {
+            "Ref": "InstanceSecurityGroup"
+          }
+        ],
+        "SubnetId": {
+          "Ref": "Subnet"
+        },
+        "Tags": [
+          {
+            "Key": "Application",
+            "Value": {
+              "Ref": "AWS::StackId"
+            }
+          },
+          {
+            "Key": "Name",
+            "Value": "OpenNLP Testing"
+          }
+        ],
+        "UserData": {
+          "Fn::Base64": {
+            "Fn::Join": [
+              "",
+              [
+                "#!/bin/bash -xe\n",
+                "# Clone OpenNLP.\n",
+                "git clone https://github.com/apache/opennlp.git\n",
+                "mv opennlp /opt/\n",
+                "chown ubuntu:ubuntu /opt/ -R\n",
+                "# Checkout the branch or tag that we want to test.\n",
+                "cd /opt/opennlp\n",
+                "git checkout ", {"Ref": "Branch"}, "\n",
+                "sed -i 's/TOPICARNPARAM/", {"Ref": "SNSTopic"}, "/g' /opt/notify.sh\n",
+                "# Start the tests\n",
+                "cd /opt\n",
+                "./", {"Ref": "Tests"}, "\n"
+              ]
+            ]
+          }
+        }
+      }
+    }
+  },
+  "Outputs": {
+    "Instance": {
+      "Description": "The instance public IP.",
+      "Value": {
+        "Fn::GetAtt": [
+          "OpenNLPInstance",
+          "PublicDnsName"
+        ]
+      }
+    }
+  }
+}

diff --git a/aws-ec2-testing-scripts/notify.sh b/aws-ec2-testing-scripts/notify.sh
new file mode 100755
index 0000000..963f6f9
--- /dev/null
+++ b/aws-ec2-testing-scripts/notify.sh

@@ -0,0 +1,38 @@
+#!/bin/bash
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+SUBJECT=$1
+TOPIC_ARN="TOPICARNPARAM"
+LOG_FILE="/opt/build.log"
+
+# The max size for a SNS body is 256KB.
+# We'll round down a bit to safely stay under that limit.
+tail -c 200000 $LOG_FILE > /tmp/subset.log
+
+OUTCOME="SUCCESS"
+
+# Look to see if the build failed.
+if grep -q 'BUILD FAILURE' "$LOG_FILE"; then
+  OUTCOME="FAILURE"
+fi
+
+# Publish the message to SNS.
+aws sns publish \
+  --region us-east-1 \
+  --topic-arn "$TOPIC_ARN" \
+  --subject "$OUTCOME - $SUBJECT" \
+  --message file:///tmp/subset.log

diff --git a/aws-ec2-testing-scripts/packer.json b/aws-ec2-testing-scripts/packer.json
new file mode 100644
index 0000000..945f77f
--- /dev/null
+++ b/aws-ec2-testing-scripts/packer.json

@@ -0,0 +1,60 @@
+{
+  "variables": {},
+  "builders": [
+    {
+      "type": "amazon-ebs",
+      "region": "us-east-1",
+      "source_ami": "ami-cd0f5cb6",
+      "instance_type": "m3.medium",
+      "ssh_username": "ubuntu",
+      "ami_name": "OpenNLP Testing {{timestamp}}",
+      "tags": {
+        "Name": "OpenNLP Testing"
+      }
+    }
+  ],
+  "provisioners": [
+    {
+      "type": "file",
+      "source": "notify.sh",
+      "destination": "/tmp/"
+    },
+    {
+      "type": "file",
+      "source": "run-eval-tests.sh",
+      "destination": "/tmp/"
+    },
+    {
+      "type": "file",
+      "source": "run-high-memory-tests.sh",
+      "destination": "/tmp/"
+    },
+    {
+      "type": "file",
+      "source": "opennlp-data.zip",
+      "destination": "/tmp/"
+    },
+    {
+      "type": "shell",
+      "inline": [
+        "sudo apt-get update",
+        "sudo apt-get install -y openjdk-8-jdk maven git awscli unzip",
+        "sudo mv /tmp/*.sh /opt/",
+        "sudo chown ubuntu:ubuntu /opt/*.sh",
+        "sudo chmod +x /opt/*.sh",
+        "sudo mkdir /opt/opennlp-data",
+        "sudo chown ubuntu:ubuntu /opt/opennlp-data",
+        "unzip /tmp/opennlp-data.zip -d /opt/opennlp-data",
+        "tar -xzf /opt/opennlp-data/ontonotes4/data.tar.gz -C /opt/opennlp-data/ontonotes4/",
+        "sudo sed -i 's/PermitRootLogin without-password/PermitRootLogin forced-commands-only/g' /etc/ssh/sshd_config",
+        "sudo passwd -l root",
+        "sudo shred -n 50 -fuzv /etc/ssh/*_key /etc/ssh/*_key.pub",
+        "sudo find /root/.ssh -type f -exec shred -n 30 -z -u {} \\;",
+        "sudo find /home/ubuntu/.ssh -type f -exec shred -n 30 -z -u {} \\;",
+        "sudo shred -n 50 -fuzv /var/log/wtmp",
+        "sudo shred -n 50 -fuzv /var/log/btmp",
+        "sudo shred -n 50 -fuzv /var/log/lastlog"
+      ]
+    }
+  ]
+}

diff --git a/aws-ec2-testing-scripts/run-eval-tests.sh b/aws-ec2-testing-scripts/run-eval-tests.sh
new file mode 100755
index 0000000..2d44456
--- /dev/null
+++ b/aws-ec2-testing-scripts/run-eval-tests.sh

@@ -0,0 +1,18 @@
+#!/bin/bash
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+nohup sh -c 'cd /opt/opennlp && mvn clean install -l /opt/build.log -Peval-tests -DOPENNLP_DATA_DIR=/opt/opennlp-data/  && /opt/notify.sh "OpenNLP eval-tests complete"' &

diff --git a/aws-ec2-testing-scripts/run-high-memory-tests.sh b/aws-ec2-testing-scripts/run-high-memory-tests.sh
new file mode 100755
index 0000000..fbd0066
--- /dev/null
+++ b/aws-ec2-testing-scripts/run-high-memory-tests.sh

@@ -0,0 +1,18 @@
+#!/bin/bash
+
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+nohup sh -c 'cd /opt/opennlp && mvn clean install -l /opt/build.log -Phigh-memory-tests -DOPENNLP_DATA_DIR=/opt/opennlp-data/ && /opt/notify.sh "OpenNLP High-memory tests complete"' &

diff --git a/opennlp-dl/pom.xml b/opennlp-dl/pom.xml
index 3d15d8f..cfb1a1b 100644
--- a/opennlp-dl/pom.xml
+++ b/opennlp-dl/pom.xml

@@ -26,14 +26,14 @@
 
   <properties>
     <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
-    <nd4j.version>0.7.2</nd4j.version>
+    <nd4j.version>0.9.1</nd4j.version>
   </properties>
 
   <dependencies>
       <dependency>
           <groupId>org.apache.opennlp</groupId>
           <artifactId>opennlp-tools</artifactId>
-          <version>1.7.2</version>
+          <version>1.8.3</version>
       </dependency>
 
       <dependency>
@@ -41,8 +41,6 @@
           <artifactId>deeplearning4j-core</artifactId>
           <version>${nd4j.version}</version>
       </dependency>
-
-
       <dependency>
           <groupId>org.deeplearning4j</groupId>
           <artifactId>deeplearning4j-nlp</artifactId>
@@ -64,6 +62,16 @@
       <artifactId>nd4j-native-platform</artifactId>
       <version>${nd4j.version}</version>
     </dependency>
+    <dependency>
+      <groupId>args4j</groupId>
+      <artifactId>args4j</artifactId>
+      <version>2.33</version>
+    </dependency>
+      <dependency>
+          <groupId>org.apache.commons</groupId>
+          <artifactId>commons-collections4</artifactId>
+          <version>4.1</version>
+      </dependency>
   </dependencies>
   <build>
     <plugins>

diff --git a/opennlp-dl/src/main/java/opennlp/tools/dl/DataReader.java b/opennlp-dl/src/main/java/opennlp/tools/dl/DataReader.java
new file mode 100644
index 0000000..86af123
--- /dev/null
+++ b/opennlp-dl/src/main/java/opennlp/tools/dl/DataReader.java

@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package opennlp.tools.dl;
+
+import org.apache.commons.io.FileUtils;
+import org.nd4j.linalg.api.ndarray.INDArray;
+import org.nd4j.linalg.dataset.DataSet;
+import org.nd4j.linalg.dataset.api.DataSetPreProcessor;
+import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
+import org.nd4j.linalg.factory.Nd4j;
+import org.nd4j.linalg.indexing.INDArrayIndex;
+import org.nd4j.linalg.indexing.NDArrayIndex;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import java.util.function.Function;
+
+/**
+ * This class provides a reader capable of reading training and test datasets from file system for text classifiers.
+ * In addition to reading the content, it
+ * (1) vectorizes the text using embeddings such as Glove, and
+ * (2) divides the datasets into mini batches of specified size.
+ *
+ * The data is expected to be organized as per the following convention:
+ * <pre>
+ * data-dir/
+ *     +- label1 /
+ *     |    +- example11.txt
+ *     |    +- example12.txt
+ *     |    +- example13.txt
+ *     |    +- .....
+ *     +- label2 /
+ *     |    +- example21.txt
+ *     |    +- .....
+ *     +- labelN /
+ *          +- exampleN1.txt
+ *          +- .....
+ * </pre>
+ *
+ * In addition, the dataset shall be divided into training and testing as follows:
+ * <pre>
+ * data-dir/
+ *     + train/
+ *     |   +- label1 /
+ *     |   +- labelN /
+ *     + test /
+ *         +- label1 /
+ *         +- labelN /
+ * </pre>
+ *
+ * <h2>Usage: </h2>
+ * <code>
+ *     // label names should match the subdirectory names
+ *     labels = Arrays.asList("label1", "label2", ..."labelN");
+ *     train = DataReader('data-dir/train', labels, embeds, ....);
+ *     test = DataReader('data-dir/test', labels, embeds, ....)
+ * </code>
+ *
+ * @see GlobalVectors
+ * @see NeuralDocCat
+ * <br/>
+ * @author Thamme Gowda (thammegowda@apache.org)
+ *
+ */
+public class DataReader implements DataSetIterator {
+
+    private static final Logger LOG = LoggerFactory.getLogger(DataReader.class);
+
+    private File dataDir;
+    private List<File> records;
+    private List<Integer> labels;
+    private Map<String, Integer> labelToId;
+    private String extension = ".txt";
+    private GlobalVectors embedder;
+    private int cursor = 0;
+    private int batchSize;
+    private int vectorLen;
+    private int maxSeqLen;
+    private int numLabels;
+    // default tokenizer
+    private Function<String, String[]> tokenizer = s -> s.toLowerCase().split(" ");
+
+
+    /**
+     * Creates a reader with the specified arguments
+     * @param dataDirPath data directory
+     * @param labelNames list of labels (names should match sub directory names)
+     * @param embedder embeddings to convert words to vectors
+     * @param batchSize mini batch size for DL4j training
+     * @param maxSeqLength truncate sequences that are longer than this.
+     *                    If truncation is not desired, set {@code Integer.MAX_VAL}
+     */
+    DataReader(String dataDirPath, List<String> labelNames, GlobalVectors embedder,
+               int batchSize, int maxSeqLength){
+        this.batchSize = batchSize;
+        this.embedder = embedder;
+        this.maxSeqLen = maxSeqLength;
+        this.vectorLen = embedder.getVectorSize();
+        this.numLabels = labelNames.size();
+        this.dataDir = new File(dataDirPath);
+        this.labelToId = new HashMap<>();
+        for (int i = 0; i < labelNames.size(); i++) {
+            labelToId.put(labelNames.get(i), i);
+        }
+        this.labelToId = Collections.unmodifiableMap(this.labelToId);
+        this.scanDir();
+        this.reset();
+    }
+
+    private void scanDir(){
+        assert dataDir.exists();
+        List<Integer> labels = new ArrayList<>();
+        List<File> files = new ArrayList<>();
+        for (String labelName: this.labelToId.keySet()) {
+            Integer labelId = this.labelToId.get(labelName);
+            assert labelId != null;
+            File labelDir = new File(dataDir, labelName);
+            if (!labelDir.exists()){
+                throw new IllegalStateException("No examples found for "
+                        + labelName + ". Looked at:" + labelDir);
+            }
+            File[] examples = labelDir.listFiles(f ->
+                    f.isFile() && f.getName().endsWith(this.extension));
+            if (examples == null || examples.length == 0){
+                throw new IllegalStateException("No examples found for "
+                        + labelName + ". Looked at:" + labelDir
+                        + " for files having extension: \" + extension");
+            }
+            LOG.info("Found {} examples for label {}", examples.length, labelName);
+            for (File example: examples) {
+                files.add(example);
+                labels.add(labelId);
+            }
+        }
+        this.records = files;
+        this.labels = labels;
+    }
+
+    /**
+     * sets tokenizer for converting text to tokens
+     * @param tokenizer tokenizer to use for converting text to tokens
+     */
+    public void setTokenizer(Function<String, String[]> tokenizer) {
+        this.tokenizer = tokenizer;
+    }
+
+    /**
+     * @return Tokenizer function used for converting text into words
+     */
+    public Function<String, String[]> getTokenizer() {
+        return tokenizer;
+    }
+
+    @Override
+    public DataSet next(int batchSize) {
+        batchSize = Math.min(batchSize, records.size() - cursor);
+        INDArray features = Nd4j.create(batchSize, vectorLen, maxSeqLen);
+        INDArray labels = Nd4j.create(batchSize, numLabels, maxSeqLen);
+
+        //Because we are dealing with text of different lengths and only one output at the final time step: use padding arrays
+        //Mask arrays contain 1 if data is present at that time step for that example, or 0 if data is just padding
+        INDArray featuresMask = Nd4j.zeros(batchSize, maxSeqLen);
+        INDArray labelsMask = Nd4j.zeros(batchSize, maxSeqLen);
+
+        // Optimizations to speed up this code block by reusing memory
+        int _2dIndex[] = new int[2];
+        int _3dIndex[] = new int[3];
+        INDArrayIndex _3dNdIndex[] = new INDArrayIndex[]{null, NDArrayIndex.all(), null};
+
+        for (int i = 0; i < batchSize && cursor < records.size(); i++, cursor++) {
+            _2dIndex[0] = i;
+            _3dIndex[0] = i;
+            _3dNdIndex[0] = NDArrayIndex.point(i);
+
+            try {
+                // Read
+                File file = records.get(cursor);
+                int labelIdx = this.labels.get(cursor);
+                String text = FileUtils.readFileToString(file);
+                // Tokenize and Filter
+                String[] tokens = tokenizer.apply(text);
+                tokens = Arrays.stream(tokens).filter(embedder::hasWord).toArray(String[]::new);
+                //Get word vectors for each word in review, and put them in the training data
+                int j;
+                for(j = 0; j < tokens.length && j < maxSeqLen; j++ ){
+                    String token = tokens[j];
+                    INDArray vector = embedder.toVector(token);
+                    _3dNdIndex[2] = NDArrayIndex.point(j);
+                    features.put(_3dNdIndex, vector);
+                    //Word is present (not padding) for this example + time step -> 1.0 in features mask
+                    _2dIndex[1] = j;
+                    featuresMask.putScalar(_2dIndex, 1.0);
+                }
+                int lastIdx = j - 1;
+                _2dIndex[1] = lastIdx;
+                _3dIndex[1] = labelIdx;
+                _3dIndex[2] = lastIdx;
+
+                labels.putScalar(_3dIndex,1.0);   //Set label: one of k encoding
+                // Specify that an output exists at the final time step for this example
+                labelsMask.putScalar(_2dIndex,1.0);
+            } catch (IOException e) {
+                throw new RuntimeException(e);
+            }
+        }
+        //LOG.info("Cursor = {} || Init Time = {}, Read time = {}, preprocess Time = {}, Mask Time={}", cursor, initTime, readTime, preProcTime, maskTime);
+        return new DataSet(features, labels, featuresMask, labelsMask);
+    }
+
+    @Override
+    public int totalExamples() {
+        return this.records.size();
+    }
+
+    @Override
+    public int inputColumns() {
+        return this.embedder.getVectorSize();
+    }
+
+    @Override
+    public int totalOutcomes() {
+        return this.numLabels;
+    }
+
+    @Override
+    public boolean resetSupported() {
+        return true;
+    }
+
+    @Override
+    public boolean asyncSupported() {
+        return false;
+    }
+
+    @Override
+    public void reset() {
+        assert this.records.size() == this.labels.size();
+        long seed = System.nanoTime(); // shuffle both the lists in the same order
+        Collections.shuffle(this.records, new Random(seed));
+        Collections.shuffle(this.labels, new Random(seed));
+        this.cursor = 0; // from beginning
+    }
+
+    @Override
+    public int batch() {
+        return this.batchSize;
+    }
+
+    @Override
+    public int cursor() {
+        return this.cursor;
+    }
+
+    @Override
+    public int numExamples() {
+        return totalExamples();
+    }
+
+    @Override
+    public void setPreProcessor(DataSetPreProcessor preProcessor) {
+        throw new UnsupportedOperationException();
+    }
+
+    @Override
+    public DataSetPreProcessor getPreProcessor() {
+        throw new UnsupportedOperationException();
+    }
+
+    @Override
+    public List<String> getLabels() {
+        return new ArrayList<>(this.labelToId.keySet());
+    }
+
+    @Override
+    public boolean hasNext() {
+        return cursor < totalExamples() - 1;
+    }
+
+    @Override
+    public DataSet next() {
+        return next(this.batchSize);
+    }
+}

diff --git a/opennlp-dl/src/main/java/opennlp/tools/dl/GlobalVectors.java b/opennlp-dl/src/main/java/opennlp/tools/dl/GlobalVectors.java
new file mode 100644
index 0000000..fdf3a95
--- /dev/null
+++ b/opennlp-dl/src/main/java/opennlp/tools/dl/GlobalVectors.java

@@ -0,0 +1,199 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package opennlp.tools.dl;
+
+import org.apache.commons.io.IOUtils;
+import org.nd4j.linalg.api.ndarray.INDArray;
+import org.nd4j.linalg.dataset.DataSet;
+import org.nd4j.linalg.factory.Nd4j;
+import org.nd4j.linalg.indexing.INDArrayIndex;
+import org.nd4j.linalg.indexing.NDArrayIndex;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.*;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+/**
+ * GlobalVectors (Glove) for projecting words to vector space.
+ * This tool utilizes word vectors  pre-trained on large datasets.
+ *
+ * Visit https://nlp.stanford.edu/projects/glove/ for full documentation of Gloves.
+ *
+ * <h2>Usage</h2>
+ * <pre>
+ * path = "work/datasets/glove.6B/glove.6B.100d.txt";
+ * vocabSize = 20000; # max number of words to use
+ * GlobalVectors glove;
+ * try (InputStream stream = new FileInputStream(path)) {
+ *    glove = new GlobalVectors(stream, vocabSize);
+ * }
+ * </pre>
+ *
+ * @author Thamme Gowda (thammegowda@apache.org)
+ *
+ */
+public class GlobalVectors {
+
+    private static final Logger LOG = LoggerFactory.getLogger(GlobalVectors.class);
+
+    private final INDArray embeddings;
+    private final Map<String, Integer> wordToId;
+    private final List<String> idToWord;
+    private final int vectorSize;
+    private final int maxWords;
+
+    /**
+     * Reads Global Vectors from stream
+     * @param stream Glove word vectors stream (plain text)
+     * @throws IOException
+     */
+    public GlobalVectors(InputStream stream) throws IOException {
+        this(stream, Integer.MAX_VALUE);
+    }
+
+    /**
+     *
+     * @param stream vector stream
+     * @param maxWords maximum number of words to use, i.e. vocabulary size
+     * @throws IOException
+     */
+    public GlobalVectors(InputStream stream, int maxWords) throws IOException {
+        List<String> words = new ArrayList<>();
+        List<INDArray> vectors = new ArrayList<>();
+        int vectorSize = -1;
+        try (BufferedReader reader = new BufferedReader(new InputStreamReader(stream))){
+            String line;
+            while ((line = reader.readLine()) != null) {
+                String[] parts = line.split(" ");
+                if (vectorSize == -1) {
+                    vectorSize = parts.length - 1;
+                } else {
+                    assert vectorSize == parts.length - 1;
+                }
+                float[] vector = new float[vectorSize];
+                for (int i = 1; i < parts.length; i++) {
+                    vector[i-1] = Float.parseFloat(parts[i]);
+                }
+                vectors.add(Nd4j.create(vector));
+                words.add(parts[0]);
+                if (words.size() >= maxWords) {
+                    LOG.info("Max words reached at {}, aborting", words.size());
+                    break;
+                }
+            }
+            LOG.info("Found {} words; Vector dimensions={}", words.size(), vectorSize);
+            this.vectorSize = vectorSize;
+            this.maxWords = Math.min(words.size(), maxWords);
+            this.embeddings = Nd4j.create(vectors, new int[]{vectors.size(), vectorSize});
+            this.idToWord = words;
+            this.wordToId = new HashMap<>();
+            for (int i = 0; i < words.size(); i++) {
+                wordToId.put(words.get(i), i);
+            }
+        }
+    }
+
+    /**
+     * @return size or dimensions of vectors
+     */
+    public int getVectorSize() {
+        return vectorSize;
+    }
+
+    public int getMaxWords() {
+        return maxWords;
+    }
+
+    /**
+     *
+     * @param word
+     * @return {@code true} if word is known; false otherwise
+     */
+    public boolean hasWord(String word){
+        return wordToId.containsKey(word);
+    }
+
+    /**
+     * Converts word to vectors
+     * @param word word to be converted to vector
+     * @return Vector if words exists or null otherwise
+     */
+    public INDArray toVector(String word){
+        if (wordToId.containsKey(word)){
+            return embeddings.getRow(wordToId.get(word));
+        }
+        return null;
+    }
+
+    public INDArray embed(String text, int maxLen){
+        return embed(text.toLowerCase().split(" "), maxLen);
+    }
+
+    public INDArray embed(String[] tokens, int maxLen){
+        List<String> tokensFiltered = new ArrayList<>();
+        for(String t: tokens ){
+            if(hasWord(t)){
+                tokensFiltered.add(t);
+            }
+        }
+        int seqLen = Math.min(maxLen, tokensFiltered.size());
+
+        INDArray features = Nd4j.create(1, vectorSize, seqLen);
+
+        for( int j = 0; j < seqLen; j++ ){
+            String token = tokensFiltered.get(j);
+            INDArray vector = toVector(token);
+            features.put(new INDArrayIndex[]{NDArrayIndex.point(0), NDArrayIndex.all(), NDArrayIndex.point(j)}, vector);
+        }
+        return features;
+    }
+
+    public void writeOut(OutputStream stream, boolean closeStream) throws IOException {
+        writeOut(stream, "%.5f", closeStream);
+    }
+
+    public void writeOut(OutputStream stream,
+                         String floatPrecisionFormatString, boolean closeStream) throws IOException {
+        if (!Character.isWhitespace(floatPrecisionFormatString.charAt(0))) {
+            floatPrecisionFormatString = " " + floatPrecisionFormatString;
+        }
+        LOG.info("Writing {} vectors out, float precision {}", idToWord.size(), floatPrecisionFormatString);
+
+        PrintWriter out = new PrintWriter(stream);
+        try {
+            for (int i = 0; i < idToWord.size(); i++) {
+                out.printf("%s", idToWord.get(i));
+                INDArray row = embeddings.getRow(i);
+                for (int j = 0; j < vectorSize; j++) {
+                    out.printf(floatPrecisionFormatString, row.getDouble(j));
+                }
+                out.println();
+            }
+        } finally {
+            if (closeStream){
+                IOUtils.closeQuietly(out);
+            } // else dont close because, closing the print writer also closes the inner stream
+        }
+    }
+}

diff --git a/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCat.java b/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCat.java
new file mode 100644
index 0000000..299a742
--- /dev/null
+++ b/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCat.java

@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package opennlp.tools.dl;
+
+import opennlp.tools.doccat.DocumentCategorizer;
+import opennlp.tools.tokenize.Tokenizer;
+import opennlp.tools.tokenize.WhitespaceTokenizer;
+import org.apache.commons.io.FileUtils;
+import org.apache.commons.lang3.NotImplementedException;
+import org.kohsuke.args4j.CmdLineException;
+import org.kohsuke.args4j.CmdLineParser;
+import org.kohsuke.args4j.Option;
+import org.nd4j.linalg.api.ndarray.INDArray;
+import org.nd4j.linalg.indexing.NDArrayIndex;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.*;
+
+/**
+ * An implementation of {@link DocumentCategorizer} using Neural Networks.
+ * This class provides prediction functionality from the model of {@link NeuralDocCatTrainer}.
+ *
+ */
+public class NeuralDocCat implements DocumentCategorizer {
+
+    private static final Logger LOG = LoggerFactory.getLogger(NeuralDocCat.class);
+
+    private NeuralDocCatModel model;
+
+    public NeuralDocCat(NeuralDocCatModel model) {
+        this.model = model;
+    }
+
+    @Override
+    public double[] categorize(String[] tokens) {
+        return categorize(tokens, Collections.emptyMap());
+    }
+
+    @Override
+    public double[] categorize(String[] text, Map<String, Object> extraInformation) {
+        INDArray seqFeatures = this.model.getGloves().embed(text, this.model.getMaxSeqLen());
+
+        INDArray networkOutput = this.model.getNetwork().output(seqFeatures);
+        int timeSeriesLength = networkOutput.size(2);
+        INDArray probsAtLastWord = networkOutput.get(NDArrayIndex.point(0),
+                NDArrayIndex.all(), NDArrayIndex.point(timeSeriesLength - 1));
+
+        int nLabels = this.model.getLabels().size();
+        double[] probs = new double[nLabels];
+        for (int i = 0; i < nLabels; i++) {
+            probs[i] = probsAtLastWord.getDouble(i);
+        }
+        return probs;
+    }
+
+    @Override
+    public String getBestCategory(double[] outcome) {
+        int maxIdx = 0;
+        double maxProb = outcome[0];
+        for (int i = 1; i < outcome.length; i++) {
+            if (outcome[i] > maxProb) {
+                maxIdx = i;
+                maxProb = outcome[i];
+            }
+        }
+        return model.getLabels().get(maxIdx);
+    }
+
+    @Override
+    public int getIndex(String category) {
+        return model.getLabels().indexOf(category);
+    }
+
+    @Override
+    public String getCategory(int index) {
+        return model.getLabels().get(index);
+    }
+
+    @Override
+    public int getNumberOfCategories() {
+        return model.getLabels().size();
+    }
+
+
+    @Override
+    public String getAllResults(double[] results) {
+        throw new NotImplementedException("Not implemented");
+    }
+
+    @Override
+    public Map<String, Double> scoreMap(String[] text) {
+        double[] scores = categorize(text);
+        Map<String, Double> result = new HashMap<>();
+        for (int i = 0; i < scores.length; i++) {
+            result.put(model.getLabels().get(i), scores[i]);
+
+        }
+        return result;
+    }
+
+    @Override
+    public SortedMap<Double, Set<String>> sortedScoreMap(String[] text) {
+        throw new NotImplementedException("Not implemented");
+    }
+
+    public static void main(String[] argss) throws CmdLineException, IOException {
+        class Args {
+
+            @Option(name = "-model", required = true, usage = "Path to NeuralDocCatModel stored file")
+            String modelPath;
+
+            @Option(name = "-files", required = true, usage = "One or more document paths whose category is " +
+                    "to be predicted by the model")
+            List<File> files;
+        }
+
+        Args args = new Args();
+        CmdLineParser parser = new CmdLineParser(args);
+        try {
+            parser.parseArgument(argss);
+        } catch (CmdLineException e) {
+            System.out.println(e.getMessage());
+            e.getParser().printUsage(System.out);
+            System.exit(1);
+        }
+
+        NeuralDocCatModel model = NeuralDocCatModel.loadModel(args.modelPath);
+        NeuralDocCat classifier = new NeuralDocCat(model);
+
+        System.out.println("Labels:" + model.getLabels());
+        Tokenizer tokenizer = WhitespaceTokenizer.INSTANCE;
+
+        for (File file: args.files) {
+            String text = FileUtils.readFileToString(file);
+            String[] tokens = tokenizer.tokenize(text.toLowerCase());
+            double[] probs = classifier.categorize(tokens);
+            System.out.println(">>" + file);
+            System.out.println("Probabilities:" + Arrays.toString(probs));
+        }
+
+    }
+}

diff --git a/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCatModel.java b/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCatModel.java
new file mode 100644
index 0000000..f1b6247
--- /dev/null
+++ b/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCatModel.java

@@ -0,0 +1,179 @@
+package opennlp.tools.dl;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
+import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
+import org.nd4j.linalg.api.ndarray.INDArray;
+import org.nd4j.linalg.factory.Nd4j;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.*;
+import java.util.*;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipInputStream;
+import java.util.zip.ZipOutputStream;
+
+/**
+ * This class is a wrapper for DL4J's {@link MultiLayerNetwork}, and {@link GlobalVectors}
+ * that provides features to serialize and deserialize necessary data to a zip file.
+ *
+ * This cane be used by a Neural Trainer tool to serialize the network and a predictor tool to restore the same network
+ * with the weights.
+ *
+ * <br/>
+ ** @author Thamme Gowda (thammegowda@apache.org)
+ */
+public class NeuralDocCatModel {
+
+    public static final int VERSION = 1;
+    public static final String MODEL_NAME = NeuralDocCatModel.class.getName();
+    public static final String MANIFEST = "model.mf";
+    public static final String NETWORK = "network.json";
+    public static final String WEIGHTS = "weights.bin";
+    public static final String GLOVES = "gloves.tsv";
+    public static final String LABELS = "labels";
+    public static final String MAX_SEQ_LEN = "maxSeqLen";
+
+    private static final Logger LOG = LoggerFactory.getLogger(NeuralDocCatModel.class);
+
+    private final MultiLayerNetwork network;
+    private final GlobalVectors gloves;
+    private final Properties manifest;
+    private final List<String> labels;
+    private final int maxSeqLen;
+
+    /**
+     *
+     * @param stream Input stream of a Zip File
+     * @throws IOException
+     */
+    public NeuralDocCatModel(InputStream stream) throws IOException {
+        ZipInputStream zipIn = new ZipInputStream(stream);
+
+        Properties manifest = null;
+        MultiLayerNetwork model = null;
+        INDArray params = null;
+        GlobalVectors gloves = null;
+        ZipEntry entry;
+        while ((entry = zipIn.getNextEntry()) != null) {
+            String name = entry.getName();
+            switch (name) {
+                case MANIFEST:
+                    manifest = new Properties();
+                    manifest.load(zipIn);
+                    break;
+                case NETWORK:
+                    String json = IOUtils.toString(new UnclosableInputStream(zipIn));
+                    model = new MultiLayerNetwork(MultiLayerConfiguration.fromJson(json));
+                    break;
+                case WEIGHTS:
+                    params = Nd4j.read(new DataInputStream(new UnclosableInputStream(zipIn)));
+                    break;
+                case GLOVES:
+                    gloves = new GlobalVectors(new UnclosableInputStream(zipIn));
+                    break;
+                default:
+                    LOG.warn("Unexpected entry in the zip : {}", name);
+            }
+        }
+
+        assert model != null;
+        assert manifest != null;
+        model.init(params, false);
+        this.network = model;
+        this.manifest = manifest;
+        this.gloves = gloves;
+
+        assert manifest.containsKey(LABELS);
+        String[] labels = manifest.getProperty(LABELS).split(",");
+        this.labels = Collections.unmodifiableList(Arrays.asList(labels));
+
+        assert manifest.containsKey(MAX_SEQ_LEN);
+        this.maxSeqLen = Integer.parseInt(manifest.getProperty(MAX_SEQ_LEN));
+
+    }
+
+    /**
+     *
+     * @param network any compatible multi layer neural network
+     * @param vectors Global vectors
+     * @param labels list of labels
+     * @param maxSeqLen max sequence length
+     */
+    public NeuralDocCatModel(MultiLayerNetwork network, GlobalVectors vectors, List<String> labels, int maxSeqLen) {
+        this.network = network;
+        this.gloves = vectors;
+        this.manifest = new Properties();
+        this.manifest.setProperty(LABELS, StringUtils.join(labels, ","));
+        this.manifest.setProperty(MAX_SEQ_LEN, maxSeqLen + "");
+        this.labels = Collections.unmodifiableList(labels);
+        this.maxSeqLen = maxSeqLen;
+    }
+
+    public MultiLayerNetwork getNetwork() {
+        return network;
+    }
+
+    public GlobalVectors getGloves() {
+        return gloves;
+    }
+
+    public List<String> getLabels() {
+        return labels;
+    }
+
+    public int getMaxSeqLen() {
+        return this.maxSeqLen;
+    }
+
+    /**
+     * Zips the current state of the model and writes it stream
+     * @param stream stream to write
+     * @throws IOException
+     */
+    public void saveModel(OutputStream stream) throws IOException {
+        try (ZipOutputStream zipOut = new ZipOutputStream(new BufferedOutputStream(stream))) {
+            // Write out manifest
+            zipOut.putNextEntry(new ZipEntry(MANIFEST));
+
+            String comments = "Created-By:" + System.getenv("USER") + " at " + new Date().toString()
+                    + "\nModel-Version: " + VERSION
+                    + "\nModel-Schema:" + MODEL_NAME;
+
+            manifest.store(zipOut, comments);
+            zipOut.closeEntry();
+
+            // Write out the network
+            zipOut.putNextEntry(new ZipEntry(NETWORK));
+            byte[] jModel = network.getLayerWiseConfigurations().toJson().getBytes();
+            zipOut.write(jModel);
+            zipOut.closeEntry();
+
+            //Write out the network coefficients
+            zipOut.putNextEntry(new ZipEntry(WEIGHTS));
+            Nd4j.write(network.params(), new DataOutputStream(zipOut));
+            zipOut.closeEntry();
+
+            // Write out vectors
+            zipOut.putNextEntry(new ZipEntry(GLOVES));
+            gloves.writeOut(zipOut, false);
+            zipOut.closeEntry();
+
+            zipOut.finish();
+        }
+    }
+
+    /**
+     * creates a model from file on the local file system
+     * @param modelPath path to model file
+     * @return an instance of this class
+     * @throws IOException
+     */
+    public static NeuralDocCatModel loadModel(String modelPath) throws IOException {
+        try (InputStream modelStream = new FileInputStream(modelPath)) {
+            return new NeuralDocCatModel(modelStream);
+        }
+    }
+}

diff --git a/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCatTrainer.java b/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCatTrainer.java
new file mode 100644
index 0000000..9ce3a3f
--- /dev/null
+++ b/opennlp-dl/src/main/java/opennlp/tools/dl/NeuralDocCatTrainer.java

@@ -0,0 +1,253 @@
+package opennlp.tools.dl;
+
+import org.deeplearning4j.eval.Evaluation;
+import org.deeplearning4j.nn.conf.GradientNormalization;
+import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
+import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
+import org.deeplearning4j.nn.conf.Updater;
+import org.deeplearning4j.nn.conf.layers.GravesLSTM;
+import org.deeplearning4j.nn.conf.layers.RnnOutputLayer;
+import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
+import org.deeplearning4j.nn.weights.WeightInit;
+import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
+import org.kohsuke.args4j.CmdLineException;
+import org.kohsuke.args4j.CmdLineParser;
+import org.kohsuke.args4j.Option;
+import org.kohsuke.args4j.spi.StringArrayOptionHandler;
+import org.nd4j.linalg.activations.Activation;
+import org.nd4j.linalg.api.ndarray.INDArray;
+import org.nd4j.linalg.dataset.DataSet;
+import org.nd4j.linalg.learning.config.RmsProp;
+import org.nd4j.linalg.lossfunctions.LossFunctions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.*;
+import java.util.List;
+
+
+/**
+ * This class provides functionality to construct and train neural networks that can be used for
+ * {@link opennlp.tools.doccat.DocumentCategorizer}
+ *
+ * @see NeuralDocCat
+ * @see NeuralDocCatModel
+ * @author Thamme Gowda (thammegowda@apache.org)
+ */
+public class NeuralDocCatTrainer {
+
+    public static class Args {
+
+        @Option(name = "-batchSize", usage = "Number of examples in minibatch")
+        int batchSize = 128;
+
+        @Option(name = "-nEpochs", usage = "Number of epochs (i.e. full passes over the training data) to train on." +
+                " Applicable for training only.")
+        int nEpochs = 2;
+
+        @Option(name = "-maxSeqLen", usage = "Max Sequence Length. Sequences longer than this will be truncated")
+        int maxSeqLen = 256;    //Truncate text with length (# words) greater than this
+
+        @Option(name = "-vocabSize", usage = "Vocabulary Size.")
+        int vocabSize = 20000;   //vocabulary size
+
+        @Option(name = "-nRNNUnits", usage = "Number of RNN cells to use.")
+        int nRNNUnits = 128;
+
+        @Option(name = "-lr", aliases = "-learnRate", usage = "Learning Rate." +
+                " Adjust it when the scores bounce to NaN or Infinity.")
+        double learningRate = 2e-3;
+
+        @Option(name = "-glovesPath", required = true, usage = "Path to GloVe vectors file." +
+                " Download and unzip from https://nlp.stanford.edu/projects/glove/")
+        String glovesPath = null;
+
+        @Option(name = "-modelPath", required = true, usage = "Path to model file. " +
+                "This will be used for serializing the model after the training phase." )
+        String modelPath = null;
+
+        @Option(name = "-trainDir", required = true, usage = "Path to train data directory." +
+                " Setting this value will take the system to training mode. ")
+        String trainDir = null;
+
+        @Option(name = "-validDir", usage = "Path to validation data directory. Optional.")
+        String validDir = null;
+
+        @Option(name = "-labels", required = true, handler = StringArrayOptionHandler.class,
+                usage = "Names of targets or labels separated by spaces. " +
+                        "The order of labels matters. Make sure to use the same sequence for training and predicting. " +
+                        "Also, these names should match subdirectory names of -trainDir and -validDir when those are " +
+                        "applicable. \n Example -labels pos neg")
+        List<String> labels = null;
+
+        @Override
+        public String toString() {
+            return "Args{" +
+                    "batchSize=" + batchSize +
+                    ", nEpochs=" + nEpochs +
+                    ", maxSeqLen=" + maxSeqLen +
+                    ", vocabSize=" + vocabSize +
+                    ", learningRate=" + learningRate +
+                    ", nRNNUnits=" + nRNNUnits +
+                    ", glovesPath='" + glovesPath + '\'' +
+                    ", modelPath='" + modelPath + '\'' +
+                    ", trainDir='" + trainDir + '\'' +
+                    ", validDir='" + validDir + '\'' +
+                    ", labels=" + labels +
+                    '}';
+        }
+    }
+
+    private static final Logger LOG = LoggerFactory.getLogger(NeuralDocCatTrainer.class);
+
+    private NeuralDocCatModel model;
+    private Args args;
+    private DataReader trainSet;
+    private DataReader validSet;
+
+
+    public NeuralDocCatTrainer(Args args) throws IOException {
+        this.args = args;
+        GlobalVectors gloves;
+        MultiLayerNetwork network;
+
+        try (InputStream stream = new FileInputStream(args.glovesPath)) {
+            gloves = new GlobalVectors(stream, args.vocabSize);
+        }
+
+        LOG.info("Training data from {}", args.trainDir);
+        this.trainSet = new DataReader(args.trainDir, args.labels, gloves, args.batchSize, args.maxSeqLen);
+        if (args.validDir != null) {
+            LOG.info("Validation data from {}", args.validDir);
+            this.validSet = new DataReader(args.validDir, args.labels, gloves, args.batchSize, args.maxSeqLen);
+        }
+
+        //create network
+        network = this.createNetwork(gloves.getVectorSize());
+        this.model = new NeuralDocCatModel(network, gloves, args.labels, args.maxSeqLen);
+    }
+
+    public MultiLayerNetwork createNetwork(int vectorSize) {
+        int totalOutcomes = this.trainSet.totalOutcomes();
+        assert totalOutcomes >= 2;
+        LOG.info("Number of classes " + totalOutcomes);
+
+        //TODO: the below network params should be configurable from CLI or settings file
+        //Set up network configuration
+        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
+                .updater(new RmsProp(0.9)) // ADAM .adamMeanDecay(0.9).adamVarDecay(0.999)
+                .regularization(true).l2(1e-5)
+                .weightInit(WeightInit.XAVIER)
+                .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
+                .gradientNormalizationThreshold(1.0)
+                .learningRate(args.learningRate)
+                .list()
+                .layer(0, new GravesLSTM.Builder()
+                        .nIn(vectorSize)
+                        .nOut(args.nRNNUnits)
+                        .activation(Activation.RELU).build())
+                .layer(1, new RnnOutputLayer.Builder()
+                        .nIn(args.nRNNUnits)
+                        .nOut(totalOutcomes)
+                        .activation(Activation.SOFTMAX)
+                        .lossFunction(LossFunctions.LossFunction.MCXENT)
+                        .build())
+                .pretrain(false)
+                .backprop(true)
+                .build();
+
+        MultiLayerNetwork net = new MultiLayerNetwork(conf);
+        net.init();
+        net.setListeners(new ScoreIterationListener(1));
+        return net;
+    }
+
+    public void train() {
+        train(args.nEpochs, this.trainSet, this.validSet);
+    }
+
+    /**
+     * Trains model
+     *
+     * @param nEpochs    number of epochs (i.e. iterations over the training dataset)
+     * @param train      training data set
+     * @param validation validation data set for evaluation after each epoch.
+     *                   Setting this to null will skip the evaluation
+     */
+    public void train(int nEpochs, DataReader train, DataReader validation) {
+        assert model != null;
+        assert train != null;
+        LOG.info("Starting training...\nTotal epochs={}, Training Size={}, Validation Size={}", nEpochs,
+                train.totalExamples(), validation == null ? null : validation.totalExamples());
+        for (int i = 0; i < nEpochs; i++) {
+            model.getNetwork().fit(train);
+            train.reset();
+            LOG.info("Epoch {} complete", i);
+
+            if (validation != null) {
+                LOG.info("Starting evaluation");
+                //Run evaluation. This is on 25k reviews, so can take some time
+                Evaluation evaluation = new Evaluation();
+                while (validation.hasNext()) {
+                    DataSet t = validation.next();
+                    INDArray features = t.getFeatureMatrix();
+                    INDArray labels = t.getLabels();
+                    INDArray inMask = t.getFeaturesMaskArray();
+                    INDArray outMask = t.getLabelsMaskArray();
+                    INDArray predicted = this.model.getNetwork().output(features, false, inMask, outMask);
+                    evaluation.evalTimeSeries(labels, predicted, outMask);
+                }
+                validation.reset();
+                LOG.info(evaluation.stats());
+            }
+        }
+    }
+
+    /**
+     * Saves the model to specified path
+     *
+     * @param path model path
+     * @throws IOException
+     */
+    public void saveModel(String path) throws IOException {
+        assert model != null;
+        LOG.info("Saving the model at {}", path);
+        try (OutputStream stream = new FileOutputStream(path)) {
+            model.saveModel(stream);
+        }
+    }
+
+    /**
+     * <pre>
+     *   # Download pre trained Glo-ves (this is a large file)
+     *   wget http://nlp.stanford.edu/data/glove.6B.zip
+     *   unzip glove.6B.zip -d glove.6B
+     *
+     *   # Download dataset
+     *   wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
+     *   tar xzf aclImdb_v1.tar.gz
+     *
+     *  mvn compile exec:java
+     *    -Dexec.mainClass=edu.usc.irds.sentiment.analysis.dl.NeuralDocCat
+     *    -Dexec.args="-glovesPath $HOME/work/datasets/glove.6B/glove.6B.100d.txt
+     *    -labels pos neg -modelPath imdb-sentiment-neural-model.zip
+     *    -trainDir=$HOME/work/datasets/aclImdb/train -lr 0.001"
+     *
+     * </pre>
+     */
+    public static void main(String[] argss) throws CmdLineException, IOException {
+        Args args = new Args();
+        CmdLineParser parser = new CmdLineParser(args);
+        try {
+            parser.parseArgument(argss);
+        } catch (CmdLineException e) {
+            System.out.println(e.getMessage());
+            e.getParser().printUsage(System.out);
+            System.exit(1);
+        }
+        NeuralDocCatTrainer classifier = new NeuralDocCatTrainer(args);
+        classifier.train();
+        classifier.saveModel(args.modelPath);
+    }
+
+}

diff --git a/opennlp-dl/src/main/java/opennlp/tools/dl/RNN.java b/opennlp-dl/src/main/java/opennlp/tools/dl/RNN.java
index 417b98c..e297cc5 100644
--- a/opennlp-dl/src/main/java/opennlp/tools/dl/RNN.java
+++ b/opennlp-dl/src/main/java/opennlp/tools/dl/RNN.java

@@ -50,7 +50,7 @@
 public class RNN {
 
   // hyperparameters
-  protected final float learningRate; // size of hidden layer of neurons
+  protected float learningRate;
   protected final int seqLength; // no. of steps to unroll the RNN for
   protected final int hiddenLayerSize;
   protected final int epochs;
@@ -60,7 +60,8 @@
   protected final Map<String, Integer> charToIx;
   protected final Map<Integer, String> ixToChar;
   protected final List<String> data;
-  private final static double reg = 1e-8;
+  private final static double eps = 1e-8;
+  private final static double decay = 0.9;
 
   // model parameters
   private final INDArray wxh; // input to hidden
@@ -171,23 +172,23 @@
         System.out.printf("iter %d, loss: %f\n", n, smoothLoss); // print progress
       }
 
-      if (n% batch == 0) {
+      if (n % batch == 0) {
 
-        // perform parameter update with Adagrad
-        mWxh.addi(dWxh.mul(dWxh));
-        wxh.subi((dWxh.mul(learningRate)).div(Transforms.sqrt(mWxh).add(reg)));
+        // perform parameter update with RMSprop
+        mWxh = mWxh.mul(decay).add(1 - decay).mul((dWxh).mul(dWxh));
+        wxh.subi(dWxh.mul(learningRate).div(Transforms.sqrt(mWxh).add(eps)));
 
-        mWhh.addi(dWhh.mul(dWhh));
-        whh.subi(dWhh.mul(learningRate).div(Transforms.sqrt(mWhh).add(reg)));
+        mWhh = mWhh.mul(decay).add(1 - decay).mul((dWhh).mul(dWhh));
+        whh.subi(dWhh.mul(learningRate).div(Transforms.sqrt(mWhh).add(eps)));
 
-        mWhy.addi(dWhy.mul(dWhy));
-        why.subi(dWhy.mul(learningRate).div(Transforms.sqrt(mWhy).add(reg)));
+        mWhy = mWhy.mul(decay).add(1 - decay).mul((dWhy).mul(dWhy));
+        why.subi(dWhy.mul(learningRate).div(Transforms.sqrt(mWhy).add(eps)));
 
-        mbh.addi(dbh.mul(dbh));
-        bh.subi(dbh.mul(learningRate).div(Transforms.sqrt(mbh).add(reg)));
+        mbh = mbh.mul(decay).add(1 - decay).mul((dbh).mul(dbh));
+        bh.subi(dbh.mul(learningRate).div(Transforms.sqrt(mbh).add(eps)));
 
-        mby.addi(dby.mul(dby));
-        by.subi(dby.mul(learningRate).div(Transforms.sqrt(mby).add(reg)));
+        mby = mby.mul(decay).add(1 - decay).mul((dby).mul(dby));
+        by.subi(dby.mul(learningRate).div(Transforms.sqrt(mby).add(eps)));
       }
 
       p += seqLength; // move data pointer
@@ -245,7 +246,7 @@
         ps = init(inputs.length(), pst.shape());
       }
       ps.putRow(t, pst);
-      loss += -Math.log(pst.getDouble(targets.getInt(t))); // softmax (cross-entropy loss)
+      loss += -Math.log(pst.getDouble(targets.getInt(t),0)); // softmax (cross-entropy loss)
     }
 
     // backward pass: compute gradients going backwards
@@ -286,7 +287,7 @@
 
     INDArray x = Nd4j.zeros(vocabSize, 1);
     x.putScalar(seedIx, 1);
-    int sampleSize = 2 * seqLength;
+    int sampleSize = 144;
     INDArray ixes = Nd4j.create(sampleSize);
 
     INDArray h = hPrev.dup();
@@ -300,13 +301,16 @@
       for (int pi = 0; pi < vocabSize; pi++) {
         d.add(new Pair<>(pi, pm.getDouble(0, pi)));
       }
-      EnumeratedDistribution<Integer> distribution = new EnumeratedDistribution<>(d);
+      try {
+        EnumeratedDistribution<Integer> distribution = new EnumeratedDistribution<>(d);
 
-      int ix = distribution.sample();
+        int ix = distribution.sample();
 
-      x = Nd4j.zeros(vocabSize, 1);
-      x.putScalar(ix, 1);
-      ixes.putScalar(t, ix);
+        x = Nd4j.zeros(vocabSize, 1);
+        x.putScalar(ix, 1);
+        ixes.putScalar(t, ix);
+      } catch (Exception e) {
+      }
     }
 
     return getSampleString(ixes);
@@ -333,14 +337,14 @@
   @Override
   public String toString() {
     return getClass().getName() + "{" +
-            "learningRate=" + learningRate +
-            ", seqLength=" + seqLength +
-            ", hiddenLayerSize=" + hiddenLayerSize +
-            ", epochs=" + epochs +
-            ", vocabSize=" + vocabSize +
-            ", useChars=" + useChars +
-            ", batch=" + batch +
-            '}';
+        "learningRate=" + learningRate +
+        ", seqLength=" + seqLength +
+        ", hiddenLayerSize=" + hiddenLayerSize +
+        ", epochs=" + epochs +
+        ", vocabSize=" + vocabSize +
+        ", useChars=" + useChars +
+        ", batch=" + batch +
+        '}';
   }
 
   public void serialize(String prefix) throws IOException {

diff --git a/opennlp-dl/src/main/java/opennlp/tools/dl/StackedRNN.java b/opennlp-dl/src/main/java/opennlp/tools/dl/StackedRNN.java
index e9a5f7e..fe56d8f 100644
--- a/opennlp-dl/src/main/java/opennlp/tools/dl/StackedRNN.java
+++ b/opennlp-dl/src/main/java/opennlp/tools/dl/StackedRNN.java

@@ -55,8 +55,8 @@
   private final INDArray bh2; // hidden2 bias
   private final INDArray by; // output bias
 
-  private final double eps = 1e-4;
-  private final double decay = 0.9;
+  private final double eps = 1e-8;
+  private final double decay = 0.95;
   private final boolean rmsProp;
 
   private INDArray hPrev = null; // memory state
@@ -137,9 +137,14 @@
 
       // forward seqLength characters through the net and fetch gradient
       double loss = lossFun(inputs, targets, dWxh, dWhh, dWxh2, dWhh2, dWh2y, dbh, dbh2, dby);
-      smoothLoss = smoothLoss * 0.999 + loss * 0.001;
+      double newLoss = smoothLoss * 0.999 + loss * 0.001;
+
+      if (newLoss > smoothLoss) {
+        learningRate *= 0.999 ;
+      }
+      smoothLoss = newLoss;
       if (Double.isNaN(smoothLoss) || Double.isInfinite(smoothLoss)) {
-        System.out.println("loss is " + smoothLoss + " (over/underflow occured, try adjusting hyperparameters)");
+        System.out.println("loss is " + smoothLoss + "(" + loss + ") (over/underflow occurred, try adjusting hyperparameters)");
         break;
       }
       if (n % 100 == 0) {
@@ -252,7 +257,7 @@
       }
       ps.putRow(t, pst);
 
-      loss += -Math.log(pst.getDouble(targets.getInt(t),0)); // softmax (cross-entropy loss)
+      loss += -Math.log(pst.getDouble(targets.getInt(t), 0)); // softmax (cross-entropy loss)
     }
 
     // backward pass: compute gradients going backwards

diff --git a/opennlp-dl/src/main/java/opennlp/tools/dl/UnclosableInputStream.java b/opennlp-dl/src/main/java/opennlp/tools/dl/UnclosableInputStream.java
new file mode 100644
index 0000000..701fc48
--- /dev/null
+++ b/opennlp-dl/src/main/java/opennlp/tools/dl/UnclosableInputStream.java

@@ -0,0 +1,56 @@
+package opennlp.tools.dl;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.Reader;
+import java.io.Writer;
+
+/**
+ * This class offers a wrapper for {@link InputStream};
+ * The only sole purpose of this wrapper is to bypass the close calls that are usually
+ * propagated from the readers.
+ * A use case of this wrapper is for reading multiple files from the {@link java.util.zip.ZipInputStream},
+ * especially because the tools like {@link org.apache.commons.io.IOUtils#copy(Reader, Writer)}
+ * and {@link org.nd4j.linalg.factory.Nd4j#read(InputStream)} automatically close the input stream.
+ *
+ * Note:
+ *  1. this tool ignores the call to {@link #close()} method
+ *  2. Remember to call {@link #forceClose()} when the stream when the inner stream needs to be closed
+ *  3. This wrapper doesn't hold any resources. If you close the innerStream, you can safely ignore closing this wrapper
+ *
+ * @author Thamme Gowda (thammegowda@apache.org)
+ */
+public class UnclosableInputStream extends InputStream {
+
+    private InputStream innerStream;
+
+    public UnclosableInputStream(InputStream stream){
+        this.innerStream = stream;
+    }
+
+    @Override
+    public int read() throws IOException {
+        return innerStream.read();
+    }
+
+    /**
+     * NOP - Does not close the stream - intentional
+     * @throws IOException
+     */
+    @Override
+    public void close() throws IOException {
+        // intentionally ignored;
+        // Use forceClose() when needed to close
+    }
+
+    /**
+     * Closes the stream
+     * @throws IOException
+     */
+    public void forceClose() throws IOException {
+        if (innerStream != null) {
+            innerStream.close();
+            innerStream = null;
+        }
+    }
+}

diff --git a/opennlp-dl/src/test/java/opennlp/tools/dl/NeuralDocCatTest.java b/opennlp-dl/src/test/java/opennlp/tools/dl/NeuralDocCatTest.java
new file mode 100644
index 0000000..9d0cb83
--- /dev/null
+++ b/opennlp-dl/src/test/java/opennlp/tools/dl/NeuralDocCatTest.java

@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package opennlp.tools.dl;
+
+import java.util.Arrays;
+import java.util.Map;
+
+import org.junit.Ignore;
+import org.junit.Test;
+
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * Tests for {@link NeuralDocCat}
+ */
+@Ignore
+public class NeuralDocCatTest {
+
+  @Test
+  public void testDocCatTrainingOnTweets() throws Exception {
+    NeuralDocCatTrainer.Args args = new NeuralDocCatTrainer.Args();
+    args.glovesPath = "/path/to/glove.6B/glove.6B.50d.txt";
+    args.labels = Arrays.asList("0", "1");
+    String modelPathPrefix = "target/ndcmodel";
+    args.modelPath = modelPathPrefix + ".out";
+    args.trainDir = getClass().getResource("/ltweets").getFile();
+    NeuralDocCatTrainer trainer = new NeuralDocCatTrainer(args);
+    trainer.train();
+    String modelPath = modelPathPrefix + ".zip";
+    trainer.saveModel(modelPath);
+
+    /* TODO : this fails with:
+     * java.lang.AssertionError
+     * at opennlp.tools.dl.GlobalVectors.<init>(GlobalVectors.java:92)
+     */
+    NeuralDocCatModel neuralDocCatModel = NeuralDocCatModel.loadModel(modelPath);
+    assertNotNull(neuralDocCatModel);
+
+    NeuralDocCat neuralDocCat = new NeuralDocCat(NeuralDocCatModel.loadModel(modelPathPrefix));
+    Map<String, Double> scoreMap = neuralDocCat.scoreMap(new String[] {"u r so dumb"});
+    assertNotNull(scoreMap);
+    Double negativeScore = scoreMap.get("0");
+    assertNotNull(negativeScore);
+    Double positiveScore = scoreMap.get("1");
+    assertNotNull(positiveScore);
+    assertTrue(negativeScore > positiveScore);
+  }
+
+}
+

diff --git a/opennlp-dl/src/test/java/opennlp/tools/dl/RNNTest.java b/opennlp-dl/src/test/java/opennlp/tools/dl/RNNTest.java
index 88a9413..bc3904f 100644
--- a/opennlp-dl/src/test/java/opennlp/tools/dl/RNNTest.java
+++ b/opennlp-dl/src/test/java/opennlp/tools/dl/RNNTest.java

@@ -18,7 +18,6 @@
  */
 package opennlp.tools.dl;
 
-import java.io.FileInputStream;
 import java.io.InputStream;
 import java.util.Arrays;
 import java.util.Collection;
@@ -64,24 +63,17 @@
   @Parameterized.Parameters
   public static Collection<Object[]> data() {
     return Arrays.asList(new Object[][] {
-        {1e-1f, 15, 20, 5},
+        {1e-3f, 25, 50, 5},
     });
   }
 
   @Test
   public void testVanillaCharRNNLearn() throws Exception {
-    RNN rnn = new RNN(learningRate, seqLength, hiddenLayerSize, epochs, text, 5, true);
+    RNN rnn = new RNN(learningRate, seqLength, hiddenLayerSize, epochs, text, 10, true);
     evaluate(rnn, true);
     rnn.serialize("target/crnn-weights-");
   }
 
-  @Test
-  public void testVanillaWordRNNLearn() throws Exception {
-    RNN rnn = new RNN(learningRate, seqLength, hiddenLayerSize, epochs, text, 1, false);
-    evaluate(rnn, true);
-    rnn.serialize("target/wrnn-weights-");
-  }
-
   private void evaluate(RNN rnn, boolean checkRatio) {
     System.out.println(rnn);
     rnn.learn();

diff --git a/opennlp-dl/src/test/java/opennlp/tools/dl/StackedRNNTest.java b/opennlp-dl/src/test/java/opennlp/tools/dl/StackedRNNTest.java
index 265426f..6a61642 100644
--- a/opennlp-dl/src/test/java/opennlp/tools/dl/StackedRNNTest.java
+++ b/opennlp-dl/src/test/java/opennlp/tools/dl/StackedRNNTest.java

@@ -18,7 +18,6 @@
  */
 package opennlp.tools.dl;
 
-import java.io.FileInputStream;
 import java.io.InputStream;
 import java.util.Arrays;
 import java.util.Collection;
@@ -64,24 +63,17 @@
   @Parameterized.Parameters
   public static Collection<Object[]> data() {
     return Arrays.asList(new Object[][] {
-        {1e-1f, 15, 20, 5},
+        {1e-2f, 25, 50, 4},
     });
   }
 
   @Test
   public void testStackedCharRNNLearn() throws Exception {
-    RNN rnn = new StackedRNN(learningRate, seqLength, hiddenLayerSize, epochs, text, 5, true, true);
+    RNN rnn = new StackedRNN(learningRate, seqLength, hiddenLayerSize, epochs, text, 10, true, true);
     evaluate(rnn, true);
     rnn.serialize("target/scrnn-weights-");
   }
 
-  @Test
-  public void testStackedWordRNNLearn() throws Exception {
-    RNN rnn = new StackedRNN(learningRate, seqLength, hiddenLayerSize, epochs, text, 1, false, false);
-    evaluate(rnn, true);
-    rnn.serialize("target/swrnn-weights-");
-  }
-
   private void evaluate(RNN rnn, boolean checkRatio) {
     System.out.println(rnn);
     rnn.learn();

diff --git a/opennlp-dl/src/test/resources/ltweets/0/1.txt b/opennlp-dl/src/test/resources/ltweets/0/1.txt
new file mode 100644
index 0000000..cfc942a
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/1.txt

@@ -0,0 +1 @@
+The painting is ugly, will return it tomorrow...
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/10.txt b/opennlp-dl/src/test/resources/ltweets/0/10.txt
new file mode 100644
index 0000000..79d5725
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/10.txt

@@ -0,0 +1 @@
+The dark side of a selfie.
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/11.txt b/opennlp-dl/src/test/resources/ltweets/0/11.txt
new file mode 100644
index 0000000..379a1ff
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/11.txt

@@ -0,0 +1 @@
+False hopes for the people attending the meeting
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/12.txt b/opennlp-dl/src/test/resources/ltweets/0/12.txt
new file mode 100644
index 0000000..2636da2
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/12.txt

@@ -0,0 +1 @@
+The ugliest car ever!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/13.txt b/opennlp-dl/src/test/resources/ltweets/0/13.txt
new file mode 100644
index 0000000..1678f71
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/13.txt

@@ -0,0 +1 @@
+Feeling bored
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/14.txt b/opennlp-dl/src/test/resources/ltweets/0/14.txt
new file mode 100644
index 0000000..796c742
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/14.txt

@@ -0,0 +1 @@
+Need urgently a pause
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/15.txt b/opennlp-dl/src/test/resources/ltweets/0/15.txt
new file mode 100644
index 0000000..24ca80f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/15.txt

@@ -0,0 +1 @@
+I didn't see that one coming
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/16.txt b/opennlp-dl/src/test/resources/ltweets/0/16.txt
new file mode 100644
index 0000000..97239ad
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/16.txt

@@ -0,0 +1 @@
+Sorry mate, there is no more room for you
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/17.txt b/opennlp-dl/src/test/resources/ltweets/0/17.txt
new file mode 100644
index 0000000..635a745
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/17.txt

@@ -0,0 +1 @@
+Who could have possibly done this?
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/18.txt b/opennlp-dl/src/test/resources/ltweets/0/18.txt
new file mode 100644
index 0000000..e05029d
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/18.txt

@@ -0,0 +1 @@
+I feel bad for what I did
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/19.txt b/opennlp-dl/src/test/resources/ltweets/0/19.txt
new file mode 100644
index 0000000..b61e375
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/19.txt

@@ -0,0 +1 @@
+I just did a big mistake
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/2.txt b/opennlp-dl/src/test/resources/ltweets/0/2.txt
new file mode 100644
index 0000000..af7484e
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/2.txt

@@ -0,0 +1 @@
+Too early to travel..need a coffee
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/20.txt b/opennlp-dl/src/test/resources/ltweets/0/20.txt
new file mode 100644
index 0000000..cff8057
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/20.txt

@@ -0,0 +1 @@
+I never loved so hard in my life
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/21.txt b/opennlp-dl/src/test/resources/ltweets/0/21.txt
new file mode 100644
index 0000000..a63d697
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/21.txt

@@ -0,0 +1 @@
+I hate you Mike!!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/22.txt b/opennlp-dl/src/test/resources/ltweets/0/22.txt
new file mode 100644
index 0000000..6023984
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/22.txt

@@ -0,0 +1 @@
+I hate to say goodbye
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/23.txt b/opennlp-dl/src/test/resources/ltweets/0/23.txt
new file mode 100644
index 0000000..cb2d3ee
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/23.txt

@@ -0,0 +1 @@
+Never try this at home
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/24.txt b/opennlp-dl/src/test/resources/ltweets/0/24.txt
new file mode 100644
index 0000000..e78360b
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/24.txt

@@ -0,0 +1 @@
+Don't spoil it!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/25.txt b/opennlp-dl/src/test/resources/ltweets/0/25.txt
new file mode 100644
index 0000000..c4c9cf2
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/25.txt

@@ -0,0 +1 @@
+The more I hear you, the more annoyed I get
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/26.txt b/opennlp-dl/src/test/resources/ltweets/0/26.txt
new file mode 100644
index 0000000..6e53332
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/26.txt

@@ -0,0 +1 @@
+I just lost my appetite
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/27.txt b/opennlp-dl/src/test/resources/ltweets/0/27.txt
new file mode 100644
index 0000000..6c5a367
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/27.txt

@@ -0,0 +1 @@
+Sad end for this movie
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/28.txt b/opennlp-dl/src/test/resources/ltweets/0/28.txt
new file mode 100644
index 0000000..3e130cb
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/28.txt

@@ -0,0 +1 @@
+Lonely, I am so lonely
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/29.txt b/opennlp-dl/src/test/resources/ltweets/0/29.txt
new file mode 100644
index 0000000..6d73372
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/29.txt

@@ -0,0 +1 @@
+Hate to wait on a long queue
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/3.txt b/opennlp-dl/src/test/resources/ltweets/0/3.txt
new file mode 100644
index 0000000..43fc3c3
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/3.txt

@@ -0,0 +1 @@
+Damn..the train is late again...
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/30.txt b/opennlp-dl/src/test/resources/ltweets/0/30.txt
new file mode 100644
index 0000000..0bb7832
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/30.txt

@@ -0,0 +1 @@
+No cab available
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/31.txt b/opennlp-dl/src/test/resources/ltweets/0/31.txt
new file mode 100644
index 0000000..1d7dbe8
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/31.txt

@@ -0,0 +1 @@
+Electricity outage, this is a nightmare
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/32.txt b/opennlp-dl/src/test/resources/ltweets/0/32.txt
new file mode 100644
index 0000000..24d3571
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/32.txt

@@ -0,0 +1 @@
+Nobody to ask about directions
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/33.txt b/opennlp-dl/src/test/resources/ltweets/0/33.txt
new file mode 100644
index 0000000..42441e8
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/33.txt

@@ -0,0 +1 @@
+I feel sick
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/34.txt b/opennlp-dl/src/test/resources/ltweets/0/34.txt
new file mode 100644
index 0000000..b1d2562
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/34.txt

@@ -0,0 +1 @@
+I am very tired
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/35.txt b/opennlp-dl/src/test/resources/ltweets/0/35.txt
new file mode 100644
index 0000000..89a24c5
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/35.txt

@@ -0,0 +1 @@
+Such a bad taste
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/36.txt b/opennlp-dl/src/test/resources/ltweets/0/36.txt
new file mode 100644
index 0000000..e303474
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/36.txt

@@ -0,0 +1 @@
+I don't recommend this restaurant
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/37.txt b/opennlp-dl/src/test/resources/ltweets/0/37.txt
new file mode 100644
index 0000000..cb5bc60
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/37.txt

@@ -0,0 +1 @@
+I will never ever call you again
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/38.txt b/opennlp-dl/src/test/resources/ltweets/0/38.txt
new file mode 100644
index 0000000..060e90f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/38.txt

@@ -0,0 +1 @@
+I just got kicked out of the contest
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/39.txt b/opennlp-dl/src/test/resources/ltweets/0/39.txt
new file mode 100644
index 0000000..2bef847
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/39.txt

@@ -0,0 +1 @@
+Big pain to see my team loosing
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/4.txt b/opennlp-dl/src/test/resources/ltweets/0/4.txt
new file mode 100644
index 0000000..c1f1ade
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/4.txt

@@ -0,0 +1 @@
+Bad news, my flight just got cancelled.
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/40.txt b/opennlp-dl/src/test/resources/ltweets/0/40.txt
new file mode 100644
index 0000000..b2b7840
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/40.txt

@@ -0,0 +1 @@
+Bitter defeat tonight
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/41.txt b/opennlp-dl/src/test/resources/ltweets/0/41.txt
new file mode 100644
index 0000000..559dbe5
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/41.txt

@@ -0,0 +1 @@
+My bike was stollen
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/42.txt b/opennlp-dl/src/test/resources/ltweets/0/42.txt
new file mode 100644
index 0000000..23a1fcd
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/42.txt

@@ -0,0 +1 @@
+I lost every hope for seeing him again
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/43.txt b/opennlp-dl/src/test/resources/ltweets/0/43.txt
new file mode 100644
index 0000000..1b9b5ee
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/43.txt

@@ -0,0 +1 @@
+Cold winter ahead
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/44.txt b/opennlp-dl/src/test/resources/ltweets/0/44.txt
new file mode 100644
index 0000000..46ddb14
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/44.txt

@@ -0,0 +1 @@
+Hopless struggle..
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/45.txt b/opennlp-dl/src/test/resources/ltweets/0/45.txt
new file mode 100644
index 0000000..cf53150
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/45.txt

@@ -0,0 +1 @@
+Ugly hat
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/5.txt b/opennlp-dl/src/test/resources/ltweets/0/5.txt
new file mode 100644
index 0000000..7c73c0a
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/5.txt

@@ -0,0 +1 @@
+Had a bad evening, need urgently a beer.
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/6.txt b/opennlp-dl/src/test/resources/ltweets/0/6.txt
new file mode 100644
index 0000000..2b0a59d
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/6.txt

@@ -0,0 +1 @@
+I put on weight again
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/7.txt b/opennlp-dl/src/test/resources/ltweets/0/7.txt
new file mode 100644
index 0000000..caa942f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/7.txt

@@ -0,0 +1 @@
+I lost my keys
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/8.txt b/opennlp-dl/src/test/resources/ltweets/0/8.txt
new file mode 100644
index 0000000..c8fdad9
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/8.txt

@@ -0,0 +1 @@
+I hate Mondays
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/0/9.txt b/opennlp-dl/src/test/resources/ltweets/0/9.txt
new file mode 100644
index 0000000..87c8042
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/0/9.txt

@@ -0,0 +1 @@
+He killed our good mood
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/1.txt b/opennlp-dl/src/test/resources/ltweets/1/1.txt
new file mode 100644
index 0000000..4839031
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/1.txt

@@ -0,0 +1 @@
+Watching a nice movie
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/10.txt b/opennlp-dl/src/test/resources/ltweets/1/10.txt
new file mode 100644
index 0000000..0f8519e
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/10.txt

@@ -0,0 +1 @@
+I fell in love again
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/11.txt b/opennlp-dl/src/test/resources/ltweets/1/11.txt
new file mode 100644
index 0000000..9055a8f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/11.txt

@@ -0,0 +1 @@
+On a trip to Iceland
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/12.txt b/opennlp-dl/src/test/resources/ltweets/1/12.txt
new file mode 100644
index 0000000..17b7b9e
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/12.txt

@@ -0,0 +1 @@
+Happy in Berlin
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/13.txt b/opennlp-dl/src/test/resources/ltweets/1/13.txt
new file mode 100644
index 0000000..ba1296c
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/13.txt

@@ -0,0 +1 @@
+Love the new book I reveived for Christmas
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/14.txt b/opennlp-dl/src/test/resources/ltweets/1/14.txt
new file mode 100644
index 0000000..abdd40f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/14.txt

@@ -0,0 +1 @@
+I am in good spirits again
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/15.txt b/opennlp-dl/src/test/resources/ltweets/1/15.txt
new file mode 100644
index 0000000..4f0590e
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/15.txt

@@ -0,0 +1 @@
+This guy creates the most awesome pics ever 
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/16.txt b/opennlp-dl/src/test/resources/ltweets/1/16.txt
new file mode 100644
index 0000000..78eb30e
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/16.txt

@@ -0,0 +1 @@
+Cool! John is back!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/17.txt b/opennlp-dl/src/test/resources/ltweets/1/17.txt
new file mode 100644
index 0000000..ed1e8cf
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/17.txt

@@ -0,0 +1 @@
+Many rooms and many hopes for new residents
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/18.txt b/opennlp-dl/src/test/resources/ltweets/1/18.txt
new file mode 100644
index 0000000..0964334
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/18.txt

@@ -0,0 +1 @@
+I set my new year's resolution
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/19.txt b/opennlp-dl/src/test/resources/ltweets/1/19.txt
new file mode 100644
index 0000000..06984df
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/19.txt

@@ -0,0 +1 @@
+Nice to see Ana made it
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/2.txt b/opennlp-dl/src/test/resources/ltweets/1/2.txt
new file mode 100644
index 0000000..1344c0b
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/2.txt

@@ -0,0 +1 @@
+One of the best soccer games, worth seeing it
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/20.txt b/opennlp-dl/src/test/resources/ltweets/1/20.txt
new file mode 100644
index 0000000..25f7ea4
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/20.txt

@@ -0,0 +1 @@
+My dream came true
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/21.txt b/opennlp-dl/src/test/resources/ltweets/1/21.txt
new file mode 100644
index 0000000..bef74e8
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/21.txt

@@ -0,0 +1 @@
+I won the challenge
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/22.txt b/opennlp-dl/src/test/resources/ltweets/1/22.txt
new file mode 100644
index 0000000..010654d
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/22.txt

@@ -0,0 +1 @@
+I had a great time tonight
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/23.txt b/opennlp-dl/src/test/resources/ltweets/1/23.txt
new file mode 100644
index 0000000..4232520
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/23.txt

@@ -0,0 +1 @@
+It was a lot of fun
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/24.txt b/opennlp-dl/src/test/resources/ltweets/1/24.txt
new file mode 100644
index 0000000..260d854
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/24.txt

@@ -0,0 +1 @@
+Thank you Molly making this possible
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/25.txt b/opennlp-dl/src/test/resources/ltweets/1/25.txt
new file mode 100644
index 0000000..f8852a9
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/25.txt

@@ -0,0 +1 @@
+I love it!!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/26.txt b/opennlp-dl/src/test/resources/ltweets/1/26.txt
new file mode 100644
index 0000000..c0f882c
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/26.txt

@@ -0,0 +1 @@
+Lovely!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/27.txt b/opennlp-dl/src/test/resources/ltweets/1/27.txt
new file mode 100644
index 0000000..31caa55
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/27.txt

@@ -0,0 +1 @@
+Like and share if you feel the same
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/28.txt b/opennlp-dl/src/test/resources/ltweets/1/28.txt
new file mode 100644
index 0000000..6fc9194
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/28.txt

@@ -0,0 +1 @@
+I love rock and roll
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/29.txt b/opennlp-dl/src/test/resources/ltweets/1/29.txt
new file mode 100644
index 0000000..3ec2437
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/29.txt

@@ -0,0 +1 @@
+Finnaly passed my exam!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/3.txt b/opennlp-dl/src/test/resources/ltweets/1/3.txt
new file mode 100644
index 0000000..c3e3c97
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/3.txt

@@ -0,0 +1 @@
+Very tasty, not only for vegetarians
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/30.txt b/opennlp-dl/src/test/resources/ltweets/1/30.txt
new file mode 100644
index 0000000..51b996a
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/30.txt

@@ -0,0 +1 @@
+Lovely kittens
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/31.txt b/opennlp-dl/src/test/resources/ltweets/1/31.txt
new file mode 100644
index 0000000..6d70c44
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/31.txt

@@ -0,0 +1 @@
+Beautiful morning
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/32.txt b/opennlp-dl/src/test/resources/ltweets/1/32.txt
new file mode 100644
index 0000000..bada0ef
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/32.txt

@@ -0,0 +1 @@
+She is amazing
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/33.txt b/opennlp-dl/src/test/resources/ltweets/1/33.txt
new file mode 100644
index 0000000..26bfb48
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/33.txt

@@ -0,0 +1 @@
+Enjoying some time with my friends
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/34.txt b/opennlp-dl/src/test/resources/ltweets/1/34.txt
new file mode 100644
index 0000000..2e48c2b
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/34.txt

@@ -0,0 +1 @@
+Special thanks to Marty
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/35.txt b/opennlp-dl/src/test/resources/ltweets/1/35.txt
new file mode 100644
index 0000000..b72daf5
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/35.txt

@@ -0,0 +1 @@
+Thanks God I left on time
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/36.txt b/opennlp-dl/src/test/resources/ltweets/1/36.txt
new file mode 100644
index 0000000..5a0540f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/36.txt

@@ -0,0 +1 @@
+Greateful for a wonderful meal
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/37.txt b/opennlp-dl/src/test/resources/ltweets/1/37.txt
new file mode 100644
index 0000000..140ed00
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/37.txt

@@ -0,0 +1 @@
+So happy to be home
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/38.txt b/opennlp-dl/src/test/resources/ltweets/1/38.txt
new file mode 100644
index 0000000..531f8e5
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/38.txt

@@ -0,0 +1 @@
+Great game!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/39.txt b/opennlp-dl/src/test/resources/ltweets/1/39.txt
new file mode 100644
index 0000000..2d8d4e9
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/39.txt

@@ -0,0 +1 @@
+Nice trip
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/4.txt b/opennlp-dl/src/test/resources/ltweets/1/4.txt
new file mode 100644
index 0000000..46ebb6f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/4.txt

@@ -0,0 +1 @@
+Super party!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/40.txt b/opennlp-dl/src/test/resources/ltweets/1/40.txt
new file mode 100644
index 0000000..beda37e
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/40.txt

@@ -0,0 +1 @@
+I just received a pretty flower
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/41.txt b/opennlp-dl/src/test/resources/ltweets/1/41.txt
new file mode 100644
index 0000000..ff8f2cf
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/41.txt

@@ -0,0 +1 @@
+Excellent idea
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/42.txt b/opennlp-dl/src/test/resources/ltweets/1/42.txt
new file mode 100644
index 0000000..0823b22
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/42.txt

@@ -0,0 +1 @@
+Got a new watch. Feeling happy
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/43.txt b/opennlp-dl/src/test/resources/ltweets/1/43.txt
new file mode 100644
index 0000000..40fe32e
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/43.txt

@@ -0,0 +1 @@
+Such a good taste 
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/44.txt b/opennlp-dl/src/test/resources/ltweets/1/44.txt
new file mode 100644
index 0000000..3929ff8
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/44.txt

@@ -0,0 +1 @@
+Enjoying brunch
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/45.txt b/opennlp-dl/src/test/resources/ltweets/1/45.txt
new file mode 100644
index 0000000..579762e
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/45.txt

@@ -0,0 +1 @@
+Thank you mom for supporting me
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/46.txt b/opennlp-dl/src/test/resources/ltweets/1/46.txt
new file mode 100644
index 0000000..98ae307
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/46.txt

@@ -0,0 +1 @@
+Smiling
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/47.txt b/opennlp-dl/src/test/resources/ltweets/1/47.txt
new file mode 100644
index 0000000..4f4cb4f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/47.txt

@@ -0,0 +1 @@
+Great to see you!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/48.txt b/opennlp-dl/src/test/resources/ltweets/1/48.txt
new file mode 100644
index 0000000..19111a1
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/48.txt

@@ -0,0 +1 @@
+Nice dress!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/49.txt b/opennlp-dl/src/test/resources/ltweets/1/49.txt
new file mode 100644
index 0000000..1680d33
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/49.txt

@@ -0,0 +1 @@
+Stop wasting my time
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/5.txt b/opennlp-dl/src/test/resources/ltweets/1/5.txt
new file mode 100644
index 0000000..e28605b
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/5.txt

@@ -0,0 +1 @@
+Happy birthday mr. president
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/50.txt b/opennlp-dl/src/test/resources/ltweets/1/50.txt
new file mode 100644
index 0000000..1a9b19f
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/50.txt

@@ -0,0 +1 @@
+I have a great idea
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/51.txt b/opennlp-dl/src/test/resources/ltweets/1/51.txt
new file mode 100644
index 0000000..bf1d877
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/51.txt

@@ -0,0 +1 @@
+Excited to go to the pub
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/52.txt b/opennlp-dl/src/test/resources/ltweets/1/52.txt
new file mode 100644
index 0000000..c2cabb1
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/52.txt

@@ -0,0 +1 @@
+Feeling proud
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/53.txt b/opennlp-dl/src/test/resources/ltweets/1/53.txt
new file mode 100644
index 0000000..45100ea
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/53.txt

@@ -0,0 +1 @@
+Cute bunnies
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/54.txt b/opennlp-dl/src/test/resources/ltweets/1/54.txt
new file mode 100644
index 0000000..da4779c
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/54.txt

@@ -0,0 +1 @@
+Big hug and lots of love
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/55.txt b/opennlp-dl/src/test/resources/ltweets/1/55.txt
new file mode 100644
index 0000000..1d901a7
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/55.txt

@@ -0,0 +1 @@
+I hope you have a wonderful celebration
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/6.txt b/opennlp-dl/src/test/resources/ltweets/1/6.txt
new file mode 100644
index 0000000..1c720eb
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/6.txt

@@ -0,0 +1 @@
+Just watch it. Respect.
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/7.txt b/opennlp-dl/src/test/resources/ltweets/1/7.txt
new file mode 100644
index 0000000..29c7e90
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/7.txt

@@ -0,0 +1 @@
+Wonderful sunset.
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/8.txt b/opennlp-dl/src/test/resources/ltweets/1/8.txt
new file mode 100644
index 0000000..a67a3ac
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/8.txt

@@ -0,0 +1 @@
+Bravo, first title in 2014!
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/ltweets/1/9.txt b/opennlp-dl/src/test/resources/ltweets/1/9.txt
new file mode 100644
index 0000000..0a4683b
--- /dev/null
+++ b/opennlp-dl/src/test/resources/ltweets/1/9.txt

@@ -0,0 +1 @@
+On today's show we met Angela, a woman with an amazing story
\ No newline at end of file

diff --git a/opennlp-dl/src/test/resources/text/tweets.txt b/opennlp-dl/src/test/resources/text/tweets.txt
new file mode 100644
index 0000000..ba7ac0d
--- /dev/null
+++ b/opennlp-dl/src/test/resources/text/tweets.txt

@@ -0,0 +1,100 @@
+1	Watching a nice movie
+0	The painting is ugly, will return it tomorrow...
+1	One of the best soccer games, worth seeing it
+1	Very tasty, not only for vegetarians
+1	Super party!
+0	Too early to travel..need a coffee
+0	Damn..the train is late again...
+0	Bad news, my flight just got cancelled.
+1	Happy birthday mr. president
+1	Just watch it. Respect.
+1	Wonderful sunset.
+1	Bravo, first title in 2014!
+0	Had a bad evening, need urgently a beer.
+0	I put on weight again
+1	On today's show we met Angela, a woman with an amazing story
+1	I fell in love again
+0	I lost my keys
+1	On a trip to Iceland
+1	Happy in Berlin
+0	I hate Mondays
+1	Love the new book I reveived for Christmas
+0	He killed our good mood
+1	I am in good spirits again
+1	This guy creates the most awesome pics ever 
+0	The dark side of a selfie.
+1	Cool! John is back!
+1	Many rooms and many hopes for new residents
+0	False hopes for the people attending the meeting
+1	I set my new year's resolution
+0	The ugliest car ever!
+0	Feeling bored
+0	Need urgently a pause
+1	Nice to see Ana made it
+1	My dream came true
+0	I didn't see that one coming
+0	Sorry mate, there is no more room for you
+0	Who could have possibly done this?
+1	I won the challenge
+0	I feel bad for what I did		
+1	I had a great time tonight
+1	It was a lot of fun
+1	Thank you Molly making this possible
+0	I just did a big mistake
+1	I love it!!
+0	I never loved so hard in my life
+0	I hate you Mike!!
+0	I hate to say goodbye
+1	Lovely!
+1	Like and share if you feel the same
+0	Never try this at home
+0	Don't spoil it!
+1	I love rock and roll
+0	The more I hear you, the more annoyed I get
+1	Finnaly passed my exam!
+1	Lovely kittens
+0	I just lost my appetite
+0	Sad end for this movie
+0	Lonely, I am so lonely
+1	Beautiful morning
+1	She is amazing
+1	Enjoying some time with my friends
+1	Special thanks to Marty
+1	Thanks God I left on time
+1	Greateful for a wonderful meal
+1	So happy to be home
+0	Hate to wait on a long queue		
+0	No cab available
+0	Electricity outage, this is a nightmare
+0	Nobody to ask about directions
+1	Great game!
+1	Nice trip
+1	I just received a pretty flower
+1	Excellent idea
+1	Got a new watch. Feeling happy
+0	I feel sick
+0	I am very tired
+1	Such a good taste 
+0	Such a bad taste
+1	Enjoying brunch
+0	I don't recommend this restaurant
+1	Thank you mom for supporting me
+0	I will never ever call you again
+0	I just got kicked out of the contest
+1	Smiling
+0	Big pain to see my team loosing
+0	Bitter defeat tonight
+0	My bike was stollen
+1	Great to see you!
+0	I lost every hope for seeing him again
+1	Nice dress!
+1	Stop wasting my time
+1	I have a great idea
+1	Excited to go to the pub
+1	Feeling proud
+1	Cute bunnies
+0	Cold winter ahead
+0	Hopless struggle..
+0	Ugly hat
+1	Big hug and lots of love
+1	I hope you have a wonderful celebration

diff --git a/opennlp-dl/src/test/resources/tweets/1/1.txt b/opennlp-dl/src/test/resources/tweets/1/1.txt
new file mode 100644
index 0000000..e69de29
--- /dev/null
+++ b/opennlp-dl/src/test/resources/tweets/1/1.txt
commit	c0a14f65b4d5b1acaf7d6e6e34a84d2887632121	[log] [tgz]
author	Peter Thygesen <thygesen@users.noreply.github.com>	Thu Apr 12 10:40:11 2018 +0200
committer	GitHub <noreply@github.com>	Thu Apr 12 10:40:11 2018 +0200
tree	94496fea2dbf1f247434a4b3033f6e570adb4a12
parent	87a75a75f596a09b44143b49aa1d7d30e4a1d120 [diff]
parent	23c0ebb770eb6a3464ca51d49bc904b948035596 [diff]