tree: 30aa4f7c8023b4bf3790e34821be932c7a28888c [path history] [tgz]
  1. dev-supports/
  2. src/
  3. pom.xml

User Profiling Offline Training

Quick Start

  1. Compile and build package

     mvn clean compile package -DskipTests
  2. Run with following command or directly on IDE

     spark-submit --class --master local target/eagle-security-mltraining-spark-0.1.0.jar


  1. Problem: 2015-07-19 17:18:42,083 ERROR [main] spark.SparkContext (Logging.scala:logError(96)) - Error initializing SparkContext. Failed to bind to: / Service ‘sparkDriver’ failed after 16 retries! Solution: Add environment variables

     export SPARK_LOCAL_IP=
     export SPARK_MASTER_IP=
  2. Problem: Detected both slf4j-log4j12 and log4j-over-slf4j in classpath

    Solution: Exclude log4 and slf4j related dependencies

  3. Problem: fasterxml related class method not found

    Solution: Exclude following modules in version 2.4.1:


    but explicitly use 2.3.1 instead

  4. Problem: “org.apache.commons.math3.exception.MathIllegalArgumentException: insufficient data: only 10 rows and 1 columns.” breaks spark on distributed environment, but works locally.

    Solution: It's because Spark natively depends on math3 version “3.1.1”.

  5. Problem: Exception in thread “delete Spark temp dirs” java.lang.NoClassDefFoundError: Could not initialize class org.apache.log4j.LogManager

    Solution: TODO