Adding a symlink to the data path on engine generate. close #93 (#97)

Great!!! Thanks!
3 files changed
tree: f790c274839b15910ad5a97d9068f2c0731689e5
  1. bin/
  2. marvin_python_toolbox/
  3. notebooks/
  4. tests/
  5. .bumpversion.cfg
  6. .coveragerc
  7. .gitignore
  8. .travis.yml
  9. CHANGES.md
  10. INSTALL
  11. LICENSE
  12. Makefile
  13. MANIFEST.in
  14. marvin.ini
  15. NOTICE
  16. pytest.ini
  17. README.md
  18. setup.cfg
  19. setup.py
  20. tox.ini
README.md

Build Status codecov

Join the chat at https://gitter.im/gitterHQ/gitter

Marvin Toolbox v0.0.3

Quick Start

Review

Marvin is an open-source Artificial Intelligence platform that focuses on helping data scientists deliver meaningful solutions to complex problems. Supported by a standardized large-scale, language-agnostic architecture, Marvin simplifies the process of exploration and modeling.

Getting Started

Installing Marvin as Ubuntu and MacOS user

Perform the following steps to install the Marvin Toolbox:

  1. Libsasl2-dev, Python-pip and Graphviz installation
Ubuntu: 
sudo apt-get install libsasl2-dev python-pip graphviz -y

MacOS: 
sudo easy_install pip
brew install openssl graphviz
  1. VirtualEnvWrapper Installation
sudo pip install --upgrade pip
sudo pip install virtualenvwrapper --ignore-installed six
  1. Spark installation
curl https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.6.tgz -o /tmp/spark-2.1.1-bin-hadoop2.6.tgz

sudo tar -xf /tmp/spark-2.1.1-bin-hadoop2.6.tgz -C /opt/
sudo ln -s /opt/spark-2.1.1-bin-hadoop2.6 /opt/spark

echo "export SPARK_HOME=/opt/spark" >> $HOME/.bash_profile

If there is no /opt directory present, before unpacking spark, run:

sudo mkdir /opt
  1. Set environment variables
echo "export WORKON_HOME=$HOME/.virtualenvs" >> $HOME/.bash_profile
echo "export MARVIN_HOME=$HOME/marvin" >> $HOME/.bash_profile
echo "export MARVIN_DATA_PATH=$HOME/marvin/data" >> $HOME/.bash_profile
echo "source virtualenvwrapper.sh" >> $HOME/.bash_profile

source ~/.bash_profile
  1. Clone and install python-toolbox
mkdir $MARVIN_HOME
mkdir $MARVIN_DATA_PATH
cd $MARVIN_HOME

git clone https://github.com/marvin-ai/marvin-python-toolbox.git
cd marvin-python-toolbox

mkvirtualenv python-toolbox-env
setvirtualenvproject

make marvin
  1. Test the installation
marvin test

Installing Marvin with Other OS

Perform the following steps to install Marvin Toolbox using Vagrant:

  1. Install requirements
  1. Clone the repository and start provision
git clone https://github.com/marvin-ai/marvin-vagrant-dev.git
cd marvin-vagrant-dev
  1. Prepare the dev (engine creation) box
vagrant up dev
vagrant ssh dev

Wait for provision process and follow interactive configuration script after access the dev box using vagrant ssh command.

  1. The marvin source projects will be located in your home folder; to compile and use the Marvin toolbox
workon python-toolbox-env
make marvin

Creating a new engine

  1. To create a new engine
workon python-toolbox-env
marvin engine-generate

Respond to the prompt and wait for the engine environment preparation to complete. Don't forget to start dev box before if you are using vagrant.

  1. Test the new engine
workon <new_engine_name>-env
marvin test
  1. For more information
marvin --help

Working in an existing engine

  1. Set VirtualEnv and get to the engine's path
workon <engine_name>-env
  1. Test your engine
marvin test
  1. Bring up the notebook and access it from your browser
marvin notebook

Command line interface

Usage: marvin [OPTIONS] COMMAND [ARGS]

Options:

  --debug       #Enable debug mode.
  --version     #Show the version and exit.
  --help        #Show this command line interface and exit.

Commands:

  engine-generate     #Generate a new marvin engine project.
  engine-generateenv  #Generate a new marvin engine environment.
  engine-grpcserver   #Marvin gRPC engine action server starts.
  engine-httpserver   #Marvin http api server starts.
  hive-dataimport     #Import data samples from a hive databse to the hive running in this toolbox.
  hive-generateconf   #Generate default configuration file.
  hive-resetremote    #Drop all remote tables from informed engine on host.
  notebook            #Start the Jupyter notebook server.
  pkg-bumpversion     #Bump the package version.
  pkg-createtag       #Create git tag using the package version.
  pkg-showchanges     #Show the package changelog.
  pkg-showinfo        #Show information about the package.
  pkg-showversion     #Show the package version.
  pkg-updatedeps      #Update requirements.txt.
  test                #Run tests.
  test-checkpep8      #Check python code style.
  test-tdd            #Watch for changes to run tests automatically.
  test-tox            #Run tests using a new virtualenv.

Running a example engine

  1. Clone the example engine from the repository
git clone https://github.com/marvin-ai/engines.git
  1. Generate a new Marvin engine environment for the Iris species engine
workon python-toolbox-env
marvin engine-generateenv ../engines/iris-species-engine/
  1. Run the Iris species engine
workon iris-species-engine-env
marvin engine-dryrun 

Marvin is a project started at B2W Digital offices and released open source on September 2017.