Merge pull request #1159 from dcslin/feature/train_multiprocess

Add train multiprocess implementations for cnn ms example