blob: 61929e96902d4968429c037fe51608984933fb3e [file] [log] [blame]
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Load images into table\n",
"\n",
"This demonstrates different ways to load images into a database table.\n",
"\n",
"We use the script called <em>madlib_image_loader.py</em> located at https://github.com/apache/madlib-site/tree/asf-site/community-artifacts/Deep-learning which uses the Python Imaging Library so supports multiple formats http://www.pythonware.com/products/pil/\n",
"\n",
"## Table of contents\n",
"\n",
"<a href=\"#setup\">1. Setup image loader</a>\n",
"\n",
"<a href=\"#fetch_numpy\">2. Fetch images then load NumPy array into table</a>\n",
"\n",
"<a href=\"#file_system\">3. Load from file system into table</a>\n",
"\n",
"<a href=\"#fetch_numpy_label\">4. Fetch images then load NumPy array into table with label datatype as INT</a>"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/fmcquillan/anaconda/lib/python2.7/site-packages/IPython/config.py:13: ShimWarning: The `IPython.config` package has been deprecated since IPython 4.0. You should import from traitlets.config instead.\n",
" \"You should import from traitlets.config instead.\", ShimWarning)\n",
"/Users/fmcquillan/anaconda/lib/python2.7/site-packages/IPython/utils/traitlets.py:5: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.\n",
" warn(\"IPython.utils.traitlets has moved to a top-level traitlets package.\")\n"
]
}
],
"source": [
"%load_ext sql"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"u'Connected: fmcquillan@madlib'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Greenplum Database 5.x on GCP for deep learning (PM demo machine)\n",
"#%sql postgresql://gpadmin@35.239.240.26:5432/madlib\n",
" \n",
"# PostgreSQL local\n",
"%sql postgresql://fmcquillan@localhost:5432/madlib"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>version</th>\n",
" </tr>\n",
" <tr>\n",
" <td>MADlib version: 1.16, git revision: rc/1.16-rc1, cmake configuration time: Mon Jul 1 17:45:09 UTC 2019, build type: Release, build system: Darwin-16.7.0, C compiler: Clang, C++ compiler: Clang</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(u'MADlib version: 1.16, git revision: rc/1.16-rc1, cmake configuration time: Mon Jul 1 17:45:09 UTC 2019, build type: Release, build system: Darwin-16.7.0, C compiler: Clang, C++ compiler: Clang',)]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql select madlib.version();\n",
"#%sql select version();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"setup\"></a>\n",
"# 1. Set up image loader\n",
"\n",
"We use the script called <em>madlib_image_loader.py</em> located at https://github.com/apache/madlib-site/tree/asf-site/community-artifacts/Deep-learning"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using TensorFlow backend.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Couldn't import dot_parser, loading of dot files will not be possible.\n"
]
}
],
"source": [
"import sys\n",
"import os\n",
"from keras.datasets import cifar10\n",
"\n",
"madlib_site_dir = '/Users/fmcquillan/Documents/Product/MADlib/Demos/data'\n",
"sys.path.append(madlib_site_dir)\n",
"\n",
"# Import image loader module\n",
"from madlib_image_loader import ImageLoader, DbCredentials\n",
"\n",
"# Specify database credentials, for connecting to db\n",
"#db_creds = DbCredentials(user='gpadmin',\n",
"# host='35.239.240.26',\n",
"# port='5432',\n",
"# password='')\n",
"\n",
"# Specify database credentials, for connecting to db\n",
"db_creds = DbCredentials(user='fmcquillan',\n",
" host='localhost',\n",
" port='5432',\n",
" password='')\n",
"\n",
"# Initialize ImageLoader (increase num_workers to run faster)\n",
"iloader = ImageLoader(num_workers=5, db_creds=db_creds)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"fetch_numpy\"></a>\n",
"# 2. Fetch images then load NumPy array into table\n",
"\n",
"<em>iloader.load_dataset_from_np(data_x, data_y, table_name, append=False)</em>\n",
"\n",
"- <em>data_x</em> contains image data in np.array format\n",
"\n",
"\n",
"- <em>data_y</em> is a 1D np.array of the image categories (labels).\n",
"\n",
"\n",
"- If the user passes a <em>table_name</em> while creating ImageLoader object, it will be used for all further calls to load_dataset_from_np. It can be changed by passing it as a parameter during the actual call to load_dataset_from_np, and if so future calls will load to that table name instead. This avoids needing to pass the table_name again every time, but also allows it to be changed at any time."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# Load dataset into np array\n",
"(x_train, y_train), (x_test, y_test) = cifar10.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"MainProcess: Connected to madlib db.\n",
"Executing: CREATE TABLE cifar_10_train_data (id SERIAL, x REAL[], y TEXT)\n",
"CREATE TABLE\n",
"Created table cifar_10_train_data in madlib db\n",
"Spawning 5 workers...\n",
"Initializing PoolWorker-1 [pid 82412]\n",
"PoolWorker-1: Created temporary directory /tmp/madlib_Bt85aChbv0\n",
"Initializing PoolWorker-2 [pid 82413]\n",
"PoolWorker-2: Created temporary directory /tmp/madlib_cSyCSiEhHT\n",
"Initializing PoolWorker-3 [pid 82414]\n",
"PoolWorker-3: Created temporary directory /tmp/madlib_uvtHjGCU5S\n",
"PoolWorker-1: Connected to madlib db.\n",
"Initializing PoolWorker-4 [pid 82415]\n",
"PoolWorker-4: Created temporary directory /tmp/madlib_eJmkoDZTr8\n",
"PoolWorker-2: Connected to madlib db.\n",
"Initializing PoolWorker-5 [pid 82417]\n",
"PoolWorker-5: Created temporary directory /tmp/madlib_websbk05x2\n",
"PoolWorker-3: Connected to madlib db.\n",
"PoolWorker-4: Connected to madlib db.\n",
"PoolWorker-5: Connected to madlib db.\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0000.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0000.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0000.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0000.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0000.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0001.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0001.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0001.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0001.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0001.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0002.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0002.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0002.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0002.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0002.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0003.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0003.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0003.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0003.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0003.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0004.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0004.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0004.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0004.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0004.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0005.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0005.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0005.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0005.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0005.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0006.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0006.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0006.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0006.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0006.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0007.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0007.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0007.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0007.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0007.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0008.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0008.tmp\n",
"PoolWorker-3: Wrote 1000 images to /tmp/madlib_uvtHjGCU5S/cifar_10_train_data0008.tmp\n",
"PoolWorker-4: Wrote 1000 images to /tmp/madlib_eJmkoDZTr8/cifar_10_train_data0008.tmp\n",
"PoolWorker-5: Wrote 1000 images to /tmp/madlib_websbk05x2/cifar_10_train_data0008.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-3: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-4: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-5: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0009.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0009.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0010.tmp\n",
"PoolWorker-2: Wrote 1000 images to /tmp/madlib_cSyCSiEhHT/cifar_10_train_data0010.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-1: Wrote 1000 images to /tmp/madlib_Bt85aChbv0/cifar_10_train_data0011.tmp\n",
"PoolWorker-1: Loaded 1000 images into cifar_10_train_data\n",
"PoolWorker-2: Removed temporary directory /tmp/madlib_cSyCSiEhHT\n",
"PoolWorker-3: Removed temporary directory /tmp/madlib_uvtHjGCU5S\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"PoolWorker-5: Removed temporary directory /tmp/madlib_websbk05x2\n",
"PoolWorker-4: Removed temporary directory /tmp/madlib_eJmkoDZTr8\n",
"PoolWorker-1: Removed temporary directory /tmp/madlib_Bt85aChbv0\n",
"Done! Loaded 50000 images in 24.2222080231s\n",
"5 workers terminated.\n",
"MainProcess: Connected to madlib db.\n",
"Executing: CREATE TABLE cifar_10_test_data (id SERIAL, x REAL[], y TEXT)\n",
"CREATE TABLE\n",
"Created table cifar_10_test_data in madlib db\n",
"Spawning 5 workers...\n",
"Initializing PoolWorker-6 [pid 82423]\n",
"PoolWorker-6: Created temporary directory /tmp/madlib_e615zVgkaE\n",
"Initializing PoolWorker-7 [pid 82424]\n",
"PoolWorker-7: Created temporary directory /tmp/madlib_iRi2oMNIFA\n",
"Initializing PoolWorker-8 [pid 82425]\n",
"PoolWorker-8: Created temporary directory /tmp/madlib_kkSktVCq3n\n",
"PoolWorker-6: Connected to madlib db.\n",
"Initializing PoolWorker-9 [pid 82426]\n",
"PoolWorker-7: Connected to madlib db.\n",
"PoolWorker-9: Created temporary directory /tmp/madlib_0To3XX96yI\n",
"Initializing PoolWorker-10 [pid 82428]\n",
"PoolWorker-8: Connected to madlib db.\n",
"PoolWorker-10: Created temporary directory /tmp/madlib_8zwK04IJsc\n",
"PoolWorker-9: Connected to madlib db.\n",
"PoolWorker-10: Connected to madlib db.\n",
"PoolWorker-6: Wrote 1000 images to /tmp/madlib_e615zVgkaE/cifar_10_test_data0000.tmp\n",
"PoolWorker-7: Wrote 1000 images to /tmp/madlib_iRi2oMNIFA/cifar_10_test_data0000.tmp\n",
"PoolWorker-8: Wrote 1000 images to /tmp/madlib_kkSktVCq3n/cifar_10_test_data0000.tmp\n",
"PoolWorker-9: Wrote 1000 images to /tmp/madlib_0To3XX96yI/cifar_10_test_data0000.tmp\n",
"PoolWorker-10: Wrote 1000 images to /tmp/madlib_8zwK04IJsc/cifar_10_test_data0000.tmp\n",
"PoolWorker-6: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-7: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-8: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-10: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-9: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-6: Wrote 1000 images to /tmp/madlib_e615zVgkaE/cifar_10_test_data0001.tmp\n",
"PoolWorker-7: Wrote 1000 images to /tmp/madlib_iRi2oMNIFA/cifar_10_test_data0001.tmp\n",
"PoolWorker-8: Wrote 1000 images to /tmp/madlib_kkSktVCq3n/cifar_10_test_data0001.tmp\n",
"PoolWorker-9: Wrote 1000 images to /tmp/madlib_0To3XX96yI/cifar_10_test_data0001.tmp\n",
"PoolWorker-10: Wrote 1000 images to /tmp/madlib_8zwK04IJsc/cifar_10_test_data0001.tmp\n",
"PoolWorker-6: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-7: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-8: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-9: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-10: Loaded 1000 images into cifar_10_test_data\n",
"PoolWorker-10: Removed temporary directory /tmp/madlib_8zwK04IJsc\n",
"PoolWorker-8: Removed temporary directory /tmp/madlib_kkSktVCq3n\n",
"PoolWorker-7: Removed temporary directory /tmp/madlib_iRi2oMNIFA\n",
"PoolWorker-6: Removed temporary directory /tmp/madlib_e615zVgkaE\n",
"PoolWorker-9: Removed temporary directory /tmp/madlib_0To3XX96yI\n",
"Done! Loaded 10000 images in 4.6932258606s\n",
"5 workers terminated.\n"
]
}
],
"source": [
"%sql DROP TABLE IF EXISTS cifar_10_train_data, cifar_10_test_data;\n",
"\n",
"# Save images to temporary directories and load into database\n",
"iloader.load_dataset_from_np(x_train, y_train, 'cifar_10_train_data', append=False)\n",
"iloader.load_dataset_from_np(x_test, y_test, 'cifar_10_test_data', append=False)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>50000</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(50000L,)]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*) FROM cifar_10_train_data;"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>10000</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(10000L,)]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*) FROM cifar_10_test_data;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"file_system\"></a>\n",
"# 3. Load from file system\n",
"\n",
"Uses the Python Imaging Library so supports multiple formats\n",
"http://www.pythonware.com/products/pil/\n",
"\n",
"<em>load_dataset_from_disk(root_dir, table_name, num_labels='all', append=False)</em>\n",
"\n",
"- Calling this function will look in <em>root_dir</em> on the local disk of wherever this is being run. It will skip over any files in that directory, but will load images contained in each of its subdirectories. The images should be organized by category/class, where the name of each subdirectory is the label for the images contained within it.\n",
"\n",
"\n",
"- The <em>table_name</em> and <em>append</em> parameters are the same as above The parameter <em>num_labels</em> is an optional parameter which can be used to restrict the number of labels (image classes) loaded, even if more are found in <em>root_dir</em>. For example, for a large dataset you may have hundreds of labels, but only wish to use a subset of that containing a few dozen.\n",
"\n",
"For example, if we put the CIFAR-10 training data is in 10 subdirectories under directory <em>cifar10</em>, with one subdirectory for each class:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"MainProcess: Connected to madlib db.\n",
"Executing: CREATE TABLE cifar_10_train_data_filesystem (id SERIAL, x REAL[], y TEXT, img_name TEXT)\n",
"CREATE TABLE\n",
"Created table cifar_10_train_data_filesystem in madlib db\n",
".DS_Store is not a directory, skipping\n",
"number of labels = 10\n",
"Found 10 image labels in /Users/fmcquillan/tmp/cifar10\n",
"Spawning 5 workers...\n",
"Initializing PoolWorker-11 [pid 82438]\n",
"PoolWorker-11: Created temporary directory /tmp/madlib_aEC1lF2HqL\n",
"Initializing PoolWorker-12 [pid 82439]\n",
"PoolWorker-12: Created temporary directory /tmp/madlib_70qpwFzzqW\n",
"Initializing PoolWorker-13 [pid 82440]\n",
"PoolWorker-13: Created temporary directory /tmp/madlib_r2u4Zo5bPt\n",
"PoolWorker-11: Connected to madlib db.\n",
"Initializing PoolWorker-14 [pid 82441]\n",
"PoolWorker-12: Connected to madlib db.\n",
"PoolWorker-14: Created temporary directory /tmp/madlib_aTPESoNjVi\n",
"Initializing PoolWorker-15 [pid 82443]\n",
"PoolWorker-13: Connected to madlib db.\n",
"PoolWorker-15: Created temporary directory /tmp/madlib_rhVwjLTbWI\n",
"PoolWorker-14: Connected to madlib db.\n",
"PoolWorker-15: Connected to madlib db.\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0000.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0000.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0000.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0000.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0000.tmp\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0001.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0001.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0001.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0001.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0001.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0002.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0002.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0002.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0002.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0002.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0003.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0003.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0003.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0003.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0003.tmp\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0004.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0004.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0004.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0004.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0004.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0005.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0005.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0005.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0005.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0005.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0006.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0006.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0006.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0006.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0006.tmp\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0007.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0007.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0007.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0007.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0007.tmp\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0008.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0008.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0008.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0008.tmp\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0008.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_aTPESoNjVi/cifar_10_train_data_filesystem0009.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_rhVwjLTbWI/cifar_10_train_data_filesystem0009.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_70qpwFzzqW/cifar_10_train_data_filesystem0009.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_r2u4Zo5bPt/cifar_10_train_data_filesystem0009.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_aEC1lF2HqL/cifar_10_train_data_filesystem0009.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_filesystem\n",
"PoolWorker-12: Removed temporary directory /tmp/madlib_70qpwFzzqW\n",
"PoolWorker-13: Removed temporary directory /tmp/madlib_r2u4Zo5bPt\n",
"PoolWorker-15: Removed temporary directory /tmp/madlib_rhVwjLTbWI\n",
"PoolWorker-11: Removed temporary directory /tmp/madlib_aEC1lF2HqL\n",
"PoolWorker-14: Removed temporary directory /tmp/madlib_aTPESoNjVi\n",
"Done! Loaded 10 image categories in 27.9927430153s\n",
"5 workers terminated.\n"
]
}
],
"source": [
"%sql drop table if exists cifar_10_train_data_filesystem;\n",
"# Load images from file system\n",
"iloader.load_dataset_from_disk('/Users/fmcquillan/tmp/cifar10', 'cifar_10_train_data_filesystem', num_labels='all', append=False)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>50000</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(50000L,)]"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*) FROM cifar_10_train_data_filesystem;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a id=\"fetch_numpy_label\"></a>\n",
"# 4. Fetch images then load NumPy array into table with label datatype as INT\n",
"\n",
"<em>iloader.load_dataset_from_np(data_x, data_y, table_name, append=False, label_datatype= 'int')</em>\n",
"\n",
"- <em>data_x</em> contains image data in np.array format\n",
"\n",
"\n",
"- <em>data_y</em> is a 1D np.array of the image categories (labels).\n",
"\n",
"\n",
"- If the user passes a <em>table_name</em> while creating ImageLoader object, it will be used for all further calls to load_dataset_from_np. It can be changed by passing it as a parameter during the actual call to load_dataset_from_np, and if so future calls will load to that table name instead. This avoids needing to pass the table_name again every time, but also allows it to be changed at any time.\n",
"\n",
"\n",
"- If the user passes a <em>label_datatype</em> is the datatype of the image categories (labels) column in the output table."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"# Load dataset into np array\n",
"(x_train, y_train), (x_test, y_test) = cifar10.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"MainProcess: Connected to madlib db.\n",
"Executing: CREATE TABLE cifar_10_train_data_int (id SERIAL, x REAL[], y int)\n",
"CREATE TABLE\n",
"Created table cifar_10_train_data_int in madlib db\n",
"Spawning 5 workers...\n",
"Initializing PoolWorker-11 [pid 42137]\n",
"PoolWorker-11: Created temporary directory /tmp/madlib_8Pc3rPCJJL\n",
"Initializing PoolWorker-12 [pid 42138]\n",
"PoolWorker-12: Created temporary directory /tmp/madlib_QOyefNO7bi\n",
"Initializing PoolWorker-13 [pid 42139]\n",
"Initializing PoolWorker-14 [pid 42140]\n",
"PoolWorker-13: Created temporary directory /tmp/madlib_iFiR9zT8Za\n",
"PoolWorker-14: Created temporary directory /tmp/madlib_EAo5JPfg55\n",
"Initializing PoolWorker-15 [pid 42142]\n",
"PoolWorker-11: Connected to madlib db.\n",
"PoolWorker-15: Created temporary directory /tmp/madlib_AjkjHXX3lr\n",
"PoolWorker-12: Connected to madlib db.\n",
"PoolWorker-13: Connected to madlib db.\n",
"PoolWorker-14: Connected to madlib db.\n",
"PoolWorker-15: Connected to madlib db.\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0000.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0000.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0000.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0000.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0000.tmp\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0001.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0001.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0001.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0001.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0001.tmp\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0002.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0002.tmp\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0002.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0002.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0002.tmp\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0003.tmp\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0003.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0003.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0003.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0003.tmp\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0004.tmp\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0004.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0004.tmp\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0004.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0004.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0005.tmp\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0005.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0005.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0005.tmp\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0005.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0006.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0006.tmp\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0006.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0006.tmp\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0006.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0007.tmp\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0007.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0007.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0007.tmp\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0007.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0008.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0008.tmp\n",
"PoolWorker-13: Wrote 1000 images to /tmp/madlib_iFiR9zT8Za/cifar_10_train_data_int0008.tmp\n",
"PoolWorker-15: Wrote 1000 images to /tmp/madlib_AjkjHXX3lr/cifar_10_train_data_int0008.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-13: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-14: Wrote 1000 images to /tmp/madlib_EAo5JPfg55/cifar_10_train_data_int0008.tmp\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0009.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0009.tmp\n",
"PoolWorker-14: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0010.tmp\n",
"PoolWorker-12: Wrote 1000 images to /tmp/madlib_QOyefNO7bi/cifar_10_train_data_int0010.tmp\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-12: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-11: Wrote 1000 images to /tmp/madlib_8Pc3rPCJJL/cifar_10_train_data_int0011.tmp\n",
"PoolWorker-11: Loaded 1000 images into cifar_10_train_data_int\n",
"PoolWorker-15: Removed temporary directory /tmp/madlib_AjkjHXX3lr\n",
"PoolWorker-14: Removed temporary directory /tmp/madlib_EAo5JPfg55\n",
"PoolWorker-12: Removed temporary directory /tmp/madlib_QOyefNO7bi\n",
"PoolWorker-11: Removed temporary directory /tmp/madlib_8Pc3rPCJJL\n",
"PoolWorker-13: Removed temporary directory /tmp/madlib_iFiR9zT8Za\n",
"Done! Loaded 50000 images in 40.9305319786s\n",
"5 workers terminated.\n",
"MainProcess: Connected to madlib db.\n",
"Executing: CREATE TABLE cifar_10_test_data_int (id SERIAL, x REAL[], y int)\n",
"CREATE TABLE\n",
"Created table cifar_10_test_data_int in madlib db\n",
"Spawning 5 workers...\n",
"Initializing PoolWorker-16 [pid 42208]\n",
"PoolWorker-16: Created temporary directory /tmp/madlib_J8m9MJp916\n",
"Initializing PoolWorker-17 [pid 42209]\n",
"PoolWorker-17: Created temporary directory /tmp/madlib_jH7fmFz0Ku\n",
"Initializing PoolWorker-18 [pid 42210]\n",
"PoolWorker-18: Created temporary directory /tmp/madlib_lRPK3C5DIL\n",
"Initializing PoolWorker-19 [pid 42211]\n",
"PoolWorker-19: Created temporary directory /tmp/madlib_KkQlXdU0Vd\n",
"PoolWorker-16: Connected to madlib db.\n",
"PoolWorker-17: Connected to madlib db.\n",
"PoolWorker-18: Connected to madlib db.\n",
"Initializing PoolWorker-20 [pid 42215]\n",
"PoolWorker-20: Created temporary directory /tmp/madlib_RfyStrRruL\n",
"PoolWorker-19: Connected to madlib db.\n",
"PoolWorker-20: Connected to madlib db.\n",
"PoolWorker-16: Wrote 1000 images to /tmp/madlib_J8m9MJp916/cifar_10_test_data_int0000.tmp\n",
"PoolWorker-17: Wrote 1000 images to /tmp/madlib_jH7fmFz0Ku/cifar_10_test_data_int0000.tmp\n",
"PoolWorker-18: Wrote 1000 images to /tmp/madlib_lRPK3C5DIL/cifar_10_test_data_int0000.tmp\n",
"PoolWorker-19: Wrote 1000 images to /tmp/madlib_KkQlXdU0Vd/cifar_10_test_data_int0000.tmp\n",
"PoolWorker-20: Wrote 1000 images to /tmp/madlib_RfyStrRruL/cifar_10_test_data_int0000.tmp\n",
"PoolWorker-16: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-20: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-19: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-17: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-18: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-16: Wrote 1000 images to /tmp/madlib_J8m9MJp916/cifar_10_test_data_int0001.tmp\n",
"PoolWorker-20: Wrote 1000 images to /tmp/madlib_RfyStrRruL/cifar_10_test_data_int0001.tmp\n",
"PoolWorker-19: Wrote 1000 images to /tmp/madlib_KkQlXdU0Vd/cifar_10_test_data_int0001.tmp\n",
"PoolWorker-17: Wrote 1000 images to /tmp/madlib_jH7fmFz0Ku/cifar_10_test_data_int0001.tmp\n",
"PoolWorker-18: Wrote 1000 images to /tmp/madlib_lRPK3C5DIL/cifar_10_test_data_int0001.tmp\n",
"PoolWorker-16: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-19: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-18: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-17: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-20: Loaded 1000 images into cifar_10_test_data_int\n",
"PoolWorker-16: Removed temporary directory /tmp/madlib_J8m9MJp916\n",
"PoolWorker-18: Removed temporary directory /tmp/madlib_lRPK3C5DIL\n",
"PoolWorker-17: Removed temporary directory /tmp/madlib_jH7fmFz0Ku\n",
"PoolWorker-20: Removed temporary directory /tmp/madlib_RfyStrRruL\n",
"PoolWorker-19: Removed temporary directory /tmp/madlib_KkQlXdU0Vd\n",
"Done! Loaded 10000 images in 7.7835650444s\n",
"5 workers terminated.\n"
]
}
],
"source": [
"%sql DROP TABLE IF EXISTS cifar_10_train_data_int, cifar_10_test_data_int;\n",
"\n",
"# Save images to temporary directories and load into database\n",
"iloader.load_dataset_from_np(x_train, y_train, 'cifar_10_train_data_int', append=False, label_datatype= 'int')\n",
"iloader.load_dataset_from_np(x_test, y_test, 'cifar_10_test_data_int', append=False, label_datatype= 'int')"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>50000</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(50000L,)]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*) FROM cifar_10_train_data_int;"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"<table>\n",
" <tr>\n",
" <th>count</th>\n",
" </tr>\n",
" <tr>\n",
" <td>10000</td>\n",
" </tr>\n",
"</table>"
],
"text/plain": [
"[(10000L,)]"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT COUNT(*) FROM cifar_10_test_data_int;"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.16"
}
},
"nbformat": 4,
"nbformat_minor": 2
}