python generate_pycarbon_mnist.py --carbon-sdk-path /your_path/carbondata/store/sdk/target/carbondata-sdk.jar
# Create a dataframe object from carbon files spark.sql("create table readcarbon using carbon location '" + str(dataset_path) + "'") dataframe = spark.sql("select * from readcarbon") # Show a schema dataframe.printSchema() # Count all dataframe.count() # Show just some columns dataframe.select('id').show() # Also use a standard SQL to query a dataset spark.sql('SELECT count(id) from carbon.`{}` '.format(dataset_url)).collect()
some details are illustrated in pyspark_hello_world_carbon.py
in test/hello_world.
python tf_example_carbon_unified_api.py --carbon-sdk-path /your_path/carbondata/store/sdk/target/carbondata-sdk.jar
with make_reader('file:///some/localpath/a_dataset') as reader: dataset = make_dataset(reader) iterator = dataset.make_one_shot_iterator() tensor = iterator.get_next() with tf.Session() as sess: sample = sess.run(tensor) print(sample.id) some details are illustrated in `tf_example_carbon_unified_api.py` in test/mnist.
2020-01-20 21:12:31 INFO DictionaryBasedVectorResultCollector:72 - Direct pagewise vector fill collector is used to scan and collect the data 2020-01-20 21:12:32 INFO UnsafeMemoryManager:176 - Total offheap working memory used after task 2642c969-6c43-4e31-b8b0-450dff1f7821 is 0. Current running tasks are 2020-01-20 21:12:32 INFO UnsafeMemoryManager:176 - Total offheap working memory used after task 67ecf75e-e097-486d-b787-8b7db5f1d7c1 is 0. Current running tasks are After 0 training iterations, the accuracy of the model is: 0.27 After 10 training iterations, the accuracy of the model is: 0.48 After 20 training iterations, the accuracy of the model is: 0.78 After 30 training iterations, the accuracy of the model is: 0.69 After 40 training iterations, the accuracy of the model is: 0.73 After 50 training iterations, the accuracy of the model is: 0.79 After 60 training iterations, the accuracy of the model is: 0.85 After 70 training iterations, the accuracy of the model is: 0.73 After 80 training iterations, the accuracy of the model is: 0.86 After 90 training iterations, the accuracy of the model is: 0.80 After 99 training iterations, the accuracy of the model is: 0.79 all time: 185.28250288963318 Finish