Before starting, make sure you have downloaded and deployed SeaTunnel as described in Deployment
Please Download Spark first(required version >= 2.4.0). For more information you can see Getting Started: Standalone
Configure SeaTunnel: Change the setting in ${SEATUNNEL_HOME}/config/seatunnel-env.sh
and set SPARK_HOME
to the Spark deployment dir.
Edit config/seatunnel.streaming.conf.template
, which determines the way and logic of data input, processing, and output after seatunnel is started. The following is an example of the configuration file, which is the same as the example application mentioned above.
env { parallelism = 1 job.mode = "BATCH" } source { FakeSource { plugin_output = "fake" row.num = 16 schema = { fields { name = "string" age = "int" } } } } transform { FieldMapper { plugin_input = "fake" plugin_output = "fake1" field_mapper = { age = age name = new_name } } } sink { Console { plugin_input = "fake1" } }
More information about config please check Config Concept
You could start the application by the following commands:
Spark 2.4.x
cd "apache-seatunnel-${version}" ./bin/start-seatunnel-spark-2-connector-v2.sh \ --master local[4] \ --deploy-mode client \ --config ./config/v2.streaming.conf.template
Spark3.x.x
cd "apache-seatunnel-${version}" ./bin/start-seatunnel-spark-3-connector-v2.sh \ --master local[4] \ --deploy-mode client \ --config ./config/v2.streaming.conf.template
See The Output: When you run the command, you can see its output in your console. This is a sign to determine whether the command ran successfully or not.
The SeaTunnel console will print some logs as below:
fields : name, age types : STRING, INT row=1 : elWaB, 1984352560 row=2 : uAtnp, 762961563 row=3 : TQEIB, 2042675010 row=4 : DcFjo, 593971283 row=5 : SenEb, 2099913608 row=6 : DHjkg, 1928005856 row=7 : eScCM, 526029657 row=8 : sgOeE, 600878991 row=9 : gwdvw, 1951126920 row=10 : nSiKE, 488708928 row=11 : xubpl, 1420202810 row=12 : rHZqb, 331185742 row=13 : rciGD, 1112878259 row=14 : qLhdI, 1457046294 row=15 : ZTkRx, 1240668386 row=16 : SGZCr, 94186144
Zeta
, and it's the default engine of SeaTunnel. You can follow Quick Start to configure and run a data synchronization job.