| :py:mod:`tests.system.providers.apache.hive.example_twitter_dag` |
| ================================================================ |
| |
| .. py:module:: tests.system.providers.apache.hive.example_twitter_dag |
| |
| .. autoapi-nested-parse:: |
| |
| This is an example dag for managing twitter data. |
| |
| |
| |
| Module Contents |
| --------------- |
| |
| |
| Functions |
| ~~~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| tests.system.providers.apache.hive.example_twitter_dag.fetch_tweets |
| tests.system.providers.apache.hive.example_twitter_dag.clean_tweets |
| tests.system.providers.apache.hive.example_twitter_dag.analyze_tweets |
| tests.system.providers.apache.hive.example_twitter_dag.transfer_to_db |
| |
| |
| |
| Attributes |
| ~~~~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| tests.system.providers.apache.hive.example_twitter_dag.ENV_ID |
| tests.system.providers.apache.hive.example_twitter_dag.DAG_ID |
| tests.system.providers.apache.hive.example_twitter_dag.fetch |
| tests.system.providers.apache.hive.example_twitter_dag.test_run |
| |
| |
| .. py:data:: ENV_ID |
| |
| |
| |
| |
| .. py:data:: DAG_ID |
| :annotation: = example_twitter_dag |
| |
| |
| |
| .. py:function:: fetch_tweets() |
| |
| This task should call Twitter API and retrieve tweets from yesterday from and to for the four twitter |
| users (Twitter_A,..,Twitter_D) There should be eight csv output files generated by this task and naming |
| convention is direction(from or to)_twitterHandle_date.csv |
| |
| |
| .. py:function:: clean_tweets() |
| |
| This is a placeholder to clean the eight files. In this step you can get rid of or cherry pick columns |
| and different parts of the text. |
| |
| |
| .. py:function:: analyze_tweets() |
| |
| This is a placeholder to analyze the twitter data. Could simply be a sentiment analysis through algorithms |
| like bag of words or something more complicated. You can also take a look at Web Services to do such |
| tasks. |
| |
| |
| .. py:function:: transfer_to_db() |
| |
| This is a placeholder to extract summary from Hive data and store it to MySQL. |
| |
| |
| .. py:data:: fetch |
| |
| |
| |
| |
| .. py:data:: test_run |
| |
| |
| |
| |