blob: e12ae27d6cb03c983c0f95a397920243900e058e [file] [log] [blame]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
<title>Apache Flink: Apache Flink Code Style and Quality Guide — Components</title>
<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
<link rel="icon" href="/favicon.ico" type="image/x-icon">
<!-- Bootstrap -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/css/bootstrap.min.css">
<link rel="stylesheet" href="/css/flink.css">
<link rel="stylesheet" href="/css/syntax.css">
<!-- Blog RSS feed -->
<link href="/blog/feed.xml" rel="alternate" type="application/rss+xml" title="Apache Flink Blog: RSS feed" />
<!-- jQuery (necessary for Bootstrap's JavaScript plugins) -->
<!-- We need to load Jquery in the header for custom google analytics event tracking-->
<script src="/js/jquery.min.js"></script>
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<!-- Main content. -->
<div class="container">
<div class="row">
<div id="sidebar" class="col-sm-3">
<!-- Top navbar. -->
<nav class="navbar navbar-default">
<!-- The logo. -->
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<div class="navbar-logo">
<a href="/">
<img alt="Apache Flink" src="/img/flink-header-logo.svg" width="147px" height="73px">
</a>
</div>
</div><!-- /.navbar-header -->
<!-- The navigation links. -->
<div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
<ul class="nav navbar-nav navbar-main">
<!-- First menu section explains visitors what Flink is -->
<!-- What is Stream Processing? -->
<!--
<li><a href="/streamprocessing1.html">What is Stream Processing?</a></li>
-->
<!-- What is Flink? -->
<li><a href="/flink-architecture.html">What is Apache Flink?</a></li>
<!-- What is Stateful Functions? -->
<li><a href="/stateful-functions.html">What is Stateful Functions?</a></li>
<!-- Use cases -->
<li><a href="/usecases.html">Use Cases</a></li>
<!-- Powered by -->
<li><a href="/poweredby.html">Powered By</a></li>
&nbsp;
<!-- Second menu section aims to support Flink users -->
<!-- Downloads -->
<li><a href="/downloads.html">Downloads</a></li>
<!-- Getting Started -->
<li class="dropdown">
<a class="dropdown-toggle" data-toggle="dropdown" href="#">Getting Started<span class="caret"></span></a>
<ul class="dropdown-menu">
<li><a href="https://ci.apache.org/projects/flink/flink-docs-release-1.11/getting-started/index.html" target="_blank">With Flink <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
<li><a href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.1/getting-started/project-setup.html" target="_blank">With Flink Stateful Functions <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
<li><a href="/training.html">Training Course</a></li>
</ul>
</li>
<!-- Documentation -->
<li class="dropdown">
<a class="dropdown-toggle" data-toggle="dropdown" href="#">Documentation<span class="caret"></span></a>
<ul class="dropdown-menu">
<li><a href="https://ci.apache.org/projects/flink/flink-docs-release-1.11" target="_blank">Flink 1.11 (Latest stable release) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
<li><a href="https://ci.apache.org/projects/flink/flink-docs-master" target="_blank">Flink Master (Latest Snapshot) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
<li><a href="https://ci.apache.org/projects/flink/flink-statefun-docs-release-2.1" target="_blank">Flink Stateful Functions 2.1 (Latest stable release) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
<li><a href="https://ci.apache.org/projects/flink/flink-statefun-docs-master" target="_blank">Flink Stateful Functions Master (Latest Snapshot) <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
</ul>
</li>
<!-- getting help -->
<li><a href="/gettinghelp.html">Getting Help</a></li>
<!-- Blog -->
<li><a href="/blog/"><b>Flink Blog</b></a></li>
<!-- Flink-packages -->
<li>
<a href="https://flink-packages.org" target="_blank">flink-packages.org <small><span class="glyphicon glyphicon-new-window"></span></small></a>
</li>
&nbsp;
<!-- Third menu section aim to support community and contributors -->
<!-- Community -->
<li><a href="/community.html">Community &amp; Project Info</a></li>
<!-- Roadmap -->
<li><a href="/roadmap.html">Roadmap</a></li>
<!-- Contribute -->
<li><a href="/contributing/how-to-contribute.html">How to Contribute</a></li>
<ul class="nav navbar-nav navbar-subnav">
<li >
<a href="/contributing/contribute-code.html">Contribute Code</a>
</li>
<li >
<a href="/contributing/reviewing-prs.html">Review Pull Requests</a>
</li>
<li >
<a href="/contributing/code-style-and-quality-preamble.html">Code Style and Quality Guide</a>
</li>
<li >
<a href="/contributing/contribute-documentation.html">Contribute Documentation</a>
</li>
<li >
<a href="/contributing/docs-style.html">Documentation Style Guide</a>
</li>
<li >
<a href="/contributing/improve-website.html">Contribute to the Website</a>
</li>
</ul>
<!-- GitHub -->
<li>
<a href="https://github.com/apache/flink" target="_blank">Flink on GitHub <small><span class="glyphicon glyphicon-new-window"></span></small></a>
</li>
&nbsp;
<!-- Language Switcher -->
<li>
<a href="/zh/contributing/code-style-and-quality-components.html">中文版</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-bottom">
<hr />
<!-- Twitter -->
<li><a href="https://twitter.com/apacheflink" target="_blank">@ApacheFlink <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
<!-- Visualizer -->
<li class=" hidden-md hidden-sm"><a href="/visualizer/" target="_blank">Plan Visualizer <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
<hr />
<li><a href="https://apache.org" target="_blank">Apache Software Foundation <small><span class="glyphicon glyphicon-new-window"></span></small></a></li>
<li>
<style>
.smalllinks:link {
display: inline-block !important; background: none; padding-top: 0px; padding-bottom: 0px; padding-right: 0px; min-width: 75px;
}
</style>
<a class="smalllinks" href="https://www.apache.org/licenses/" target="_blank">License</a> <small><span class="glyphicon glyphicon-new-window"></span></small>
<a class="smalllinks" href="https://www.apache.org/security/" target="_blank">Security</a> <small><span class="glyphicon glyphicon-new-window"></span></small>
<a class="smalllinks" href="https://www.apache.org/foundation/sponsorship.html" target="_blank">Donate</a> <small><span class="glyphicon glyphicon-new-window"></span></small>
<a class="smalllinks" href="https://www.apache.org/foundation/thanks.html" target="_blank">Thanks</a> <small><span class="glyphicon glyphicon-new-window"></span></small>
</li>
</ul>
</div><!-- /.navbar-collapse -->
</nav>
</div>
<div class="col-sm-9">
<div class="row-fluid">
<div class="col-sm-12">
<h1>Apache Flink Code Style and Quality Guide — Components</h1>
<ul class="list-group" style="padding-top: 30px; font-weight: bold;">
<li class="list-group-item">
<a href="/contributing/code-style-and-quality-preamble.html">
Preamble
</a>
</li>
<li class="list-group-item">
<a href="/contributing/code-style-and-quality-pull-requests.html">
Pull Requests &amp; Changes
</a>
</li>
<li class="list-group-item">
<a href="/contributing/code-style-and-quality-common.html">
Common Coding Guide
</a>
</li>
<li class="list-group-item">
<a href="/contributing/code-style-and-quality-java.html">
Java Language Guide
</a>
</li>
<li class="list-group-item">
<a href="/contributing/code-style-and-quality-scala.html">
Scala Language Guide
</a>
</li>
<li class="list-group-item">
<a href="/contributing/code-style-and-quality-components.html">
Component Guides
</a>
</li>
<li class="list-group-item">
<a href="/contributing/code-style-and-quality-formatting.html">
Formatting Guide
</a>
</li>
</ul>
<hr />
<div class="page-toc">
<ul id="markdown-toc">
<li><a href="#component-specific-guidelines" id="markdown-toc-component-specific-guidelines">Component Specific Guidelines</a> <ul>
<li><a href="#configuration-changes" id="markdown-toc-configuration-changes">Configuration Changes</a></li>
<li><a href="#connectors" id="markdown-toc-connectors">Connectors</a></li>
<li><a href="#examples" id="markdown-toc-examples">Examples</a></li>
<li><a href="#table--sql-api" id="markdown-toc-table--sql-api">Table &amp; SQL API</a></li>
</ul>
</li>
</ul>
</div>
<h2 id="component-specific-guidelines">Component Specific Guidelines</h2>
<p><em>Additional guidelines about changes in specific components.</em></p>
<h3 id="configuration-changes">Configuration Changes</h3>
<p>Where should the config option go?</p>
<ul>
<li>
<p><span style="text-decoration:underline;">‘flink-conf.yaml’:</span> All configuration that pertains to execution behavior that one may want to standardize across jobs. Think of it as parameters someone would set wearing an “ops” hat, or someone that provides a stream processing platform to other teams.</p>
</li>
<li><span style="text-decoration:underline;">‘ExecutionConfig’</span>: Parameters specific to an individual Flink application, needed by the operators during execution. Typical examples are watermark interval, serializer parameters, object reuse.</li>
<li><span style="text-decoration:underline;">ExecutionEnvironment (in code)</span>: Everything that is specific to an individual Flink application and is only needed to build program / dataflow, not needed inside the operators during execution.</li>
</ul>
<p>How to name config keys:</p>
<ul>
<li>
<p>Config key names should be hierarchical.
Think of the configuration as nested objects (JSON style)</p>
<div class="highlight"><pre><code>taskmanager: {
jvm-exit-on-oom: true,
network: {
detailed-metrics: false,
request-backoff: {
initial: 100,
max: 10000
},
memory: {
fraction: 0.1,
min: 64MB,
max: 1GB,
buffers-per-channel: 2,
floating-buffers-per-gate: 16
}
}
}
</code></pre></div>
</li>
<li>
<p>The resulting config keys should hence be:</p>
<p><strong>NOT</strong> <code>"taskmanager.detailed.network.metrics"</code></p>
<p><strong>But rather</strong> <code>"taskmanager.network.detailed-metrics"</code></p>
</li>
</ul>
<h3 id="connectors">Connectors</h3>
<p>Connectors are historically hard to implement and need to deal with many aspects of threading, concurrency, and checkpointing.</p>
<p>As part of <a href="https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface">FLIP-27</a> we are working on making this much simpler for sources. New sources should not have to deal with any aspect of concurrency/threading and checkpointing any more.</p>
<p>A similar FLIP can be expected for sinks in the near future.</p>
<h3 id="examples">Examples</h3>
<p>Examples should be self-contained and not require systems other than Flink to run. Except for examples that show how to use specific connectors, like the Kafka connector. Sources/sinks that are ok to use are <code>StreamExecutionEnvironment.socketTextStream</code>, which should not be used in production but is quite handy for exploring how things work, and file-based sources/sinks. (For streaming, there is the continuous file source)</p>
<p>Examples should also not be pure toy-examples but strike a balance between real-world code and purely abstract examples. The WordCount example is quite long in the tooth by now but it’s a good showcase of simple code that highlights functionality and can do useful things.</p>
<p>Examples should also be heavy in comments. They should describe the general idea of the example in the class-level Javadoc and describe what is happening and what functionality is used throughout the code. The expected input data and output data should also be described.</p>
<p>Examples should include parameter parsing, so that you can run an example (from the Jar that is created for each example using <code>bin/flink run path/to/myExample.jar --param1 … --param2</code>.</p>
<h3 id="table--sql-api">Table &amp; SQL API</h3>
<h4 id="semantics">Semantics</h4>
<p><strong>The SQL standard should be the main source of truth.</strong></p>
<ul>
<li>Syntax, semantics, and features should be aligned with SQL!</li>
<li>We don’t need to reinvent the wheel. Most problems have already been discussed industry-wide and written down in the SQL standard.</li>
<li>We rely on the newest standard (SQL:2016 or ISO/IEC 9075:2016 when writing this document <a href="https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip">[download]</a> ). Not every part is available online but a quick web search might help here.</li>
</ul>
<p>Discuss divergence from the standard or vendor-specific interpretations.</p>
<ul>
<li>Once a syntax or behavior is defined it cannot be undone easily.</li>
<li>Contributions that need to extent or interpret the standard need a thorough discussion with the community.</li>
<li>Please help committers by performing some initial research about how other vendors such as Postgres, Microsoft SQL Server, Oracle, Hive, Calcite, Beam are handling such cases.</li>
</ul>
<p>Consider the Table API as a bridge between the SQL and Java/Scala programming world.</p>
<ul>
<li>The Table API is an Embedded Domain Specific Language for analytical programs following the relational model.
It is not required to strictly follow the SQL standard in regards of syntax and names, but can be closer to the way a programming language would do/name functions and features, if that helps make it feel more intuitive.</li>
<li>The Table API might have some non-SQL features (e.g. map(), flatMap(), etc.) but should nevertheless “feel like SQL”. Functions and operations should have equal semantics and naming if possible.</li>
</ul>
<h4 id="common-mistakes">Common mistakes</h4>
<ul>
<li>Support SQL’s type system when adding a feature.
<ul>
<li>A SQL function, connector, or format should natively support most SQL types from the very beginning.</li>
<li>Unsupported types lead to confusion, limit the usability, and create overhead by touching the same code paths multiple times.</li>
<li>For example, when adding a <code>SHIFT_LEFT</code> function, make sure that the contribution is general enough not only for <code>INT</code> but also <code>BIGINT</code> or <code>TINYINT</code>.</li>
</ul>
</li>
</ul>
<h4 id="testing">Testing</h4>
<p>Test for nullability.</p>
<ul>
<li>SQL natively supports <code>NULL</code> for almost every operation and has a 3-valued boolean logic.</li>
<li>Make sure to test every feature for nullability as well.</li>
</ul>
<p>Avoid full integration tests</p>
<ul>
<li>Spawning a Flink mini-cluster and performing compilation of generated code for a SQL query is expensive.</li>
<li>Avoid integration tests for planner tests or variations of API calls.</li>
<li>Instead, use unit tests that validate the optimized plan which comes out of a planner. Or test the behavior of a runtime operator directly.</li>
</ul>
<h4 id="compatibility">Compatibility</h4>
<p>Don’t introduce physical plan changes in minor releases!</p>
<ul>
<li>Backwards compatibility for state in streaming SQL relies on the fact that the physical execution plan remains stable. Otherwise the generated Operator Names/IDs change and state cannot be matched and restored.</li>
<li>Every bug fix that leads to changes in the optimized physical plan of a streaming pipeline hences breaks compatibility.</li>
<li>As a consequence, changes of the kind that lead to different optimizer plans can only be merged in major releases for now.</li>
</ul>
<h4 id="scala--java-interoperability-legacy-code-parts">Scala / Java interoperability (legacy code parts)</h4>
<p>Keep Java in mind when designing interfaces.</p>
<ul>
<li>Consider whether a class will need to interact with a Java class in the future.</li>
<li>Use Java collections and Java Optional in interfaces for a smooth integration with Java code.</li>
<li>Don’t use features of case classes such as .copy() or apply() for construction if a class is subjected to be converted to Java.</li>
<li>Pure Scala user-facing APIs should use pure Scala collections/iterables/etc. for natural and idiomatic (“scalaesk”) integration with Scala.</li>
</ul>
</div>
</div>
</div>
</div>
<hr />
<div class="row">
<div class="footer text-center col-sm-12">
<p>Copyright © 2014-2019 <a href="http://apache.org">The Apache Software Foundation</a>. All Rights Reserved.</p>
<p>Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.</p>
<p><a href="/privacy-policy.html">Privacy Policy</a> &middot; <a href="/blog/feed.xml">RSS feed</a></p>
</div>
</div>
</div><!-- /.container -->
<!-- Include all compiled plugins (below), or include individual files as needed -->
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.4/js/bootstrap.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery.matchHeight/0.7.0/jquery.matchHeight-min.js"></script>
<script src="/js/codetabs.js"></script>
<script src="/js/stickysidebar.js"></script>
<!-- Google Analytics -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-52545728-1', 'auto');
ga('send', 'pageview');
</script>
</body>
</html>