blob: 01ee1121baabfe8d4335900f1a2e2d0e08ec5ab4 [file] [log] [blame]
////
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
////
Contributing
============
Contributions via GitHub pull requests are gladly accepted from their original author. By submitting any copyrighted
material via pull request, email, or other means you agree to license the material under the project's open source
license and warrant that you have the legal authority to do so.
Getting Started
---------------
New contributors can start development with TinkerPop by first link:https://help.github.com/articles/fork-a-repo/[forking
then cloning] the Apache TinkerPop link:https://github.com/apache/incubator-tinkerpop[GitHub repository]. Generally
speaking it is best to tie any work done to an issue in link:https://issues.apache.org/jira/browse/TINKERPOP3[JIRA].
Either scan through JIRA for an existing open issue to work on or create a new one.
NOTE: For those who are trying to find a place to start to contribute, consider looking at unresolved issues that
have the "trivial" priority as these issues are specifically set aside as
link:https://issues.apache.org/jira/issues/?jql=project%20%3D%20TINKERPOP3%20AND%20resolution%20%3D%20Unresolved%20AND%20priority%20%3D%20Trivial%20ORDER%20BY%20key%20DESC[low-hanging fruit]
for newcomers.
After making changes, submit a link:https://help.github.com/articles/using-pull-requests/[pull request] through
GitHub, where the name of the pull request is prefixed with the JIRA issue number. In this way, the pull request
and its comments get tied back to the JIRA issue it references.
Before issuing your pull request, please be sure of the following:
. `mvn clean install` works successfully.
. If the change requires modification to the documentation, which can be found in `docs/src`, please be sure to try to
generate the docs. If the changes are minimal and do not include code examples, it might be sufficient to test
generate the docs to validate formatting by just doing `bin/process-docs.sh --dryRun`. If there are code examples,
please be sure to have Zookeeper and Hadoop running when doing a `bin/process-docs.sh`. The documentation is
generated to `/target/docs/htmlsingle`.
. If necessary, run the integration tests. For example, if the changes affect serialization or Gremlin Server/Driver
operations then running the integration tests assures in addition to unit tests will definitely be necessary. After
a successful `mvn clean install`, do `mvn verify -DskipIntegrationTests=false -pl gremlin-server`.
Once a pull request is submitted it must go through <<rtc,review>> and will be merged once three TinkerPop committers
offer positive vote and achieve Apache consensus.
Building and Testing
--------------------
TinkerPop requires `Java 1.8.0_40+` for proper building and proper operations.
* Build Project: `mvn clean install`
** Specify specific tests in a TinkerPop Suite to run with the `GREMLIN_TESTS` environment variable, along with the
Maven project list argument, e.g.:
+
----
export GREMLIN_TESTS='org.apache.tinkerpop.gremlin.process.traversal.step.map.PathTest$Traversals,org.apache.tinkerpop.gremlin.process.traversal.PathTest'
mvn -Dmaven.javadoc.skip=true --projects tinkergraph-gremlin test
----
** Clean the `.groovy/grapes/org.apache.tinkerpop` directory on build: `mvn clean install -DcleanGrapes`
** Turn off "heavy" logging in the "process" tests: `mvn clean install -DargLine="-DmuteTestLogs=true"`
** The test suite for `neo4j-gremlin` is disabled by default - to turn it on: `mvn clean install -DincludeNeo4j`
* Regenerate test data (only necessary given changes to IO classes): `mvn clean install -Dio` from `tinkergraph-gremlin` directory
** If there are changes to the Gryo format, it may be necessary to generate the Grateful Dead dataset from GraphSON (see `IoDataGenerationTest.shouldWriteGratefulDead`)
* Check license headers are present: `mvn apache-rat:check`
* Build AsciiDocs (Hadoop and ZooKeeper must be running): `bin/process-docs.sh`
** Build AsciiDocs (but don't evaluate code blocks): `bin/process-docs.sh --dryRun`
** Process a single AsciiDoc file: +pass:[docs/preprocessor/preprocess-file.sh `pwd`/gremlin-console/target/apache-gremlin-console-*-standalone `pwd`/docs/src/xyz.asciidoc]+
* Build JavaDocs: `mvn process-resources -Djavadoc`
* Check for Apache License headers: `mvn apache-rat:check`
* Check for newer dependencies: `mvn versions:display-dependency-updates` or `mvn versions:display-plugin-updates`
* Deploy JavaDocs/AsciiDocs: `bin/publish-docs.sh svn-username`
* Integration Tests: `mvn verify -DskipIntegrationTests=false`
** Execute with the `-DincludeNeo4j` option to include transactional tests.
** Execute with the `-DuseEpoll` option to try to use Netty native transport (works on Linux, but will fallback to Java NIO on other OS).
* Performance Tests: `mvn verify -DskipPerformanceTests=false`
IDE Setup with Intellij
-----------------------
This section refers specifically to setup within Intellij. TinkerPop has a module called `gremlin-shaded` which
contains shaded dependencies for some libraries that are widely used and tend to introduce conflicts. To ensure
that Intellij properly interprets this module after importing the Maven `pom.xml` perform the following steps:
. Build `gremlin-shaded` from the command line with `mvn clean install`.
. Right-click on the `gremlin-shaded` module in the project viewer of Intellij and select "Remove module".
. In the "Maven Projects" Tool window and click the tool button for "Reimport All Maven projects" (go to
`View | Tool Windows | Maven Projects` on the main menu if this panel is not activated).
. At this point it should be possible to compile and run the tests within Intellij, but in the worst case, use
`File | Invalidate Caches/Restart` to ensure that indices properly rebuild.
Note that it maybe be necessary to re-execute these steps if the `gremlin-shaded` `pom.xml` is ever updated.
Developers working on the `neo4j-gremlin` module should enabled the `include-neo4j` Maven profile in Intellij.
This will ensure that tests will properly execute within the IDE.
If Intellij complains about "duplicate sources" for the Groovy files when attempting to compile/run tests, then
install the link:http://plugins.jetbrains.com/plugin/7442?pr=idea[GMavenPlus Intellij plugin].
For Committers
--------------
The guidelines that follow apply to those with commit access to the main repository:
Communication
~~~~~~~~~~~~~
TinkerPop has a link:http://groups.google.com/group/gremlin-users[user mailing list] and a
link:http://mail-archives.apache.org/mod_mbox/incubator-tinkerpop-dev/[developer mailing list]. As a committer,
it is a good idea to join both.
It would also be helpful to join the public link:https://s.apache.org/tinkerpop[TinkerPop HipChat room] for developer
discussion. This helps contributors to communicate in a more real-time way. Anyone can join as a guest, but for
regular contributors it may be best to request that an Apache HipChat account be created.
Release Notes
~~~~~~~~~~~~~
There is a two-pronged approach to maintaining the change log and preparing the release notes.
1. For work that is documented in JIRA, run the release notes report to include all of
the tickets targeted for a specific release. This report can be included in the
release announcement.
2. The manual change log (`CHANGELOG.asciidoc`) can be used to highlight large
changes, describe themes (e.g. "We focused on performance improvements") or to
give voice to undocumented changes.
Given the dependence on the JIRA report for generating additions to the `CHANGELOG.asciidoc`,
which uses the title of the issue as the line presented in the release note report, titles should
be edited prior to release to be useful in that context. In other words, an issue title should
be understandable as a change in the fewest words possible while still conveying the gist of the
change.
Changes that break the public APIs should be marked with a "breaking" label and should be
distinguished from other changes in the release notes.
Branches
~~~~~~~~
The "master" branch is used for the main line of development and release branches are constructed
for ongoing maintenance work. For example, the "tp30" branch is used to maintain the 3.0.x line
while work on 3.1.x proceeds on "master".
Other branches may be created for collaborating on features or for RFC's that
other developers may want to inspect. It is suggested that the JIRA issue ID be
used as the prefix, since that triggers certain automation, and it provides a
way to account for the branch lifecycle, i.e. "Who's branch is this, and can I
delete it?"
For branches that are NOT associated with JIRA issues, developers should utilize their Apache ID as
a branch name prefix. This provides a unique namespace, and also a way to account for the branch lifecycle.
Developers should remove their own branches when they are no longer needed.
Tags
~~~~
Tags are used for milestones, release candidates, and approved releases. Please refrain from creating arbitrary
tags, as they produce permanent clutter.
Issue Tracker Conventions
~~~~~~~~~~~~~~~~~~~~~~~~~
TinkerPop uses Apache JIRA as its link:https://issues.apache.org/jira/browse/TINKERPOP3[issue tracker]. JIRA is a
very robust piece of software with many options and configurations. To simplify usage and ensure consistency across
issues, the following conventions should be adhered to:
* An issue's "status" should generally be in one of two states: `open` or `closed` (`reopened` is equivalent to `open`
for our purposes).
** An `open` issue is newly created, under consideration or otherwise in progress.
** A `closed` issue is completed for purposes of release (i.e. code, testing, and documentation complete).
** Issues in a `resolved` state should immediately be evaluated for movement to `closed` - issue become `resolved`
by those who don't have the permissions to `close`.
* An issue's "type" should be one of two options: `bug` or `improvement`.
** A `bug` has a very specific meaning, referring to an error that prevents usage of TinkerPop AND does not have a
reasonable workaround. Given that definition, a `bug` should generally have very high priority for a fix.
** Everything else is an `improvement` in the sense that any other work is an enhancement to the current codebase.
* The "component" should be representative of the primary area of code that it applies to and all issues should have
this property set.
* Issues are not assigned "labels" with two exceptions:
** The "breaking" label which marks an issue as one that is representative of a change in the API that might
affect users or vendors. This label is important when organizing release notes.
** The "deprecation" label which is assigned to an issue that is about removing a deprecated portion of the API.
* The "affects/fix version(s)" fields should be appropriately set, where the "fix version" implies the version on
which that particular issue will completed.
* The "priority" field can be arbitrarily applied with one exception. The "trivial" option should be reserved for
tasks that are "easy" for a potential new contributor to jump into and do not have significant impact to urgently
required improvements.
Code Style
~~~~~~~~~~
Contributors should examine the current code base to determine what the code style patterns are and should match their
style to what is already present. Of specific note however, TinkerPop does not use "import wildcards" - IDEs should
be adjusted accordingly to not auto-wildcard the imports.
Deprecation
~~~~~~~~~~~
When possible, committers should avoid direct "breaking" change (e.g. removing a method from a class) and favor
deprecation. Deprecation should come with sufficient documentation and notice especially when the change involves
public APIs that might be utilized by users or implemented by vendors:
* Mark the code with the `@Deprecated` annotation.
* Use javadoc to further document the change with the following content:
** `@deprecated As of release x.y.z, replaced by {@link SomeOtherClass#someNewMethod()}` - if the method is not
replaced then the comment can simply read "not replaced". Additional comments that provide more context are
encouraged.
** `@see <a href="https://issues.apache.org/jira/browse/TINKERPOP3-XXX">TINKERPOP3-XXX</a>` - supply a link to the
JIRA issue for reference.
* All deprecation should typically be tied to a JIRA issue with a "breaking" label - the issue itself does not need to
specifically or solely be about "deprecation" but it should be documented very clearly in the comments what was
deprecated and what the path forward should be.
* Be sure that deprecated methods are still under test - consider using javadoc/comments in the tests themselves to
call out this fact.
* Create a new JIRA issue to track removal of the deprecation for future evaluation - this issue should have the
"breaking" label as well as a "deprecation" label.
* Update the "upgrade documentation" to reflect the API change and how the reader should resolve it.
The JIRA issues that track removal of deprecated methods should be periodically evaluated to determine if it is
prudent to schedule them into a release.
Gremlin Language Test Cases
~~~~~~~~~~~~~~~~~~~~~~~~~~~
When writing a test case for a Gremlin step, be sure to use the following conventions.
* The name of the traversal generator should start with `get`, use `X` for brackets, `_` for space, and the Gremlin-Groovy sugar syntax.
** `get_g_V_hasLabelXpersonX_groupXaX_byXageX_byXsumX_name()`
* When creating a test for a step that has both a barrier and sideEffect form (e.g. `group()`, `groupCount()`, etc.), test both representations.
** `get_g_V_groupCount_byXnameX()`
** `get_g_V_groupCountXaX_byXnameX_capXaX()`
* The name of the actual test case should be the name of the traversal generator minus the `get_` prefix.
* The Gremlin-Groovy version of the test should use the sugar syntax in order to test sugar (as Gremlin-Java8 tests test standard syntax).
** `g.V.age.sum`
* Avoid using lambdas in the test case unless that is explicitly what is being tested as OLAP systems will typically not be able to execute those tests.
* `AbstractGremlinProcessTest` has various static methods to make writing a test case easy.
** `checkResults(Arrays.asList("marko","josh"), traversal)`
** `checkMap(new HashMap<String,Long>() {{ put("marko",1l); }}, traversal.next())`
[[rtc]]
Review then Commit
~~~~~~~~~~~~~~~~~~
Code modifications must go through a link:http://www.apache.org/foundation/glossary.html#ReviewThenCommit[review-then-committ] (RTC)
process before being merged into a release branch. All committers should follow the pattern below, where "you" refers to the
committer wanting to put code into a release branch.
* Make a JIRA ticket for the software problem you want to solve (i.e. a fix).
* Fork the release branch that the fix will be put into.
** The branch name should be the JIRA issue identifier (e.g. `TINKERPOP3-XXX`).
* Develop your fix in your branch.
* When your fix is complete and ready to merge, issue a link:https://git-scm.com/docs/git-request-pull[pull request].
** Be certain that the test suite is passing.
** If you updated documentation, be sure that the `process-docs.sh` is building the documentation correctly.
* Before you can merge your branch into the release branch, you must have at least 3 +1 link:http://www.apache.org/foundation/glossary.html#ConsensusApproval[consensus votes].
** Please see the Apache Software Foundations regulations regarding link:http://www.apache.org/foundation/voting.html#votes-on-code-modification[Voting on Code Modifications].
* Votes are issued by TinkerPop committers as comments to the pull request.
* Once 3 +1 votes are received, you are responsible for merging to the release branch and handling any merge conflicts.
** If there is a higher version release branch that requires your fix (e.g. `3.y-1.z` fix going to a `3.y.z` release), be sure to merge to that release branch as well.
* Be conscious of deleting your branch if it is no longer going to be used so stale branches don't pollute the repository.
NOTE: These steps also generally apply to external pull requests from those who are not official Apache committers. In
this case, the person responsible for the merge after voting is typically the first person available
who is knowledgeable in the area that the pull request affects. Any additional coordination on merging can be handled
via the pull request comment system.
The following exceptions to the RTC (review-then-commit) model presented above are itemized below. It is up to the
committer to self-regulate as the itemization below is not complete and only hints at the types of commits that do not
require a review.
* You are responsible for a release and need to manipulate files accordingly for the release.
** `Gremlin.version()`, CHANGELOG dates, `pom.xml` version bumps, etc.
* You are doing an minor change and it is obvious that an RTC is not required (would be a pointless burden to the community).
** The fix is under the link:http://www.apache.org/foundation/glossary.html#CommitThenReview[commit-then-review] (CTR) policy and lazy consensus is sufficient, where a single -1 vote requires you to revert your changes.
** Adding a test case, fixing spelling/grammar mistakes in the documentation, fixing LICENSE/NOTICE/etc. files, fixing a minor issue in an already merged branch.
When the committer chooses CTR, it is considered good form to include something in the commit message that explains
that CTR was invoked and the reason for doing so. For example, "Invoking CTR as this change encompasses minor
adjustments to text formatting."
Pull Request Format
^^^^^^^^^^^^^^^^^^^
When you submit a pull request, be sure it uses the following style.
* The title of the pull request is the JIRA ticket number + "colon" + the title of the JIRA ticket.
* The first line of the pull request message should contain a link to the JIRA ticket.
* Discuss what you did to solve the problem articulated in the JIRA ticket.
* Discuss any "extra" work done that go beyond the assumed requirements of the JIRA ticket.
* Be sure to explain what you did to prove that the issue is resolved.
** Test cases written.
** Integration tests run (if required for the work accomplished).
** Documentation building (if required for the work accomplished).
** Any manual testing (though this should be embodied in a test case).
* Notes about what you will do when you merge to the respective release branch (e.g. update CHANGELOG).
** These types of "on merge tweaks" are typically done to extremely dynamic files to combat and merge conflicts.
* If you are a TinkerPop committer, you can VOTE on your own pull request, so please do so.
[[dependencies]]
Dependencies
~~~~~~~~~~~~
There are many dependencies on other open source libraries in TinkerPop modules. When adding dependencies or
altering the version of a dependency, developers must consider the implications that may apply to the TinkerPop
LICENSE and NOTICE files. There are two implications to consider:
. Does the dependency fit an Apache _approved_ license?
. Given the addition or modification to a dependency, does it mean any change for TinkerPop LICENSE and NOTICE files?
Understanding these implications is important for insuring that TinkerPop stays compliant with the Apache 2 license
that it releases under.
Regarding the first item, refer to the Apache Legal for a list of link:http://www.apache.org/legal/resolved.html[approved licenses]
that are compatible with the Apache 2 license.
The second item requires a bit more effort to follow. The Apache website offers a
link:http://www.apache.org/dev/licensing-howto.html[how-to guide] on the approach to maintaining appropriate LICENSE
and NOTICE files, but this guide is designed to offer some more specific guidance as it pertains to TinkerPop
and its distribution.
To get started, TinkerPop has both "source" and "binary" LICENSE/NOTICE files:
* Source LICENSE/NOTICE relate to files packaged with the released source code distribution:
link:https://github.com/apache/incubator-tinkerpop/blob/master/LICENSE[LICENSE] / link:https://github.com/apache/incubator-tinkerpop/blob/master/NOTICE[NOTICE]
* Binary LICENSE/NOTICE relate to files packaged with the released binary distributions:
** Gremlin Console link:https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-console/src/main/LICENSE[LICENSE]
/ link:https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-console/src/main/NOTICE[NOTICE]
** Gremlin Server link:https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-server/src/main/LICENSE[LICENSE]
/ link:https://github.com/apache/incubator-tinkerpop/blob/master/gremlin-server/src/main/NOTICE[NOTICE]
As we don't usually include any dependencies in the source distribution (i.e. the source zip distribution), there is
typically no need to edit source LICENSE/NOTICE when editing a TinkerPop `pom.xml`. These files only need to be edited
if the distribution has a file added to it. Such a situation may arise from several scenarios, but it would most
likely come from the addition of a source file from another library. In that case, edit LICENSE to include this file
following the pattern already present there. Refer to the
link:http://www.apache.org/dev/licensing-howto.html#mod-notice[Modifications to Notice] section from the Apache
"how-to" to determine if changes are necessary there.
The binary LICENSE/NOTICE is perhaps most impacted by changes to the various `pom.xml` files. After altering the
`pom.xml` file of any module, build both Gremlin Console and Gremlin Server and examine the contents of both binary
distributions, either:
* target/apache-gremlin-console-x.y.z-distribution.zip
* target/apache-gremlin-server-x.y.z-distribution.zip
There is a _near_ one-to-one mapping between the jar files in those distributions and their respective LICENSE files.
It is described as "near" because the distributions include TinkerPop jars which don't need to be represented there
and there are entries in LICENSE that refer to libraries that are present, but shaded in the gremlin-shaded module.
Everything else should be listed and present under the appropriate license section.
There is also a good chance that the TinkerPop binary NOTICE should be updated. Check if the newly added dependency
contains a NOTICE of its own. If so, include that NOTICE in the TinkerPop NOTICE.