blob: f11d4decf7cccf9521336edcd89a452ac50f9c3d [file] [log] [blame]
<html>
<head>
<title>Interview with Tom Ball</title>
<meta name="description" content="Interview with Tom Ball">
<meta name="author" content="Ruth Kusterer">
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<link rel="stylesheet" type="text/css" href="../../../netbeans.css">
</head>
<body>
<h1><a name="Interview_with_Tom_Ball"></a>&quot;Standing on the Shoulders of Giants&quot;
&mdash; Interview with Jackpot Team Leader Tom Ball</h1>
<table border="2" cellpadding="7" cellspacing="0" style="float:right;margin:10px" width="200" bgcolor="#FFFFFF">
<tr>
<td align="center" valign="top"><img src="../../../images_www/articles/interviews/tom_ball.jpg"><br />
<div style="text-align:justify"><p>Tom Ball is a senior staff engineer at Sun Microsystems, working on Java language tools.
He has been working with Java for twelve years now with the JDK, AWT, Swing, Jackpot, and NetBeans teams.</p><p>
Tom considers programming a craft, and is always looking for new tools and techniques to improve it.</p></div>
<p style="text-align:center"><br /><a href="https://netbeans.org/projects/contrib/index.html">Jackpot Project Page</a><br />
<a href="http://weblogs.java.net/blog/tball/">Tom Ball's Blog</a></p>
</td></tr>
</table>
<p><h2><a name="Question_1"> </a> Question: Tom, tell us in a few words, what is Jackpot? </h2>
<p>
At its heart, <a href="https://netbeans.org/projects/contrib/index.html">Jackpot</a> is a platform library for easily finding and correcting patterns
in Java source code using static analysis.
<p />
On top of that, there are two Jackpot modules for NetBeans.
The first module provides a user interface, project integration,
and some built-in queries (more to come).
The second module adds developer support for creating and deploying new queries easily.
Finally, underlying Jackpot is the Mustang (Java 6) version of the javac compiler,
which creates the data model of your project sources which Jackpot manipulates.
Your IDE only needs to run Tiger (Java 5) to use Jackpot, however.
<p />
Jackpot attempts to address several computer problems which software
researchers considered impossible, such as making radical but
semantically correct changes to large bodies of code, or preserving the
format of the source code when it is transformed. We use an approach
similar to how limits are used in calculus, where you keep moving
closer to a solution until that solution is close enough to be used effectively.
</p>
<p>
</p><h2><a name="Question_2"> </a> Question: How is Jackpot different than refactoring tools? </h2>
<p>
There are two key differences:
<ol>
<li><p> Most refactoring tools are fairly difficult to extend,
so a lot of effort has gone into making Jackpot easy (or easier) to do so.
In addition to a rich API for creating query and transformation classes,
Jackpot provides a pattern-matching (rule) language for Java statements and expressions.
We are also putting work into making it easy to share these extensions
with co-workers and fellow community members.
<li><p> The data model that Jackpot searches and transforms has more detailed and
correct information about the source code than most, if not all, other refactoring tools.
This is because it utilizes every scrap of information the javac compiler gleans
during its parsing and attribution phases.
Most refactoring tools use parse trees and many resolve symbol references;
but it is extremely difficult to determine Java semantic information
such as type attribution correctly, especially with the new language additions such as generic types.
I don't think the Jackpot team is any more capable of creating a correct semantic model
for all Java source code than other refactoring teams (they are all excellent engineers),
but I do believe that Sun's javac team has the best chance of getting it right
(and the most feedback when they don't).
By leveraging javac, we are "standing on the shoulders of giants".
</ol>
</p>
<p>
</p><h2><a name="Question_3"> </a> Question: How does Jackpot compare to static analysis tools like FindBugs and PMD? </h2>
<p>
These are both great tools with different strengths.
Although I haven't done a query-by-query analysis,
my impression is that it is possible to write any of the queries
either of these tools have using either Jackpot's rule language or its API &mdash;
there is nothing missing from Jackpot's model that would preclude any of these analyses.
FindBugs and PMD also have the advantage of being able to run from the command-line and by the Ant build tool.
<p />
A usability problem with these types of tools is that they can wind up reporting hundreds of small issues.
So people often run them once, fix the most glaring bugs by hand,
and then don't use them much after that because during project development
all the unfixed issues obscure new ones when they surface.
Since many of these types of bugs have an obvious fix associated with them,
Jackpot can automatically correct these problems and write the changes to your source files.
Before modifying any source files, it reports what problems it found,
what changes it wants to make, and you have the option of skipping any changes before applying the rest.
So even if it finds hundreds of potential problems,
it can fix most of them in a single pass so that future queries will only find the new issues.
</p>
<p>
</p><h2><a name="Question_4"> </a> Question: How does Jackpot compare to the "Inspect Code" feature in IntelliJ's IDEA?</h2>
<p>
My limited understanding is that the "Inspect Code" command in IntelliJ IDEA
is another "anti-pattern" detector like FindBugs and PMD.
Like Jackpot, it has the ability to fix many of these problems,
but its UI requires that you select each problem separately and click on the command to fix it.
It also doesn't show you a diff window with the old and new source like Jackpot can,
so you just have to trust that it will be correct before invoking the command.
Finally, it takes IDEA significantly longer to read in a large project;
I don't mean that as a criticism of their parser, but rather it demonstrates how fast javac is.
</p>
<p>
</p><h2><a name="Question_5"> </a> Question: How well does Jackpot scale - how much RAM would my system need in order to use Jackpot on a large project (>20,000 lines of code)? </h2>
<p>
You've been working with sample code too long &mdash; these days 20,000 lines is pretty small
for many Java projects!
At James Gosling's keynote at JavaOne 2006, I demonstrated Jackpot with a medium-sized project
that had a quarter of a million lines of code, and it took less than half a minute on my old laptop
to read the project and execute several queries.
It's true that big projects take more memory and time,
but for NetBeans 6.0 we have some ideas on ways to minimize some of the overhead.
On the other hand, if you are working on a big project it's easy to justify
a max'ed out Sun Ultra 40, as it has more than enough memory for the biggest projects
and should pay for itself quickly from the productivity boost.
(No, I'm not in sales, but am rehearsing these justifications so my boss will buy me one soon.)
</p>
<p>
</p><h2><a name="Question_6"> </a> Question: Can Jackpot be used outside of the NetBeans IDE? </h2>
<p>
No, it requires the project support (where are the source files, libraries, how is it built, etc.)
and user interface support NetBeans IDE provides.
We certainly could have implemented Jackpot as a separate tool,
but it would have taken a lot longer recreating all of the infrastructure the NetBeans IDE provides.
</p>
<p>
</p><h2><a name="Question_7"> </a> Question: What is an Abstract Syntax Tree? (And why should I care?) </h2>
<p>
If you are just using Jackpot, either with its built-in queries
or with queries a team member wrote, you shouldn't care &mdash;
And if you are using Jackpot's rule language such knowledge is optional (but useful).
But if you want to write a custom query, here are the gory details.
<p />
An AST (Abstract Syntax Tree) is a data type which represents the syntax of source code.
For example, this HelloWorld.java example:
<pre>
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, world!");
}
}
</pre>
could be expressed using a simple AST as:
<pre>source-file: HelloWorld.java
+ class
+ name
+ identifier: HelloWorld
+ modifiers: public
+ members
+ method
+ name: main
+ modifiers: public, static
+ arguments
+ parameter
+ type: String[]
+ name: args
+ return-type: void
+ block
+ method-invocation: System.out.println
+ arguments
+ string-literal: "Hello, world!"
</pre>
Now, most parsers generate much more explicit AST representations, as does javac.
Because there can be millions of AST nodes in a large project,
queries inspect this tree using visitor patterns, so only the node types you care about are inspected.
If we wanted to query for static method declarations in the above,
for example, our query writes a single "visitMethod" method which inspects each method's modifiers
and reports the ones which include static.
<p />
But ASTs are only half the story, since Java source fragments do not execute in isolation.
For example, if you have three "println" method calls,
you will get three similar trees with no indication that they are referring to the same method.
To query for all references to a method, you need to resolve all references
so the ones that are the same can be easily found
(and correctly resolved so you have confidence in your query).
Symbol resolution seems easy, but it can get complicated by all the ways
you can overload, hide, and obscure class members when inheritance is involved.
<p />
ASTs and symbol resolution can support a lot of queries,
but many more advanced queries require type information, which is a third step compilers run.
This can be the most complicated analysis most Java compilers make,
and it became much more difficult with Java 5 and the introduction of generic types.
Accurate type information is critical for making safe transformations to source code
(ones that preserve its logic),
and the rich type system javac provides allows Jackpot to make more types of changes
than would otherwise be feasible.
<p />
But in the end, all that matters is:
Did the tool find potential problems in my source code and fix them without breaking anything?
</p>
<p>
</p><h2><a name="Question_8"> </a> Question: What did you use as a guideline when creating the scripting language that is used with Jackpot? </h2>
<p>
By "you" I hope you meant past and present members of the Jackpot team,
because the language was designed by James Gosling.
He didn't create the language to solve all refactoring problems,
but instead wanted a simple way to express the sorts of changes we were making for API migration.
API migration involves moving source code from one API to another,
such as rewriting deprecated method use to non-deprecated replacement methods.
Using the Java core classes as an example,
it's easy to find hundreds of methods to be mapped using very similar patterns,
and writing a visitor to handle all of them would be a maintenance nightmare.
He therefore created a "little language" to handle those sorts of transformations,
with the key guideline being simplicity.
It cannot handle all queries, but the transformations it can handle can be expressed
in a short fragment of very legible code.
<p />
Jackpot's rule language isn't really a scripting language
(since it has no support for things like flow control),
but instead a pattern matching language similar in concept (but not syntax) to regular expressions.
The brilliant core idea behind Gosling's language is to use Java to express patterns
you want to find in Java statements and expressions,
with meta-variables (wildcards) for the parts you don't care about.
By using Java to define your pattern, you avoid exposing the AST structure of the model
and making query-writers learn it; the query-writers already know Java and understand it deeply.
It also allows the javac team to
change the AST model as the Java language evolves without breaking rule
definitions.
<p />
See if you can figure out what this transformation does.
Meta-variables (wildcards) are Java identifiers with an initial '$'.
The Java fragment to the left of the "=>" token is the pattern to search for,
the fragment between the "=>" and the "::" tokens is the pattern to convert it to,
and the test expression after the "::" token restricts the rule so it only matches if it passes the test:
<pre>
$object.show() => $object.setVisible(true) ::
$object instanceof java.awt.Component;
</pre>
As you no doubt figured out, this rule converts any statement which invokes the deprecated <code>Component.show()</code>
method to <code>Component.setVisible(true)</code>,
but only when the object's class is derived from <code>Component</code>.
The <code>instanceof</code> test demonstrates why type information is necessary for accurate matching,
as we don't want to match any non-<code>Component</code> classes which have a <code>show()</code> method,
and we also don't want to miss any <code>Component</code> subclasses that may be hiding in obscure corners.
</p>
<p>
</p><h2><a name="Question_9"> </a> Question: What does the roadmap for Jackpot include? </h2>
<p>
For NetBeans 6.0 Jackpot will become a part of the core IDE,
and its engine will merge into a new Java source model facility which will be shared by the rest of the IDE.
Its user interface and developer support features will mostly remain,
but will be better integrated with the existing refactoring support.
One feature I'm looking forward to is the ability for anyone to create Jackpot queries
and add them as Java editor tips (the little light bulbs that show when a code improvement is suggested).
<p />
The other big piece is the work on catching up with the other inspection tools
in terms of increasing the number of built-in inspections.
I've spent so long working on making Jackpot easy to extend,
so I hope soon to be able to have some fun actually doing so.
<p />
In general, the overall roadmap for Jackpot is to get developers to use
the tool and report back what they don't like, and we'll continually
incorporate their feedback to keep improving the tool.
</p>
</body>