Summary
No longer require super()
and this()
to appear first in a constructor.
Goals
Change the Java Language Specification and make corresponding changes to the Java compiler so that:
super()
andthis()
no longer must appear as the first statement in a constructor- The language preserves existing safety and initialization guarantees afforded to constructors
- Existing programs continue to compile and function as they did before
Non-Goals
Modifications to the JVM. These changes may prompt reconsideration of the JVM’s current restrictions on constructors, however, in order to avoid unnecessary linkage between JLS and JVM changes, any such modifications should be proposed in a follow-on JEP. This JEP assumes no change to the current JVM behavior.
Changes to current behavior. There is no intention to change the behavior of any program that adheres to the current JLS.
Addressing larger language concerns. Thinking about the interplay between superclass constructors and subclass initialization has evolved since the Java language was first designed. This work should be considered a pragmatic tweak rather than a statement on language design.
Motivation
Currently, the Java language requires that invocations of this()
or super()
appear as the first statement in a constructor.
However, the Java Virtual Machine actually allows more flexibility:
- Multiple invocations of
this()
and/orsuper()
may appear in a constructor, as long as on any code path there is exactly one invocation - Arbitrary code may appear before
this()
/super()
, as long as that code doesn’t reference the instance under construction, with an exception carved out for field assignments - However, invocations of
this()
/super()
may not appear within atry { }
block (i.e., within a bytecode exception range)
Note that these more permissive rules do not cause any reduction in existing safety guarantees regarding proper initialization: (a) the uninitialized instance is still “off limits”, except for field assignments (which do not affect outcomes), until superclass initialization is performed, and (b) superclass initialization always happens exactly once, either directly via super()
or indirectly via this()
.
So a basic motivation is simply that the JLS is being needlessly restrictive. In fact, this inconsistency is a historical artifact: the original JVM specification was more restrictive also, however, this led to issues with initialization of synthetic fields generated by the compiler to support new language features such as inner classes and captured free variables. As as result, the JVM specification was relaxed to accommodate the compiler, but this new flexibility never made its way back up to the language level.
There is also a practical motivation, which is that it’s often convenient to be able to do “housekeeping” before invoking super()
or this()
.
Here’s a somewhat contrived example:
import java.math.*;
public class BigPositiveValue extends BigInteger {
/**
* Constructor taking a {@code long} value.
*
* @param value value, must be one or greater
*/
public BigPositiveValue(long value) {
if (value < 1)
throw new IllegalArgumentException("non-positive value");
super(String.valueOf(value));
}
/**
* Constructor taking a base and exponent. Negative exponents are clipped to zero.
*
* @param base base
* @param power exponent
*/
public BigPositiveValue(int base, float power) {
if (base < 2)
throw new IllegalArgumentException("invalid base");
if (!Float.isFinite(power))
throw new IllegalArgumentException("invalid power");
if (power <= 0) // clip negative exponents to zero
super("1");
else
this(Math.round(Math.pow(base, power)));
}
}
Another reason is to provide a way to avoid bugs caused by a 'this' escape in a superclass constructor. A 'this' escape is when a superclass constructor does something that could cause a subclass method to be invoked before the superclass constructor returns; in such cases the subclass method would operate on an incompletely initialized instance.
For example, consider this class:
import java.util.*;
import java.util.function.*;
/**
* A {@link Set} that rejects elements not accepted by the configured {@link Predicate}.
*/
public class FilteredSet extends HashSet {
private final Predicate super E> filter;
public FilteredSet(Predicate super E> filter, Collection extends E> elems) {
super(elems);
this.filter = filter;
}
@Override
public boolean add(E elem) {
if (!this.filter.test(elem))
throw new IllegalArgumentException("disallowed element");
return super.add(elem);
}
public static void main(String[] args) {
new FilteredSet<>(s -> true, Arrays.asList("abc", "def")); // NullPointerException
}
}
It appears bug-free, but actually it throws a NullPointerException
. The reason is not apparent until you realize that the HashSet(Collection)
constructor invokes AbstractCollection.addAll()
, which invokes add()
, which as overridden in FilteredSet
dereferences this.filter
before that field is initialized. In other words, the bug results from the trap laid by the 'this' escape in the HashSet(Collection)
constructor.
Moreover, there's no simple way for the FilteredSet
constructor to work around that trap. But the problem could be easily avoided if the constructor could simply do this:
public FilteredSet(Predicate super E> filter, Collection extends E> elems) {
this.filter = filter;
super(elems);
}
Even if there is no 'this' escape in a superclass, this is a fact that's not going to be obvious to a developer, because it requires recursive inspection of each superclass constructor's code. Moreover, 'this' escape behavior in constructors is rarely part of their documented behavior (either way), and so is subject to change; it's unwise to rely on some other class' unspecified implementation details for correct code. By initializing fields prior to superclass initialization, developers can confidently dismiss any concerns about superclass 'this' escapes.
Description
Language Changes
The JLS will be modified as follows:
- Remove the requirement that
super()
orthis()
appear as the first statement in a constructor - Add the requirement that, in any constructor with explicit
super()
and/orthis()
invocations, eithersuper()
orthis()
must be invoked exactly once (assuming the constructor returns normally). This may be specified economically by stating that the compiler treats superclass initialization like a non-static blank final field. - Add the requirement that no access to the new instance in a constructor, other than assignments to fields, may occur prior to an invocation of
super()
orthis()
- Add the requirement that
super()
andthis()
may not appear within anytry { }
block - Specify that non-static field initializers and initialization blocks are executed immediately after
super()
invocation, wherever it occurs
Note: there is no change to the implicit addition of super()
at the beginning of any constructor having no explicit super()
or this()
invocation.
try { }
Blocks
The restriction that super()
and this()
may not appear inside a try { }
block comes from the JVM itself, and is due to how StackMaps are represented. The logic is that when a superclass constructor throws an exception, the new instance on the stack is neither fully uninitialized nor fully initialized, so it should be considered unusable, and therefore such a constructor must never return. However, the JVM doesn't allow the bytecode to discard the unusable instance and throw another exception; instead, it doesn't allow it to exist on the stack at all. The net effect is that constructors can't catch exceptions thrown by superclass initialization, even if rethrown.
Initialization Order
The JLS specifies that field initializers and initialization blocks execute after superclass initialization via super()
. So this class:
class Test1 {
final int x;
{
x = 123;
}
public Test1() {
super();
this.x = 456;
}
}
generates this error:
Test1.java:8: error: variable x might already have been assigned
this.x = 456;
^
However, now that super()
can appear anywhere in a constructor, an assignment in an initializer block can now happen after an earlier assignment in a constructor. So this class:
class Test1 {
final int x;
{
x = 123;
}
public Test1() {
this.x = 456;
super();
}
}
will now generate this error:
Test1.java:4: error: variable x might already have been assigned
x = 123;
^
As before, initializers and initialization blocks happen immediately after superclass initialization, which happens when super()
is invoked. But now this can be anywhere in the constructor.
One might ask why not move initializers and initialization blocks to the start of every constructor, but that doesn't work. First, they could have early references (e.g., by invoking an instance method), and second, the constructor might invoke this()
and super()
on different code branches, so you'd be executing the initialization twice in the this()
case.
Records
Record constructors are subject to more restrictions that normal constructors. In particular:
- Canonical record constructors may not contain any explicit
super()
orthis()
invocation - Non-canonical record constructors may invoke
this()
, but notsuper()
These restrictions remain in place, but otherwise record constructors benefit from these changes. The net change is that non-canonical record constructors can now invoke this()
multiple times, as long as it is invoked exactly once along any code path.
Compiler Changes
All constructors except java.lang.Object()
must initialize their superclass. Currently, there are three options for superclass initialization:
- Invoke
super()
as the first statement - Invoke
this()
as the first statement - Do not invoke
super()
orthis()
→ the compiler adds asuper()
for you
In the compiler, constructors are currently divided into two categories:
- Initial constructors invoke
super()
(either explicitly or implicitly) - Non-inital constructors invoke
this()
In the current code, non-initial constructors are treated almost the same as normal methods, because once this()
is invoked at the start of the constructor, the object is fully initialized. Initial constructors, however, must be more closely watched to insure final fields are initialized correctly. Initial constructors also must be modified during compilation to execute any non-static field initializers and initialization blocks. All constructors are modified to handle non-static nested class references to outer instances, and free variable proxies.
Overall, the following "syntactic sugar" adjustments are applied to constructors during compilation:
- If the constructor doesn't invoke
this()
orsuper()
, an initialsuper()
invocation is inserted - If the class has non-static fields initializers or initialization blocks:
- Code is added after
super()
invocations to initialize fields and run initialization blocks
- Code is added after
- If the class has an outer instance:
- A synthetic
this$0
field is added to the class - Constructors have an extra parameter prepended to carry it
- Code is added prior to
super()
invocations to initializethis$0
from the new parameter
- A synthetic
- If the class has proxies for free variables:
- Synthetic
val$x
fields are added to the class - Constructors have extra parameters appended
- Code is added prior to
super()
invocations to initialize eachval$x
from its new parameter
- Synthetic
By initializing this$0
and val$x
fields before invoking super()
, the compiler is already taking advantage of the looser JVM requirements for its own purposes. A side effect is that this alternate version of FilteredSet
works fine:
import java.util.*;
import java.util.function.*;
public class FilteredSet {
public static Set create(Predicate super E> filter, Collection extends E> elems) {
return new HashSet(elems) {
@Override
public boolean add(E elem) {
if (!filter.test(elem))
throw new IllegalArgumentException("disallowed element");
return super.add(elem);
}
};
}
public static void main(String[] args) {
FilteredSet.create(s -> true, Arrays.asList("abc", "def")); // works!
}
}
Compiler Change Overview
This change impacts a few different areas of the compiler. In all cases, existing classes should compile the same way as they did before; we are strictly expanding the set of accepted source inputs.
At a high level, here's what changes in the compiler:
- Relax checks so that
this()
/super()
may appear anywhere in constructors except fortry { }
blocks - Add DA/DU analysis for superclass initialization
- Add checks to disallow early
this
references, except for field assignments - Refactor/replace any code that currently assumes
this()
/super()
is always first in constructors
Changes to Specific Files
Below are per-file descriptions of the changes being made.
comp/Attr.java
The check that super()
/this()
is the first statement of a constructor is relaxed to just check that super()
/this()
occurs within a constructor.
Non-canonical record constructors may now invoke this()
more than once on different code branches, but (as before) they must invoke this()
exactly once and they must not ever invoke super()
.
comp/Check.java
The check for recursive constructor invocation is adjusted to handle the fact that a constructor may invoke more than one other constructor, i.e., the invocation call graph is now one-to-many instead of one-to-one.
comp/Flow.java
Flow.FlowAnalyzer
checks for uncaught checked exceptions. For initializer blocks, this was previously done by requiring that any checked exceptions thrown be declared as thrown by all initial constructors. This list of checked exceptions is pre-calculated before recursing into the initial constructors. This works because initializer blocks are executed at the beginning of each initial constructor right after super()
is called.
In the new version of FlowAnalyzer
, initializer blocks are traversed in the flow analysis after each super()
invocation, reflecting what actually will happen at runtime (see below), and the pre-calculation is removed. The effect is the same as before, namely, any checked exceptions thrown by initializer blocks must be declared as thrown by all constructors that invoke super()
.
Flow.AssignAnalyzer
is responsible for DA/DU analysis for fields and variables. We piggy-back on the existing machinery for tracking assignments to final instance fields to track superclass initialization, which acts like an assignment to a blank final field, in that it must happen exactly once in each constructor no matter what code branch is taken. To do this we allocate an additional bit in the existing DA/DU bitmaps, and for the most part the existing machinery takes care of the rest.
Previously, the code worked as follows:
- For initial constructors:
- Assume final fields with initializers or assigned within initialization blocks start out DA.
- Note: This is an optimization based on the assumption that
super()
is always first and then followed by initializers
- Note: This is an optimization based on the assumption that
- Assume all blank final fields start out DU.
- Upon seeing an assignment to a blank final field:
- Before, the blank final field must be DU
- After, the blank final field is DA
- Require all final fields to be DA on any return.
- Assume final fields with initializers or assigned within initialization blocks start out DA.
- For non-initial constructors, don't do DA/DU analysis for fields (i.e., treat non-initial constructors like a normal method)
- Note: This is another optimization, based on the assumption that
this()
is always first
- Note: This is another optimization, based on the assumption that
Now that super()
and this()
can appear anywhere in constructors, there is no longer such a thing as an "initial" constructor. The new code works as follows:
- For all constructors:
- Assume all final fields start out DU.
- Upon seeing an assignment to a blank final field:
- Before, the blank final field must be DU
- After, the blank final field is DA
- Upon seeing
super()
:- Superclass initialization must be DU
- Mark superclass initialization as DA
- Recurse on initializers and initialization blocks normally to process field assignments therein
- Upon seeing
this()
:- Superclass initialization must be DU
- Mark superclass initialization as DA
- "Infer" assignments to all blank final fields, i.e.:
- All blank final fields must be DU
- Mark all blank final fields as DA
- Require all final fields to be DA on any return.
- Require superclass initialization to be DA on any return.
The result is that on every path through every constructor, each blank final field must be assigned exactly once, and superclass initialization must also happen exactly once.
AssignAnalyzer
is also augmented to enforce these new restrictions:
- Disallow any reference to the current instance prior to
super()
orthis()
, except for assignments to fields. - Disallow invocations of
this()
orsuper()
invocations withintry { }
blocks.
comp/Lower.java
This is where the adjustments are made for initializing outer instances and free variable proxies. This now must be done at every super()
invocation instead of just at the presumed first and only one, so the new code goes and finds all super()
invocations. Otherwise the adjustments made are the same.
jvm/Code.java
This class requires a change because of the following problem: while the class Code.State
is used to model the JVM state on each code branch, the "uninitialized" status of each local variable is not part of Code.State
but rather stored in the LocalVar
fields themselves (which are not cloned per code branch). Previously this was not a problem because the initial this()
or super()
invocation was always on the (only) initial branch of the code. Now that different branches of code may or may not initialize the superclass, we have to keep track of the "uninitialized" status of each LocalVar
separately in each Code.State
instance.
This is done by adding a bitmap indicating which local variables are initialized. As a result, to get the current type of a LocalVar
, now you access the State
instead of accessing the LocalVar
directly.
jvm/Gen.java
Previously, the method Gen.normalizeMethod()
added initialization code to initial constructors after the intial super()
invocation. This is now done at every super()
invocation instead of just after the presumed first and only one.
tree/TreeInfo.java
Removed these utility methods:
public static Name getConstructorInvocationName(List extends JCTree> trees, Names names)
public static boolean isInitialConstructor(JCTree tree)
Added these utility methods:
public static boolean hasConstructorCalls(JCTree tree, Name target)
public static boolean hasAnyConstructorCalls(JCTree tree)
public static List
findConstructorCalls(JCTree tree, Name target) public static List
findAllConstructorCalls(JCTree tree) public static void mapSuperCalls(JCBlock block, Function super JCExpressionStatement, ? extends JCStatement> mapper)
resources/compiler.properties
There are some changes to error messages:
Removed these errors
call to {0} must be first statement in constructor
Added these errors:
calls to {0}() may only appear within constructors
calls to {0}() may not appear within try statements
superclass constructor might not have been invoked
superclass constructor might already have been invoked
Changed these errors:
Old: canonical constructor must not contain explicit constructor invocation
New: canonical constructor must not contain explicit constructor invocations
Old: constructor is not canonical, so its first statement must invoke another constructor of class {0}
New: constructor is not canonical, so it must invoke other constructors of class {0}
Testing
Testing of compiler changes will be done using the existing unit tests, which are unchanged except for those tests that verify changed compiler behavior, plus new positive and negative test cases related to this new feature.
All JDK existing classes will be compiled using the previous and new versions of the compiler, and the bytecode compared, to verify there is no change to existing bytecode.
No platform-specific testing should be required.
Risks and Assumptions
An explicit goal of this work is to not change the behavior of existing programs. Therefore, other than any newly created bugs, the risk to existing software should be low.
From a technical point of view, the most complicated aspect of this change is proper DA/DU analysis of superclass initialization. It is believed that risk here is reduced by relying on the existing, well-tested code for blank final field DA/DU analysis.
It's possible that compiling and/or executing newly valid code could trigger bugs in existing code that were not previously accessible.
Dependencies
Java compiler changes - JDK-8194743
Leave A Comment