This contribution diary post is much longer than normal because its subject matter is deeper. It also assumes you’ve read through previous entries and/or are already familiar with how JavaScript compilers and type checkers work. If that’s not the case, no worries! Read through a previous entry such as TypeScript Contribution Diary: Improved Syntax Error for Enum Member Colons and Andrew Branch’s Debugging the TypeScript Codebase.
My previous TypeScript Contribution Diary posts were structured as stories explaining the timeline of how those changes made it in. This entry’s pull request had 159 comments over three years — far too many for that format. I’ll instead give a high-level overview of the backing issue’s context, the pull request’s strategy, and general code changes.
Project Scope
There ended up being two areas of source code I had to change:
- Updating the Type Checker: Adjusting TypeScript’s type errors to be more lenient
- Updating Transformers: Adjusting output JavaScript for more varieties of constructors
I’ll give a high-level overview for each. I’d strongly recommend referring back to the pull request in your local editor to understand the flow of code.
Let’s dig in! 🎂
Updating the Type Checker
Most use cases for including non-this
, non-super
code in the constructor of a derived class are fairly small.
The ones I’d seen in the wild were generally about logging and/or creating a temporary variable to be passed as an argument to the super()
call.
I also didn’t want to spend a great deal of time to handle complicated logical cases.
Thus, I thought it’d be best to tweak TypeScript’s type system logic without overhauling it.
Instead of requiring the super()
call be the first expression in the constructor, I would make two requirements:
- It would need to be a root-level expression: meaning it couldn’t be contained in a block such as an
if
orfor
- Runtime uses of
this
andsuper
keywords would not be allowed before that root-level expression
You can see the changes in the pull request’s src/compiler/checker.ts
file view.
These next two blog post sections will give a high-level overview of them.
Checking for a Root Level super()
TypeScript’s type checker already found the first super()
call in a constructor using a call to an existing findFirstSuperCall
function:
const superCall = findFirstSuperCall(node.body!);
That function returns the first node that matches isSuperCall
, skipping any function boundary and recursively searching through all other child nodes:
function findFirstSuperCall(node: Node): SuperCall | undefined {
return isSuperCall(node)
? node
: isFunctionLike(node)
? undefined
: forEachChild(node, findFirstSuperCall);
}
I fortunately didn’t need to change findFirstSuperCall
for my changes.
I used the existing superCall
variable for a check to make sure it was root level with a new superCallIsRootLevelInConstructor
function:
if (!superCallIsRootLevelInConstructor(superCall, node.body!)) {
error(
superCall,
Diagnostics.A_super_call_must_be_a_root_level_statement_within_a_constructor_of_a_derived_class_that_contains_initialized_properties_parameter_properties_or_private_identifiers,
);
}
superCallIsRootLevelInConstructor
checks whether a super()
call expression’s parent expression statement is in the body of a constructor:
function superCallIsRootLevelInConstructor(superCall: Node, body: Block) {
const superCallParent = walkUpParenthesizedExpressions(superCall.parent);
return (
isExpressionStatement(superCallParent) && superCallParent.parent === body
);
}
To recap TypeScript’s AST behavior around call statements:
- Block: Area containing lines of code, commonly surrounded by
{}
- Statement: Line of code, commonly a child of a block
- Examples include expression statements, for statements, and if statements
- Expression Statement: Contains a call expression as its child expression
- Call Expression: A call to a function
I find it easier to remember the distinction by recalling that statements may optionally have a semicolon. In codebases that include semicolons, expression statements contain a child such as a binary expression or call expression plus one character for a semicolon:
super();
|------| <- expression statement
|-----| <- call expression
Checking Constructor Statement Order
Next up was making sure nothing in the constructor accessed super
or this
before the super()
call.
I did that with a for loop over the statements in the constructor.
For each statement:
- If the statement is an expression statement that contains a
super()
call, mark that we found it and break the loop - If the statement is a “prologue directive”, continue
- If the statement “immediately” references
super
orthis
, break the loop
for (const statement of node.body!.statements) {
if (
isExpressionStatement(statement) &&
isSuperCall(skipOuterExpressions(statement.expression))
) {
superCallStatement = statement;
break;
}
if (
!isPrologueDirective(statement) &&
nodeImmediatelyReferencesSuperOrThis(statement)
) {
break;
}
}
After the loop, if we hadn’t found the super()
call, issue a type error with an amusingly long error message for failing to find it.
if (superCallStatement === undefined) {
error(
node,
Diagnostics.A_super_call_must_be_the_first_statement_in_the_constructor_to_refer_to_super_or_this_when_a_derived_class_contains_initialized_properties_parameter_properties_or_private_identifiers,
);
}
“A super call must be the first statement in the constructor to refer to super or this when a derived class contains initialized properties parameter properties or private identifiers.”
Prologue Directives
I had never heard of this term before this pull request.
It refers to string literals used as a statements such as "use asm;"
and "use strict";
.
They are by nature allowed to come before any code in a constructor.
In retrospect, I don’t recall why I added a special case for them to the function. Ah well.
Edit 4/13/2022: The ECMAScript Spec refers to them as “Directive Prologues”. Whoops.
Immediately Referencing super
or this
By “immediately” I mean a node accesses super
or this
in code that is known to execute immediately, such as children of expressions and blocks.
Another way of putting that is ignoring any code that won’t be immediately executed, such as function or property declaration.
There are a lot of edge cases in there!
For example, a class extends
clause immediately executes the base class being extended, but initial values for properties in any class aren’t used in runtime until the constructor for their class is called.
class Base {}
class Derived extends Base {
constructor() {
// class Middle { ... } executes immediately for Inside to extend it...
class Inside extends class Middle {
// ...while this property is created later, per-instance
woweeMiddle = this;
} {
// ...while this property is created later, per-instance
woweeInside = this;
}
super();
new Inside();
}
}
I wrote a nodeImmediatelyReferencesSuperOrThis
helper function that, similar to findFirstSuperCall
, recursively checks children of a node.
It stops searching when it encounters a node that creates a new class scope or delays execution of its contents, such as a function or class property.
function nodeImmediatelyReferencesSuperOrThis(node: Node): boolean {
if (
node.kind === SyntaxKind.SuperKeyword ||
node.kind === SyntaxKind.ThisKeyword
) {
return true;
}
if (isThisContainerOrFunctionBlock(node)) {
return false;
}
return !!forEachChild(node, nodeImmediatelyReferencesSuperOrThis);
}
/**
* @returns Whether the node creates a new 'this' scope for its children.
*/
export function isThisContainerOrFunctionBlock(node: Node): boolean {
switch (node.kind) {
// Arrow functions use the same scope, but may do
// so in a "delayed" manner
// For example, `const getThis = () => this` may be
// before a super() call in a derived constructor
case SyntaxKind.ArrowFunction:
case SyntaxKind.FunctionDeclaration:
case SyntaxKind.FunctionExpression:
case SyntaxKind.PropertyDeclaration:
return true;
case SyntaxKind.Block:
switch (node.parent.kind) {
case SyntaxKind.Constructor:
case SyntaxKind.MethodDeclaration:
case SyntaxKind.GetAccessor:
case SyntaxKind.SetAccessor:
// Object properties can have computed names;
// only method-like bodies start a new scope
return true;
default:
return false;
}
default:
return false;
}
}
With these approximate type checker changes, the type checker allows for code before the super()
call as long as it doesn’t immediately reference super
or this
.
The type checker was sufficiently updated for my changes.
Hooray!
That leaves us with making TypeScript’s code emit properly transform JavaScript for these new constructor variants.
Updating Transformers
TypeScript’s code emit converts input TypeScript syntax to output JavaScript syntax by passing each input AST through a series of transformers.
You can see the impacted transformers in the pull request under src/transformers
.
They’re coordinated by a getScriptTransformers
in src/compiler/transformer.ts
.
The transformers relevant to this pull request are, in order:
transformTypeScript
: Removes type system specific syntax, leaving pure glorious JavaScript.transformClassFields
: Massages class fields such as class properties and parameter properties into their JavaScript equivalents.transformES....
: For each language version recognized by TypeScript, a transformer of the next language version’s name transforms it.- These start at ESNext, then decrease sequentially from the newest known language version down to the configured output target language version.
- For example, if the configured output language version is
"es2019"
, then as of TypeScript 4.6 the transformers to be run would be:transformESNext
,transformES2021
, andtransformES2020
.
Transformers generally recursively crawl through the nodes in the file’s AST, applying transformations to specific node types as they find them. These next three blog post sections will give a high-level overview of each of the changed transformers.
transformTypeScript
transformTypeScript
includes a transformConstructorBody
function that turns any parameter properties into assignments within the constructor.
For example, this TypeScript class:
class HasParameterProperty {
constructor(public property: number) {
console.log("Hello, world!");
}
}
…would become this JavaScript class (or the equivalent with Object.defineProperty
if useDefineForClassFields
is enabled):
class HasParameterProperty {
constructor(property) {
this.property = property;
console.log("Hello, world!");
}
}
transformTypeScript
previously assumed it could add both prologue directives and the initial super call all at once when transforming a constructor with nothing between them.
It did so with a function named addPrologueDirectivesAndInitialSuperCall
that returned the index of the first statement after them.
I replaced that function with code that computed two important variables:
indexAfterLastPrologueStatement
: After copying any prologue statements, the index of the node just after themsuperStatementIndex
: Index of the first foundsuper()
call after prologue statements, or-1
if not found
const indexAfterLastPrologueStatement = factory.copyPrologue(
body.statements,
statements,
/*ensureUseStrict*/ false,
visitor,
);
const superStatementIndex = findSuperStatementIndex(
body.statements,
indexAfterLastPrologueStatement,
);
function findSuperStatementIndex(
statements: NodeArray<Statement>,
indexAfterLastPrologueStatement: number,
) {
for (let i = indexAfterLastPrologueStatement; i < statements.length; i += 1) {
const statement = statements[i];
if (getSuperCallFromStatement(statement)) {
return i;
}
}
return -1;
}
Using those two variables, this is the order the code now takes to create the transformed constructor’s body in the proper order:
- If
superStatementIndex
was found, first visit existing statements up to and including it - Visit any parameter properties and map them into nodes:
- If
superStatementIndex
was found, place those parameter properties immediately after it - If
superStatementIndex
wasn’t found, place the parameter properties first in the constructor
- If
- Add any remaining statements from the body, skipping the
superStatementIndex
index if it was found
// If there was a super call, visit existing statements up to and including it
if (superStatementIndex >= 0) {
addRange(
statements,
visitNodes(
body.statements,
visitor,
isStatement,
indexAfterLastPrologueStatement,
superStatementIndex + 1 - indexAfterLastPrologueStatement,
),
);
}
// Transform parameters into property assignments. Transforms this:
//
// constructor (public x, public y) {
// }
//
// Into this:
//
// constructor (x, y) {
// this.x = x;
// this.y = y;
// }
//
const parameterPropertyAssignments = mapDefined(
parametersWithPropertyAssignments,
transformParameterWithPropertyAssignment,
);
// If there is a super() call, the parameter properties go immediately after it
if (superStatementIndex >= 0) {
addRange(statements, parameterPropertyAssignments);
}
// Since there was no super() call, parameter properties are the first statements in the constructor
else {
statements = addRange(parameterPropertyAssignments, statements);
}
// Add remaining statements from the body, skipping the super() call if it was found
addRange(
statements,
visitNodes(body.statements, visitor, isStatement, superStatementIndex + 1),
);
transformClassFields
transformClassFields
also contains a transformConstructorBody
function.
This time it’s used to turn class properties into assignments within the constructor.
For example, this TypeScript class:
class HasClassProperty {
property = 1;
constructor() {
console.log("Hello, world!");
}
}
…would become this JavaScript class (or the equivalent with Object.defineProperty
if useDefineForClassFields
is enabled):
class HasClassProperty {
constructor() {
this.property = 1;
console.log("Hello, world!");
}
}
This transformConstructorBody
also inserts a “synthetic” super(...arguments)
if the class is a derived one with a property initializer and without its own constructor.
For example, this TypeScript class:
class HasJustClassProperty {
property = 1;
}
…needs to create its own constructor
and super(...arguments)
in order to hold the mapped property in its output JavaScript:
class HasJustClassProperty {
constructor() {
super(...arguments);
this.property = 1;
}
}
In order to account for code being emitted before any class properties and any constructor, the logic is roughly:
- Map any prologue directives and explicit
super()
call into the new constructor - If there was a
super()
call, splice any statements preceding it after the prologue statements and before thesuper()
call - Later depending on whether a
super()
call was found:- If it was, add parameter properties immediately after it
- If it wasn’t but a synthetic
super(...arguments)
was added, add those parameter properties just after it - If neither is the case, add those parameter properties to the top of the constructor
Ordering is tricky!
I also excluded parameter properties from being moved into the constructor when useDefineForClassFields
is enabled, as those properties are then handled elsewhere.
I don’t remember where else they’re handled but I do remember that when I didn’t filter them out, they appeared twice in the output JavaScript.
I’ve omitted code snippets from this transformer’s explanation for brevity.
transformES2015
The ES2015-to-ES5 transformer is the largest of TypeScript’s transformers and contains more lines of code than all the other ECMAScript transformers combined.
I suggested in #47573: Remove older emit support over time that TypeScript no longer target ECMAScript versions older than what any realistically used runtime environment needs…
but until dropping pre-ES2020 happens some years in the future (🙏), ES2015 classes still need to be transformed into function prototype
equivalents in TypeScript’s compiled output JavaScript.
This TypeScript class:
class HasPropertyAndLog {
message = "world";
constructor() {
console.log("Hello", this.message);
}
}
…becomes roughly this output JavaScript:
var HasPropertyAndLog = /** @class */ (function () {
function HasPropertyAndLog() {
this.message = "world";
console.log("Hello", this.message);
}
return HasPropertyAndLog;
})();
transformES2015
’s transformConstructorBody
keeps track of two arrays of statement nodes:
prologue
: Any existing prologue directives, as well as any nodes added during transformation meant to be added just after themstatements
: The rest of the statements output for the function body
My change started off by adding three pieces of logic:
- Captures any previously existing prologue directives in an
existingPrologue
array - Find the
super()
call, storing it in asuperCall
and its statement index insuperStatementIndex
- This is done with a new
findSuperCallAndStatementIndex
that loops through constructor body statements after those inexistingPrologue
- This is done with a new
- Create a
postSuperStatementsStart
variable to determine where post-er(...)
nodes are meant to be placed:- If a
super()
call wasn’t found, place them just afterexistingPrologue
- If a
super()
call was found, place them just aftersuperStatementIndex
- If a
transformConstructorBody
is then able to use that information to create constructor body statements:
- If the
super()
call wasn’t synthesized, copy prologue statements intoprologue
- Create a
superCallExpression
variable to store a newsuper()
call, if a previous one exists:- If the existing
super()
is synthesized, replace it with the ES5 equivalent:var _this = _super !== null && _super.apply(this, arguments) || this;
- If the existing
super()
wasn’t synthesized, store the result of visiting it
- If the existing
- Add any default property value assignments and constructor rest parameter to the end of
prologue
- Add any remaining statements from the constructor to
statements
The logic for where to place that superCallExpression
node changes based on a few potential cases commented in src/compiler/transformers/es2015.ts#1056:
- Whether the constructor is in a derived class
- If so, whether the constructor ends with a
super()
call and doesn’t refer tothis
- If so, whether the constructor ends with a
- Whether the
super()
call, if it exists, is the first call in the constructor
I’ve omitted code snippets from this transformer’s explanation for brevity.
I know that was a big wall of text, but if you read through the contents of transformConstructorBody
and use its comments as reference, I think it can be reasoned through.
The transformer code has to include a few extra function calls to properly massage this
scoping and source maps from ES2015+ classes to ES5 functions here and there.
Bewildered at that high-level walkthrough? Me too! Please upvote #47573: Remove older emit support over time to make it more likely we’ll no longer need to support ES5 eventually! 💖
Final Thanks
I’d like to extend a sincere heartfelt thanks to the several developers who reviewed the pull request over the years. In order of review:
- Klaus Meinhardt: An all-around knowledgeable developer who has previously created a linter (fimbullinter/wotan) and gave helpful pointers early in the pull request — all as a fellow external contributor.
- Wesley Wigham: For giving the pull request a helpful review and its first approval back in 2020.
- Ron Buckton: For an intensely thorough set of reviews containing deep insights into the wild and wacky world of JavaScript and TypeScript classes, along with the final approval in 2022.