Variable Assignment

Now that we know the basic premise of converting tokens to an AST node, lets walk through our assignment node.

As a quick recap, here's what the DSL looks like:


new hello = 'world'

Which we break up into 4 different tokens:

DSL Text	Token	AST Node
new	VariableDeclaration	?
hello	Literal	Literal
=	AssignmentOperator	?
'world'	String	String

We've already established that our Literals and Strings map to the same AST nodes, so what about VariableDeclaration and AssignmentOperator?

If we look at our DSL, exchanging text for tokens, we end up with:


<VariableDeclaration> <Literal> <AssignmentOperator> <String>

Now, this is true for our case, but we may not always be assigning a string as a value:


<VariableDeclaration> <Literal> <AssignmentOperator> <Token>

Looking at this we can see that our AssignmentOperator token is actually pretty useless, even though it makes our DSL easier to read.

From an AST perspective, we can completely ignore it, as we know that after we encounter a VariableDeclaration, the next token should be a Literal (the variable name), and then the next token after that will be our assignment value.

So, let's just noop our AssignmentOperator token for the moment:


if (currentToken.type === TokenType.AssignmentOperator) {
  currentIndex++
  return null
}

Next, let's check for our our variable declaration token:


if (currentToken.type === TokenType.VariableDeclaration) {
  currentIndex++

  // Process our name and value tokens

  return {
    type: ASTNodeType.Assignment,
    name: ??,
    value: ??
  }
}

We know from above that we need to process the next 3 tokens.


const variableNameNode = process()
const assignmentOperatorNode = process()
const variableValueNode = process()

Typescript forces us to check if they're null or not, so while we do this, let's actually check that they're the correct types as well.


const variableNameNode = process()
if (!variableNameNode || variableNameNode.type !== ASTNodeType.Literal) {
  throw new Error('Invalid variable node')
}

// Process our AssignmentOperator
process()

const variableValueNode = process()
if (!variableValueNode) {
  throw new Error('Invalid variable value')
}

There's a couple things you may notice here:

We're not actually checking our assignment operator node.
We're not checking that our variable value is a valid as a value.

Assignment Operator check

For our first check, this one needs to be different. Even though we 'process' our assignment operator, we actually return null, so what do we check?

The easiest solution here is to check our actual token, instead of the processed AST node.

Change our single process() call to the following:


const assignmentNode = tokens[currentIndex++]

if (assignmentNode.type !== TokenType.AssignmentOperator) {
  throw new Error('Must use = operator to assign value')
}

We know that the correct token structure means that the next (and only the next) token is an AssignmentOperator, so we manually increment currentIndex and check this token. If it's not correct, we can now throw an error.

Variable value check

If we have a look at the return type of variableValueNode, it's the following:


ASTValueNode<ASTNodeType.String, string> |
ASTValueNode<ASTNodeType.Literal, string> |
ASTProgramNode |
ASTAssignmentNode |
ASTLogNode |
null

This is what ASTToken is (our union of all types) in addition to the null from our process() type signature.

Why is this a problem?

Because not all of our tokens have a value key on them, typescript won't let us actually access the value key unless we check it's valid.

For the sake of this check, let's assume that any AST node that contains a value key is valid as a value for an assignment.

How do we do this check?

By using the x in y Javascript syntax - which Typescript understands as a type guard.


const variableValueNode = process()
if (!variableValueNode || !('value' in variableValueNode)) {
  throw new Error('Invalid variable value')
}

If we now take a look at the type value of variableValueNode after this if statement, we'll see that it's now restricted to the few types that have a value key:


ASTValueNode<ASTNodeType.String, string> |
ASTValueNode<ASTNodeType.Literal, string> |
ASTAssignmentNode

Technically ASTAssignmentNode isn't a valid node here, but this check is likely good enough for now, let's continue.

Returning our AST node

The final step in our variable assignment parser is to return our new AST node.

Up above, we had placeholders in our return code:


return {
  type: ASTNodeType.Assignment,
  name: ??,
  value: ??
}

But now that we have our two required nodes (variableNameNode & variableValueNode) parsed, and as their correct types, we can fill this out.


return {
  type: ASTNodeType.Assignment,
  name: variableNameNode.value,
  value: variableValueNode
}

You may notice that we actually use these two nodes differently.

We know for a fact that our variable name is always going to be a string, so we're actually going to convert our name Literal into a string.

For our value though, we're not entirely sure what it will be. It would be a string, like in our example DSL ('world'), or it could be another literal:


new hello1 = 'world'
new hello2 = hello1

So instead of returning just the value, we need to capture the type of that value as well, so we are just going to return the entire AST node as our value.

Building a language in Typescript