Lexing

This it the process of converting your program into a distinct set of base elements.

Think of things like:

  • string
  • start block
  • end block
  • for loop
  • assignment

The above list is certainly not exhaustive, and depends entirely on the syntax and semantics of the language that we're building.

An important things to note about our lexing phase: We are not assigning meaning to anything here, we are only describing what our program is.

new hello = 'world'
print hello

Take our above DSL as an example, when we parse it, we may end up with a list of tokens that looks something similar to the following:

[
  { type: 'LineBreak' },
  { type: 'VariableDeclaration' },
  { type: 'Literal', value: 'hello' },
  { type: 'AssignmentOperator' },
  { type: 'String', value: 'world' },
  { type: 'LineBreak' },
  { type: 'Log' },
  { type: 'Literal', value: 'hello' },
  { type: 'LineBreak' }
]

Ignore the Literal destinction if you need to - think of it as just a name. We'll cover this in a later chapter.

As mentioned above, we're not parsing/storing context or meaning here, it's simply a one-dimensional array describing our source code.