tokenize

fun tokenize(input: CharSequence, discardEnabled: Boolean = true): List<Token>

Transforms the given character input into a list of tokens.

The Lexer processes input from start to end, repeatedly attempting to match Lexemes at the current position. The Tokens produced by each Lexeme are collected and returned as a list for further use in parsing.

When discardEnabled is true, tokens marked as discardable (such as whitespace) are excluded from the output list.

Lexing succeeds only if the entire input sequence is consumed.

Return

A list of tokens representing the lexical structure of input

Parameters

input

The character sequence to tokenize

discardEnabled

Whether to exclude discardable tokens from the result

Throws

if the input cannot be fully tokenized