Lexeme

interface Lexeme<out T : Token>

Defines a pattern for recognizing and producing tokens from character input.

A Lexeme encapsulates the logic for matching a specific lexical pattern at the current position in the input stream. When a match is found, it returns a LexemeMatch describing the match length and a factory for creating the corresponding Token.

Implementations should define the match method to perform pattern matching within a LexerContext.

Note on equals and hashCode

Lexeme implementations may optionally override equals and hashCode to provide custom equality semantics. By default, Lexemes use the semantics of Any for these functions.

If an implementation overrides these functions, ensure:

  • Both equals and hashCode are overridden together (not just one or the other)

  • Both equals and hashCode follow the semantic contracts specified in Any

  • For Lexemes a and b, the implementation of a.equals(b) must satisfy (at minimum) the following semantics:

  • a) Return true when a === b

  • b) Return false when the Lexemes have different name values

  • c) Return false when there exists any input (i.e. LexerContext state) where a.match()?.length != b.match()?.length

  • d) If none of the above apply, equals may (but is not required to) return true

Parameters

T

The type of Token produced by this Lexeme

Inheritors

Properties

Link copied to clipboard
open val defaultFactory: (value: CharSequence) -> T

A default Token factory that creates new instances of T for matches of this Lexeme, given the CharSequence which captures the Token (equal to Token.value).

Link copied to clipboard
open val name: String?

An optional name for this Lexeme, used for equals, hashCode, debugging, and error messages.

Functions

Link copied to clipboard
abstract fun match(): LexemeMatch<T>?