Lexer

amyc.parsing.Lexer$
object Lexer extends Pipeline[List[File], Iterator[Token]], Lexers

Attributes

Graph
Supertypes
trait Lexers
trait Zippers
trait RegExps
class Pipeline[List[File], Iterator[Token]]
class Object
trait Matchable
class Any
Show all
Self type
Lexer.type

Members list

Type members

Inherited classlikes

class Lexer

Tokenizes an input source with respect to a sequence of token producers.

Tokenizes an input source with respect to a sequence of token producers.

Attributes

Inherited from:
Lexers
Supertypes
class Object
trait Matchable
class Any
object Lexer

Contains utilities to build lexers.

Contains utilities to build lexers.

Attributes

Inherited from:
Lexers
Supertypes
class Object
trait Matchable
class Any
case class Producer(regExp: RegExp, makeToken: TokenMaker)

Associates a regular expression with a token generator.

Associates a regular expression with a token generator.

Attributes

Inherited from:
Lexers
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
class ProducerDecorator(regExp: RegExp)

Adds methods to build a Producer from a RegExp.

Adds methods to build a Producer from a RegExp.

Attributes

Inherited from:
Lexers
Supertypes
class Object
trait Matchable
class Any
sealed abstract class RegExp

Regular expressions over characters.

Regular expressions over characters.

Attributes

Inherited from:
RegExps
Supertypes
class Object
trait Matchable
class Any
object RegExp

Contains primitive constructors for regular expressions.

Contains primitive constructors for regular expressions.

Attributes

Inherited from:
RegExps
Supertypes
trait Sum
trait Mirror
class Object
trait Matchable
class Any

Types

type Character = Char

Tiny Silex reference: ============================== Silex's lexer essentially allows you to define a list of regular expressions in their order of priority. To tokenize a given input stream of characters, each individual regular expression is applied in turn. If a given expression matches, it is used to produce a token of maximal length. Whenever a regular expression does not match, the expression of next-highest priority is tried. The result is a stream of tokens.

Tiny Silex reference: ============================== Silex's lexer essentially allows you to define a list of regular expressions in their order of priority. To tokenize a given input stream of characters, each individual regular expression is applied in turn. If a given expression matches, it is used to produce a token of maximal length. Whenever a regular expression does not match, the expression of next-highest priority is tried. The result is a stream of tokens.

Regular expressions r can be built using the following operators:

  • word("abc") matches the sequence "abc" exactly
  • r1 | r2 matches either expression r1 or expression r2
  • r1 ~ r2 matches r1 followed by r2
  • oneOf("xy") matches either "x" or "y" (i.e., it is a shorthand of word and | for single characters)
  • elem(c) matches character c
  • elem(f) matches any character for which the boolean predicate f holds
  • opt(r) matches r or nothing at all
  • many(r) matches any number of repetitions of r (including none at all)
  • many1(r) matches any non-zero number of repetitions of r

To define the token that should be output for a given expression, one can use the |> combinator with an expression on the left-hand side and a function producing the token on the right. The function is given the sequence of matched characters and the source-position range as arguments.

For instance,

elem(_.isDigit) ~ word("kg") |> { (cs, range) => WeightLiteralToken(cs.mkString).setPos(range._1)) }

will match a single digit followed by the characters "kg" and turn them into a "WeightLiteralToken" whose value will be the full string matched (e.g. "1kg").

Attributes

Type of positions.

Type of positions.

Attributes

type Token = Token

Type of tokens.

Type of tokens.

Attributes

Inherited types

type TokenMaker = (Iterable[Character], (Position, Position)) => Token

Type of functions that create tokens.

Type of functions that create tokens.

Attributes

Inherited from:
Lexers

Value members

Concrete methods

override def run(files: List[File])(using Context): Iterator[Token]

Attributes

Definition Classes

Inherited methods

def andThen[G](thenn: Pipeline[Iterator[Token], G]): Pipeline[F, G]

Attributes

Inherited from:
Pipeline
def elem(char: Character): RegExp

Regular expression that accepts only the single character char.

Regular expression that accepts only the single character char.

Attributes

Inherited from:
RegExps
def elem(predicate: Character => Boolean): RegExp

Regular expression that accepts single characters based on a predicate.

Regular expression that accepts single characters based on a predicate.

Attributes

Inherited from:
RegExps
def many(regExp: RegExp): RegExp

Regular expression that accepts zero or more repetitions of regExp.

Regular expression that accepts zero or more repetitions of regExp.

Attributes

Inherited from:
RegExps
def many1(regExp: RegExp): RegExp

Regular expression that accepts one or more repetitions of regExp.

Regular expression that accepts one or more repetitions of regExp.

Attributes

Inherited from:
RegExps
def oneOf(chars: Seq[Character]): RegExp

Regular expression that accepts any of the characters in chars.

Regular expression that accepts any of the characters in chars.

Attributes

Inherited from:
RegExps
def opt(regExp: RegExp): RegExp

Regular expression that accepts zero or one instances of regExp.

Regular expression that accepts zero or one instances of regExp.

Attributes

Inherited from:
RegExps
def word(chars: Seq[Character]): RegExp

Regular expression that accepts only the sequence of characters chars.

Regular expression that accepts only the sequence of characters chars.

Attributes

Inherited from:
RegExps

Concrete fields

lazy val delimiters: Producer
lazy val identifiers: Producer
lazy val intLitToken: Producer
lazy val keywords: Producer
lazy val lexer: Lexer
lazy val modifiers: Producer
override val name: String
lazy val whitespace: Producer

Inherited fields

val any: RegExp

Regular expression that accepts any single character.

Regular expression that accepts any single character.

Attributes

Inherited from:
RegExps