A simple tokenizer written in typescript.
The Token
constructor accepts an array of possible RegExp
that may be used to extract a token, the original source string, and optionally a starting index.
When constructed, Token
tests all given regular expressions it was given and selects the first that matches. The pattern
and value
properties are then updated to hold the regular expression and the portion of string it matched, respectively.
console.log(new Token( [ /[\s]+/, /[a-z]+/ ], "just a test string"));
will output:
Token { index: 0, pattern: /[a-z]+/, value: "just" }
yield
The static generator method yield
accepts the same arguments as the constructor, but sequentially extracts and yields a token so long as one can be matched.
console.log( [ ...Token.yield( [ /[\s]+/, /[a-z]+/ ], "just a test string", 0 ) ] );
wil output:
[
Token { index: 0, pattern: /[a-z]+/, value: "just" },
Token { index: 4, pattern: /[\s]+/, value: " " },
Token { index: 5, pattern: /[a-z]+/, value: "a" },
Token { index: 6, pattern: /[\s]+/, value: " " },
Token { index: 7, pattern: /[a-z]+/, value: "test" },
Token { index: 11, pattern: /[\s]+/, value: " " },
Token { index: 12, pattern: /[a-z]+/, value: "string" }
]