This file implements common functionality for lexical analysis, such as string comparison, tokenization (splitting strings by whitespace), and parsing arithmetic types from strings.
Classes, functions, and variables in this file | |
---|---|
bool | compare_strings (const array< char > & first, const char * second) |
bool | compare_strings (const string & first, const char * second, unsigned int second_length) |
bool | tokenize (const char * str, unsigned int length, array< unsigned int > & tokens, hash_map< string, unsigned int > & names) |
bool | parse_float (const CharArray & token, double & value) |
bool | parse_uint (const CharArray & token, unsigned int & value, unsigned int base = 0) |
bool | parse_ulonglong (const CharArray & token, unsigned long long & value) |
bool | parse_int (const CharArray & token, int & value) |
bool | parse_long (const CharArray & token, long & value) |
bool | parse_long_long (const CharArray & token, long long & value) |
bool | parse_uint (const char (&token)[N], unsigned int & value, unsigned int base = 0) |
struct | position |
struct | lexical_token |
bool | print (const lexical_token< TokenType > & token, Stream & stream, Printer & printer) |
void | read_error (const char * error, const position & pos) |
bool | emit_token (array< lexical_token< TokenType >> & tokens, const position & start, const position & end, TokenType type) |
bool | emit_token (array< lexical_token< TokenType >> & tokens, array< char > & token, const position & start, const position & end, TokenType type) |
void | free_tokens (array< lexical_token< TokenType >> & tokens) |
bool | expect_token (const array< lexical_token< TokenType >> & tokens, const unsigned int & index, TokenType type, const char * name) |
bool | append_to_token (array< char > & token, char32_t next, mbstate_t & shift) |
const array< char > & | first, | |
const char * | second | ) |
first
and the null-terminated C string second
.true
if the strings are equivalent, and false
otherwise.
const string & | first, | |
const char * | second, | |
unsigned int | second_length | ) |
first
and the native char array second
whose length is given by second_length
.true
if the strings are equivalent, and false
otherwise.
const char * | str, | |
unsigned int | length, | |
array< unsigned int > & | tokens, | |
hash_map< string, unsigned int > & | names | ) |
Tokenizes the given native char array str
with length length
, assigning to each unique token an unsigned int
identifier. These identifiers are stored in the core::hash_map names
. The tokenized identifiers are added to the core::array tokens
.
token
as a double
.CharArray | a string type that implements two fields: (1) |
true
if successful, or false
if there is insufficient memory or token
is not an appropriate string representation of a floating-point number.
const CharArray & | token, | |
unsigned int & | value, | |
unsigned int | base = 0 | ) |
token
as an unsigned int
.CharArray | a string type that implements two fields: (1) |
true
if successful, or false
if there is insufficient memory or token
is not an appropriate string representation of a unsigned integer.
const CharArray & | token, | |
unsigned long long & | value | ) |
token
as an unsigned int
.CharArray | a string type that implements two fields: (1) |
true
if successful, or false
if there is insufficient memory or token
is not an appropriate string representation of a unsigned integer.
token
as an int
.CharArray | a string type that implements two fields: (1) |
true
if successful, or false
if there is insufficient memory or token
is not an appropriate string representation of a integer.
token
as a long
.CharArray | a string type that implements two fields: (1) |
true
if successful, or false
if there is insufficient memory or token
is not an appropriate string representation of a long.
const CharArray & | token, | |
long long & | value | ) |
token
as a long
.CharArray | a string type that implements two fields: (1) |
true
if successful, or false
if there is insufficient memory or token
is not an appropriate string representation of a long.
const char | (&token)[N], | |
unsigned int & | value, | |
unsigned int | base = 0 | ) |
token
as an unsigned int
.base | if |
true
if successful, or false
if token
is not an appropriate string representation of an unsigned integer.
Represents a position in a file. This structure is typically used to provide informative errors during lexical analysis of data from a file.
Public members | |
---|---|
unsigned int | line |
unsigned int | column |
position (unsigned int line, unsigned int column) | |
position (const position & p) | |
position | operator + (unsigned int i) const |
position | operator - (unsigned int i) const |
static bool | copy (const position & src, position & dst) |
The line number of the position in the file.
The column number of the position in the file.
unsigned int | line, | |
unsigned int | column | ) |
Constructs the position structure with the given line
and column
.
const position & | p | ) |
Constructs the position structure by copying from p
.
Returns a position with the column number increased by i
.
Returns a position with the column number decreased by i
.
A structure representing a single token during lexical analysis. This structure is generic, intended for use across multiple lexical analyzers.
Public members | |
---|---|
TokenType | type |
position | start |
position | end |
string | text |
The generic type of this token.
The start position (inclusive) of the token in the source file.
The end position (exclusive) of the token in the source file.
An (optional) string representing the contents of the token.
const lexical_token< TokenType > & | token, | |
Stream & | stream, | |
Printer & | printer | ) |
token
to the output stream
.Printer | a scribe type for which the functions |
const char * | error, | |
const position & | pos | ) |
Reports an error with the given message error
as a null-terminated C string at the given source file position pos
to stderr.
array< lexical_token< TokenType >> & | tokens, | |
const position & | start, | |
const position & | end, | |
TokenType | type | ) |
Constructs a lexical_token with the given start
and end
positions, and TokenType type
, with an empty lexical_token::text message and appends it to the tokens
array.
array< lexical_token< TokenType >> & | tokens, | |
array< char > & | token, | |
const position & | start, | |
const position & | end, | |
TokenType | type | ) |
Constructs a lexical_token with the given start
and end
positions, and TokenType type
, with lexical_token::text copied from token
and appends it to the tokens
array.
array< lexical_token< TokenType >> & | tokens | ) |
Frees every element in the given tokens
array. This function does not free the array itself.
const array< lexical_token< TokenType >> & | tokens, | |
const unsigned int & | index, | |
TokenType | type, | |
const char * | name | ) |
Inspects the element at the given index
in the tokens
array. If index
is not out of bounds, and the token at that index has type that matches the given type
, the function returns true
. Otherwise, an error message is printed to stderr indicating that the expected token was missing, with its name
as part of the error message, and false
is returned.
array< char > & | token, | |
char32_t | next, | |
mbstate_t & | shift | ) |
Appends the given wide character next
to the char array token
which represents a multi-byte string.