This class splits a given character array (e.g. a const char* or an AString object), which contains data separated by a delimiter character, into tokens of type Substring.
After an instance of this class is constructed, there are three methods available:
After a token was retrieved, it might be modified using the interface of class Substring. (In other words, the tokenizer does not rely on the bounds of the current token when receiving the next.) Consequently, it is allowed to recursively tokenize a token by creating a different instance of class Tokenizer providing the returned token as input.
If created or set using a reference of class AString, the buffer of AString is not copied. This allows efficient operations on sub-strings of class AString. However, the source string must not be changed (or only in a controlled way) during the use the Tokenizer instance.
Objects of this class can be reused by freshly initializing them using one of the overloaded Set methods.
Sample code:
The following code sample shows how to tokenize a string, including using one nested tokenizer:
The output will be:
Public Fields | |
Substring | Actual = new Substring() |
Substring | Rest = new Substring() |
char[] | Whitespaces = CString.DefaultWhitespaces |
Public Methods | |
Tokenizer () | |
Tokenizer (AString src, char delim, bool skipEmptyTokens=false) | |
Tokenizer (String src, char delim, bool skipEmptyTokens=false) | |
Tokenizer (Substring src, char delim, bool skipEmptyTokens=false) | |
Substring | GetRest (Whitespaces trimming=lang.Whitespaces.Trim) |
bool | HasNext () |
Substring | Next (Whitespaces trimming=lang.Whitespaces.Trim, char newDelim='\0') |
void | Set (AString src, char delim, bool skipEmptyTokens=false) |
void | Set (String src, char delim, bool skipEmptyTokens=false) |
void | Set (Substring src, char delim, bool skipEmptyTokens=false) |
override String | ToString () |
Protected Fields | |
char | delim = '\0' |
bool | skipEmptyTokens |
|
inline |
Constructs an empty tokenizer. Before use, method Set to initialize needs to be invoked.
|
inline |
Constructs a tokenizer to work on a given cstring.
src | The string to be tokenized. |
delim | The delimiter that separates the tokens. Can be changed with every next token. |
skipEmptyTokens | If true , empty tokens are omitted. Optional and defaults to false . |
Constructs a tokenizer to work on a given AString.
src | The string to be tokenized. |
delim | The delimiter that separates the tokens. Can be changed with every next token. |
skipEmptyTokens | If true , empty tokens are omitted. Optional and defaults to false . |
Constructs a tokenizer to work on a given Substring.
src | The string to be tokenized. |
delim | The delimiter that separates the tokens. Can be changed with every next token. |
skipEmptyTokens | If true , empty tokens are omitted. Optional and defaults to false . |
|
inline |
Returns the currently remaining string (without searching for further delimiter characters).
After this call HasNext will return false and Next will return a nulled Substring.
trimming | Determines if the token is trimmed in respect to the white space characters defined in field Whitespaces. Defaults to Whitespaces.Trim. |
|
inline |
|
inline |
Returns the next token, which is afterwards also available through field Actual. If no further token was available, the returned Substring will be 'nulled' (see Substring.IsNull). To prevent this, the availability of a next token should be checked using method HasNext().
For clarification, see the explanation and sample code in this classes documentation.
trimming | Determines if the token is trimmed in respect to the white space characters defined in field Whitespaces. Defaults to Whitespaces.Trim. |
newDelim | The delimiter separates the tokens. Defaults to 0, which keeps the current delimiter intact. A new delimiter can be provided for every next token. |
|
inline |
Constructs a tokenizer to work on a given AString.
src | The string to be tokenized. |
delim | The delimiter that separates the tokens. Can be changed with every next token. |
skipEmptyTokens | If true , empty tokens are omitted. Optional and defaults to false . |
|
inline |
Sets the tokenizer to the new source and delim.
src | The string to be tokenized. |
delim | The delimiter that separates the tokens. Can be changed with every next token. |
skipEmptyTokens | If true , empty tokens are omitted. Optional and defaults to false . |
|
inline |
Constructs a tokenizer to work on a given Substring.
src | The string to be tokenized. |
delim | The delimiter that separates the tokens. Can be changed with every next token. |
skipEmptyTokens | If true , empty tokens are omitted. Optional and defaults to false . |
|
inline |
This is for debugging purposes. E.g. this enables the Monodevelop IDE to display object descriptions in the debugger.
|
protected |
The most recently set delimiter used by default for the next token extraction.
A Substring that represents the part of the underlying data that has not been tokenized, yet.
|
protected |
If true
, empty tokens are omitted.
char [] Whitespaces = CString.DefaultWhitespaces |
The white space characters used to trim the tokens. Defaults to CString.DefaultWhitespaces