Collaboration diagram for Tokenizer:

Class Description

This class splits a given character array (e.g. a const char* or an AString object), which contains data separated by a delimiter character, into tokens of type Substring.

After an instance of this class is constructed, there are three methods available:

hasNext:
Indicates if there are further tokens available.
next:
Sets the Substring in field actual to reference the next token and returns it.
With each call to Next, a different delimiter can be provided, which then serves as the delimiter for this and subsequent tokens.
The returned token by default will be trimmed according to the current trimable characters.
rest: Like next, however returns the complete remaining region without searching for further delimiters (and tokens).
After this method was invoked, hasNext() will return false.

After a token was retrieved, it might be modified using the interface of class Substring. (In other words, the tokenizer does not rely on the bounds of the current token when receiving the next.) Consequently, it is allowed to recursively tokenize a token by creating a different instance of class Tokenizer providing the returned token as input.

If created or set using a reference of class AString, the buffer of AString is not copied. This allows efficient operations on sub-strings of class AString. However, the source string must not be changed (or only in a controlled way) during the use the Tokenizer instance.

Objects of this class can be reused by freshly initializing them using one of the overloaded Set methods.

Sample code:
The following code sample shows how to tokenize a string, including using one nested tokenizer:

    // data string to tokenize
    AString data= new AString( "test;  abc ; 1,2 , 3 ; xyz ; including;separator" );
 
    // create tokenizer on data with ';' as delimiter
    Tokenizer tknzr= new Tokenizer( data, ';' );
 
    // read tokens
    System.out.println( tknzr.next().toString() ); // will print "test"
    System.out.println( tknzr.next().toString() ); // will print "abc"
    System.out.println( tknzr.next().toString() ); // will print "1,2 , 3"
 
    // tokenize actual (third) token (nested tokenizer)
    Tokenizer subTknzr= new Tokenizer( tknzr.actual,  ',');
    System.out.print( subTknzr.next().toString() );
 
    while( subTknzr.hasNext() )
        System.out.print( "~" + subTknzr.next().toString() );
 
    System.out.println();
 
    // continue with the main tokenizer
    System.out.println( tknzr.next().toString() ); // will print "xyz"
 
    // grab the rest, as we know that the last token might include our separator character
    System.out.println( tknzr.getRest().toString() ); // will print "including;separator"

The output will be:

test
abc
1,2 , 3
1~2~3
xyz
including;separator

Public Fields
Substring	actual = new Substring()

Substring	rest = new Substring()

char[]	whitespaces =CString.DEFAULT_WHITESPACES

Public Methods
	Tokenizer ()

	Tokenizer (AString src, char delim)

	Tokenizer (AString src, char delim, boolean skipEmptyTokens)

	Tokenizer (String src, char delim)

	Tokenizer (String src, char delim, boolean skipEmptyTokens)

	Tokenizer (Substring src, char delim)

	Tokenizer (Substring src, char delim, boolean skipEmptyTokens)

Substring	getRest ()

Substring	getRest (Whitespaces trimming)

boolean	hasNext ()

Substring	next ()

Substring	next (Whitespaces trimming)

Substring	next (Whitespaces trimming, char newDelim)

void	set (AString astring, char delim)

void	set (AString astring, char delim, boolean skipEmptyTokens)

void	set (String str, char delim)

void	set (String str, char delim, boolean skipEmptyTokens)

void	set (Substring substring, char delim)

void	set (Substring substring, char delim, boolean skipEmptyTokens)

String	toString ()

Protected Fields
char	delim

boolean	skipEmptyTokens

Constructor & Destructor Documentation

◆ Tokenizer() [1/7]

Tokenizer ( )

Constructs an empty tokenizer. Before use, method set to initialize needs to be invoked.

◆ Tokenizer() [2/7]

Tokenizer	(	String	src,
		char	delim,
		boolean	skipEmptyTokens
	)

Constructs a tokenizer to work on a given cstring.

Parameters

src	The string to be tokenized.
delim	The delimiter that separates the tokens. Can be changed with every next token.
skipEmptyTokens	If `true`, empty tokens are omitted. Optional and defaults to `false`.

◆ Tokenizer() [3/7]

Tokenizer	(	String	src,
		char	delim
	)

Constructs a tokenizer to work on a given cstring.

Parameters

src	The character array to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.

◆ Tokenizer() [4/7]

Tokenizer	(	AString	src,
		char	delim,
		boolean	skipEmptyTokens
	)

Constructs a tokenizer to work on a given AString.

Parameters

src	The string to be tokenized.
delim	The delimiter that separates the tokens. Can be changed with every next token.
skipEmptyTokens	If `true`, empty tokens are omitted. Optional and defaults to `false`.

◆ Tokenizer() [5/7]

Tokenizer	(	AString	src,
		char	delim
	)

Constructs a tokenizer to work on a given AString.

Parameters

src	The AString to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.

◆ Tokenizer() [6/7]

Tokenizer	(	Substring	src,
		char	delim,
		boolean	skipEmptyTokens
	)

Constructs a tokenizer to work on a given Substring.

Parameters

src	The string to be tokenized.
delim	The delimiter that separates the tokens. Can be changed with every next token.
skipEmptyTokens	If `true`, empty tokens are omitted. Optional and defaults to `false`.

◆ Tokenizer() [7/7]

Tokenizer	(	Substring	src,
		char	delim
	)

Constructs a tokenizer to work on a given Substring.

Parameters

src	The substring to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.

Member Function Documentation

◆ getRest() [1/2]

Substring getRest ( )

Returns the currently remaining string (without searching for further delimiter characters). After this call hasNext will return false and next will return a nulled Substring.

Returns: The rest of the original source string, which was not returned by next(), yet.

◆ getRest() [2/2]

Substring getRest ( Whitespaces trimming )

Returns the currently remaining string (without searching for further delimiter characters).
After this call hasNext will return false and next will return a nulled Substring.

Parameters

trimming Determines if the token is trimmed in respect to the white space characters defined in field whitespaces. Defaults to Whitespaces.TRIM.

Returns: The rest of the original source string, which was not returned by next(), yet.

◆ hasNext()

boolean hasNext ( )

If this returns true, a call to next will be successful and will return a Substring which is not nulled.

Returns: true if a next token is available.

◆ next() [1/3]

Substring next ( )

Returns the next token, which is afterwards also available through field actual. If no further token was available, the returned Substring will be 'nulled' (see Substring.isNull). To prevent this, the availability of a next token should be checked using method hasNext().

For clarification, see the explanation and sample code in this classes documentation.

Returns: true if a next token was available, false if not.

◆ next() [2/3]

Substring next ( Whitespaces trimming )

Returns the next token, which is afterwards also available through field actual. If no further token was available, the returned Substring will be 'nulled' (see Substring.isNull). To prevent this, the availability of a next token should be checked using method hasNext.

For clarification, see the explanation and sample code in this classes documentation.

Parameters

trimming Determines if the token is trimmed in respect to the white space characters defined in field whitespaces. Defaults to Whitespaces.TRIM.

Returns: true if a next token was available, false if not.

◆ next() [3/3]

Substring next	(	Whitespaces	trimming,
		char	newDelim
	)

Returns the next token, which is afterwards also available through field actual. If no further token was available, the returned Substring will be 'nulled' (see Substring.isNull). To prevent this, the availability of a next token should be checked using method hasNext().

For clarification, see the explanation and sample code in this classes documentation.

Parameters

trimming	Determines if the token is trimmed in respect to the white space characters defined in field whitespaces. Defaults to Whitespaces.TRIM.
newDelim	The delimiter separates the tokens. Defaults to 0, which keeps the current delimiter intact. A new delimiter can be provided for every next token.

Returns: true if a next token was available, false if not.

◆ set() [1/6]

void set	(	AString	astring,
		char	delim
	)

Constructs a tokenizer to work on a given AString.

Parameters

astring	The AString to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.

◆ set() [2/6]

void set	(	AString	astring,
		char	delim,
		boolean	skipEmptyTokens
	)

Constructs a tokenizer to work on a given AString.

Parameters

astring	The AString to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.
skipEmptyTokens	If `true`, empty tokens are omitted. Optional and defaults to `false`.

◆ set() [3/6]

void set	(	String	str,
		char	delim
	)

Sets the tokenizer to the new source and delim.

Parameters

str	The character array to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.

◆ set() [4/6]

void set	(	String	str,
		char	delim,
		boolean	skipEmptyTokens
	)

Sets the tokenizer to the new source and delim.

Parameters

str	The character array to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.
skipEmptyTokens	If `true`, empty tokens are omitted. Optional and defaults to `false`.

◆ set() [5/6]

void set	(	Substring	substring,
		char	delim
	)

Constructs a tokenizer to work on a given Substring.

Parameters

substring	The substring to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.

◆ set() [6/6]

void set	(	Substring	substring,
		char	delim,
		boolean	skipEmptyTokens
	)

Constructs a tokenizer to work on a given Substring.

Parameters

substring	The substring to use as the source for the tokens.
delim	The delimiter that separates the tokens. Can be changed with every next token.
skipEmptyTokens	If `true`, empty tokens are omitted. Optional and defaults to `false`.

◆ toString()

String toString ( )

This is for debugging purposes. E.g. this enables the Eclipse IDE to display object descriptions in the debugger.

Returns: A human readable string representation of this object.

Member Data Documentation

◆ actual

Substring actual = new Substring()

The actual token, which was returned the last recent invocation of next() or rest(). It is allowed to manipulate this field any time, for example changing its whitespace characters definition.

◆ delim

char delim

protected

The most recently set delimiter used by default for the next token extraction.

◆ rest

Substring rest = new Substring()

A Substring that represents the part of the underlying data that has not been tokenized, yet.

◆ skipEmptyTokens

boolean skipEmptyTokens

protected

If true, empty tokens are omitted.

◆ whitespaces

char [] whitespaces =CString.DEFAULT_WHITESPACES

The white space characters used to trim the tokens. Defaults to CString.DEFAULT_WHITESPACES

The documentation for this class was generated from the following file:

Tokenizer.java

Class Description

Public Fields

Public Methods

Protected Fields

Constructor & Destructor Documentation

◆ Tokenizer() [1/7]

◆ Tokenizer() [2/7]

◆ Tokenizer() [3/7]

◆ Tokenizer() [4/7]

◆ Tokenizer() [5/7]

◆ Tokenizer() [6/7]

◆ Tokenizer() [7/7]

Member Function Documentation

◆ getRest() [1/2]

◆ getRest() [2/2]

◆ hasNext()

◆ next() [1/3]

◆ next() [2/3]

◆ next() [3/3]

◆ set() [1/6]

◆ set() [2/6]

◆ set() [3/6]

◆ set() [4/6]

◆ set() [5/6]

◆ set() [6/6]

◆ toString()

Member Data Documentation

◆ actual

◆ delim

◆ rest

◆ skipEmptyTokens

◆ whitespaces