ucommon  7.0.0
About: GNU uCommon C++ is a portable and optimized class framework for writing C++ applications that need to use threads and support concurrent synchronization, and that use sockets, XML parsing, object serialization, thread-optimized string and data structure classes, etc..
  Fossies Dox: ucommon-7.0.0.tar.gz  ("unofficial" and yet experimental doxygen-generated source code documentation)  

Loading...
Searching...
No Matches
ost::StringTokenizer Class Reference

Splits delimited string into tokens. More...

#include <tokenizer.h>

Collaboration diagram for ost::StringTokenizer:
[legend]

Classes

class  iterator
 The input forward iterator for tokens. More...
 
class  NoSuchElementException
 Exception thrown, if someone tried to read beyond the end of the tokens. More...
 

Public Member Functions

 StringTokenizer (const char *str, const char *delim, bool skipAllDelim=false, bool trim=false)
 creates a new StringTokenizer for a string and a given set of delimiters. More...
 
 StringTokenizer (const char *s)
 create a new StringTokenizer which splits the input string at whitespaces. More...
 
iterator begin () const
 returns the begin iterator More...
 
void setDelimiters (const char *d)
 changes the set of delimiters used in subsequent iterations. More...
 
iterator begin (const char *d)
 returns a begin iterator with an alternate set of delimiters. More...
 
const iteratorend () const
 the iterator marking the end. More...
 

Static Public Attributes

static const char *const SPACE =" \t\n\r\f\v"
 a delimiter string containing all usual whitespace delimiters. More...
 

Private Attributes

const char * str
 
const char * delim
 
bool skipAll
 
bool trim
 
iterator itEnd
 

Friends

class StringTokenizer::iterator
 

Detailed Description

Splits delimited string into tokens.

The StringTokenizer takes a pointer to a string and a pointer to a string containing a number of possible delimiters. The StringTokenizer provides an input forward iterator which allows to iterate through all tokens. An iterator behaves like a logical pointer to the tokens, i.e. to shift to the next token, you've to increment the iterator, you get the token by dereferencing the iterator.

Memory consumption: This class operates on the original string and only allocates memory for the individual tokens actually requested, so this class allocates at maximum the space required for the longest token in the given string. Since for each iteration, memory is reclaimed for the last token, you MAY NOT store pointers to them; if you need them afterwards, copy them. You may not modify the original string while you operate on it with the StringTokenizer; the behaviour is undefined in that case.

The iterator has one special method 'nextDelimiter()' which returns a character containing the next delimiter following this tokenization process or '\0', if there are no following delimiters. In case of skipAllDelim, it returns the FIRST delimiter.

With the method 'setDelimiters(const char*)' you may change the set of delimiters. It affects all running iterators.

Example:

 StringTokenizer st("mary had a little lamb;its fleece was..", " ;");
 StringTokenizer::iterator i;
 for (i = st.begin() ; i != st.end() ; ++i) {
       cout << "Token: '" << *i << "'\t";
       cout << " next Delim: '" << i.nextDelimiter() << "'" << endl;
 }
 

Author
Henner Zeller H.Zel.nosp@m.ler@.nosp@m.acm.o.nosp@m.rg
License:\n LGPL

Definition at line 104 of file tokenizer.h.

Constructor & Destructor Documentation

◆ StringTokenizer() [1/2]

ost::StringTokenizer::StringTokenizer ( const char *  str,
const char *  delim,
bool  skipAllDelim = false,
bool  trim = false 
)

creates a new StringTokenizer for a string and a given set of delimiters.

Parameters
strString to be split up. This string will not be modified by this StringTokenizer, but you may as well not modfiy this string while tokenizing is in process, which may lead to undefined behaviour.
delimString containing the characters which should be regarded as delimiters.
skipAllDelimOPTIONAL. true, if subsequent delimiters should be skipped at once or false, if empty tokens should be returned for two delimiters with no other text inbetween. The first behaviour may be desirable for whitespace skipping, the second for input with delimited entry e.g. /etc/passwd like files or CSV input. NOTE, that 'true' here resembles the ANSI-C strtok(char *s,char *d) behaviour. DEFAULT = false
trimOPTIONAL. true, if the tokens returned should be trimmed, so that they don't have any whitespaces at the beginning or end. Whitespaces are any of the characters defined in StringTokenizer::SPACE. If delim itself is StringTokenizer::SPACE, this will result in a behaviour with skipAllDelim = true. DEFAULT = false

Definition at line 54 of file tokenizer.cpp.

References itEnd, and str.

◆ StringTokenizer() [2/2]

ost::StringTokenizer::StringTokenizer ( const char *  s)

create a new StringTokenizer which splits the input string at whitespaces.

The tokens are stripped from whitespaces. This means, if you change the set of delimiters in either the 'begin(const char *delim)' method or in 'setDelimiters()', you then get whitespace trimmed tokens, delimited by the new set. Behaves like StringTokenizer(s, StringTokenizer::SPACE,false,true);

Definition at line 63 of file tokenizer.cpp.

References itEnd, and str.

Member Function Documentation

◆ begin() [1/2]

iterator ost::StringTokenizer::begin ( ) const
inline

returns the begin iterator

Definition at line 280 of file tokenizer.h.

◆ begin() [2/2]

iterator ost::StringTokenizer::begin ( const char *  d)
inline

returns a begin iterator with an alternate set of delimiters.

Definition at line 294 of file tokenizer.h.

References delim.

◆ end()

const iterator & ost::StringTokenizer::end ( ) const
inline

the iterator marking the end.

Definition at line 302 of file tokenizer.h.

◆ setDelimiters()

void ost::StringTokenizer::setDelimiters ( const char *  d)
inline

changes the set of delimiters used in subsequent iterations.

Definition at line 287 of file tokenizer.h.

References delim.

Friends And Related Function Documentation

◆ StringTokenizer::iterator

friend class StringTokenizer::iterator
friend

Definition at line 216 of file tokenizer.h.

Member Data Documentation

◆ delim

const char* ost::StringTokenizer::delim
private

Definition at line 218 of file tokenizer.h.

Referenced by ost::StringTokenizer::iterator::operator++().

◆ itEnd

iterator ost::StringTokenizer::itEnd
private

Definition at line 220 of file tokenizer.h.

Referenced by ost::StringTokenizer::iterator::operator++(), and StringTokenizer().

◆ skipAll

bool ost::StringTokenizer::skipAll
private

Definition at line 219 of file tokenizer.h.

Referenced by ost::StringTokenizer::iterator::operator++().

◆ SPACE

const char *const ost::StringTokenizer::SPACE =" \t\n\r\f\v"
static

a delimiter string containing all usual whitespace delimiters.

These are space, tab, newline, carriage return, formfeed and vertical tab. (see isspace() manpage).

Definition at line 111 of file tokenizer.h.

Referenced by ost::StringTokenizer::iterator::operator*().

◆ str

const char* ost::StringTokenizer::str
private

Definition at line 217 of file tokenizer.h.

Referenced by StringTokenizer().

◆ trim

bool ost::StringTokenizer::trim
private

Definition at line 219 of file tokenizer.h.


The documentation for this class was generated from the following files: