Don't assume that characters
are one byte each like in ASCII. Characters can be made up of multiple bytes
each. This will happen when SetTextToASCII() is set to
FALSE or SetTextToUCS4() is set to TRUE.
void GetLine( CParsePoint& parse_point, CString& line )
This method will retrieve a line of text from the data.
A line is terminated by a carriage return, line feed or both.
The returned line will not contain the terminating carriage return/line feed character(s).
BOOL GetNextCharacter( CParsePoint& parse_point, DWORD& character ) const
Like GetCharacter() except the parse point will be advanced
by however many bytes make up one character (1, 2 or 4). It allows you to basically
enumerate through the data stream. It will return TRUE of character was
filled or FALSE if you have reached the end (or passed the end) of the data.
DWORD GetUCS4Order( void ) const
Returns one of the following:
BYTE GetUnicodeToASCIITranslationFailureCharacter( void ) const
Returns the ASCII character that will be substituted when a translation from
UNICODE to ASCII fails.
DWORD GetSize( void ) const
Returns the number of bytes in the data area.
BOOL GetUntilAndIncluding( CParsePoint& parse_point, BYTE termination_byte, CString& string_to_get ) const
BOOL GetUntilAndIncluding( CParsePoint& parse_point, BYTE termination_byte, CByteArray& bytes_to_get ) const
This method retrieves data (filling string_to_get or bytes_to_get)
until and including the termination_byte. The parse_point
is advanced in the process.
BOOL Initialize( CByteArray * data, BOOL automatically_delete = FALSE )
BOOL Initialize( const CStringArray& strings )
Tells the parser where to go for data.
BOOL IsTextASCII( void ) const
Returns TRUE if characters are to be treated as one byte each.
BOOL IsTextBigEndian( void ) const
Returns TRUE if text is big endian (Sun) format. This has meaning when the
underlying characters are treated as UNICODE or ICS-4.
BOOL IsTextUCS4( void ) const
Returns TRUE if characters are to be treated as four bytes per character.
BOOL PeekAtCharacter( const CParsePoint& parse_point, DWORD& character, const DWORD number_of_characters_ahead = 1 ) const
Allows you to peek ahead at characters. It will return TRUE if
character was filled with a character from the data stream.
It will return FALSE when you have tried to read passed the end of the stream.
DWORD PeekCharacter( const CParsePoint& parse_point, const LONG number_of_characters_ahead ) const
Allows you to peek ahead at characters. It will the character at the current location plus
number_of_characters_ahead. If you attempt to read a character passed the
end of the data, it will return NULL.
BOOL SetTextToASCII( BOOL text_is_ascii = TRUE )
Tells the class to interpret characters as one byte each.
BOOL SetTextToBigEndian( BOOL unicode_is_big_endian = TRUE )
Tells the class to interpret UNICODE or UCS-4 characters as big endian (Sun) format.
Little endian is Intel format.
BOOL SetTextToUCS4( BOOL text_is_ucs4 = TRUE )
Tells the class to interpret characters as four bytes each.
BOOL SetUCS4Order( DWORD order = 4321 )
Tells the parser to interpret UCS-4 characters in 4321 format.
void SetUnicodeToASCIITranslationFailureCharacter( BYTE asci_character )
This sets the character that will be substituted when a translation must be made
from UNICODE to ASCII. Since ASCII only has 256 possible values and UNICODE has 65536,
some provision must be made for bad translations.