wxJSONReader Class Reference

The JSON parser. More...

#include <jsonreader.h>

List of all members.

Public Member Functions

 wxJSONReader (int flags=wxJSONREADER_TOLERANT, int maxErrors=30)
 Ctor.
virtual ~wxJSONReader ()
 Dtor - does nothing.
int Parse (const wxString &doc, wxJSONValue *val)
 Parse the JSON document.
int Parse (wxInputStream &doc, wxJSONValue *val)
int GetErrorCount () const
 Return the size of the error message's array.
int GetWarningCount () const
 Return the size of the warning message's array.
const wxArrayString & GetErrors () const
 Return a reference to the error message's array.
const wxArrayString & GetWarnings () const
 Return a reference to the warning message's array.

Static Public Member Functions

static int UTF8NumBytes (char ch)
 Compute the number of bytes that makes a UTF-8 encoded wide character.
static bool Strtoll (const wxString &str, wxInt64 *i64)
 Converts a decimal string to a 64-bit signed integer.
static bool Strtoull (const wxString &str, wxUint64 *ui64)
 Converts a decimal string to a 64-bit unsigned integer.

Protected Member Functions

int Parse (wxJSONValue *val)
 The general parsing function (internal use).
int DoRead (wxJSONValue &val)
 Reads the JSON text document (internal use).
void AddError (const wxString &descr)
 Add a error message to the error's array.
void AddError (const wxString &fmt, const wxString &str)
void AddError (const wxString &fmt, wxChar ch)
void AddWarning (int type, const wxString &descr)
 Add a warning message to the warning's array.
int GetStart ()
 Returns the start of the document.
int ReadChar ()
 Read a character from the input JSON document.
int GetChar ()
 Return a character from the input JSON document.
int PeekChar ()
 Peek a character from the input JSON document.
void StoreValue (int ch, const wxString &key, wxJSONValue &value, wxJSONValue &parent)
 Store a value in the parent object.
int SkipWhiteSpace ()
 Skip all whitespaces.
int SkipComment ()
 Skip a comment.
void StoreComment (const wxJSONValue *parent)
 Store the comment string in the value it refers to.
int ReadString (wxJSONValue &val)
 Read a string value.
int ReadToken (int ch, wxString &s)
 Reads a token string.
int ReadValue (int ch, wxJSONValue &val)
 Read a value from input stream.
int ReadUnicode (long int &hex)
 Read a 4-hex-digit unicode character.
int AppendUnicodeSequence (wxString &s, int hex)
 The function appends the wide character to the string value.
int NumBytes (char ch)
 Return the number of bytes that make a character in stream input.

Static Protected Member Functions

static bool DoStrto_ll (const wxString &str, wxUint64 *ui64, wxChar *sign)
 Perform the actual conversion from a string to a 64-bit integer.

Protected Attributes

int m_flags
 Flag that control the parser behaviour,.
int m_maxErrors
 aximum number of errors stored in the error's array
int m_lineNo
 The current line number (start at 1).
int m_colNo
 The current column number (start at 1).
int m_level
 The current level of object/array annidation (start at ZERO).
wxJSONValuem_current
 The pointer to the value object that is being read.
wxJSONValuem_lastStored
 The pointer to the value object that was last stored.
wxJSONValuem_next
 The pointer to the value object that will be read.
wxString m_comment
 The comment string read by SkipComment().
int m_commentLine
 The starting line of the comment string.
wxArrayString m_errors
 The array of error messages.
wxArrayString m_warnings
 The array of warning messages.
int m_charPos
 The current character position for string input.
int m_inType
 The input type (0=string, 1=stream).
void * m_inObject
 The pointer to the input object (a string or a stream).
int m_peekChar
 The character read by the PeekChar() function (-1 none).
wxMBConv * m_conv
 The conversion object used in stream input.


Detailed Description

The JSON parser.

The class is a JSON parser which reads a JSON formatted text and stores values in the wxJSONValue structure. The ctor accepts two parameters: the style flag, which controls how much error-tolerant should the parser be and an integer which is the maximum number of errors and warnings that have to be reported.

If the JSON text document does not contain an open/close JSON character the function returns an invalid value object; in other words, the wxJSONValue::IsValid() function returns FALSE. This is the case of a document that is empty or contains only whitespaces or comments. If the document contains a starting object/array character immediatly followed by a closing object/array character (i.e.: {} ) then the function returns an empty array or object JSON value. This is a valid JSON object of type wxJSONTYPE_OBJECT or wxJSONTYPE_ARRAY whose wxJSONValue::Size() function returns ZERO.

JSON text
The wxJSON parser just skips all characters read from the input JSON text until the start-object '{' or start-array '[' characters are encontered (see the GetStart() function). This means that the JSON input text may contain everything before the first start-object/array character except these two chars themselves unless they are included in a C/C++ comment. Comment lines that apear before the first start array/object character, are non ignored if the parser is constructed with the wxJSONREADER_STORE_COMMENT flag: they are added to the comment's array of the root JSON value.

Note that the parsing process stops when the internal DoRead() function returns. Because that function is recursive, the top-level close-object '}' or close-array ']' character cause the top-level DoRead() function to return thus stopping the parsing process regardless the EOF condition. This means that the JSON input text may contain everything after the top-level close-object/array character. Here are some examples:

Returns a wxJSONTYPE_INVALID value (invalid JSON value)

   // this text does not contain an open array/object character

Returns a wxJSONTYPE_OBJECT value of Size() = 0

   {
   }

Returns a wxJSONTYPE_ARRAY value of Size() = 0

   [
   ]

Text before and after the top-level open/close characters is ignored.

   This non-JSON text does not cause the parser to report errors or warnings
   {
   }
   This non-JSON text does not cause the parser to report errors or warnings

Extensions
The wxJSON parser recognizes all JSON text plus some extensions that are not part of the JSON syntax but that many other JSON implementations do recognize. If the input text contains the following non-JSON text, the parser reports the situation as warnings and not as errors unless the parser object was constructed with the wxJSONREADER_STRICT flag. In the latter case the wxJSON parser is not tolerant.

Note that you can control how much error-tolerant should the parser be and also you can specify how many and what extensions are recognized. See the constructor's parameters for more details.

Unicode vs ANSI
The parser can read JSON text from two very different kind of objects:

When the input is from a string object, the character represented in the string is platform- and mode- dependant; in other words, characters are represented differently: in ANSI builds they depend on the charset in use and in Unicode builds they depend on the platform (UCS-2 on win32, UCS-4 on GNU/Linux).

When the input is from a stream object, the only recognized encoding format is UTF-8 for both ANSI and Unicode builds.

Example:
  wxJSONValue  value;
  wxJSONReader reader;

  // open a text file that contains the UTF-8 encoded JSON text
  wxFFileInputStream jsonText( _T("filename.utf8"), _T("r"));

  // read the file
  int numErrors = reader.Parse( jsonText, &value );

  if ( numErrors > 0 )  {
    ::MessageBox( _T("Error reading the input file"));
  }

To know more about ANSI and Unicode mode read Unicode support in wxJSON.


Constructor & Destructor Documentation

wxJSONReader::wxJSONReader ( int  flags = wxJSONREADER_TOLERANT,
int  maxErrors = 30 
)

Ctor.

Construct a JSON parser object with the given parameters.

JSON parser objects should always be constructed on the stack but it does not hurt to have a global JSON parser.

Parameters:
flags this paramter controls how much error-tolerant should the parser be
maxErrors the maximum number of errors (and warnings, too) that are reported by the parser. When the number of errors reaches this limit, the parser stops to read the JSON input text and no other error is reported.
The flag parameter is the combination of ZERO or more of the following constants OR'ed toghether:

You can also use the following shortcuts to specify some predefined flag's combinations:

Example:
The following code fragment construct a JSON parser, turns on all wxJSON extensions and also stores C/C++ comments in the value object they refer to. The parser assumes that the comments apear before the value:

   wxJSONReader reader( wxJSONREADER_TOLERANT | wxJSONREADER_STORE_COMMENTS );
   wxJSONValue  root;
   int numErrors = reader.Parse( jsonText, &root );

wxJSONReader::~wxJSONReader (  )  [virtual]

Dtor - does nothing.


Member Function Documentation

void wxJSONReader::AddError ( const wxString &  fmt,
wxChar  ch 
) [protected]

void wxJSONReader::AddError ( const wxString &  fmt,
const wxString &  str 
) [protected]

void wxJSONReader::AddError ( const wxString &  msg  )  [protected]

Add a error message to the error's array.

The overloaded versions of this function add an error message to the error's array stored in m_errors. The error message is formatted as follows:

   Error: line xxx, col xxx - <error_description>

The msg parameter is the description of the error; line's and column's number are automatically added by the functions. The fmt parameter is a format string that has the same syntax as the printf function. Note that it is the user's responsability to provide a format string suitable with the arguments: another string or a character.

void wxJSONReader::AddWarning ( int  type,
const wxString &  msg 
) [protected]

Add a warning message to the warning's array.

The warning description is as follows:

   Warning: line xxx, col xxx - <warning_description>

Warning messages are generated by the parser when the JSON text that has been read is not well-formed but the error is not fatal and the parser recognizes the text as an extension to the JSON standard (see the parser's ctor for more info about wxJSON extensions).

Note that the parser has to be constructed with a flag that indicates if each individual wxJSON extension is on. If the warning message is related to an extension that is not enabled in the parser's m_flag data member, this function calls AddError() and the warning message becomes an error message. The type parameter is one of the same constants that specify the parser's extensions.

int wxJSONReader::AppendUnicodeSequence ( wxString &  s,
int  hex 
) [protected]

The function appends the wide character to the string value.

This function is called by ReadString() when a unicode escaped sequence is read from the input text as for example (the greek letter alpha):

  \u03B1

In unicode mode, the function just appends the wide character code stored in hex to the string s. In ANSI mode, the function converts the wide character code to the corresponding character if it is convertible using the locale dependent character set. If the wide char cannot be converted, the function appends the unicode escape sequence to the string s. Returns ZERO if the character was not converted to a unicode escape sequence.

int wxJSONReader::DoRead ( wxJSONValue parent  )  [protected]

Reads the JSON text document (internal use).

This is a recursive function that is called by Parse() and by the DoRead() function itself when a new object / array is encontered. The function returns when a EOF condition is encontered or when the final close-object / close-array char is encontered. The function also increments the m_level data member when it is entered and decrements it on return.

The function is the heart of the wxJSON parser class but it is also very easy to understand because JSON syntax is very easy.

Returns the last close-object/array character read or -1 on EOF.

bool wxJSONReader::DoStrto_ll ( const wxString &  str,
wxUint64 *  ui64,
wxChar *  sign 
) [static, protected]

Perform the actual conversion from a string to a 64-bit integer.

This function is called internally by the Strtoll and Strtoull functions and it does the actual conversion. The function is also able to check numeric overflow.

int wxJSONReader::GetChar (  )  [protected]

Return a character from the input JSON document.

The function is called by ReadChar() and returns a single character from the input JSON document as an integer so that all 2^31 unicode characters can be represented as a positive integer value. In case of errors or EOF, the function returns -1. Note that this function behaves differently depending on the build mode (ANSI or Unicode) and the type of the object containing the JSON document.

wxString input
If the input JSON text is stored in a wxString object, there is no difference between ANSI and Unicode builds: the function just returns the next character in the string and updates the m_charPos data member that points the next character in the string. In Unicode mode, the function returns wide characters and in ANSI builds it returns only chars.

wxInputStream input
Stream input is always encoded in UTF-8 format in both ANSI ans Unicode builds. In order to return a single character, the function calls the NumBytes() function which returns the number of bytes that have to be read from the stream in order to get one character. The bytes read are then converted to a wide character and returned. Note that wide chars are also returned in ANSI mode but they are processed differently by the parser: before storing the wide character in the JSON value, it is converted to the locale dependent character if one exists; if not, the unicode escape sequence is stored in the JSON value.

int wxJSONReader::GetErrorCount (  )  const

Return the size of the error message's array.

const wxArrayString & wxJSONReader::GetErrors (  )  const

Return a reference to the error message's array.

int wxJSONReader::GetStart (  )  [protected]

Returns the start of the document.

This is the first function called by the Parse() function and it searches the input stream for the starting character of a JSON text and returns it. JSON text start with '{' or '['. If the two starting characters are inside a C/C++ comment, they are ignored. Returns the JSON-text start character or -1 on EOF.

int wxJSONReader::GetWarningCount (  )  const

Return the size of the warning message's array.

const wxArrayString & wxJSONReader::GetWarnings (  )  const

Return a reference to the warning message's array.

int wxJSONReader::NumBytes ( char  ch  )  [protected]

Return the number of bytes that make a character in stream input.

This function is used by the GetChar() function when the JSON input is from a stream object and returns the number of bytes that has to be read from the stream in order to get a single wide character. Because the encoding format of a JSON text in streams is UTF-8 and no other formats are supported by now, the function just calls UTF8NumBytes() function.

In order to implement new input formats from stream input, you have to implement this function in order to return the correct number of bytes and to set the appropriate value in the m_conv data member which has to point to the conversion object. For example, to implement UCS-4 Little Endian encoding:

int wxJSONReader::Parse ( wxJSONValue val  )  [protected]

The general parsing function (internal use).

This protected function is called by the public overloaded Parse() functions after setting up the internal data members.

int wxJSONReader::Parse ( wxInputStream &  doc,
wxJSONValue val 
)

int wxJSONReader::Parse ( const wxString &  doc,
wxJSONValue val 
)

Parse the JSON document.

The two overloaded versions of the Parse() function read a JSON text stored in a wxString object or in a wxInputStream object.

If val is a NULL pointer, the function does not store the values: it can be used as a JSON checker in order to check the syntax of the document. Returns the number of errors found in the document. If the returned value is ZERO and the parser was constructed with the wxJSONREADER_STRICT flag, then the parsed document is well-formed and it only contains valid JSON text.

If the wxJSONREADER_TOLERANT flag was used in the parser's constructor, then a return value of ZERO does not mean that the document is well-formed because it may contain comments and other extensions that are not fatal for the wxJSON parser but other parsers may fail to recognize. You can use the GetWarningCount() function to know how many wxJSON extensions are present in the JSON input text.

Note that the JSON value object val is not cleared by this function unless its type is of the wrong type. In other words, if val is of type wxJSONTYPE_ARRAY and it already contains 10 elements and the input document starts with a '[' (open-array char) then the elements read from the document are appended to the existing ones.

On the other hand, if the text document starts with a '{' (open-object) char then this function must change the type of the val object to wxJSONTYPE_OBJECT and the old content of 10 array elements will be lost.

When reading from a wxInputStream the JSON text must be encoded in UTF-8 format for both Unicode and ANSI builds. When reading from a wxString object, the input text is encoded in different formats depending on the platform and the build mode: in Unicode builds, strings are encoded in UCS-2 format on Windows and in UCS-4 format on GNU/Linux; in ANSI builds, strings contain one-byte locale dependent characters.

int wxJSONReader::PeekChar (  )  [protected]

Peek a character from the input JSON document.

This function is much like GetChar() but it does not update the stream or string input position.

int wxJSONReader::ReadChar (  )  [protected]

Read a character from the input JSON document.

The function returns a single character from the input JSON document as an integer so that all 2^31 unicode characters can be represented as a positive integer value. In case of errors or EOF, the function returns -1. The function also updates the m_lineNo and m_colNo data members and converts all CR+LF sequence in LF.

Note that this function calls GetChar() in order to retrieve the next character in the JSON input text.

int wxJSONReader::ReadString ( wxJSONValue val  )  [protected]

Read a string value.

The function reads a string value from input stream and it is called by the DoRead() function when it enconters the double quote characters. The function read all characters up to the next double quotes unless it is escaped. Also, the function recognizes the escaped characters defined in the JSON syntax.

The string is also stored in the provided wxJSONValue argument provided that it is empty or it contains a string value. This is because the parser class recognizes multi-line strings like the following one:

   [
      "This is a very long string value which is splitted into more"
      "than one line because it is more human readable"
   ]
Because of the lack of the value separator (,) the parser assumes that the string was split into several double-quoted strings. If the value does not contain a string then an error is reported. Splitted strings cause the parser to report a warning.

int wxJSONReader::ReadToken ( int  ch,
wxString &  s 
) [protected]

Reads a token string.

This function is called by the ReadValue() when the first character encontered is not a special char and it is not a string. It stores the token in the provided string argument and returns the next character read which is a whitespace or a special JSON character.

A token cannot include unicode escaped sequences so this function does not try to interpret such sequences.

int wxJSONReader::ReadUnicode ( long int &  hex  )  [protected]

Read a 4-hex-digit unicode character.

The function is called by ReadString() when the \u sequence is encontered; the sequence introduces a unicode character. The function reads four chars from the input text by calling ReadChar() four times: if -1( EOF) is encontered before reading four chars, -1 is also returned and no conversion takes place. The function tries to convert the 4-hex-digit sequence in an integer which is returned in the \ hex parameter. If the string cannot be converted, the function stores -1 in hex and reports an error.

Returns the character after the hex sequence or -1 if EOF or if the four characters cannot be converted to a hex number.

int wxJSONReader::ReadValue ( int  ch,
wxJSONValue val 
) [protected]

Read a value from input stream.

The function is called by DoRead() when it enconters a char that is not a special char nor a double-quote. It assumes that the string is a numeric value or a 'null' or a boolean value and stores it in the wxJSONValue object val.

The function also checks that val is of type wxJSONTYPE_INVALID otherwise an error is reported becasue a value cannot follow another value: maybe a (,) or (:) is missing. Returns the next character or -1 on EOF.

int wxJSONReader::SkipComment (  )  [protected]

Skip a comment.

The function is called by DoRead() when a '/' (slash) character is read from the input stream assuming that a C/C++ comment is starting. Returns the first character that follows the comment or -1 on EOF. The function also adds a warning message because comments are not valid JSON text. The function also stores the comment, if any, in the m_comment data member: it can be used by the DoRead() function if comments have to be stored in the value they refer to.

int wxJSONReader::SkipWhiteSpace (  )  [protected]

Skip all whitespaces.

The function reads characters from the input text and returns the first non-whitespace character read or -1 if EOF. Note that the function does not rely on the isspace function of the C library but checks the space constants: space, TAB and LF.

void wxJSONReader::StoreComment ( const wxJSONValue parent  )  [protected]

Store the comment string in the value it refers to.

The function searches a suitable value object for storing the comment line that was read by the parser and temporarly stored in m_comment. The function searches the three values pointed to by:

The value that the comment refers to is:

Note that the comment line is only stored if the wxJSONREADER_STORE_COMMENTS flag was used when the parser object was constructed; otherwise, the function does nothing and immediatly returns. Also note that if the comment line has to be stored but the function cannot find a suitable value to add the comment line to, an error is reported (note: not a warning but an error).

void wxJSONReader::StoreValue ( int  ch,
const wxString &  key,
wxJSONValue value,
wxJSONValue parent 
) [protected]

Store a value in the parent object.

The function is called by DoRead() when a the comma or a close-object/array character is encontered and stores the current value read by the parser in the parent object. The function checks that value is not invalid and that key is not an empty string if parent is an object.

Parameters:
ch the character read: a comma or close objecty/array char
key the key string: may be empty if parent ss an array
value the current JSON value to be stored in parent
parent the JSON value that holds value.

bool wxJSONReader::Strtoll ( const wxString &  str,
wxInt64 *  i64 
) [static]

Converts a decimal string to a 64-bit signed integer.

bool wxJSONReader::Strtoull ( const wxString &  str,
wxUint64 *  ui64 
) [static]

Converts a decimal string to a 64-bit unsigned integer.

int wxJSONReader::UTF8NumBytes ( char  ch  )  [static]

Compute the number of bytes that makes a UTF-8 encoded wide character.

The function counts the number of '1' bit in the character ch and returns it. The UTF-8 encoding specifies the number of bytes needed by a wide character by coding it in the first byte. See below.

Note that if the character does not contain a valid UTF-8 encoding the function returns -1.

   UCS-4 range (hex.)    UTF-8 octet sequence (binary)
   -------------------   -----------------------------
   0000 0000-0000 007F   0xxxxxxx
   0000 0080-0000 07FF   110xxxxx 10xxxxxx
   0000 0800-0000 FFFF   1110xxxx 10xxxxxx 10xxxxxx
   0001 0000-001F FFFF   11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
   0020 0000-03FF FFFF   111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
   0400 0000-7FFF FFFF   1111110x 10xxxxxx ... 10xxxxxx


Member Data Documentation

int wxJSONReader::m_charPos [protected]

The current character position for string input.

int wxJSONReader::m_colNo [protected]

The current column number (start at 1).

wxString wxJSONReader::m_comment [protected]

The comment string read by SkipComment().

int wxJSONReader::m_commentLine [protected]

The starting line of the comment string.

wxMBConv* wxJSONReader::m_conv [protected]

The conversion object used in stream input.

This data member is set to NULL in the ctor and it is initialized in the Parse() function when the input is from a stream object. The pointer points to a wxMBConvUTF8 class used by the GetChar() function in order to convert a variable number of bytes in a wide character. Input from strings does not need this conversion: the GetChar() function can access a single character in the string input by using the subscript operator or the wxString::At() function which returns a single character.

Please note that in order to support input formats other than UTF-8 in stream input it is not sufficient to change the object pointed to by this data member because the GetChar() function also has to know the number of bytes that have to be read from the stream in order to get a character. Se NumBytes() for more info.

wxJSONValue* wxJSONReader::m_current [protected]

The pointer to the value object that is being read.

wxArrayString wxJSONReader::m_errors [protected]

The array of error messages.

int wxJSONReader::m_flags [protected]

Flag that control the parser behaviour,.

void* wxJSONReader::m_inObject [protected]

The pointer to the input object (a string or a stream).

int wxJSONReader::m_inType [protected]

The input type (0=string, 1=stream).

wxJSONValue* wxJSONReader::m_lastStored [protected]

The pointer to the value object that was last stored.

int wxJSONReader::m_level [protected]

The current level of object/array annidation (start at ZERO).

int wxJSONReader::m_lineNo [protected]

The current line number (start at 1).

int wxJSONReader::m_maxErrors [protected]

aximum number of errors stored in the error's array

wxJSONValue* wxJSONReader::m_next [protected]

The pointer to the value object that will be read.

int wxJSONReader::m_peekChar [protected]

The character read by the PeekChar() function (-1 none).

wxArrayString wxJSONReader::m_warnings [protected]

The array of warning messages.


The documentation for this class was generated from the following files:
Generated on Mon Aug 18 22:54:23 2008 for wxJSON by  doxygen 1.4.7