wxJSON internals

Introduction

The wxJSONValue class is much like a variant type which can hold various types (see the documentation of the wxWidgets' wxVariant class). The JSON value class is a bit different from wxVariant because it cannot hold whatever type you want but only the following:

The type of the JSON value contained in a wxJSONValue object is an enumerated constant of type wxJSONType. Starting from version 0.5 the wxJSON library supports 64-bits integers which introduced new JSON types. For more info see 64-bits and 32-bits integers.

There is no need to specify the type of a wxJSONValue object: it is automatically set when you construct the object or when a value is assigned to it. For example:

 wxJSONValue v;            // a 'null' value
 wxJSONValue v1( 10 );     // signed integer type
 wxJSONValue v2( 12.90);   // double type
 v1 = "some string";       // now 'v1' is of type string

 wxJSONValue v3;           // a 'null' value
 v3.Append( 10 );          // 'v3' is now of type wxJSONTYPE_ARRAY

The only exception to this is when you want to set the wxJSONTYPE_INVALID. Note that you should cast the wxJSONTYPE_INVALID constant to a wxJSONType type because some compilers may assume the constant value to be an int:

 wxJSONValue value( (wxJSONType) wxJSONTYPE_INVALID );

The wxJSONRefData structure
All data is stored in the wxJSONRefData class which is just like a simple structure: the class does not define an interface for accessing data: it only defines the data members and the ctors and dtor. The interface is totally defined in the wxJSONValue class which, in turn, does not contain any data (with the exception of the pointer to referenced data). This organization lets we use the reference counting tecnique (also known as copy-on-write) when copying wxJSONValue objects.

The data structure holds the type of the JSON value, the JSON value itself, the comment lines, if any, etc. To know more about the individual data member defined in the class see the documentation of wxJSONRefData. The data structure holds data in two different modes:

The union is defined as follows:

  union wxJSONValueHolder  {
    int              m_valInt;
    unsigned int     m_valUInt;
    short int        m_valShort;
    unsigned short   m_valUShort;
    long int         m_valLong;
    unsigned long    m_valULong;
    double           m_valDouble;
    const wxChar*    m_valCString;
    bool             m_valBool;

 #if defined( wxJSON_64BIT_INT )
    wxInt64          m_valInt64;
    wxUint64         m_valUInt64;
 #endif
  };

The wxJSONRefData structure also holds the three complex objects that represent the three JSON value types: strings, arrays and objects (this is referred to JSON objets, not C++ class's instances):

    wxString             m_valString;
    wxJSONInternalArray  m_valArray;
    wxJSONInternalMap    m_valMap;

Note that primitive types are stored in a union and not in a structure: this means that when you store a value in one of the data member, all other are also affected. I will explain more clearly with an example:

integers are stored using the most wide storage size; (unsigned) long int by default and wx(U)Int64 on platforms that support 64-bits integers. (to know more about 64-bits integer support read 64-bits and 32-bits integers). So if you store a int data type of value -1, all other data member will get a value that depends on the data type stored. Below you find an hardcopy of the memory dump of a JSON value object which was assigned the integer value of -1:

intern01.png

A value of -1 is stored as all binary '1' in the union but the value returned by the wxJSONValue class depends on the type you want. In other words, if you get the value as an integer you get -1 but if you get the value as an unsigned integer you get different values depending on the size of the requested type. Also note that when the same value is returned as a double, the wxJSONValue::AsDouble() function does not promote the int to a double: the function just returns the bits as they are stored and interpreted as a double thus returning a NaN.

wxJSON internals: reference counting

Starting from version 0.4.0 the internal representation of a JSON value has totally changed because of the implementation of the reference counting tecnique also known as copy-on-write. Now the wxJSONValue class does not actually contain any data: every instance of this class holds a pointer to the actual data structure defined in the wxJSONRefData class. The structure contains a special data member that counts the number of instances of wxJSONValue objects that share the data.

If you look at the example memory dump seen above, you will note the wxJSONValue::m_refData data member that points to the actual data structure and the wxJSONRefData::m_refCount data member that counts how many JSON value objects share the data structure (one, in the example).

Reference counting is very simple: if you copy an instance of a wxJSONValue object, the data contained in the wxJSONRefData structure is not really copied but, instead, it is shared by the two JSON value objects which data pointer points to the same memory area. Here is an example:

  wxJSONValue v1( 12 );
  wxJSONvalue v2( v1 );

cow02.png

Reference counting is implemented in many wxWidget's classes such as wxBitmap, wxImage, etc but the wxJSONValue class is a bit different because objects of this class may contain other wxJSONValue objects and they can be nested to a virtually infinite level. This cause references to not being propagated in the hierarchy. Also, because values are accessed using the subscript operators - which are non-const functions - COW for wxJSONValue objects is not as efficient as we may expect.

In the following paragraphs I try to explain what happens when you make a copy of a wxJSONValue and then call some non-const functions on one of the two instances.

Making a copy of an array type.

In the following example I create an array type and set a value to the fourth element. The subscript operator automatically creates the first for elements and initializes them to a null value. Then, the integer value is assigned to the fourth element by the assignment operator. Note that the first three array's element share the same data: this is because the subscript operator automatically creates all instances of the items until the requested index. Needed items are created by copying (using COW) a temporary NULL JSON value:

  wxJSONValue v1;
  v1[3] = 12;           // set the value of the fourth item

cow08.png

Writing the value to a JSON text document we get the following:

 [
    null,
    null,
    null,
    12
 ]

Now copy the v1 JSON value to a v3 value. Note that the root JSON data structure is shared by the two instances.

  wxJSONValue v1;
  v1[3] = 12;           // set the value of the fourth item
  wxJSONValue v3(v1);   // make a copy of 'v1'

cow09.png

We already noted that the three null values in the array share the same data structure but because the root value is shared we only have a reference count of THREE for the NULL values. In fact, the data is shared by SIX JSON value objects: 3 items in v1 plus 3 items in v3 (six values in total) but as the parent object is shared, the wxJSONRefData::m_refCount data member only counts 3 shares.

Writing to a shared data

  wxJSONValue v1;
  v1[3] = 12;           // set the value of the fourth item
  wxJSONValue v3(v1);   // makes a copy of 'v1'

  v3[1] = 2;            // change the value of the second array's element

When we change a value of an array's element we would expect that a real copy of only that array's element is done and it is assigned the new value.

We are wrong. In fact, the wxJSONValue object makes a copy of the whole root object v3. Why? The answer is simple: the first impression is that the assignment operator is called for the second array element of v3 and this would cause a real copy of that array's element. In reality, before calling the operator= memberfunction the code fragment seen above must return a reference to the second element of v3's array. This reference is returned by the operator[] (the subscript operator) which is itself a non-const memberfunction. So, the subscript operator of the root value object makes a real copy of the referenced data. All array's elements are copied from v1's instance to v3. You may notice from the memory dump that the copy of elements is not a real copy but it uses COW. Below you find the memory dump of the two object after we had changed one array's element. As you can see, each root value has now an exclusive copy of the array:

cow10.png

In order to avoid the copy of the top-level array type we have had to use a const member function to access the second array's element. Note that we cannot use the wxJSONValue::ItemAt() function because this function returns a copy of the data not a reference to it:

  wxJSONValue v1;
  v1[3] = 12;           // set the value of the fourth item
  wxJSONValue v2(v1);   // makes a copy of 'v1'

  // does not work!!! what we do is to change a temporary copy
  // of the second array's element
  v2.ItemAt( 1 ) = "value changed";

The only suitable function is the wxJSONValue::Find() function which is, unfortunately, protected so it cannot be called from outside the class.

Another drawback of using non-const subscript operators is that the copy operation is done also when we do not write to the JSON value object but also when we read from it. This is an example:

  wxJSONValue v1;
  v1[3] = 12;           // set the value of the fourth item
  wxJSONValue v2(v1);   // makes a copy of 'v1'

  int i = v1[3].AsInt();   // read from 'v1'

Because the operator[] memberfunction is non-const, the read operation causes the wxJSONValue class to make an exclusive copy of shared data even when the access to the value is only for read purposes. The good news is that we can use wxJSONValue::ItemAt() in this case thus avoiding the copy operation of the shared data ( OK, tested see samples/test11.cpp function Test51() )

  wxJSONValue v1;
  v1[3] = 12;           // set the value of the fourth item
  wxJSONValue v2(v1);   // makes a copy of 'v1'

  int i = v1.ItemAt( 3 ).AsInt();

The problem is that we can use ItemAt() for only one level in the JSON value's hierarchy.

So is COW totally useless? No, it is not!

Even when using subscripts operators, the real copy of shared data is done only until the parent of the requested level: every other JSON value objects of the same level and of inferior levels are not really copied: COW is used for all of them. In the following image you see that in the above example of a four element's array, the JSON array value v1 is copied to v3 but the individual items are not really copied because 3 items of v1 and 2 items of v3 refer to the same referenced data (the NULL value):

cow11.png

In this example, the array's items are NULL values, thus the time that was saved in the COW in not really much but remember that an array's item may contain another array which may contain one or more key/value hashmaps, which may contain one or more array which .... and so on.

wxJSON internals: the C string type

  wxJSONValue( const wxChar* str );
  wxJSONValue( const wxString& str );

You may ask yourself why there are 2 different constructors for strings. For convenience, you may think, in order to save an implicit conversion from const wxChar* to wxString. The answer is: NO. The two constructors store the string in a very different way.

Both ctors store strings and they could be stored just as wxString objects. In fact, this is the default behaviour of the class if the WXJSON_USE_CSTRING macro is not defined.

If this macro is defined, however, the first ctor stores the pointer in the wxJSONRefData structure assuming that the string is statically allocated and it does NOT copy the string. This behaviour saves a string's copy which can be time-consuming but, on the other hand, you must be sure that the pointed-to buffer is not freed / deallocated for the lifetime of the wxJSONValue object (this is always true for static strings). The following code fragment is an example on how to use the static string type:

  wxJSONValue aString( _T("This is a static string"));

The code above is correct, because the pointed-to string is really statically allocated (and, on most platforms, static strings are allocated in the code segment so that they cannot be changed for the lifetime of the application).

The following code is not correct and it would probably result in a SEGFAULT when you try to access the wxJSONValue data. The problem is that the string is constructed on the stack which will be deallocated when the function returns. So, the returned JSON object contains a pointer to a deallocated memory area.

  // Example 1
  wxJSONValue MyFunction()
  {
    char buffer[128];
    snprintf( buffer, 128, "This is a string constructed on the stack");
    wxJSONValue aString( buffer );
    return aString;
  }

The code above should be written as follows:

  // Example 2
  wxJSONValue MyFunction()
  {
    char buffer[128];
    snprintf( buffer, 128, "This is a string constructed on the stack");
    wxJSONValue aString( wxString( buffer));
    return aString;
  }

Now it is correct because the wxString object holds a copy of the buffer memory area. Note that if the WXJSON_USE_CSTRING macro is not defined, there is no need to actually construct a temporary wxString object in order to force the wxJSONValue class to create an instance of the wxString object: it is automaticlly created by the wxJSONValue( const wxChar*) ctor. This mean that you can use use the code in Example 1 without worry about C-strings. By default, the wxJSON_USE_CSTRING macro is not defined.

If your application uses many static strings that never changes, you can save time by defining the symbol when compiling the wxJSON library

NOTES: the static C-string type value is, probably, useless and, in fact, it is never used in wxJSONValue by default. The C string value is useless because the only reason for using it is speed: time is saved when no string-copy operation is performed. But the wxString object uses copy-on-write to avoid unnecessary copies so it is much more wisely (and SAFE) to never use C-strings.

64-bits and 32-bits integers

Starting from version 0.5, the wxJSON library supports 64-bits integers but only on those platforms that have native support for 64-bits integers such as, for example, Win32 and GNU/Linux.

Starting from version 1.0 the wxJSONValue also handles long int and short int data types.

By default, the library checks if the wxLongLong_t macro is defined by wxWidgets and, if it is, the library enables 64-bits integer support. The wxLongLong_t macro is the wxWidget's platform independent data type for representing a 64-bits integer and it is defined by the GUI framework as a placeholder for the underlying compiler / platform specific data type: __int64 on Windows and long long on GNU/Linux systems. To know more about the wxWidget's 64-bits integer stuff read the documentation of the wxLongLong class. If the system / platform do not support 64-bits integers, integer values are stored in a:

The user can disable 64-bits integer support by defining the:

  wxJSON_NO_64BIT_INT

macro in the include/wx/json_defs.h header file (just uncomment the line where the macro is defined).

All integer values are stored in the widest storage size: wx(U)int64 or long int depending the platform. The m_type data member of the JSON value is set to the generic integer type: wxJSONTYPE_INT or wxJSONTYPE_UINT regardless its size: in other words, no matter the type of the original value: the only thing that matters is the sign of the value: signed or unsigned.

  wxJSONValue i( 100)               // an int
  wxJSONValue l( (short) 100)       // a short int
  wxJSONValue l( (long) 100)        // a long int
  wxJSONvalue i64( (wxInt64) 100 ); // a long long int

All the above integer values are stored in the wxJSONValueHolder::m_valInt64 or in the wxJSONValueHolder::m_valLong data member. The JSON value type is set to wxJSONTYPE_INT for all cases. As the storage area of all primitive types is the same (it is an union) it is very easy to return an integer value in different sizes provided that the requested integer type has sufficient bits to store the value.

How can the user know the storage needs of a integer data type? First of all you have to ask yourself if you really need to know this information. In other words, if your application only uses the int data type (for integers) and it only reads its own JSON data file, it is improbable that an integer value stored in the JSON value class will hold other than an int. On the other hand, if your application communicate with other applications over a network connection, it may be possible that the JSON value class holds integers which are so large that they cannot fit in a simple int data type.

In order to know the storage needs of the value stored in the class you call the wxJSONValue::GetType() function which returns different constants depending on the weight of the numeric value:

The GetType() function relies on the definition of the SHORT_MAX, SHORT_MIN, USHORT_MAX, LONG_MAX, LONG_MIN, ULONG_MAX, macros to check if the value fits in a particular data type. If the macros are not defined (I do not know if this could happen), the wxJSON library defines them by itself according to the rules of the C99 standard (see the include/wx/json_defs.h header file):

   C99 type      width (bits)         limits
   --------      ------------         ------
   short            16                -32.768 .. +32.767
   ushort           16                0 .. 65.535
   long             32                -2.147.483.648 .. +2.147.483.647
   ulong            32                0 .. +4.294.967.295

Note that the C99 standard only defines the minimum width of these types; in addition, the C++ language does not define a minimum size for these integer types.

Also note that the wxJSONValue::GetType() function never returns wxJSONTYPE_INT. This is because the int data type has a variable bit-width that depends on the platform: on Win32 and GNU/Linux, the int type is the same as long (32 bits wide) but on other platforms it may be only 16 because the minimum width of int is 16 in the C99 standard. For this reason, it is always a good practice to never use int in C/C++ programming language but the long data type which ensures 32-bits integers.

The wxJSONValue class lets you use int as the returned data type because it defines the Is(U)Int memberfunction which returns the correct result depending on the definition of the INT_MAX, INT_MIN and UINT_MAX macros.

The IsXxxxx() functions

The wxJSONValue class also defines functions of the form IsXxxxxx() to know the storage needs of an integer type:

Signed integers
Unsigned integers
Note that in version 1.x those functions behaves differently than in versions 0.x Now all functions return TRUE if the value can be stored in the desired type, with respect to the sign of the value.
In 0.x versions only one of the 8 functions returned TRUE: the one that match the sign of the stored value AND the storage needs. In other words, if a value contains the signed integer value of 100, the IsInt() function returns TRUE while the isInt64() returns FALSE. This is wrong because the value fits in a 64-bits integer. In the new 1.0 version, all functions related to signed integer return TRUE if the value contained is 100.

All 64-bits integer memberfunctions are only available if the platform on which runs your application natively supports 64-bits integers. This is in constrast with the wxLongLong class which is also available on 32-bits platforms.

If your system natively supports 64-bits integers but you do not want to include this feature in the wxJSON library, you can define the following macro when compiling the library (see include/wx/json_defs.h):

  wxJSON_NO_64BIT_INT

You can check whether or not 64-bits support is enabled in the wxJSON library, by checking if the wxJSON_64BIT_INT macro is defined.

The IsXxxxxxx() and AsXxxxxxx() functions

You can get the type of a JSON value object by calling the IsXxxxxx() functions where Xxxxxx is the type. For example, to know if a JSON value object contains an integer value, you call the wxJSONValue::IsInt() memberfunction. Note that the function returns TRUE if the stored value is of type wxJSONTYPE_INT and it fits in a int data type. Example:

 wxJSONValue value(100);
 bool r = value.IsInt();        // return TRUE
      r = value.IsShort();      // return TRUE
      r = value.IsUInt();       // return FALSE (value is signed)
      r = value.IsLong();       // return TRUE

 value = (wxInt64) LONG_MAX + 1;  // needs 64-bits integer support

      r = value.IsInt64();        // return TRUE
      r = value.IsLong();         // return FALSE
      r = value.IsInt();          // return FALSE 

On the other hand, the AsXxxxxx functions return the value as the specified type, without checking if the stored value is actually of the same type as the one wanted. In other words, the AsXxxxxx() functions just returns the bit pattern stored in the union interpreting those bits as the desired type.

Example:

 wxJSONValue value(-1);
 int i             = value.AsLong();    // return -1
 unsigned int ui   = value.AsULong();   // return 4294967295
 unsigned short us = value.AsUShort();  // return 65535

Note that all primitive types are stored in a union so you can get the value of a primitive type as the one you want. Also note that the AsXxxxxx() function does not check that the value is compatible with the one you want. For example:

  wxJSONValue value( 10000 );
  wxChar* ch = value.AsCString();    // returns 10000 which is garbage

However, in debug builds, all AsXxxxxx() functions wxASSERT that the type of the stored value is compatible with the one you want. In the previuos example, an ASSERTION failure occurs in the AsCString() function because the returned value is, surely, garbage.

To know more about the return values of the IsXxxxxx and the AsXxxxxx memberfunctions in 64-bits platforms, see the Test55() function in the samples/test13.cpp source file. The test function calls all the mentioned memberfunctions for several signed and unsigned integers: results are commented.

Reading numeric values from JSON text.

If 64-bits support is enabled in the wxJSON library, reading values from JSON text may have different results. If a numeric value is too large for 32-bits integer storage, it will be stored as a double data type in versions 0.4 and previous ones but it may be stored as and integer if it fits in a 64-bit integer storage.

When the wxJSON parser reads a token that start with a digit or with a minus or plus sign, it assumes that the value read is a number and it tries to convert it to:

Note that if 64-bits storage is enabled, the wxJSON parser, does not try to convert numeric tokens in a 32-bits integer but immediatly tries 64-bits storage. This has the conseguence that numeric values that are between INT_MAX + 1 and UINT_MAX are stored as signed integers in a 64-bits environment and as unsigned integers in a 32-bits environment.

Examples:

  {
    // if 64 bits integer is not enabled
    100         // read as a signed long integer
    2147483647  // INT_MAX: read as a signed long integer
    2147483648  // INT_MAX + 1: read as a unsigned long

   -2147483648  // INT_MIN: read as a signed long integer
    4294967295  // UINT_MAX: read as a unsigned long integer

    4294967296  // UINT_MAX + 1: read as a double

    // if 64-bits integer is enabled
    4294967295  // UINT_MAX: read as a signed wxInt64
    4294967296  // UINT_MAX + 1: read as a signed wxInt64
  }

Also note that if a number is between INT64_MIN and INT64_MAX (or LONG_MIN and LONG_MAX, depending on the platform) it is always read as a signed integer regardless its original type. You can use a special writer's flag in order to force the wxJSON library to recognize unsigned JSON values written to JSON text. See The wxJSONWRITER_RECOGNIZE_UNSIGNED flag for more info.

The array of values.

An object of this type holds an array of wxJSONValue objects. This means that you can have an array of integers, doubles, strings, arrays and key/value maps, too Moreover, the array can contain all these types. In other words, the first element can be an integer, the second element is another array, and the third one a key/value map.

The type is implemented using a wxObjArray class which stores wxJSONValue objects. The declaration of this type follows the wxWidget's container classes declaration for arrays of objects:

  class wxJSONValue;
  WX_DECLARE_OBJARRAY( wxJSONValue, wxJSONInternalArray )

Note that the name of the type contains the word internal. This means that the type is used internally by the wxJSONValue class and should not be used by the application. However, the class's API defines a member function that can be used to get the internal array type:

  const wxJSONInternalArray* AsArray() const;

which returns the pointer of the array, stored in the wxJSONValue::m_value.m_valArray data member. There is no need for the application to access the internal representation of the JSON array-type. Use the wxJSONValue::Item, wxJSONValue::ItemAt and the subscript operator for retreiving array's values.

The map of key/value pairs.

An object of this type is a map of key / value pairs where the key is a string and the value is a wxJSONValue object: it can hold bools, integers, strings, arrays and key/value maps, too.

This type is implemented using the wxHashMap class which is a simple, type-safe, and reasonably efficient hash map class whose interface is a subset of the interface of STL containers.

The definition of the hashmap for wxJSONValue objects is as follows:

  WX_DECLARE_STRING_HASH_MAP( wxJSONValue, wxJSONInternalMap );

Note that the name of the type contains the word internal. This means that the type is used internally by the wxJSONvalue class and should not be used by the application. However, the wxJSONValue API defines a member function that can be used to get this object:

  const wxJSONInternalMap* AsMap() const;

There is no need for the application to access the internal representation of the JSON hashmap-type. Use the wxJSONValue::Item(const wxString&), wxJSONValue::ItemAt and the subscript operator for retreiving hashmap's values.

The comparison function and operator

You may have noticed that the wxJSONValue class does not define a comparison operator (the operator==() function). This is not a forgetfullness but a precise design choice because comparing wxJSON Value objects may be time-consuming and the meaning of equal is not applicable to JSON objects. Consider the following two JSON objects:

 // first object
 {
   "font" : {
     "size" = 12,
     "face" = "Arial",
     "bold" = true
   }
 }

 // second object
 {
   "font" : {
     "face" = "Arial",
     "bold" = true
     "size" = 12,
   }
 }

You have to note that the two objects are not equal because the order of the key/value pairs is not the same. Althrough, the values that the two objects contain are the same.

For this reason the wxJSONValue class does not define the comparison operator but a similar function: the \c wxJSONValue::IsSameAs() which returns TRUE if the two objects contain the same values even if they are not in the same order: this applies for both key/value maps and for arrays.

The comparison function is much time-consuming: for key/value maps, the function first compares the number of elements in the two objects and, if it is equal, for every key in the first object it searches the second object and compares the values.

The job of comparing arrays is even worse: the comparison function has to first compare the number of elements then, because the order of the elements is unimportant, for each element in the first array the function iterates through all elements in the second array searching for a matching element.

If the two objects are very complex, the comparison function is very slow and you are discouraged to use it unless it is strictly necessary. I have defined this function only for debugging purposes.

Comparing different types

A problem in the interpretation of IsSameAs arise when comparing different types that can be converted or promoted to another type. Consider the two following JSON values:

  wxJSONValue v1( 100 );
  wxJSONValue v2( 100.0 );
  bool r = v1.IsSameAs( v2 );  // should return TRUE

The above values will be stored as different types: the first as an integer and the second as a double but they are, in fact, the same value and the function should return TRUE. Until the release 0.2.1 included, the wxJSON library had a bug that cause the IsSameAs() function to return FALSE in the above example. This was due to the fact that the function first compared the types and if they differ, FALSE was immediatly returned without trying a type conversion.

Starting from release 0.2.2, this bug was fixed and the wxJSONValue::IsSameAs() function now correctly compares compatible types: by now, they only are the numeric types. In other words, a string that contains the same value as a numeric one is not the same. Example:

  wxJSONValue v1( 100 );
  wxJSONValue v2( _T("100"));
  bool r = v1.IsSameAs( v2 );  // returns FALSE

The comparison function compares different numeric types by promoting them to double an then comparing the double values. In this way the function correctly handles this situation:

  wxJSONValue v1( -1 );            // this is -1
  wxJSONValue v2( (unsigned) -1);  // this is 4.294.967.296
  bool r = v1.IsSameAs( v2 );      // returns FALSE

The multiline string

WARNING: this feature in old 0.x versions is a conseguence of an error in the implementation or the wxJSONWriter's flags. The feature is still usefull but its implementation had changed in the writer class: no change is needed in the parser.

The old 0.x documentation
This feature allows a JSON string to be splitted in two or more lines in the input JSON test:

 {
    "copyright" : "This library is distributed\n"
                  "under the GNU Public License\n"
                  "(C) 2007 XYZ Software"
 }

It is much more readable by humans because the new-line characters contained in the value string are enforced by splitting the string into three lines. The wxJSONReader allows this but reports a warning if the parser was not constructed usign the wxJSONREADER_STRICT flag. If strict is used, strings cannot be splitted.

In order to get this feature, the parser concatenates strings if a comma character is not found between them. Note that only strings can be splitted into more than one line: numbers and literals cannot be splitted.

The drawback is that this feature is error-prone. Consider the following example:

  {
    "seasons" :  [
      "spring",
      "summer"
      "autumn",
      "winter"
     ]
  }

The array has four elements but I forgot the comma character between the second and the third element. What I get in this case is a three-elements array where the second element is the concatenation of the two strings "summer" and "autumn" which is not what I wanted. Worse, the parser does not consider this an error: only a warning is reported.

For this reason, I am not sure that this feature is really a feature: maybe it will be dropped in next versions of the wxJSON library.

The actual documentation
I wrote the old documentation just for historycal purposes and for clarity: it helps to understand why the feature was changed in 1.0 version. First of all if you run the test #24 in the test application (see the Test24() function in the samples/test6.cpp source file) you got the following output:

   "This is a multiline string\n"
      "this is line #2\n"
      "this is line #3"

which is just as explained before and it seems good but it isn't. The problem is that the LF character contained in the three strings would have to break the line just before the close-quote characters but this does not happen. The reason is that the LF character was not acually a LF (0x0D) but the lowercase letter 'n'. This is due to a implementation error in the wxJSONWriter::WriteStringValue() function which contains a table of characters that have to be escaped and the corresponding table of escape sequences. Here is the old code:

  // the characters that have to be escaped
  static const wxChar escapedChars[] = { 
        _T( '\"' ), // quotes 
        _T( '\\' ), // reverse solidus
        _T( '/' ),  // solidus
        _T( '\b' ), // backspace
        _T( '\f' ), // formfeed
        _T( '\n' ), // newline
        _T( '\r' ), // carriage-return
        _T( '\t' )  // horizontal tab
  };

  // the escape sequences that has to be written instead of the characters
  const wxString replaceChars[] = { 
        _T("\\\""),             // OK
        _T("\\\\"),             // OK
        _T("\\/"),              // OK
        _T("\\b"),              // should be "\\\b"
        _T("\\f"),              // should be "\\\f"
        _T("\\n"),              // should be "\\\n"
        _T("\\r"),              // should be "\\\r"
        _T("\\t")               // should be "\\\t"
  };

As you can see, I forgot one ESC character so the escape sequence for a LF character results in a ESCAPE character followed by a lowercase letter 'n'.

The conseguence is that the multiline string feature is good for the parser but it is useless for the writer: there is no need to enforce a line-feed character by adding another line-feed character: the string already results splitted. If we add the line-feed after the close-quote character, the readibility of the JSON text will be made worse.

// the output without the MULTILINE string feature is more readable
// than using it: LF character are escaped.

"This is a multiline string\
this is line #2\
this is line #3"

Nevertheless, the flag is still usefull in the JSON writer. If a string value is very long, it would be nice to split it into two or more lines provided that the string itself does not contain LF characters. Consider the following example:

  {
        "key" : "A very long string value that cause the user to scroll horizontally in order to actually read the whole line"
  }

It would be more readable if it was written as:

  {
        "key" : "A very long string value that cause the user to scroll"
                " horizontally in order to actually read the whole line"
  }

The reader is already capable of reading multiline strings such as the one seen above regardless they contain LF characters (escaped or not) just specify the wxJSONREADER_MULTISTRING flag in its constructor.

For the writer, the wxJSONWRITER_SPLIT_STRING flag has changed its implementation: strings are not splitted when a LF character is encontered but only if the 75th column has reached in the output JSON text. Note that the writer will split the string when a space character is encontered so that a single word is not splitted in two lines.

C/C++ comments in JSON text

Starting with release 0.2, the wxJSON library recognizes and stores C/C++ comments in JSON value objects. See wxJSON internals: C/C++ comments storage for a detailed implementation.
Generated on Mon Aug 18 22:54:23 2008 for wxJSON by  doxygen 1.4.7