zenXML
Straightforward C++ XML Processing
Overview

Rationale

zenXML is an XML library that enables serialization of structured user data in a convenient way. Using compile-time information gathered by techniques of template metaprogramming it minimizes the manual overhead required and frees the user from applying fundamental type conversions by himself. Basic data types such as all build-in arithmetic numbers, all kinds of string classes and "string-like" types, all types defined as STL containers are processed automatically. Thereby a large number of recurring problems is finally solved by the library:

The design follows the philosophy of the Loki library:
http://loki-lib.sourceforge.net/index.php?n=Main.Philosophy

Quick Start

1. Download zenXML: http://sourceforge.net/projects/zenxml

2. Setup a preprocessor macro for your project to identify the platform (this is required for C-stream file IO only)

    ZEN_PLATFORM_WINDOWS
    or
    ZEN_PLATFORM_OTHER

3. For optimal performance define this global macro in release build: (following convention of the assert macro)

    NDEBUG

4. Include the main header:

#include "zenxml.h"

5. Start serializing user data:

size_t a = 10;
double b = 2.0;
int    c = -1;
zen::XmlDoc doc; //empty XML document

zen::XmlOut out(doc); //fill the document via a data output proxy
out["elem1"](a); //
out["elem2"](b); //map data types to XML elements
out["elem3"](c); //

try
{
    save(doc, "file.xml"); //throw zen::XmlFileError
}
catch (const zen::XmlFileError& e) { /* handle error */ }

The following XML file will be created:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <elem1>10</elem1>
    <elem2>2.000000</elem2>
    <elem3>-1</elem3>
</Root>

Load an XML file and map its content to user data:

zen::XmlDoc doc; //empty XML document

try
{
    load("file.xml", doc); //throw XmlFileError, XmlParsingError
}
catch (const zen::XmlFileError& e) { /* handle error */ }

zen::XmlIn in(doc); //read document into user data via an input proxy
in["elem1"](a); //
in["elem2"](b); //map XML elements into user data
in["elem3"](c); //

//check for mapping errors: element missing, conversion error: these MAY be considered warnings only
if (in.errorsOccured())
{
   std::vector<std::wstring> failedElements = in.getErrorsAs<std::wstring>();
   /* show mapping errors */
}

Supported Platforms

zenXML is written in a platform independent manner and should run on any rudimentary C++11 compliant compiler. It has been tested successfully under:

Note: In order to enable C++11 features in GCC it is required to specify this compiler option:

-std=gnu++0x

Flexible Programming Model

Depending on what granularity of control is required in a particular application, zenXML allows the user to choose between full control or simplicity.

The library is structured into the following parts, each of which can be used in isolation:

<File>
|
| zenxml_io.h
|
<Byte Stream>
|
| zenxml_parser.h
|
<Document Object Model>
|
| zenxml_bind.h
|
<C++ user data>

Structured XML element access

zen::XmlOut out(doc);
out[elem1][elem2][elem3][elem4]["elem5"][L"元素6"][L'元']['z'](-1234); //write value into one deeply nested XML element

The resulting XML:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <elemento1>
        <элемент2>
            <要素3>
                <στοιχείο4>
                    <elem5>
                        <元素6>
                            <元>
                                <z>-1234</z>
                            </元>
                        </元素6>
                    </elem5>
                </στοιχείο4>
            </要素3>
        </элемент2>
    </elemento1>
</Root>

Access XML attributes

zen::XmlDoc doc;

zen::XmlOut out(doc);
out["elem"].attribute("attr1",   -1); //
out["elem"].attribute("attr2",  2.0); //write data into XML attributes
out["elem"].attribute("attr3", true); //

save(doc, "file.xml"); //throw XmlFileError

The resulting XML:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <elem attr1="-1" attr2="2.000000" attr3="true"/>
</Root>

Automatic conversion for built-in arithmetic types

All built-in arithmetic types are detected at compile time and a proper conversion is applied. Common conversions for integer-like types such as long, size_t or __int64 as well as floating point types are optimized for maximum performance.

zen::XmlOut out(doc);

out["int"]      (-1234);
out["double"]   (1.23);
out["float"]    (4.56f);
out["usignlong"](1234UL);
out["bool"]     (false);

The resulting XML:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <int>-1234</int>
    <double>1.230000</double>
    <float>4.560000</float>
    <usignlong>1234</usignlong>
    <bool>false</bool>
</Root>

Automatic conversion for string-like types

The document object model of zenXML internally stores all names and values as a std::string. Consequently everything that is not a std::string but is "string-like" is converted automatically into a std::string representation. By default zenXML accepts all character arrays like char[], wchar_t[], char*, wchar_t*, single characters like char, wchar_t, standard string classes like std::string, std::wstring and user defined string classes. If the input string is based on char, it will simply be copied and thereby preserves any local encodings. If the input string is based on wchar_t it will be converted to an UTF-8 encoded std::string. The correct wchar_t encoding of the system will be detected at compile time, for example UTF-16 on Windows, UTF-32 on certain Linux variants.

Note: User defined string classes are implicitly supported if they fulfill the following string concept by defining:

  1. A typedef named value_type for the underlying character type: must be char or wchar_t
  2. A member function c_str() returning something that can be converted into a const value_type*
  3. A member function length() returning the number of characters returned by c_str()
std::string  elem1 = "elemento1";
std::wstring elem2 = L"элемент2";
wxString     elem3 = L"要素3";
MyString     elem4 = L"στοιχείο4";

zen::XmlOut out(doc);

out["string"]    (elem1);
out["wstring"]   (elem2);
out["wxString"]  (elem3);
out["MyString"]  (elem4);
out["char[6]"]   ("elem5");
out["wchar_t[4]"](L"元素6");
out["wchar_t"]   (L'元');
out["char"]      ('z');

The resulting XML:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <string>elemento1</string>
    <wstring>элемент2</wstring>
    <wxString>要素3</wxString>
    <MyString>στοιχείο4</MyString>
    <char[6]>elem5</char[6]>
    <wchar_t[4]>元素6</wchar_t[4]>
    <wchar_t>元</wchar_t>
    <char>z</char>
</Root>

Automatic conversion for STL container types

std::deque   <float>         testDeque;
std::list    <size_t>        testList;
std::map     <double, char>  testMap;
std::multimap<short, double> testMultiMap;
std::set     <int>           testSet;
std::multiset<std::string>   testMultiSet;
std::vector  <wchar_t>       testVector;
std::vector  <std::list<wchar_t>> testVectorList;
std::pair    <char, wchar_t> testPair;

/* fill container */

zen::XmlOut out(doc);

out["deque"]    (testDeque);
out["list"]     (testList);
out["map"]      (testMap);
out["multimap"] (testMultiMap);
out["set"]      (testSet);
out["multiset"] (testMultiSet);
out["vector"]   (testVector);
out["vect_list"](testVectorList);
out["pair" ]    (testPair);

The resulting XML:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <deque>
        <Item>1.234000</Item>
        <Item>5.678000</Item>
    </deque>
    <list>
        <Item>1</Item>
        <Item>2</Item>
    </list>
    <map>
        <Item>
            <1st>1.100000</1st>
            <2nd>a</2nd>
        </Item>
        <Item>
            <1st>2.200000</1st>
            <2nd>b</2nd>
        </Item>
    </map>
    <multimap>
        <Item>
            <1st>3</1st>
            <2nd>99.000000</2nd>
        </Item>
        <Item>
            <1st>3</1st>
            <2nd>100.000000</2nd>
        </Item>
        <Item>
            <1st>4</1st>
            <2nd>101.000000</2nd>
        </Item>
    </multimap>
    <set>
        <Item>1</Item>
        <Item>2</Item>
    </set>
    <multiset>
        <Item>1</Item>
        <Item>1</Item>
        <Item>2</Item>
    </multiset>
    <vector>
        <Item>Ä</Item>
        <Item>Ö</Item>
    </vector>
    <vect_list>
        <Item>
            <Item>ä</Item>
            <Item>ö</Item>
            <Item>ü</Item>
        </Item>
        <Item>
            <Item>ä</Item>
            <Item>ö</Item>
            <Item>ü</Item>
        </Item>
    </vect_list>
    <pair>
        <1st>a</1st>
        <2nd>â</2nd>
    </pair>
</Root>

Support for user defined types

User types can be integrated into zenXML by providing specializations of zen::readText() and zen::writeText() or zen::readValue() and zen::writeValue(). The first pair should be used for all non-structured types that can be represented as a simple text string. This specialization is then used to convert the type to XML elements and XML attributes. The second pair should be specialized for structured types that require an XML representation as a hierarchy of elements. This specialization is used when converting the type to XML elements only.

See section Type Safety for a discussion of type categories.

Example: Specialization for an enum type

enum UnitTime
{
    UNIT_SECOND,
    UNIT_MINUTE,
    UNIT_HOUR
};

namespace zen
{
template <> inline
void writeText(const UnitTime& value, std::string& output)
{
    switch (value)
    {
        case UNIT_SECOND: output = "second"; break;
        case UNIT_MINUTE: output = "minute"; break;
        case UNIT_HOUR:   output = "hour"  ; break;
    }
}

template <> inline
bool readText(const std::string& input, UnitTime& value)
{
    std::string tmp = input;
    zen::trim(tmp);
    if (tmp == "second")
        value = UNIT_SECOND;
    else if (tmp == "minute")
        value = UNIT_MINUTE;
    else if (tmp == "hour")
        value = UNIT_HOUR;
    else
        return false;
    return true;
}
}

Example: Brute-force specialization for an enum type

namespace zen
{
template <> inline
void writeText(const EnumType& value, std::string& output)
{
    output = zen::toString<std::string>(value); //treat enum as an integer
}

template <> inline
bool readText(const std::string& input, EnumType& value)
{
    value = static_cast<EnumType>(zen::toNumber<int>(input)); //treat enum as an integer
    return true;
}
}

Example: Specialization for a structured user type

struct Config
{
    int a;
    std::wstring b;
};

namespace zen
{
template <> inline
void writeValue(const Config& value, XmlElement& output)
{
    XmlOut out(output);
    out["number" ](value.a);
    out["address"](value.b);
}

template <> inline
bool readValue(const XmlElement& input, Config& value)
{
    XmlIn in(input);
    bool rv1 = in["number" ](value.a);
    bool rv2 = in["address"](value.b);
    return rv1 && rv2;
}
}

int main()
{
    Config cfg;
    cfg.a = 2;
     ...
    std::vector<Config> cfgList;
    cfgList.push_back(cfg);

    zen::XmlDoc doc;
    zen::XmlOut out(doc);
    out["config"](cfgList);
    save(doc, "file.xml"); //throw XmlFileError
}

The resulting XML:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <config>
        <Item>
            <number>2</number>
            <address>Abc 3</address>
        </Item>
    </config>
</Root>

Structured user types

Although it is possible to enable conversion of structured user types by specializing zen::readValue() and zen::writeValue() (see Support for user defined types), this approach has one drawback: If a mapping error occurs when converting an XML element to structured user data, like one child-element is missing, the input proxy class zen::XmlIn is only able to detect that the whole conversion failed. It cannot say which child-elements in particular failed to convert.

Therefore it may be appropriate to convert structured types by calling subroutines in order to enable fine-granular logging:

void readConfig(const zen::XmlIn& in, Config& cfg)
{
    in["number" ](value.a); //failed conversion will now be logged for each single item by XmlIn
    in["address"](value.b); //instead of once for the complete Config type!
}


void readConfig(const wxString& filename, Config& cfg)
{
    zen::XmlDoc doc; //empty XML document

    try
    {
        load(filename, doc); //throw XmlFileError, XmlParsingError
    }
    catch (const zen::XmlError& e) { /* handle error */ }

    zen::XmlIn in(doc); 
 
    zen::XmlIn inConfig = in["config"]; //get input proxy for child element "config"
  
    readConfig(inConfig, cfg); //map child element to user data by calling subroutine

    //check for mapping errors: errors occuring in subroutines are considered, too!
    if (in.errorsOccured())
       /* show mapping errors */
}

Type Safety

zenXML heavily utilizes methods of compile-time introspection in order to free the user from managing basic type conversions by himself. Thereby it is important to find the right balance between automatic conversions and type safety so that program correctness is not compromized. In the context of XML processing three fundamental type categories can be recognized:

These categories can be seen as a sequence of inclusive sets:

-----------------------------
| structured                |  Used as: XML element value                         Conversion via: readValue(), writeValue()
| ------------------------- |
| | to-string-convertible | |  Used as: XML element/attribute value               Conversion via: readText(), writeText()
| | ---------------       | |
| | | string-like |       | |  Used as: XML element/attribute value or XML name   Conversion via: stdStringTo(), toStdString()
| | ---------------       | |
| ------------------------- |
-----------------------------

A practical implication of this design is that conversions that do not make sense in a particular context simply lead to compile-time errors:

zen::XmlOut out(doc);
out[L'Z'](someValue); //fine: a wchar_t is acceptable as an element name
out[1234](someValue); //compiler error: an integer is NOT "string-like"!


int valInt = 0;
std::vector<int> valVec;

zen::XmlOut out(doc);
out["elem1"](valInt); //fine: both valInt and valVec can be converted to an XML element
out["elem2"](valVec); //

out["elem"].attribute("attr1", valInt); //fine: an integer can be converted to an XML attribute
out["elem"].attribute("attr2", valVec); //compiler error: a std::vector<int> is NOT "to-string-convertible"!
Author:
ZenJu

Email: zhnmju123 AT gmx DOT de
 All Classes Namespaces Files Functions Variables