rwsf::XmlReaderrwsf::HandleBase
#include rwsf/core/XmlReader.h
Class rwsf::XmlReader implements the handle/body idiom in which rwsf::XmlReaderImp is the body and rwsf::XmlReader is the handle.
rwsf::XmlReader is a simple XML pull-parser. The XML document is typically parsed element by element using readElement(), or by iteratively calling readElementStart(), readElementValue(), and readElementEnd(). On each read, this object sets its internal state with information about what was just read. Member functions getLastNodeType(), getLastName(), and getLastContent() can then be used to retrieve portions of the rwsf::XmlReader's state.
rwsf::XmlReader throws an exception of type rwsf::XmlParseException when it encounters XML that is not well-formed. The rwsf::XmlParseException exception contains a description of the error and the line and column number of the source document where the error occurred.
rwsf::XmlReader can parse documents in the encodings UTF-8, UTF-16(BE), UTF-16LE, US-ASCII, and ISO-8859-1. In addition, if the rwsf_icu library is present, rwsf::XmlReader will also convert from any character encodings supported by the ICU.
For more information on how HydraSCA performs conversions and how to create custom conversions, see Chapter 20, "Internationalizing Your Services," in the HydraSCA Express Web Service Development Guide.
Currently, rwsf::XmlReader provides support only for reading elements and their content. No support for reading processing instructions, DOCTYPE declarations, or entity declarations is provided.
Note
rwsf::XmlReader converts all documents to UTF-8 regardless of the encoding of the source document.
XmlReader(const char * buf, size_t length);
Constructs a reader, using buf as input. Constructs self from the document pointed to by buf, which is length bytes long. Parses the prolog of the document if found, and determines document encoding both from the encoding= specifier in the optional XML declaration, and from a guess based on the first few bytes of the document. Upon construction, the reader is placed before the first tag in the document.
XmlReader(const unsigned char * buf, size_t length);
Constructs a reader, using buf as input. Constructs self from the document pointed to by buf, which is length bytes long. Parses the prolog of the document if found, and determines document encoding both from the encoding= specifier in the optional XML declaration, and from a guess based on the first few bytes of the document. Upon construction, the reader is placed before the first tag in the document.
XmlReader();
Default constructor. Constructs an invalid reader.
XmlReader(const std::string & document);
Convenience constructor for converting from an std::string. Constructs self from the document found in the string document. Parses the prolog of the document if found, and determines document encoding both from the encoding= specifier in the optional XML declaration, and from a guess based on the first few bytes of the document. Upon construction, the reader is placed before the first tag in the document.
~XmlReader();
Destructor.
void addNamespace(const rwsf::XmlNamespace & ns);
Adds ns to the list of namespaces known by the reader. This methods is useful when parsing document fragments where namespaces are declared outside of the scope of the fragment.
bool eof();
Returns true if at the end of the current document; false otherwise.
rwsf::XmlReader getElementReader(const rwsf::XmlReaderName & name = rwsf::XmlReaderName::Empty);
Returns an rwsf::XmlReader instance for the current element. The current reader will be moved past the end of the returned element.
std::string getEncoding() const;
Returns the name of the encoding of the original source document, either from the XML declaration's "encoding=" declaration, or as automatically sensed from the first few bytes of the XML document.
bool getExpandAttributeReference() const;
Returns true if the reader is expanding entity references in attributes, false otherwise.
bool getExpandCommentReference() const;
Returns the value of ExpandCommentReference. A value of true expands references.
bool getExpandContentReference() const;
Returns the value of expandReference. A value of true expands references.
rwsf::XmlAttributeSet getLastAttributes() const;
Returns the set of attributes associated with the last node read of type rwsf::XmlReader::StartTag.
std::string getLastContent() const;
Returns the last content read, for nodes of type rwsf::XmlReader::Data. The content will be encoded in UTF-8, regardless of the encoding of the source document. This value is undefined if the last node read was not of type rwsf::XmlReader::Data.
rwsf::XmlName getLastName() const;
Returns the name of the last node read. This value is undefined if the last node read was of type rwsf::XmlReader::Data.
NodeType getLastNodeType() const;
Returns the type of the last node read. The following table summarizes each type of node:
rwsf::XmlReader::StartTag
An XML start tag.
Example: "<customer>"
rwsf::XmlReader::EndTag
An XML end tag.
Example: "</customer>".
rwsf::XmlReader::EmptyTag
An empty XML tag.
Example: "<customer/>".
rwsf::XmlReader::Data
Data that is the content of an element, not including any tags.
Example: "John \c Doe".
rwsf::XmlReader::Unknown
Set before the reader has read an element from the document.
std::string getPrefixForURI(const std::string & uri) const;
Looks up the provided uri in the current list of namespaces and returns the corresponding prefix. If the current list of namespaces does not contain the uri, returns the empty string.
std::string getStandalone() const;
Returns the value of the source document's "standalone=" declaration, if it exists.
std::string getURIForPrefix(const std::string & prefix) const;
Looks up the provided prefix in the current list of namespaces, returns the corresponding URI. If the current list of namespaces does not contain the prefix, returns the empty string.
std::string getVersion() const;
Returns the value of the source document's "version=" declaration, if it exists.
bool hasEncoding() const;
Returns true if the source XML document explicitly specified an encoding. Returns false if the document's encoding was automatically sensed from the first few bytes of the XML document.
bool hasStandalone() const;
Returns true if a "standalone=" declaration existed in the source document's XML declaration.
bool isElementNext(const rwsf::XmlName & name);
Returns true if the next element is the one given in name.
bool isElementNext(const std::string & name);
Returns true if the next element is the one given in name.
std::string readElement(const rwsf::XmlName & name = NullName);
Reads and returns the entire next element found in the XML document at the current depth. Skips past any content that may exist. If no element exists at the current depth, returns an empty string. The returned element includes the text of the starting and ending tags, along with the text of all content and child elements. The returned element will always be encoded in UTF-8, regardless of the encoding of the source document. If name is specified, the element's name must match name, otherwise throws an exception of type rwsf::XmlParseException.
std::string readElement(const std::string & name);
Reads and returns the entire next element found in the XML document at the current depth. Skips past any content that may exist. If no element exists at the current depth, returns an empty string. The returned element includes the text of the starting and ending tags, along with the text of all content and child elements. The returned element will always be encoded in UTF-8, regardless of the encoding of the source document. If name is specified, the element's name must match name, otherwise throws an exception of type rwsf::XmlParseException.
void readElementEnd(const rwsf::XmlName & name);
Reads the next node in the document. If the node was not an end tag matching name, throws an exception of type rwsf::XmlParseException.
void readElementEnd();
Reads the next node in the document. If the node was not an end tag, throws an exception of type rwsf::XmlParseException.
rwsf::XmlAttributeSet readElementStart(const rwsf::XmlName & name);
Reads the next node in the document, and if the node was not a start tag, or the node's name does not match name, throws an exception of type rwsf::XmlParseException. Returns any attributes found inside the tag.
void readElementStart();
Reads the next node in the document. If the node was not a start tag, throws an exception of type rwsf::XmlParseException.
std::string readElementValue();
Reads and returns the next content from the document.
void readNextNode();
Reads the next start tag, empty tag, end tag, or content from the document. Use getLastNodeType(), getLastName(), and getLastContent() to retrieve information about what was read. If a well-formedness error is encountered while reading the document, an exception of type rwsf::XmlParseException is thrown.
This method is not typically used directly. It is used by other methods such as readElementStart(), readElementValue(), and so on.
std::string readWellFormedElement(const rwsf::XmlName & name = NullName);
Reads and returns the entire next element found in the XML document at the current depth. Skips past any content that may exist. If no element exists at the current depth, returns an empty string. The returned element includes the text of the starting and ending tags, along with the text of all content and child elements. The returned element will always be encoded in UTF-8, regardless of the encoding of the source document. If name is specified, the element's name must match name, otherwise throws an exception of type rwsf::XmlParseException.
This method will also add any necessary namespaces to the element in order for it to be well-formed.
void setExpandAttributeReference(bool expandReference);
Sets whether the reader expands entity references in attributes. For example, when expandReference is true, the reader converts the attribute value
3 < 4
to
3 < 4.
void setExpandCommentReference(bool expandComment);
Sets whether references in expandComment is expanded. A value of true expands references.
void setExpandContentReference(bool expandReference);
Sets whether references in content is expanded. A value of true expands references.
rwsf::XmlReaderImp & body() const;
Reimplements method in rwsf::HandleBase
Base class documentation:
Gets a reference for the body instance, if any; otherwise, throws an rwsf::Exception exception.
© Copyright Rogue Wave Software, Inc. All Rights Reserved. All Rights Reserved. Rogue Wave is a registered trademark of Rogue Wave Software, Inc. in the United States and other countries. HydraExpress is a trademark of Rogue Wave Software, Inc. All other trademarks are the property of their respective owners.
Contact Rogue Wave about documentation or support issues.