Reading XML into a tree of XML nodes is typically done like this:

  import Orchard.XML

  document = Orchard.XML.load('file.xml')

XML node constructors are one way to create XML nodes in code. There will also be modules that provide convenience methods for creating nodes. This is an example of using node constructors to build an XML tree:

  import Orchard.XML

  document = Orchard.XML.Document()
  element = Orchard.XML.Element( name='my_element' )
  document.contents.append(element)
  element.attributes['an_attr'] = Orchard.XML.Attribute()
  characters = Orchard.XML.Characters( data='some chars' )
  element.contents.append(characters)

Using Orchard.XML in event-stream mode is described below.

Orchard.XML

This module has basic functions for reading XML documents and creating trees and provides objects for accessing and working with them.

load(source)
This method reads an XML document in source, returning the document object. source can be a URI string, a file name, an XML string, or a file object.

Basic XML Nodes

Document
root The root element of the document.
contents An array containing any whitespace or comment nodes coming before or after the root element.

Element
name The element type name (including prefix).
namespace_uri The namespace of this element.
prefix The namespace prefix used on this element.
local_name The local name of this element.
attributes A mapping containing the attributes of this element, as Attribute objects. XML Namespace declarations are included, if any. See note below on Attribute indexes.
contents An array containing the contents of the element, which may be any type of node.

The attributes mapping is indexed by both name and by (namespace_uri, local_name). When assigning to this hash, the prefix, local_name, and namespace_uri of the attribute will override the mapping index used to make the assignment.

Mixing name and (namespace_uri, local_name) accesses can produce undefined results when prefix properties are modified on Attribute objects. It is recommended only to use name for unqualified (non-namespace) attributes and to use (namespace_uri, local_name) for all namespace attributes.

Users may assign an array to the attributes property, the mapping will be created from the Attribute objects in the array.

Attribute
name The attribute name (including prefix).
namespace_uri The namespace of this attribute.
prefix The namespace prefix used on this attribute.
local_name The local name of this attribute.
value The normalized value of the attribute.

Characters
data The characters from the XML document.

Orchard.Parsers.XML

Orchard.Parsers.XML provides an event-stream interface to an XML parser based on SAX but using Orchard.XML nodes as event parameters.

There are two basic interfaces to using the parser: the parser interface and the handler interface. The parser interface creates new parser instances, starts parsing, and provides additional information to handlers on request. The handler interface is used to receive parse events from the parser. This pattern is also commonly called "Producer and Consumer" or "Generator and Sink".

SAX is typically used like this:

    import Orchard.Parsers.SAX
    handler = MyHandler()
    parser = Orchard.Parsers.SAX.SAX(handler=handler)
    parser.parse(uri)

Handlers are typically written like this:

    class MyHandler:
        def startElement(self, element):
            print "Starting element " + element.name

        def endElement(self, element):
            print "Ending element " + element.name

        def characters(self, characters):
            print "characters: " + characters.data

Basic Orchard XML Parser

Applications may not invoke the parse() method again while a parse is in progress (they should create a new parser instead for each nested XML document). Once a parse is complete, an application may reuse the same parser object, possibly with a different input source.

During the parse, the parser will provide information about the XML document through the registered event handlers.

parse(source [, options])
Parses the XML instance identified by source. source can be a URI string, a file name, an XML string, or a file object. options can be a list of keyword arguments. Options include handler, features and properties, and advanced parser options. parse() returns the result of calling the end_document() handler.

handler
The default handler object to receive all events from the parser. Applications may change handler in the middle of the parse and the parser will begin using the new handler immediately.

Basic XML Handler

These methods are the most commonly used by handlers.

startDocument(document)
Receive notification of the beginning of a document.

The parser will invoke this method only once, before any other methods (except for setDocumentLocator() in advanced SAX handlers).

endDocument(document)
Receive notification of the end of a document.

The parser will invoke this method only once, and it will be the last method invoked during the parse. The parser shall not invoke this method until it has either abandoned parsing (because of an unrecoverable error) or reached the end of input.

The return value of endDocument() is returned by the parser's parse() method.

startElement(element)
Receive notification of the start of an element.

The parser will invoke this method at the beginning of every element in the XML document; there will be a corresponding endElement() event for every startElement() event (even when the element is empty). All of the element's content will be reported, in order, before the corresponding endElement() event.

endElement(element)
Receive notification of the end of an element.

The parser will invoke this method at the end of every element in the XML document; there will be a corresponding startElement() event for every endElement() event (even when the element is empty).

characters(characters)
Receive notification of character data.

The parser will call this method to report each chunk of character data. Parsers may return all contiguous character data in a single chunk, or they may split it into several chunks (however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information).