MOBILE

iPhone SDK 3 Programming : XML Processing - Simple API for XML (SAX)

11/13/2013 6:52:48 PM

In some applications, the size of the XML document may prevent loading the whole document due to limited device memory. The Simple API for XML (SAX) is another XML parsing model that is different from DOM. In SAX, you configure the parser with callback functions. The SAX parser will use these function pointers to call your functions, informing you of important events. For example, if you are interested in the event Start of Document, you set up a function for this event and give the parser a pointer to it.

Listing 1 shows the remainder of the fetchAbsconders method pertaining to SAX parsing.

Example 1. SAX XML Parsing. Remainder of fetchAbsconders method.
else if(parser == XML_PARSER_SAX){
xmlParserCtxtPtr ctxt = xmlCreateDocParserCtxt(XMLChars);
int parseResult =
xmlSAXUserParseMemory(&rssSAXHandler, self, XMLChars,
strlen(XMLChars));

xmlFreeParserCtxt(ctxt);
xmlCleanupParser();
}
[pool release];
}

To use SAX in libxml2, you first set up a parser context using the function xmlCreateDocParserCtxt(), which takes a single parameter: the XML document represented as a C-string. After that, you start the SAX parser by calling the xmlSAXUserParseMemory() function. The function is declared in parser.h as:

int xmlSAXUserParseMemory (xmlSAXHandlerPtr sax, void * user_data,
const char * buffer, int size)

This function parses an in-memory buffer and calls your registered functions as necessary. The first parameter to this function is a pointer to the SAX handler. The SAX handler is a structure holding the pointers to your callback functions. The second parameter is an optional pointer that is application-specific. The value specified will be used as the context when the SAX parser calls your callback functions. The third and fourth parameters are used for the C-string XML document in memory and its length, respectively.

The SAX handler is where you store the pointers to your callback functions. If you are not interested in an event type, just store a NULL value in its field. The following is the definition of the structure in tree.h:

struct _xmlSAXHandler {
internalSubsetSAXFunc internalSubset;
isStandaloneSAXFunc isStandalone;
hasInternalSubsetSAXFunc hasInternalSubset;
hasExternalSubsetSAXFunc hasExternalSubset;
resolveEntitySAXFunc resolveEntity;
getEntitySAXFunc getEntity;
entityDeclSAXFunc entityDecl;
notationDeclSAXFunc notationDecl;
attributeDeclSAXFunc attributeDecl;
elementDeclSAXFunc elementDecl;
unparsedEntityDeclSAXFunc unparsedEntityDecl;
setDocumentLocatorSAXFunc setDocumentLocator;
startDocumentSAXFunc startDocument;
endDocumentSAXFunc endDocument;
startElementSAXFunc startElement;
endElementSAXFunc endElement;
referenceSAXFunc reference;
charactersSAXFunc characters;
ignorableWhitespaceSAXFunc ignorableWhitespace;
processingInstructionSAXFunc processingInstruction;
commentSAXFunc comment;
warningSAXFunc warning;


errorSAXFunc  error;
fatalErrorSAXFunc fatalError;
getParameterEntitySAXFunc getParameterEntity;
cdataBlockSAXFunc cdataBlock;
externalSubsetSAXFunc externalSubset;
unsigned int initialized;
// The following fields are extensions
void * _private;
startElementNsSAX2Func startElementNs;
endElementNsSAX2Func endElementNs;
xmlStructuredErrorFunc serror;
};

Listing 2 shows our SAX handler.

Example 2. Our SAX handler.
static  xmlSAXHandler rssSAXHandler ={
NULL, /* internalSubset */
NULL, /* isStandalone */
NULL, /* hasInternalSubset */
NULL, /* hasExternalSubset */
NULL, /* resolveEntity */
NULL, /* getEntity */
NULL, /* entityDecl */
NULL, /* notationDecl */
NULL, /* attributeDecl */
NULL, /* elementDecl */
NULL, /* unparsedEntityDecl */
NULL, /* setDocumentLocator */
NULL, /* startDocument */
NULL, /* endDocument */
NULL, /* startElement*/
NULL, /* endElement */
NULL, /* reference */
charactersFoundSAX, /* characters */
NULL, /* ignorableWhitespace */
NULL, /* processingInstruction */
NULL, /* comment */
NULL, /* warning */
errorEncounteredSAX, /* error */
fatalErrorEncounteredSAX, /* fatalError */
NULL, /* getParameterEntity */
NULL, /* cdataBlock */
NULL, /* externalSubset */
XML_SAX2_MAGIC, //
NULL,
startElementSAX, /* startElementNs */


endElementSAX,              /* endElementNs */
NULL, /* serror */
};

Aside from the function pointers, the initialized field should be set to the value XML_SAX2_MAGIC in order to indicate that the handler is used for a SAX2 parser. Once you call the xmlSAXUserParseMemory(), the SAX parser starts the parsing of the document and calling your registered callback functions.

We are mainly interested in three functions: startElementNsSAX2Func(), endElementNsSAX2Func(), and charactersSAXFunc().

startElementNsSAX2Func() is called when the parser encounters the start of a new element. startElementNsSAX2Func() is defined in tree.h as:

void  startElementNsSAX2Func (void * ctx, const xmlChar * localname,
const xmlChar * prefix, const xmlChar *URI,
int nb_namespaces,
const xmlChar ** namespaces,
int nb_attributes, int nb_defaulted,
const xmlChar ** attributes)

ctx is the user data, and it is the second value you used when you called the function xmlSAXUserParseMemory(). In our case, it is a pointer to the class XORSSFeedNebraska. Then localname is the local name of the element. prefix is the element namespace prefix (if available). URI is the element namespace name (if available). nb_namespaces is number of namespace definitions on that node. namespaces is a pointer to the array of prefix/URI pair namespace definitions. nb_attributes is the number of attributes on that node. nb_defaulted is the number of defaulted attributes. The defaulted ones are at the end of the array. attributes is a pointer to the array of (localname/prefix/URI/value/end) attribute values.

Listing 3 shows the definition of our startElementNsSAX2Func() function.

Example 3. The startElementSAX() callback function.
static void
startElementSAX(void *ctx,
const xmlChar *localname,
const xmlChar *prefix,
const xmlChar *URI,
int nb_namespaces,
const xmlChar **namespaces,
int nb_attributes,
int nb_defaulted,
const xmlChar **attributes)
{
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
XORSSFeedNebraska *feedNebraska = (XORSSFeedNebraska*) ctx;
if (feedNebraska.currentElementContent) {

[feedNebraska.currentElementContent release];
feedNebraska.currentElementContent = nil;
}
if ((!xmlStrcmp(localname, (const xmlChar *)"item"))) {
feedNebraska.currAbsconder = [[XOAbsconder alloc] init];
}
[pool release];
}

It's good practice to have an autorelease pool per function. We first start by casting the ctx to a pointer to our class XORSSFeedNebraska. The class and its parent are declared in Listings 4 and 5.

Example 4. The XORSSFeedNebraska class declaration.
#import "XORSSFeed.h"
@interface XORSSFeedNebraska : XORSSFeed {
}
@end

Example 5. The XORSSFeed class declaration.
@class XOAbsconder;
typedef enum {
XML_PARSER_DOM,
XML_PARSER_SAX
} XMLParser;

@interface XORSSFeed : NSObject {
NSString *feedURL;
NSMutableArray *absconders;
XMLParser parser;
NSMutableString *currentElementContent;
XOAbsconder *currAbsconder;
}
@property(nonatomic, copy) NSString *feedURL;
@property(nonatomic, assign) XMLParser parser;
@property(nonatomic, assign) NSMutableString *currentElementContent;
@property(nonatomic, assign) XOAbsconder *currAbsconder;
-(id)init;
-(id)initWithURL:(NSString*) feedURL;
-(void)fetchAbsconders;
-(NSUInteger)numberOfAbsconders;
-(XOAbsconder*)absconderAtIndex:(NSUInteger) index;
-(void)addAbsconder:(XOAbsconder*)absconder;
@end


The XORSSFeedNebraska object has an instance variable of type NSMutableString called currentElementContent. This variable holds the text value inside an element. It's constructed in our charactersFoundSAX() function and used in the endElementSAX() function. The function startElementSAX() always releases and so we set this instance variable to nil (if it is not already nil). This will ensure that we start with an empty string for holding the text. If the element name is item, we create a new object of the XOAbsconder class. This is a simple class holding the three pieces of data information about an individual absconder. Listing 6 shows the declaration of the XOAbsconder and Listing 7 shows its definition.

Example 6. The XOAbsconder class declaration.
#import <UIKit/UIKit.h>
@interface XOAbsconder : NSObject {
NSString *name;
NSString *furtherInfoURL;
NSString *desc;
}

@property(copy) NSString *name;
@property(copy) NSString *furtherInfoURL;
@property(copy) NSString *desc;
-(id)init;
-(id)initWithName:(NSString*)name
andURL:(NSString*)url
andDescription:(NSString*)desc;
-(NSString*)description;
@end

Example 7. The XOAbsconder class definition.
#import "XOAbsconder.h"

@implementation XOAbsconder
@synthesize name;
@synthesize furtherInfoURL;
@synthesize desc;

-(id)initWithName:(NSString*)name
andURL:(NSString*)url
andDescription:(NSString*)description{
self = [super init];
if(self){
self.name = name;
self.furtherInfoURL = url;
self.desc = description;
}
return self;
}

-(id)init{
return [self initWithName:@"" andURL:@"" andDescription:@""];
}

-(NSString*)description{
return [NSString stringWithString:name];
}

-(void)dealloc{
[name release];
[furtherInfoURL release];
[desc release];
[super dealloc];
}
@end

Our endElementNsSAX2Func() function is called endElementSAX() and is shown in Listing 8.

Example 8. The endElementSAX() function definition.
static void
endElementSAX (void *ctx,
const xmlChar *localname,
const xmlChar *prefix,
const xmlChar *URI)
{
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
XORSSFeedNebraska *feedNebraska = (XORSSFeedNebraska*) ctx;
if ((!xmlStrcmp(localname, (const xmlChar *)"item"))) {
if(feedNebraska.currAbsconder){
[feedNebraska addAbsconder:feedNebraska.currAbsconder];
}
[feedNebraska.currAbsconder release];
feedNebraska.currAbsconder = nil;
}
else if ((!xmlStrcmp(localname,(const xmlChar *)"title"))) {
if(feedNebraska.currAbsconder){
feedNebraska.currAbsconder.name =
feedNebraska.currentElementContent;
}
}
else if ((!xmlStrcmp(localname, (const xmlChar *)"link"))) {
if(feedNebraska.currAbsconder){
feedNebraska.currAbsconder.furtherInfoURL =
feedNebraska.currentElementContent;
}


}
else if ((!xmlStrcmp(localname,(const xmlChar *)"description"))) {
if(feedNebraska.currAbsconder){
feedNebraska.currAbsconder.desc =
feedNebraska.currentElementContent;
}
}

if (feedNebraska.currentElementContent) {
[feedNebraska.currentElementContent release];
feedNebraska.currentElementContent = nil;
}
[pool release];
}

The function first checks to see if the element's name is item. If it is, then we add the XOAbsconder object which was constructed by the other callback functions. Otherwise, we check for the three element names: title, link, and description. For each of these elements, we set its respective text value gathered by the charactersSAXFunc() function. For example, the following sets the desc instance variable with the current text value.

feedNebraska.currAbsconder.desc = feedNebraska.currentElementContent;

The text of the element is stored in charactersSAXFunc(). The function is declared in parser.h as:

void  charactersSAXFunc (void * ctx, const xmlChar * ch, int len)

This function is called by the parser informing you of new found characters. In addition to the context, you receive the string of characters and its length. Between the start of an element and the end of that element, this function might be called several times. Your function should take this into account and append the new text to the current string.

Our charactersFoundSAX() function is shown in Listing 9.

Example 9. The charactersFoundSAX() function definition.
static void charactersFoundSAX(void * ctx, const xmlChar * ch, int len){
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
XORSSFeedNebraska *feedNebraska =(XORSSFeedNebraska*) ctx;
CFStringRef str =
CFStringCreateWithBytes(kCFAllocatorSystemDefault,
ch, len, kCFStringEncodingUTF8, false)
if (!feedNebraska.currentElementContent) {
feedNebraska.currentElementContent = [[NSMutableString alloc] init];
}
[feedNebraska.currentElementContent appendString:(NSString *)str];

CFRelease(str);
[pool release];
}

The function starts by casting the ctx into a XORSSFeedNebraska instance. Using this pointer, we can call our Objective-C class. After that, we create a string from received characters by using the function CFStringCreateWithBytes(), which is declared as follows:

CFStringRef CFStringCreateWithBytes (
CFAllocatorRef alloc,
const UInt8 *bytes,
CFIndex numBytes,
CFStringEncoding encoding,
Boolean isExternalRepresentation
);

The first parameter is used to specify the memory allocator. kCFAllocatorDefault is used for the current default allocator. The second parameter is the buffer which contains the characters. The third parameter specifies the number of bytes. The fourth parameter is the encoding. We use kCFStringEncodingUTF8 for UTF8 encoding. The fifth parameter is used to specify if the characters in the buffer are in an external representation format. Since they are not, we use false.

Once we have the string representation of the characters, we check to see if this is the first time charactersFoundSAX has been called for the current element. Recall that the parser can call this function multiple times, supplying the content of a single element. If it is the first time, we allocate our mutable string. After that, we append the string that we created from the character buffer to the mutable string. When the endElementSAX() function is called, we retrieve this string to build our Objective-C object, currAbsconder. When we are finished with the string str, we use the CFRelease() function to deallocate it.

Finally, the error handling functions are shown in Listings 10 and 11. As in all other event functions, what you do for error-handling depends on your application. In our example, we release the currAbsconder object that we are constructing and log the problem.

Example 10. The errorEncounteredSAX() function definition.
static void errorEncounteredSAX (void * ctx, const char * msg, ...){
XORSSFeedNebraska *feedNebraska = (XORSSFeedNebraska*) ctx;
if(feedNebraska.currAbsconder){
[feedNebraska.currAbsconder release];
feedNebraska.currAbsconder = nil;
}
NSLog(@"errorEncountered: %s", msg);
}

Example 11. The fatalErrorEncounteredSAX() function definition.
static void fatalErrorEncounteredSAX (void * ctx, const char * msg, ...){
XORSSFeedNebraska *feedNebraska = (XORSSFeedNebraska*) ctx;
if(feedNebraska.currAbsconder){
[feedNebraska.currAbsconder release];
feedNebraska.currAbsconder = nil;
}
NSLog(@"fatalErrorEncountered: %s", msg);
}
Other  
  •  iPhone SDK 3 Programming : XML Processing - Document Object Model (DOM)
  •  iPhone SDK 3 Programming : XML and RSS
  •  Windows Phone 8 : Making Money - Modifying Your Application, Dealing with Failed Submissions, Using Ads in Your Apps
  •  Windows Phone 8 : Making Money - Submitting Your App (part 3) - After the Submission
  •  Windows Phone 8 : Making Money - Submitting Your App (part 2) - The Submission Process
  •  Windows Phone 8 : Making Money - Submitting Your App (part 1) - Preparing Your Application
  •  Windows Phone 8 : Making Money - What Is the Store?
  •  BlackBerry Push APIs (part 3) - Building an Application that Uses the BlackBerry Push APIs - Checking the Status of a Push Request and Cancelling a Push Request
  •  BlackBerry Push APIs (part 2) - Building an Application that Uses the BlackBerry Push APIs - Unsubscribing From the Push System, Pushing Data to a Subscriber
  •  BlackBerry Push APIs (part 1) - Building an Application that Uses the BlackBerry Push APIs - BlackBerry Push API Domains , Subscriber Registration
  •  
    Top 10
    Review : Sigma 24mm f/1.4 DG HSM Art
    Review : Canon EF11-24mm f/4L USM
    Review : Creative Sound Blaster Roar 2
    Review : Philips Fidelio M2L
    Review : Alienware 17 - Dell's Alienware laptops
    Review Smartwatch : Wellograph
    Review : Xiaomi Redmi 2
    Extending LINQ to Objects : Writing a Single Element Operator (part 2) - Building the RandomElement Operator
    Extending LINQ to Objects : Writing a Single Element Operator (part 1) - Building Our Own Last Operator
    3 Tips for Maintaining Your Cell Phone Battery (part 2) - Discharge Smart, Use Smart
    REVIEW
    - First look: Apple Watch

    - 3 Tips for Maintaining Your Cell Phone Battery (part 1)

    - 3 Tips for Maintaining Your Cell Phone Battery (part 2)
    VIDEO TUTORIAL
    - How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 1)

    - How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 2)

    - How to create your first Swimlane Diagram or Cross-Functional Flowchart Diagram by using Microsoft Visio 2010 (Part 3)
    Popular Tags
    Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Biztalk Exchange Server Microsoft LynC Server Microsoft Dynamic Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Indesign Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe After Effects Adobe Photoshop Adobe Fireworks Adobe Flash Catalyst Corel Painter X CorelDRAW X5 CorelDraw 10 QuarkXPress 8 windows Phone 7 windows Phone 8