Download the free trial version
Basic4android Video
Features
Tutorials and manuals
Showcase
Screenshots

Go Back   Android Development Forum - Basic4android > Basic4android > Basic4android Getting started & Tutorials
Documentation Wiki Register Members List B4P Search Today's Posts Mark Forums Read

Basic4android Getting started & Tutorials Android development starts here. Please do not post questions in this sub-forum.

XML Parsing with the XmlSax library

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 12-12-2010, 02:51 PM
Erel's Avatar
Administrator
 
Join Date: Apr 2007
Posts: 15,689
Awards Showcase
Basic4ppc Founder 
Total Awards: 1
Default XML Parsing with the XmlSax library

The XmlSax library provides an XML Sax parser.
This parser sequentially reads the stream and raises events at the beginning and end of each element.
The developer is responsible to do something useful with those events.

There are two events:
Code:
StartElement (Uri As String, Name As StringAttributes As Attributes)
EndElement (Uri 
As String, Name As String, Text As StringBuilder)
The StartElement is raised when an element begins. This event includes the element's attributes list.
EndElement is raised when an element ends. This event includes the element's text.

In this example we will parse the forum RSS feed. RSS is formatted using XML.
A simplified example of this RSS is:
Code:
<?xml version="1.0" encoding="ISO-8859-1"?>
<rss version=
"2.0">
    <channel>
        <title>Basic4ppc  / Basic4android - Android programming</title>
        <link>http://www.basic4ppc.com/forum</link>
        <description>Basic4android - android programming 
and development</description>
        <ttl>
60</ttl>
        <image>
            <url>http://www.basic4ppc.com/forum/images/misc/rss.jpg</url>
            <title>Basic4ppc  / Basic4android - Android programming</title>
            <link>http://www.basic4ppc.com/forum</link>
        </image>
        <item>
            <title>
Phone library was updated - V1.10</title>
            <link>http://www.basic4ppc.com/forum/additional-libraries-official-updates/
6859-phone-library-updated-v1-10-a.html</link>
            <pubDate>Sun, 
12 Dec 2010 09:27:38 GMT</pubDate>
            <guid isPermaLink=
"true">http://www.basic4ppc.com/forum/additional-libraries-official-updates/6859-phone-library-updated-v1-10-a.html</guid>
        </item>
        ...MORE ITEMS HERE
    </channel>
</rss>
The first line is part of the XML protocol and is ignored.
On the second line the StartElement event will be raised with "Name = rss" and the attributes will include the "version" field.
The EndElement of the rss element will only be called on the last line: </rss>.

We will populate a list view with all items parsed from an offline file. When the user will press on an item we will open the browser with the relevant link.
Every item represents a forum thread.



For each item we are interested in two values. The title and the link.
The SaxParser object includes a handy list that holds the names of all the current parents elements.
This is useful as it will help us find the "correct" 'title' and 'link' elements. The correct elements are the ones under the 'item' element.

The parsing code in this case is pretty simple:
Code:
Sub Parser_StartElement (Uri As String, Name As StringAttributes As Attributes)

End Sub
Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
    
If parser.Parents.IndexOf("item") > -1 Then
        
If Name = "title" Then
            Title = Text.ToString
        
Else If Name = "link" Then
            Link = Text.ToString
        
End If
    
End If
    
If Name = "item" Then
        ListView1.AddSingleLine2(Title, Link) 
'add the title as the text and the link as the value
    End If
End Sub
Title and Link are global variables.
We are only using EndElement events in this program.
First we check if we are inside an 'item' element. If this is the case we check the actual element name and save it if it is 'title' or 'link'.

If the current element is 'item' it means that we are done parsing an item.
So we add the data collected to the list view.

We are using ListView.AddSingleLine2. This method receives two values. The first is the item text and the second is the value that will return when the user will click on this item. In this case we are storing the link as the return value.

Later we will use it to open the browser:
Code:
Sub ListView1_ItemClick (Position As Int, Value As Object)
    
StartActivity(PhoneIntents1.OpenBrowser(Value)) 'open the brower with the link
End Sub
The code that initiated the parsing is:
Code:
    Dim in As InputStream
in = 
File.OpenInput(File.DirAssets, "rss.xml"'This file was added with the file manager.
parser.Parse(in, "Parser"'"Parser" is the events subs prefix.
in.Close
Attached Files
File Type: zip XmlSax.zip (10.0 KB, 767 views)
Reply With Quote
  #2 (permalink)  
Old 12-13-2010, 03:05 AM
ssg ssg is offline
Basic4ppc Veteran
 
Join Date: Nov 2010
Posts: 488
Default

Hi Erel,

Thank you for this excellent library... been waiting for it

I have a question, my sample file had an empty line as the first line. This threw a runtime error. Deleting the empty line fixed the problem.

Is it a must that the first line be the XML declaration?

Thank you.
Reply With Quote
  #3 (permalink)  
Old 12-13-2010, 04:02 AM
Erel's Avatar
Administrator
 
Join Date: Apr 2007
Posts: 15,689
Awards Showcase
Basic4ppc Founder 
Total Awards: 1
Default

Quote:
Is it a must that the first line be the XML declaration?
Yes. The error thrown was thrown by the underlying system parser.
Reply With Quote
  #4 (permalink)  
Old 12-13-2010, 05:07 AM
ssg ssg is offline
Basic4ppc Veteran
 
Join Date: Nov 2010
Posts: 488
Default

got it! thanks a bunch....
Reply With Quote
  #5 (permalink)  
Old 12-17-2010, 07:55 AM
Basic4ppc Expert
 
Join Date: May 2008
Posts: 550
Default

I use PHP to generate the xml file like this:

Code:
<?xml version="1.0" encoding="UTF-8"?>
<item>
<year>
1431</year>
<content>Henry VI of England 
is crowned King of France.</content>
<year>
1653</year>
<content>Oliver Cromwell takes on dictatorial powers with  the title of Lord Protector./content>
<year>
1998</year>
<content>The United States launches a missile attack on Iraq  
for failing to comply with United Nations weapons inspectors.</content>
</item>
I use your tutorial code to load the content:

Code:
Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
    
If parser.Parents.IndexOf("item") > -1 Then
        
If Name = "year" Then
            Title = Text.ToString
        
Else If Name = "content" Then
            Link = Text.ToString
        
End If
    
End If
    
If Name = "item" Then
        ListView1.AddTwoLines(Title, Link)
    
End If
End Sub
It load the xml but only the last one (year 1998). What's wrong? Do I need to revise the xml file?
__________________
I'm not good at English, please understand. Thank you.
Reply With Quote
  #6 (permalink)  
Old 12-17-2010, 08:06 AM
ssg ssg is offline
Basic4ppc Veteran
 
Join Date: Nov 2010
Posts: 488
Default

hi susu,

I believe the following line is causing issue:

Code:
    If Name = "item" Then
    ListView1.AddTwoLines(Title, Link)
End If
This means when the "item" tag closes, only then append the values to the list view.

I'd change this to the following:

Code:
Sub Parser_EndElement (Uri As String, Name As String, Text As StringBuilder)
    
If parser.Parents.IndexOf("item") > -1 Then
        
If Name = "year" Then
            Title = Text.ToString
        
Else If Name = "content" Then
            Link = Text.ToString
                ListView1.AddTwoLines(Title, Link)
        
End If
    
End If
End Sub
Not having access to B4A right now... but I hope that helps you out.

Cheers!
Reply With Quote
  #7 (permalink)  
Old 12-17-2010, 09:25 AM
Basic4ppc Expert
 
Join Date: May 2008
Posts: 550
Default

Yeah! You saved me! Thank you SSG.
__________________
I'm not good at English, please understand. Thank you.
Reply With Quote
  #8 (permalink)  
Old 02-13-2011, 04:46 PM
Basic4ppc Veteran
 
Join Date: Feb 2011
Location: Chicago area (NW Indiana, USA)
Posts: 325
Default

I'm trying to write my first Android app using B4A and I am having a problem parsing XML.

I am opening a URL that returns XML and saving that return/result to a string. Then I am trying to feed that string into the XML parser, but I am getting an error when compiling.

--------------
src\com\cognitial\vstream\main.java:276: inconvertible types
found : java.lang.String
required: java.io.Reader
_parser.Parse2((java.io.Reader)(_result),"Parser") ;
--------------

Is there no way to feed the parser a string? How would I go about feeding the XML result from a URL into the parser? Do I need to 'save' it to the device first? If so, how would I do that, and how would I delete it when I am finished?
Reply With Quote
  #9 (permalink)  
Old 02-13-2011, 06:06 PM
Erel's Avatar
Administrator
 
Join Date: Apr 2007
Posts: 15,689
Awards Showcase
Basic4ppc Founder 
Total Awards: 1
Default

Parse2 expects a TextReader not a string.
Instead of saving the result to a string, just pass the InputStream directly to the XML parser.
Reply With Quote
  #10 (permalink)  
Old 02-26-2011, 04:30 PM
Junior Member
 
Join Date: Jan 2011
Location: Finland
Posts: 49
Default

How is xml character encoding handled... as I get some error when there is 'ä' or 'ö' characters in xml stream.. is UTF8 only encoding that XmlSax handles or is it okay to use ->
Code:
<?xml version='1.0' encoding='ISO-8859-1'?>

error code was:
Code:
org.apache.harmony.xml.ExpatParser$ParseException: At line 8, column 197not well-formed (invalid token)
__________________
--- Samsung Galaxy Note ---
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
Door library (Beta) - Special library Erel Official Updates 60 01-13-2011 11:23 AM
Merging Outlook library and Phone library Erel Official Updates 11 09-15-2010 09:22 AM
GPS.dll parsing keywords epo Questions (Windows Mobile) 3 11-11-2009 12:46 PM
File Parsing Smee Questions (Windows Mobile) 3 05-28-2009 04:18 PM
How to use Network Client and stream parsing stuff... apstrojny2 Questions (Windows Mobile) 4 04-29-2009 06:16 PM


All times are GMT. The time now is 10:23 AM.


Powered by vBulletin® Version 3.6.12
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
Search Engine Friendly URLs by vBSEO 3.3.0