stax - get xml node as string

xml looks like so:

<statements>
   <statement account="123">
      ...stuff...
   </statement>
   <statement account="456">
      ...stuff...
   </statement>
</statements>

I'm using stax to process one "<statement>" at a time and I got that working. I need to get that entire statement node as a string so I can create "123.xml" and "456.xml" or maybe even load it into a database table indexed by account.

using this approach: http://www.devx.com/Java/Article/30298/1954

I'm looking to do something like this:

String statementXml = staxXmlReader.getNodeByName("statement");

//load statementXml into database

Answers


Why not just use xpath for this?

You could have a fairly simple xpath to get all 'statement' nodes.

Like so:

//statement

EDIT #1: If possible, take a look at dom4j. You could read the String and get all 'statement' nodes fairly simply.

EDIT #2: Using dom4j, this is how you would do it: (from their cookbook)

String text = "your xml here";
Document document = DocumentHelper.parseText(text);

public void bar(Document document) {
   List list = document.selectNodes( "//statement" );
   // loop through node data
}

You can use StAX for this. You just need to advance the XMLStreamReader to the start element for statement. Check the account attribute to get the file name. Then use the javax.xml.transform APIs to transform the StAXSource to a StreamResult wrapping a File. This will advance the XMLStreamReader and then just repeat this process.

import java.io.File;
import java.io.FileReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stax.StAXSource;
import javax.xml.transform.stream.StreamResult;

public class Demo {

    public static void main(String[] args) throws Exception  {
        XMLInputFactory xif = XMLInputFactory.newInstance();
        XMLStreamReader xsr = xif.createXMLStreamReader(new FileReader("input.xml"));
        xsr.nextTag(); // Advance to statements element

        while(xsr.nextTag() == XMLStreamConstants.START_ELEMENT) {
            TransformerFactory tf = TransformerFactory.newInstance();
            Transformer t = tf.newTransformer();
            File file = new File("out" + xsr.getAttributeValue(null, "account") + ".xml");
            t.transform(new StAXSource(xsr), new StreamResult(file));
        }
    }

}

Stax is a low-level access API, and it does not have either lookups or methods that access content recursively. But what you actually trying to do? And why are you considering Stax?

Beyond using a tree model (DOM, XOM, JDOM, Dom4j), which would work well with XPath, best choice when dealing with data is usually data binding library like JAXB. With it you can pass Stax or SAX reader and ask it to bind xml data into Java beans and instead of messing with xml process Java objects. This is often more convenient, and it is usually quite performance. Only trick with larger files is that you do not want to bind the whole thing at once, but rather bind each sub-tree (in your case, one 'statement' at a time). This is easiest done by iterating Stax XmlStreamReader, then using JAXB to bind.


I had a similar task and although the original question is older than a year, I couldn't find a satisfying answer. The most interesting answer up to now was Blaise Doughan's answer, but I couldn't get it running on the XML I am expecting (maybe some parameters for the underlying parser could change that?). Here the XML, very simplyfied:

<many-many-tags>
    <description>
        ...
        <p>Lorem ipsum...</p>
        Devils inside...
        ...
    </description>
</many-many-tags>

My solution:

public static String readElementBody(XMLEventReader eventReader)
    throws XMLStreamException {
    StringWriter buf = new StringWriter(1024);

    int depth = 0;
    while (eventReader.hasNext()) {
        // peek event
        XMLEvent xmlEvent = eventReader.peek();

        if (xmlEvent.isStartElement()) {
            ++depth;
        }
        else if (xmlEvent.isEndElement()) {
            --depth;

            // reached END_ELEMENT tag?
            // break loop, leave event in stream
            if (depth < 0)
                break;
        }

        // consume event
        xmlEvent = eventReader.nextEvent();

        // print out event
        xmlEvent.writeAsEncodedUnicode(buf);
    }

    return buf.getBuffer().toString();
}

Usage example:

XMLEventReader eventReader = ...;
while (eventReader.hasNext()) {
    XMLEvent xmlEvent = eventReader.nextEvent();
    if (xmlEvent.isStartElement()) {
        StartElement elem = xmlEvent.asStartElement();
        String name = elem.getName().getLocalPart();

        if ("DESCRIPTION".equals(name)) {
            String xmlFragment = readElementBody(eventReader);
            // do something with it...
            System.out.println("'" + fragment + "'");
        }
    }
    else if (xmlEvent.isEndElement()) {
        // ...
    }
}

Note that the extracted XML fragment will contain the complete extracted body content, including white space and comments. Filtering those on demand, or making the buffer size parametrizable have been left out for code brevity:

'
    <description>
        ...
        <p>Lorem ipsum...</p>
        Devils inside...
        ...
    </description>
    '

Need Your Help

Call one function from another function in PHP class

php class function oop

I want to scan directories and subdirectories, make list of xml files, take content from xml files and display it. This functions work correctly without OOP. I try to create a class. I call function

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.