Selecting specific text from an xml file using R

I need to select specific text from within an xml document using R. The syntax before and after the area I need to pull is constant so it will work with many xml files when i run it through my script.

For example using a mock xml document..

<head>
  <image name="test1">
    <nodes>
      <alt>Synthesis1</alt>
    </node>
    <body> There is a lot of text in this section, THIS IS WHAT I NEED TO SELECT, Here is some more text in the section
    </body>
    <body> Here is the next section, THIS IS AGAIN WHAT I NEED TO SELECT, Here is more text afterwards
    </body>
  </image>
</head>

I've been using the XML package in R with no luck. Any suggestions? Thanks!

Answers


Try

library(XML)
doc <- htmlParse('<head>
  <image name="test1">
    <nodes>
      <alt>Synthesis1</alt>
    </node>
    <body> There is a lot of text in this section, THIS IS WHAT I NEED TO SELECT, Here is some more text in the section
    </body>
    <body> Here is the next section, THIS IS AGAIN WHAT I NEED TO SELECT, Here is more text afterwards
    </body>
  </image>
</head>')
doc["//body"]

or

sapply(doc["//body"], xmlValue, trim = TRUE)
# [1] "There is a lot of text in this section, THIS IS WHAT I NEED TO SELECT, Here is some more text in the section"
# [2] "Here is the next section, THIS IS AGAIN WHAT I NEED TO SELECT, Here is more text afterwards" 

Need Your Help

How to define a route that only accepts a slug and id?

asp.net asp.net-mvc asp.net-mvc-3 web-applications

What kind of a route do I need in order to catch urls like mydowmain.com/slug/id like:

How can I fix this code for WPF Application?

c# wpf xaml data-binding

As a learning resource, I want to convert a project that has most of the work done by XAML to be backend code. So, here is the original XAML code I cam trying to convert.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.