Can't remove node in Nokogiri

I'm having a bit of a strange issue with Nokogiri in Rails. I'm trying to remove a "p" tag with a class of "why". I have the following code, which doesn't work:

def test_grab
  f = File.open("public/test.html")
  @doc = Nokogiri::HTML.parse(f)
  f.close
  @doc = @doc.css("p")
  @doc.each do |p|
    if p["class"] == "why"
      logger.info p.values
      p.remove
    end
  end
end

test.html:

<html>
<head>
    <title>Test</title>
</head>
<body>
    <p>Test data</p>
    <p>More <a href="http://stackoverflow.com">Test Data</a></p>
    <p class="why">Why is this still here?</p>
</body>
</html>

Output html source:

<p>Test data</p>
<p>More <a href="http://stackoverflow.com">Test Data</a></p>
<p class="why">Why is this still here?</p>

I know the rails code is going into the if loop because the logger.info shows up on the server terminal.

Any ideas?

Answers


Is there any reason you're reusing your @doc instance variable?

When it comes to troubleshooting stuff like this, I find the best idea is to try evaluating the same code without the Rails overhead. For example:

require 'nokogiri'

doc = Nokogiri::HTML(DATA)
doc.css("p").each do |p|
  p.remove if p["class"] == "why" 
end

__END__
<html>
<head>
    <title>Test</title>
</head>
<body>
    <p>Test data</p>
    <p>More <a href="http://stackoverflow.com">Test Data</a></p>
    <p class="why">Why is this still here?</p>
</body>
</html>

Which returns:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head><title>Test</title></head>
<body>
    <p>Test data</p>
    <p>More <a href="http://stackoverflow.com">Test Data</a></p>

</body>
</html>

Now trying doing paragraphs = @doc.css("p") and then paragraphs.each .. or just omit the whole assignment like I have above.


Need Your Help

asp.net error form cannot be nested within element form?

asp.net forms

I have a content page in an asp.net application that uses a form tag. There's only one on the page so I'm confused why its give me the error: Validation (HTML5): Element 'form' must not be nested w...

Tools for Silverlight and .NET parallel programming?

.net visual-studio-2008 silverlight

An upcoming project at work looks like it might involve Silverlight. We're a .NET shop without all that much Silverlight-experience, but it looks like we need to ramp up our knowledge for this.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.