Can't remove node in Nokogiri

I'm having a bit of a strange issue with Nokogiri in Rails. I'm trying to remove a "p" tag with a class of "why". I have the following code, which doesn't work:

def test_grab
  f = File.open("public/test.html")
  @doc = Nokogiri::HTML.parse(f)
  f.close
  @doc = @doc.css("p")
  @doc.each do |p|
    if p["class"] == "why"
      logger.info p.values
      p.remove
    end
  end
end

test.html:

<html>
<head>
    <title>Test</title>
</head>
<body>
    <p>Test data</p>
    <p>More <a href="http://stackoverflow.com">Test Data</a></p>
    <p class="why">Why is this still here?</p>
</body>
</html>

Output html source:

<p>Test data</p>
<p>More <a href="http://stackoverflow.com">Test Data</a></p>
<p class="why">Why is this still here?</p>

I know the rails code is going into the if loop because the logger.info shows up on the server terminal.

Any ideas?

Answers


Is there any reason you're reusing your @doc instance variable?

When it comes to troubleshooting stuff like this, I find the best idea is to try evaluating the same code without the Rails overhead. For example:

require 'nokogiri'

doc = Nokogiri::HTML(DATA)
doc.css("p").each do |p|
  p.remove if p["class"] == "why" 
end

__END__
<html>
<head>
    <title>Test</title>
</head>
<body>
    <p>Test data</p>
    <p>More <a href="http://stackoverflow.com">Test Data</a></p>
    <p class="why">Why is this still here?</p>
</body>
</html>

Which returns:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head><title>Test</title></head>
<body>
    <p>Test data</p>
    <p>More <a href="http://stackoverflow.com">Test Data</a></p>

</body>
</html>

Now trying doing paragraphs = @doc.css("p") and then paragraphs.each .. or just omit the whole assignment like I have above.


Need Your Help

asp.net error form cannot be nested within element form?

asp.net forms

I have a content page in an asp.net application that uses a form tag. There's only one on the page so I'm confused why its give me the error: Validation (HTML5): Element 'form' must not be nested w...

Tools for Silverlight and .NET parallel programming?

.net visual-studio-2008 silverlight

An upcoming project at work looks like it might involve Silverlight. We're a .NET shop without all that much Silverlight-experience, but it looks like we need to ramp up our knowledge for this.