Get 2 sections from html in python

I have html that looks like this

    <tr><td align="center" class="listas" colspan="0">
<div id="b8b7c9523733e026bf89de9b3cf8f73811ddb579" style="display: none;">
<table border="0" cellpadding="2" cellspacing="2" width="100%">
<tbody><tr><td align="left" class="lista" width="100%"><table align="left" border="0" cellpadding="0" cellspacing="0" width="90%"><tbody><tr><td style="background-image: url(./style/classicx/line_a.gif); background-repeat: no-repeat; background-position:top right; width=60px; height:4px;" width="60"></td><td style="background-image:url(./style/classicx/line_b.gif); background-repeat: repeat-x; height:4px;" width="75%"></td><td style="background-image: url(./style/classicx/line_c.gif); background-repeat: no-repeat; height:4px;" width="60"></td></tr></tbody></table></td></tr>
<tr><td align="left" class="lista"><table align="left" border="0" cellpadding="0" cellspacing="0" width="70%"><tbody><tr><td align="left" class="lista">
<b>Options: </b></td>
<td align="left" class="lista" title="Download: The Wolverine 2013 Theatrical Cut 1080p Blu-ray AVC DTS-HD MA 7.1-o0o"><table align="left" border="0" cellpadding="0" cellspacing="0" onclick="window.open('download.php?id=b8b7c9523733e026bf89de9b3cf8f73811ddb579&amp;f=The+Wolverine+2013+Theatrical+Cut+1080p+Blu-ray+AVC+DTS-HD+MA+7.1-o0o.torrent','_self')" style="cursor:pointer; cursor:hand;"><tbody><tr><td align="center" style="background-image: url(images/download.gif); background-repeat: no-repeat; width:17px; height:17px;"></td><td> Download</td></tr></tbody></table></td>
<td align="left" class="lista" title="Details for: The Wolverine 2013 Theatrical Cut 1080p Blu-ray AVC DTS-HD MA 7.1-o0o"><table align="left" border="0" cellpadding="0" cellspacing="0" onclick="window.open('details.php?id=b8b7c9523733e026bf89de9b3cf8f73811ddb579&amp;hit=1','_self')" style="cursor:pointer; cursor:hand;"><tbody><tr><td align="center" style="background-image: url(images/torrent_name.gif); background-repeat: no-repeat; width:17px; height:17px;"></td><td> Details</td></tr></tbody></table></td>
<td align="left" class="lista" title="Add to WishList: The Wolverine 2013 Theatrical Cut 1080p Blu-ray AVC DTS-HD MA 7.1-o0o"><table align="left" border="0" cellpadding="0" cellspacing="0" onclick="window.open('wishlist.php?do=add&amp;torrent_id=b8b7c9523733e026bf89de9b3cf8f73811ddb579','_self')" style="cursor:pointer; cursor:hand;"><tbody><tr><td align="center" style="background-image: url(images/wishlist.gif); background-repeat: no-repeat; width:17px; height:17px;"></td><td> Add to WishList</td></tr></tbody></table></td>
<td align="left" class="lista" title="Report: The Wolverine 2013 Theatrical Cut 1080p Blu-ray AVC DTS-HD MA 7.1-o0o"><table align="left" border="0" cellpadding="0" cellspacing="0" onclick="window.open('report.php?torrent=b8b7c9523733e026bf89de9b3cf8f73811ddb579','_self')" style="cursor:pointer; cursor:hand;"><tbody><tr><td align="center" style="background-image: url(images/report.gif); background-repeat: no-repeat; width:16px; height:17px;"></td><td> Report</td></tr></tbody></table></td>
    <td align="left" class="lista" style="white-space:nowrap"><table align="left" border="0" cellpadding="0" cellspacing="0"><tbody><tr><td align="center" style="background-image: url(images/torrent_comments.gif); background-repeat: no-repeat; width:17px; height:17px;"></td><td> Comments (<b><span style="color:#006699">0</span></b>)</td></tr></tbody></table></td>
</tr></tbody></table></td>
</tr><tr><td align="left" class="lista" width="100%"><table align="left" border="0" cellpadding="0" cellspacing="0" width="60%"><tbody><tr><td style="background-image: url(./style/classicx/line_a.gif); background-repeat: no-repeat; background-position:top right; width=60px; height:4px;" width="60"></td><td style="background-image:url(./style/classicx/line_b.gif); background-repeat: repeat-x; height:4px;" width="75%"></td><td style="background-image: url(./style/classicx/line_c.gif); background-repeat: no-repeat; height:4px;" width="60"></td></tr></tbody></table></td></tr>
<tr><td align="left" class="lista"><b>Technical Info:</b></td></tr>
<tr><td align="left" class="lista"><table border="0" cellpadding="0" cellspacing="0" width="60%"><tbody><tr><td></td></tr></tbody></table></td></tr>
<tr><td align="left" class="lista" width="100%"><table align="left" border="0" cellpadding="0" cellspacing="0" width="90%"><tbody><tr><td style="background-image: url(./style/classicx/line_a.gif); background-repeat: no-repeat; background-position:top right; width=60px; height:4px;" width="60"></td><td style="background-image:url(./style/classicx/line_b.gif); background-repeat: repeat-x; height:4px;" width="75%"></td><td style="background-image: url(./style/classicx/line_c.gif); background-repeat: no-repeat; height:4px;" width="60"></td></tr></tbody></table></td></tr>
</tbody></table></div></td></tr>
<tr>
    <td align="center" class="mainblockcontent" max-width="25px"><a href="torrents.php?category=5"><img alt="Movie/1080p/i" border="0" src="images/categories/MOVIES-1080PI.png"/></a></td> <td align="left" class="mainblockcontent"><b><a href="details.php?id=948499d24362fc5d1da0bc83cc0afe2dd2d5bf55" onmouseout="return nd();" onmouseover="if(popup_mode){ overlib(' &lt;img src=\'cache/imdb/images/1430132.jpg\' width=\'200\' ', CAPTION, '');}">The Wolverine 2013 EXTENDED 1080p BluRay DTS-ES x264-PublicHD</a></b>                                        <br/><span style="color: #999999 ">Action, Adventure, Fantasy, Sci-Fi</span>  <span style="color:DarkSlateGray "> <a href="http://www.imdb.com/title/tt1430132/" target="_blank"><u>IMDB: 6.9</u></a></span></td>  <td align="center" class="mainblockcontent" title="Comments"><a href="details.php?id=948499d24362fc5d1da0bc83cc0afe2dd2d5bf55#comments" title="View details: The Wolverine 2013 EXTENDED 1080p BluRay DTS-ES x264-PublicHD">5</a></td>  <td align="center" class="mainblockcontent" title="Download"><a href="download.php?id=948499d24362fc5d1da0bc83cc0afe2dd2d5bf55&amp;f=The+Wolverine+2013+EXTENDED+1080p+BluRay+DTS-ES+x264-PublicHD.torrent"><img alt="torrent" border="0" src="images/download.gif"/></a></td>
    <td align="center" class="mainblockcontent" title="Add to Wishlist"><a href="wishlist.php?do=add&amp;torrent_id=948499d24362fc5d1da0bc83cc0afe2dd2d5bf55"><img alt="torrent" border="0" src="images/add_wishlist_star.png"/></a></td>
    <td align="center" class="mainblockcontent">09:03:59 <bthu, +0200="" 07="" 09:03:59="" 2013="" nov=""><bthu, +0200="" 07="" 09:03:59="" 2013="" nov=""> 07/11/2013</bthu,></bthu,></td>
    <td align="center" class="mainblockcontent">12.53 GB</td>
    <td align="center" class="mainblockcontent">Anonymous</td>
    <td align="center" class="green"><a href="peers.php?id=948499d24362fc5d1da0bc83cc0afe2dd2d5bf55" title="Click here to view peers details"><b>416</b></a></td>
    <td align="center" class="green"><a href="peers.php?id=948499d24362fc5d1da0bc83cc0afe2dd2d5bf55" title="Click here to view peers details"><b>3</b></a></td>
    <td align="center" class="mainblockcontent"><a href="torrent_history.php?id=948499d24362fc5d1da0bc83cc0afe2dd2d5bf55" title="History - The Wolverine 2013 EXTENDED 1080p BluRay DTS-ES x264-PublicHD">1,251</a></td></tr>

If it is displaying more then 1 item it repeats until the last item. From what i understand i need somthing like

result_table = BeautifulSoup(data)
entries = result_table.find_all('td', attrs = {'align' : 'center', 'class' : 'listas'})

for result in entries:

This works but only gets the first block, how can i adjust the code so that it also gets the second block?

Answers


To find the first block your code was correct, but as only exists one <td> element with those attributes, a find is enought:

block1 = soup.find('td', attrs={'align' : 'center', 'class' : 'listas'})

To find the second block, from the first one search its <tr> parent and then the next sibling:

block2 = block1.find_parent('tr').find_next_sibling('tr')

EDIT to find all items (not tested):

entries = result_table.find_all('td', attrs={'align' : 'center', 'class' : 'listas'})
for result in entries:
    block2 = result.find_parent('tr').find_next_sibling('tr')

Need Your Help

nginx with flask and memcached returns some garbled characters

python caching nginx memcached flask

I'm trying to cache Python/flask responses with memcached. I then want to serve the cache using nginx. I'm using flask code that looks something like this:

Setting variables in a .js file

javascript asp.net-mvc razor

I'm working in ASP.Net MVC and I'm trying to move all my jQuery code into .js files.