Programmatically log on to forum and then screenscrape

I'd like to login to the Forums part of community-server (e.g. http://forums.timesnapper.com/login.aspx?ReturnUrl=/forums/default.aspx) and then download a specific page and perform a regex (to see if there are any posts waiting for moderation). If there is, I'd like to send an email.

I'd like to do this from a Linux server.

Currently I know how to download a page (using e.g. wget) but have a problem logging in. Any bright idea how that works?

Answers


Looking at the source of the login page it appears to be an asp.net app so you'd need to probably do a couple things to achieve this -

Manage the form hidden __viewstate field and post that back when you submit the login details.

Once you get past that I'm guessing you can reference the specific page in question just using an absolute URL but you'd need to handle the ASP.NET Forms authentication cookie and send that as part of the GET request.


You might have better luck with Selenium or see this question for more suggestions:

http://stackoverflow.com/questions/300788/script-for-college-class-registration


Need Your Help

Diffing a .ttx file in git

git diff

I'm trying to diff a .ttx file (bilingual .xml translation file with source (English) strings and target (Spanish) strings and a ton of metadata) with git and it's treating it as a binary so it won't

Github - use one repo as basis for another, preserving branches and history

git github git-clone

I'd like to move over a repo, with all its history and branches, to a fresh repo. I've been able to find ways to use a single branch as a basis for a new repo (git workflow - using one repo as the ...

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.