The WebReaper home page
WebReaper is web crawler or spider, which can work its way through a website, downloading pages, pictures and objects that it finds so that they can be viewed locally, without needing to be connected to the internet. The sites can be saved locally as a fully-browsable website which can be viewed with any browser (such as Internet Explorer, Netscape, Opera, etc), or they can be saved into the Internet Explorer cache and viewed using IE's offline mode as if the you'd surfed the sites 'by hand'.
To use WebReaper, simply enter a starting URL, and hit the Go button. The program will then download the page at that URL, parsing the HTML as it goes, looking for links to other pages and objects. It will then extract this list of sub-links and download them. This process continues recursively until either no more links fulfil WebReaper's filter criteria or your hard disk becomes full - which ever happens first!
The locally saved files will have their HTML links adjusted so that they can be browsed as if they were being read directly from the internet.
The download is fully configurable - custom hierarchical filters can be constructed from 12 different filter types to allow targetted downloads. Simple filters can be built using the Filter Wizard, or more complex ones can be hand-crafted.
How Did WebReaper Come About?
I wanted to read Internet Web Page content using my laptop on the way to work. Not wanting to run up massive bills using a mobile phone and GSM modem, I decided the best idea was to download a few entire websites 'en masse' and then use IE4's Offline Mode to view them.
Try as I might I could not find a program to download the web pages in batch mode; all of the existing software I could find was either no good, too complicated, shareware (I don't like dialog boxes whinging at me to register) or 30-day trial software. Being a bit tight-fisted, I decided that instead of paying for any software, it would be easier to write my own.
- Multithreaded downloading
- Explorer-style interface
- ShockWave Flash support - downloads/fixes up SWF movies for local browsing
- User-customisable filters - limit by depth, time per object, total time and 'distance' from starting server, and many others.
- Simple-to-use filter wizard - helps you build complex filters quickly and easily.
- Full Drag & Drop support - drag links to/from Internet Explorer/Netscape.
- Save downloaded files using relative paths to recreate websites stored locally with links adjusted to make them fully browsable.
- 'Resume' mode reads files saved locally to avoid reloading unchanged pages
- Proxy & website authentication, allowing websites with passwords or behind firewalls to be reaped.
- 'URL Profiles' allow depth and filter configurations to be saved with associated URLs for easy re-reaping in future.
- Command-Line execution to run as a batch process, using a task scheduler (not provided).
- Works with GetRight® for large file downloads.