Program: HTTrack v3.44-5
I remember that when I was really young, at the beginning of my discovering the Internet, I wanted to download the entire World Wide Web to my computer. Laugh all you want, but back then the Internet connection was really poor.
Today this is possible. You can download anything you want from the Internet, including websites. The software that I chose to do it with is called WinHTTrack, and it is excellent in mirroring web sites.
WinHTTrack is a free and open source website copier and offline browser by Xavier Roche licensed under the GNU General Public License. In not so many words, the software is free to use and develop.
The software can be used for copying websites for offline browsing or backup purposes. The interface does not contain complicated elements designed to transform your work into a living hell. On the contrary, it is very easy to use. The only thing is it has so many rich features that you're going to need some time to decide what you want to do.
The application window is as simple as can be. The menu bar is up there where it should be and it is composed from the File, Preferences, Mirror, Log and Window menus.
The File menu 'hides'
options like New Project, opening a previous project, saving, deleting and the Browsing sites option. This last feature is designed to open the sites that you have completely downloaded to your computer.
In the Preferences menu the user can set default options, save and load them. Thus you can make different website download settings and then load them according to your download wishes. In this section you can also set your language preference. It covers 28 different languages.
menu allows pausing the transfer and modifying the options. In 'Log'
, the user can view the error log and the file transfers. These two menus are active only when you begin the file transfer.
Let's see how the software works. HTTrack downloads to a local directory one or more sites from the Internet and builds recursively all directories, saving HTML, image and other files to your computer. The wizard like interface helps you step by step to accomplish your goal. At the beginning of a new project the user has to give a name to the task and specify a category and a path for saving the project.
Next you have to define a mirroring mode. You have seven options, among which we count 'download website(s)'
, 'download website(s) + questions'
, 'get separate files'
and 'continue interrupted download'
. As updating a website is the key to its lifespan, HTTrack is also equipped with the option of updating an existing download.
If your connection is interrupted by some external factor you can resume the download activity. In the URL box there should be the web address(es) of the sites you want to browse later.
You do not have to enter all the addresses at once. Start a project and if you remember another site that you want to save on your disk stop the current operation and restart the project. The software will continue the interrupted task and will begin downloading the newly added site.
In the 'Set options'
menu of the application the user can make the settings for downloading the websites. You can use proxy for FTP transfers by entering manually the name and port. Use filters (scan rules) to exclude/include subdirectories or to skip certain file formats.
The software allows defining limits for how deep the engine should seek. By default, the depth is infinite, but due to the fact that you have specified a certain site, it will not mirror the entire web. You can limit the amount of bytes that can be downloaded for a site or set a maximum transfer rate.
The user can define the number of simultaneous connections initiated by the engine or set a number of retries if the server is not responding. The application can automatically test the validity of the external links on a website. You can accept cookies generated by the remote server or not. If you do not accept cookies, some "session-generated" pages will not be retrieved.
A piece of advice: if you are not sure what to do in 'Experts only'
tab, leave it as it is. Here the advanced user can enable or disable the 'use a cache for updates'
option (for later updating of the site). For primary filters, the options are HTML and/or Non-HTML. The 'Travel mode'
sets the default spidering direction.'Activate debug mode
' enables some extra debug information, like headers debugging and some interface information (for debugging purpose only).
After making all the settings, you have to choose whether you want your machine turned off or not after finishing the download or, if you want to postpone the action, you can define the time you want the download to begin.
All these being said, click 'Next'
and HTTrack will begin its activity immediately or scheduled.The Good
The software has lots of features that can help you in your task. The wizard like interface is supported by a help menu designed to open the page for the window you are in.The Bad
The fact that it is free makes it really attractive. You do not have to worry about absolutely nothing. I really liked that in the documentation file there is a section on 'How NOT to use it'.
When trying to end a task, the software obeyed my orders only the second time. The 'Set options' menu is pretty tricky and it is not for just everybody.The Truth
I was impressed by the stability and range of options the software comes equipped with. I began using it when my ISP called and announced that I will have no Internet connection for the next three days. I was in real need of information from one particular website. So I downloaded it.
I recommend paying full attention to the recommendations in the Help menu so that you don't get into trouble.
My advice for this software is: Use it, don't abuse it! source