I need a script written to scrap a website from archive.org. The script will remove all [url removed, login to view] tags/ads in the code, and download all files int the same as original folders and sub folders.
The downloaded website should be complete as it is on [url removed, login to view], and able to be uploaded without further code modification.
I provide an URL like [url removed, login to view]://[url removed, login to view]; to the script, and it will get ALL content on the page (including sub-pages)
The URL Structure of the site mustn't change.
Need simple web interface, where I enter the starting [url removed, login to view] URL
Each site recovery should contain all pages in HTML format,
All images that the sites was using should e downloaded.
URL structure of the sites should be exactly as it was with original site including links to images internal and outbound links.
Files passing variables (example ending with ?dvar=variable) should also be saved as original