Finding missing files

I hope this is not off topic. I'm not a web guy and I haven't been able to form a Google query that got me an answer.

I am moving a *very* messy web server to a new ISP. There are at least three completely different designs mixed in together and no one knows what files are used and what are not. The plan is to use this as an opportunity to clean house.

Complicating matters is the fact that the current web page is outsourced and on a different server but it still references files that are on the old server. www.my.domain is outsourced and generates html pages but all the media files (audio, video and the occasional image) are on media.my.domain. The current media.my.domain is the old web server with all manner of crud on it. There is a pretty new media.my.domain at Liquidweb that I have moved files I know are used on to.

I would like to use my host file to force me to use the new (Liquidweb) server and then crawl www.my.domain and see what media files are not found. In a perfect world a spider on my PC would crawl www.my.domain and tell me what requests generated errors. I could also get the errors from the log files. In that perfect world the spider would not save the downloads.

Is there any software that will do this? I used a couple of programs that would suck down an entire site several years ago but I don't remember what they were.

I would prefer a Windows application but I can do linux too. I could borrow a Mac if I need to.

TIA,

Dan

 

 

 

 

Top