Greg Hewgill ([info]ghewgill) wrote,
@ 2005-01-24 21:46:00
Previous Entry  Add to memories!  Tell a Friend  Next Entry
Entry tags:web

search engine spider gone awry
So I happened to look at the log file for one of my web servers today, and noticed lots of requests for the Apache manual, which is installed and served by the default configuration. The requests were all coming from "msnbot/0.3", a search engine spider for MSN. I changed the Apache config to remove the manual, and it seems that the requests are coming in more frequently now that they return 404. I wonder how much time it will take before msnbot gives up.



(3 comments) - (Post a new comment)


[info]thomasj
2005-01-24 09:18 pm UTC (link)
I wonder how many other Apache operators that happens to.

(Reply to this)


[info]pasketti
2005-01-24 09:56 pm UTC (link)
Doesn't the default Apache root web page point to the documentation?

If it's trying to spider a broken link, it could take awhile to figure it out. And Apache 2 went all i18n, and has a couple dozen versions of the default page for different languages, so maybe it's trying them all...

(Reply to this) (Thread)


[info]ghewgill
2005-01-24 10:23 pm UTC (link)
Yes, the default page is how the search bot found the manual. And yes, this is apache 2 so there are about 10 megs of documents in there.

(Reply to this) (Parent)


(3 comments) - (Post a new comment)

Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…