This is just a reflection on the earlier name resolution incident. I 
find it remarkable how much goes into solving a problem, and the 
corollary, how much impact a simple problem can have. Just my braindump 
as a relatively novice sysadmin.

Here's the chain of events:
- This morning at 9am, our web server chokes. I see apache is using up 
- After poking around the various daemons and looking at logs, I figure 
out that everything is running correctly
- I somehow narrow it down to the script that pings the OCLC chat 
availability service waiting for 20+ seconds and finally timing out, 
*despite* the fact that I thought it was set up with a 2-second timeout 
(I don't remember how I got it down to that)
- I shut that down temporarily and disabled our chat function, which got 
the server back to normal.
- I browsed the service manually, which worked, and tried two different 
techniques in the PHP (file_get_contents() and curl), both of which failed.
- I went to Brooklyn to do some vigilante digitization and have lunch 
with my boss
- I got back to the office, saw nothing had changed, and started digging 
deeper into the curl request
- I found the name resolution error, which blew my mind
- I tried resolving multiple ways, and failing that, came here

Thanks to all who contributed ideas... amazing how one change to a 
vendor DNS server can lead to our web server DOS'ing itself. More 
networking knowledge... must get more networking knowledge...

Yitzchak Schaffer
Systems Manager
Touro College Libraries
212.742.8770 ext. 2432

Access Problems? Contact [log in to unmask]