On Thu, 25 Mar 2010, Yitzchak Schaffer wrote: > On 3/24/2010 17:43, Joe Hourcle wrote: >> I know there's a lot of stuff written in it, but *please* don't >> recommend PHP to beginners. >> >> Yes, you can get a lot of stuff done with it, but I've had way too many >> incidents where newbie coders didn't check their inputs, and we've had >> to clean up after them. > > Another way of looking at this: part of learning a language is learning its > vulnerabilities and how to deal with them. And how to avoid security holes > in web code in general. Unfortunately, it's not all web code. Part of the issue is in selecting the correct tool for the job. Case in point -- I've been working for the last year to integrate a new data system into our federation. The system officially hasn't gone live yet, so as the institution building the system had replaced their full time DBA with a contractor, the contractor decided he was going to replace all of the work that the DBA had already done to enable external sites to subscribe to collections within the system. Unfortunately, he did the entire thing in shell, and he's passing around SQL scripts, applying them to the database without any validation, and he's hard-coded assumptions about how directories are laid out and where the script has permissions to write. Needless to say, when you get someone reading stuff from config files with *no* taint checking and *no* escaping or even quoting of arguments passed to other commands, I have to clean it up. I even try passing my changes back upstream, but I'm told that the contractor has to make the changes (and he then picks and chooses which security changes he's going to make ... then decides to wrap each 'rm' and dozen other commands in functions (so I can override what command's being called?), and I now have a shell script that's over 1000 lines. (okay, that's not fair ... his version is only 968 lines, it only gets over 1000 when I try to add my corrections to it, and it's only 702 lines when you strip out comments and blank lines) Now, much of it's just plain bad programming -- I mean, would you test to see if variables were set BEFORE loading the config file? Would you run through a series of functions where each one required the other one to complete without actually testing to see if any of them actually worked? (and well, one of those functions was the one that removed a tarball that took an hour to generate at the server, and the next one report back the 'success' to the server, so I couldn't get the server to run it again without getting someone to correct things manually) ... I probably wouldn't be so hot on the topic, if it hadn't occupied the better part of the last month of my life, and all of this last week. (well, it seems that scp'ing a file for the subscription manager to service to process, and create a tarball response with the contents for your database doesn't work too well when the service isn't actually running ... but the way it's written you have *no* idea what the status of the server is). ... sorry, I just needed to vent. Anyway, part of what makes a good programmer is knowing the correct tools to use. (and unfortunately, by definition, any newbie isn't going to have enough languages in their toolbox to be able to make a good selection). Yes, we always have to deal with determining the 'best' language based on what we know, who's going to maintain it, etc, so we sometimes have to go with sub-optimal choices. But much of it's trying to identify what's going to go wrong with what we build, and trying to make sure that it doesn't break in spectacularly bad ways.[1] I guess most people don't have the men with guns show up and take your servers for forensic analysis when some types of things go wrong, which makes me a little more paranoid in my error handling. But if you put it out there on the internet, someone, sooner or later will attempt to abuse it. It could be link spam on blogs, or usurping a guest book program to send spam, or even people claiming that compression artifacts in your data are UFOs[2], resulting in DDoS of your servers. The bad ones are where they find a way to modify your database, add something to your filesystem, or give them a shell on your system. -Joe [1] http://xkcd.com/327/ [2] http://www.google.com/search?q=disclosure+nasa+sun+2010