Print

Print


On Fri, 5 Jun 2009, Kenneth R. Irwin wrote:

> Hi folks,
>
> Can someone point me to some good information/how-to-guide/etc for 
> sanitizing files uploaded to a MySQL database through a web interface? 
> (This would be something much like the "Insert data from a textfile into 
> table" function in phpMyAdmin.) I want to make sure there aren't any 
> nasty queries inserted into the tab-delimited data.

Write it out to disk, and then use the 'LOAD DATA LOCAL INFILE' command, 
so you don't have to worry about escaping the values:

 	http://dev.mysql.com/doc/refman/5.1/en/load-data.html

You'll only run into problems if you're generating SQL commands as 
strings, and then sending those.  (and if you're using prepared 
statements, , you'll never need to worry about bad characters in values 
... if you're generating strings that have field or table names in them, 
check them against a list of known good values (/\A[a-zA-z0-9_]+\Z/) and 
reject any that aren't compliant.


> Is this whole-file sanitization any different than the sort of thing you 
> might use for individual pieces of data? E.g. 
> http://www.denhamcoote.com/php-howto-sanitize-database-inputs

Okay -- the issue with people trying to do XSS attacks and/or insert 
javascript can be an issue ... but the suggestions about escaping 
characters  is useless -- use prepared statemenst with placeholders.  As 
you're using MySQL and PHP, see:

 	http://dev.mysql.com/tech-resources/articles/guide-to-php-security-ch3.pdf

To deal with malicious inserted HTML, it may be slower, but I deal with it 
on output -- as there may be multiple ways for data to get in, I sanitize 
the strings before emitting them.  (and I may use different sanitizing 
depending on how it's being emitted ).

And don't use the regexes from the page you linked to -- because of the 
order they strip out the tags, they're going to screw up.  (they'll never 
match style tags as they removed them the step before; also, they need to 
SGML remove comments before removing any other tags, but their regex for 
SGML comments is flawed)

-----
Joe Hourcle