Print

Print


> 1.       Is there a general consensus on what the best long-term
> alternative to the mysql_* functions is?

It's my impression that PDO is basically the way to go.

> 2.       Does anyone have advice about how to proceed with an enormous
> overhaul like this?

Define the scope of your project clearly at the beginning.  If all you 
want to do is replace mysql_* with PDO functions, then do that and don't 
let yourself get sidetracked with other bits of refactoring.  Make a 
note of other things that need revisiting, and visit them AFTER you've 
finished with your database functions.

On the other hand, you might decide that you want to do more than just a 
1:1 function replacement.  For example, you might want to write a small 
library of database functions that wrap around PDO (or whatever), and 
then refactor the rest of your code to use THOSE, so that the next time 
this happens you can just rewrite your library of database functions and 
everything else should keep working untouched.  This is getting into the 
territory of writing your own database abstraction layer, though, which 
can be involved.

Regardless, sit down and plan out what you're going to do in detail.  
The more time spent on planning, the less likely it is that you'll 
overlook something and wind up having to backtrack and redo stuff you 
just did.

Besides that, may I suggest using something like grep to generate a list 
of places you need to visit?  Something like issuing the following 
command at the top of your working directory:

     grep -r mysql_ *.php > ~/msqyl-functions

... will walk through the entire directory tree, examine the contents of 
every file, and identify every line containing "mysql_", which is very 
helpful for not missing things.

> 3.       I wonder what other broad-sweeping old-fashionednesses may
> also be about to rear up and bite me. If you imagine that I learned
> procedural (almost never object-oriented) PHP 4 in about 2000 and am
> slow to change my ways, can you predict what sort of deprecated
> foolishness I might still be perpetrating?

I learned about the same time, but I switched to PDO some time ago.  Are 
you familiar with using bound parameters?  If not, you should get 
familiar.  They make it a lot harder for potential attackers to inject 
hostile SQL into your code.

On the other hand, they also make it somewhat more difficult to debug 
your SQL.  When you're not using bound parameters, you might have some 
code like this:

$last_name = mysql_real_escape_string($last_name);

$SQL = "SELECT * FROM students WHERE lname = '".$last_name."'";

$result = mysql_query($SQL, $DB);

This is fairly straightforward.  You build the SQL query, and you send 
it to the database, then read back the result.  If something goes wrong, 
you have the exact SQL as it was executed against the database.  In the 
event that you're not getting sufficient information about what's going 
wrong from mysql_error(), you can always assemble the SQL manually and 
execute it yourself, without involving PHP at all.

That's more difficult with bound parameters.  Those might look something 
like:

$SQL = "SELECT * FROM students WHERE lname = :name";

$query = $DB->prepare($SQL);

$query->execute(array(
     ":name" => $last_name,
));

$result = $query->fetchAll();

In this code, which is not much longer than the original, PHP is 
actually building and executing three separate SQL commands:

PREPARE studentSQL FROM 'SELECT * FROM users WHERE lname = ?';
SET @a = "O\'Grady";
EXECUTE studentSQL USING @a;

First it prepares the SQL, without any of the actual data that you feed 
in as a limiter.  Then, the data is bound to a variable name within 
MySQL.  Finally, the prepared statement is executed, with the variables 
plugged in.  Oh, and there's a cleanup phase afterwards which unsets the 
statement and the variables, I believe.

Because the data is bound to a variable name separate from the SQL 
statement, it is effectively impossible to mix hostile content into the 
query.  MySQL knows that everything in that variable is data, not SQL, 
and should be treated as data pure and simple.  So it's great for 
eliminating SQL injections.

But it does make it a little harder to debug, because it's a bit harder 
to see exactly what is getting sent to MySQL.  I've sometimes set up 
query logging on my dev box just to track down EXACTLY what is getting 
sent to the server, or else gone to the trouble of manually setting up 
and executing prepared statements to test them.

Hope this is helpful.


Will Martin

Web Services Librarian
Chester Fritz Library
University of North Dakota