Print

Print


If you were talking about production code, you might need to worry about getting more clever with memory management...  but if it's a once-and-done processing script, I see nothing wrong with taking the associative array approach in the interest of writing clean, understandable code.  I think the easiest solution to your problem is simply to raise your PHP memory limit:

http://www.ducea.com/2008/02/14/increase-php-memory-limit/

Obviously this solution only works to a certain point...  but if you're hitting a 32MB limit right now, you've probably still got a lot of room to grow!

- Demian

> -----Original Message-----
> From: Code for Libraries [mailto:[log in to unmask]] On Behalf Of
> Ken Irwin
> Sent: Tuesday, August 30, 2011 12:56 PM
> To: [log in to unmask]
> Subject: [CODE4LIB] memory management for grownups
> 
> I have a feeling it may be time for me to learn some grown-up
> programming skills, and I hope someone here might be able to help.
> 
> I have a PHP script chewing over a large MySQL query. It's creating a
> handful of big associative arrays in the process, and punks out after
> the arrays get to 32MB.
> Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to
> allocate 242 bytes) in analysis.php on line 41
> 
> I'm basically building a big associative array encoding the name of the
> borrowing institution, the patron type (student,faculty,staff,etc), the
> item barcode, and the various bits of data we want for each of these
> items.
> 
> // the first time we see a barcode
>     $stats[$inst][$patron_type][$item_barcode][first] = $date;
>     $stats[$inst][$patron_type][$item_barcode][min] = $renewals;
>     $stats[$inst][$patron_type][$item_barcode][call] = $call_no;
> //subsequent instances of the barcode
>       $stats[$inst][$patron_type][$item_barcode][max] = $renewals;
> 
> Once I've chewed over all  4million records (40MB) , I spit it out into
> a new MySQL table that has the collated data that I want. Unfortunately
> this system breaks down when I get to so many millions of records.
> 
> Is there a more efficient way of doing this kind of data
> transformation? I *could* not keep so much in the big stats array and
> instead make millions of "UPDATE" calls to the MySQL table, but that
> doesn't sound like a winning proposition to me. I imagine that I could
> also increase the memory allotment, but it will eventually get to big
> too. Or I suppose that I could do it all in chunks - right now I'm
> parsing the whole raw-data SQL table at once; I could do one
> institution's data at a time and buy myself some wiggle-room.
> 
> But fundamentally, it seems to me that asking PHP to hold lots of data
> in an array might not be the most programmerly system; it's just what
> I've always done.
> 
> Any advice?
> 
> Thanks
> Ken