And don't forget to try using EXPLAIN to figure out what is going on with the query. This will tell you which indexes are being used, whether indexes are being ignored, the approximate number of rows being parsed, etc. Just add EXPLAIN to the front of the SELECT query. I find the EXPLAIN output easier to read if you end the query with \G instead of ;.
Oh, and be sure that your development environment is running the same version as your production environment. You can see big differences in performance, it's always better to compare apples with apples when you are troubleshooting anything.
--
Ryan Ordway E-mail: [log in to unmask]
Unix Systems Administrator [log in to unmask]
OSU Libraries, Corvallis, OR, 97331 Office: Valley Library #4657
On Sep 29, 2011, at 6:27 AM, Adam Wead wrote:
> Ken,
>
> If I understand the logic correctly, you need a list of institutions with dates after July 1, but only institutions that have records existing before July 1. Subqueries could work, but a view might be easier to work with, especially if you plan to a lot of queries with similar logic.
>
> First, create a view with institutions whose dates are prior to July 1:
>
> CREATE VIEW my_list AS SELECT DISTINCT institution FROM renewals WHERE snap_date < '2011-07-01'
>
> Then, match the institutions in your renewals table with the ones in the view:
>
> SELECT institution FROM renewals a, my_list b WHERE a.institution = b.institution AND snap_date > '2011-07-01'
>
> That's the way, at least, I might tackle it... there are probably a dozen others however.
>
> ...adam
>
>
>
> On Sep 28, 2011, at 12:41 PM, Ken Irwin wrote:
>
>> Hi all,
>>
>> I've not done much with MySQL subqueries, and I'm trying right now with what I find to be surprising results. I wonder if someone can help me understand.
>>
>> I have a pile of data that with columns for "institution" and "date". Institution gets repeated a lot, with many different dates. I want to select all the institutions that *only* have dates after July 1 and don't appear in the table before that. My solution was to do a first query for all the institutions that DO have dates before July 1
>> SELECT distinct institution FROM `renewals` WHERE snap_date < '2011-07-01'
>>
>> And then to do a SELECT query on all the institutions:
>> SELECT distinct institution from renewals
>>
>> And then try to do a NOT IN subquery subtracting the smaller query from the larger one:
>>
>> SELECT distinct institution from renewals
>> WHERE institution not in
>> (SELECT distinct institution FROM `renewals` WHERE snap_date < '2011-07-01')
>>
>> ...only it doesn't seem to work. Or rather, the query has been running for several minutes and never comes back with an answer. Each of these two queries takes just a few milliseconds to run on its own.
>>
>> Can someone tell me (a) am I just formatting the query wrong, (b) do subqueries like this just take forever, and/or (c) is there a better way to do this? (I don't really understand about JOIN queries, but from what I can tell they are only for mixing the results of two different tables so I think they might not apply here.)
>>
>> Any advice would be most welcome.
>>
>> Thanks
>> Ken
>
|