Print

Print


Hi..

You could consider at least two things

1. drop the distinct keyword in the subquery. You dont need it
2. Use a EXISTS keyword instead of not in - check
http://dev.mysql.com/doc/refman/5.0/en/exists-and-not-exists-subqueries.html.
It is normally considered more effective


§;>heb
Hans Erik Büscher | Reindex
#45 7020 8434


2011/9/28 Ken Irwin <[log in to unmask]>

> Hi all,
>
> I've not done much with MySQL subqueries, and I'm trying right now with
> what I find to be surprising results. I wonder if someone can help me
> understand.
>
> I have a pile of data that with columns for "institution" and "date".
> Institution gets repeated a lot, with many different dates. I want to select
> all the institutions that *only* have dates after July 1 and don't appear in
> the table before that. My solution was to do a first query for all the
> institutions that DO have dates before July 1
> SELECT distinct institution FROM `renewals` WHERE snap_date < '2011-07-01'
>
> And then to do a SELECT query on all the institutions:
> SELECT distinct institution from renewals
>
> And then try to do a NOT IN subquery subtracting the smaller query from the
> larger one:
>
> SELECT distinct institution from renewals
> WHERE institution not in
> (SELECT distinct institution FROM `renewals` WHERE snap_date <
> '2011-07-01')
>
> ...only it doesn't seem to work. Or rather, the query has been running for
> several minutes and never comes back with an answer. Each of these two
> queries takes just a few milliseconds to run on its own.
>
> Can someone tell me (a) am I just formatting the query wrong, (b) do
> subqueries like this just take forever, and/or (c) is there a better way to
> do this? (I don't really understand about JOIN queries, but from what I can
> tell they are only for mixing the results of two different tables so I think
> they might not apply here.)
>
> Any advice would be most welcome.
>
> Thanks
> Ken
>