I may be wrong about this, but when you use a "not in" operator, you're necessarily doing a table scan. Any indexes you have are not used in such a query, so you'll definitely see a performance hit. But if each independent query runs quickly, why not use a temporary table to store the results of what you've got written as a subquery first, and then use that presumably small, distinct list of institutions in the subquery afterward? Something like this: SELECT INTO #my_institutions SELECT distinct institution FROM `renewals` WHERE snap_date < '2011-07-01' EXEC SELECT distinct institution from renewals WHERE institution not in (SELECT institution from #my_institutions) That assumes you're able to SELECT INTO, but if not, just create yourself a my_institutions table first. \-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/-\-/ Scot Colford Web Services Manager Boston Public Library [log in to unmask] Phone 617.859.2399 Mobile 617.592.8669 Fax 617.536.7558 On 9/28/11 12:41 PM, "Ken Irwin" <[log in to unmask]> wrote: >Hi all, > >I've not done much with MySQL subqueries, and I'm trying right now with >what I find to be surprising results. I wonder if someone can help me >understand. > >I have a pile of data that with columns for "institution" and "date". >Institution gets repeated a lot, with many different dates. I want to >select all the institutions that *only* have dates after July 1 and don't >appear in the table before that. My solution was to do a first query for >all the institutions that DO have dates before July 1 >SELECT distinct institution FROM `renewals` WHERE snap_date < '2011-07-01' > >And then to do a SELECT query on all the institutions: >SELECT distinct institution from renewals > >And then try to do a NOT IN subquery subtracting the smaller query from >the larger one: > >SELECT distinct institution from renewals >WHERE institution not in >(SELECT distinct institution FROM `renewals` WHERE snap_date < >'2011-07-01') > >...only it doesn't seem to work. Or rather, the query has been running >for several minutes and never comes back with an answer. Each of these >two queries takes just a few milliseconds to run on its own. > >Can someone tell me (a) am I just formatting the query wrong, (b) do >subqueries like this just take forever, and/or (c) is there a better way >to do this? (I don't really understand about JOIN queries, but from what >I can tell they are only for mixing the results of two different tables >so I think they might not apply here.) > >Any advice would be most welcome. > >Thanks >Ken