How can I speed up a joined update in SQL? My statement seems to run indefinitely
I have two tables: a source table and a target table. The target table will have a subset of the columns of the source table. I need to update a single column in the target table by joining with the source table based on another column. The update statement is as follows:
UPDATE target_table tt SET special_id = ( SELECT source_special_id FROM source_table st WHERE tt.another_id = st.another_id )
For some reason, this statement seems to run indefinitely. The inner select happens almost immediately when executed by itself. The table has roughly 50,000 records and its hosted on a powerful machine (resources are not an issue).
Am I doing this correctly? Any reasons the above wouldn't work in a timely manner? Any better way to do this?
Your initial query executes the inner subquery once for every row in the outer table. See if Oracle likes this better:
UPDATE target_table SET special_id = st.source_special_id FROM target_table tt INNER JOIN source_table st WHERE tt.another_id = st.another_id
(edited after posted query was corrected)
Add: If the join syntax doesn't work on Oracle, how about:
UPDATE target_table SET special_id = st.source_special_id FROM target_table tt, source_table st WHERE tt.another_id = st.another_id
The point is to join the two tables rather than using the outer query syntax you are currently using.
Is there an index on source_table(another_id)? If not source_table will be fully scanned once for each row in target_table. This will take some time if target_table is big.
Is it possible for there to be no match in source_table for some target_table rows? If so, your update will set special_id to null for those rows. If you want to avoid that do this:
UPDATE target_table tt SET special_id = ( SELECT source_special_id FROM source_table st WHERE tt.another_id = st.another_id ) WHERE EXISTS( SELECT NULL FROM source_table st WHERE tt.another_id = st.another_id );
If target_table.another_id was declared as a foreign key referencing source_table.another_id (unlikely in this scenario), this would work:
UPDATE ( SELECT tt.primary_key, tt.special_id, st.source_special_id FROM tatget_table tt JOIN source_table st ON st.another_id = tt.another_id ) SET special_id = source_special_id;
Are you actually sure that it's running?
Have you looked for blocking locks? indefinitely is a long time and that's usually only achieved via something stalling execution.
Update: Ok, now that the query has been fixed -- I've done this exact thing many times, on unindexed tables well over 50K rows, and it worked fine in Oracle 10g and 9i. So something else is going on here; yes, you are calling for nested loops, but no, it shouldn't run forever, even so. What are the primary keys on these tables? Do you by any chance have multiple rows from the second table matching the same row for the first table? You could be trying to rewrite the whole table over and over, throwing the locking system into a fit.
That statement doesn't really make sense -- you are telling it to update all the rows where ids match, to the same id (meaning, no change happens!).
I imagine the real statement looks a bit different?
Please also provide table schema information (primary keys for the 2 tables, any available indexes, etc).
Not sure what Oracle has available, but MS Sql Server has a tuning advisor that you can feed your queries into and it will give recommendation for adding indexes, etc... I would assume Oracle has something similar.
That would be the quickest way to pinpoint the issue.
I don't know Oracle, but MSSQLServer optimizer would have no problem converting the subquery into a join for you.
It sounds like you might be doing a data import against a short-lived or newly created table. It is easy to overlook indexing these kinds of tables. I'd make sure there's an index on sourcetable.anotherid - or a covering index on sourcetable.anotherid, sourcetable.specialid (order matters, anotherid should be first).
In cases such as these (query running unexpectedly for longer than 1 second). It is best to figure out how your environment provides query plans. Examine that plan and the problem will become clear.
You see, there is no such thing as "optimized sql code". Sql code is never executed - query plans are generated from the code and then those plans are executed.
Check that the statistics are up to date on the tables - see this question
I had the same problem, and got a "SQL command not properly ended" when trying Codewerks' answer in Oracle 11g. A bit of Googling turned up the Oracle MERGE statement, which I adapted as follows:
MERGE INTO target_table tt USING source_table st ON (tt.another_id = st.another_id) WHEN MATCHED THEN UPDATE SET tt.special_id = st.special_id;
If you're not sure all the values of another_id will be in source_table then you can use the WHEN NOT MATCHED THEN clause to handle that case.
If you have constraints that let Oracle know there is a 1-1 relationship when you join on "another_id", I think this might work well:
UPDATE ( SELECT tt.rowid tt_rowid, tt.another_id, tt.special_id, st.source_special_id FROM target_table tt, source_table st WHERE tt.another_id = st.other_id ORDER BY tt.rowid ) SET special_id = source_special_id
Ordering by ROWID is important when you are updating and updatable view with a join.