How to scan for differences between two queries?

I have a table that loads new data every day and another table that contains a history of changes to that table. What's the best way to check if any of the data have changed since the last time data was loaded?

For example, I have table @a with some strategies for different countries and table @b tracks the changes made to table @a. I can use a checksum() to hash the fields that can change, and add them to the table if the existing hash is different from the new hash. However, MSDN doesn't think this is a good idea since "collisions" can occur, e.g. two different values map to the same checksum.

MSDN link for checksum http://msdn.microsoft.com/en-us/library/aa258245(v=SQL.80).aspx

Sample code:

declare @a table
(
    ownerid bigint
    ,Strategy varchar(50)
    ,country char(3)
)
insert into @a
select 1,'Long','USA'

insert into @a
select 2,'Short','CAN'

insert into @a
select 3,'Neutral','AUS'

declare @b table
(
    Lastupdated datetime
    ,ownerid bigint
    ,Strategy varchar(50)
    ,country char(3)

)

insert into @b
(
    Lastupdated
    ,ownerid
    ,strategy
    ,country
)
select 
    getdate()
    ,a.ownerid
    ,a.strategy
    ,a.country
from @a a left join @b b
    on a.ownerid=b.ownerid
where
    b.ownerid is null

select * from @b

--get a different timestamp
waitfor delay '00:00:00.1'

--change source data
update @a 
set strategy='Short'
where ownerid=1

--add newly changed data into 
insert into @b
select 
    getdate()
    ,a.ownerid
    ,a.strategy
    ,a.country
from 
    (select *,checksum(strategy,country) as hashval from @a) a 
    left join 
    (select *,checksum(strategy,country) as hashval from @b) b
    on a.ownerid=b.ownerid
where 
    a.hashval<>b.hashval

select * from @b

Answers


How about writing a query using EXCEPT? Just write queries for both tables and then add EXCEPT between them:

(SELECT * FROM table_new) EXCEPT (SELECT * FROM table_old) 

The result will be the entries in table_new that aren't in table_old (i.e. that have been updated or inserted).

Note: To get rows recently deleted from table_old, you can reverse the order of the queries.


There is no need to check for changes if you use a different approach to the problem.

On your master table create a trigger for INSERT, UPDATE and DELETE which tracks the changes for you by writing to table @b.

If you search the internet for "SQL audit table" you will find many pages describing the process, for example: Adding simple trigger-based auditing to your SQL Server database


Thanks to @newenglander I was able to use EXCEPT to find the changed row. As @Tony said, I'm not sure how multiple changes will work, but here's the same sample code reworked to use Except instead of CHECKSUM

declare @a table
(
    ownerid bigint
    ,Strategy varchar(50)
    ,country char(3)
)
insert into @a
select 1,'Long','USA'

insert into @a
select 2,'Short','CAN'

insert into @a
select 3,'Neutral','AUS'

declare @b table
(
    Lastupdated datetime
    ,ownerid bigint
    ,Strategy varchar(50)
    ,country char(3)

)

insert into @b
(
    Lastupdated
    ,ownerid
    ,strategy
    ,country
)
select 
    getdate()
    ,a.ownerid
    ,a.strategy
    ,a.country
from @a a left join @b b
    on a.ownerid=b.ownerid
where
    b.ownerid is null

select * from @b

--get a different timestamp
waitfor delay '00:00:00.1'

--change source data
update @a 
set strategy='Short'
where ownerid=1



--add newly changed data using EXCEPT
insert into @b 
select getdate(),
    ownerid,
    strategy,
    country
from 
(
    (
    select 
        ownerid
        ,strategy
        ,country 
    from @a changedtable
    ) 
    EXCEPT 
    (
    select 
        ownerid
        ,strategy
        ,country 
    from @b historicaltable
    )
) x

select * from @b

Need Your Help

Filter MySQL statement based on 2 columns (meta_key and meta_value)

mysql wordpress

I have the following SQL statement to calculate several fields based on wp_postmeta, wp_post and wp_user tables. What I need is to include only results, where all posts have meta_key='key1' and