Bending the rules of UNIQUE column SQLITE
I am working with an extensive amount of third party data. Each data set has items with unique identifiers. So it is very easy for me to utilise UNIQUE column in SQLITE to enforce some data integrity.
Out of thousands of records I have id from third party source A matching 2 unique ids from third party source B.
Is there a way of bending the rules, and allowing a duplicate entry in a unique column? If not how should I reorganise my data to take care of this single edge case.
CREATE TABLE "trainer" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "name" TEXT NOT NULL, "betfair_id" INTEGER NOT NULL UNIQUE, "racingpost_id" INTEGER NOT NULL UNIQUE );
Miss Beverley J Thomas http://www.racingpost.com/horses/trainer_home.sd?trainer_id=20514
vs. Miss Beverley J. Thomas http://form.horseracing.betfair.com/form/trainer/1/00008861
Both Racingpost entires (my primary data source) match a single Betfair entry. This is the only one (so far) out of thousands of records.
If racingpost should have had only 1 match it is an error condition.
If racingpost is allowed to have 2 matches per id, you must either have two ids, select one, or combine the data.
Since racingpost is your primary source, having 2 ids may make sense. However if you want to improve upon that data set combining that data or selecting the most useful may be more accurate. The real question is how much data overlaps between these two records and when it does can you detect it reliably. If the overlap is small or you have good detection of an overlap condition, then combining makes more sense. If the overlap is large and you cannot detect it reliably, then selecting the most recent updated or having two ids is more useful.