Using composite PRIMARY KEY or UNIQUE key on INNODB table with multiple inserts

I have been trying to figure this out, but no luck so far.

Which one is better: A table with a composite PRIMARY KEY OR a single PRIMARY KEY and a UNIQUE index?

My table looks like this:

CREATE TABLE data (
  bucket_id INTEGER,
  backend_id INTEGER,
  unique_id INTEGER,
  weight INTEGER,
  PRIMARY KEY (bucket_id, unique_id)
) ENGINE=InnoDB

I am doing multiple inserts. 6 billion+ rows to be inserted in multi value inserts in the form of

INSERT IGNORE INTO data VALUES (x1, x2, x3, x4), (y1, y2, y3, y4), .......)

with 500000 rows in each (limited by the client). These are all done on startup of the application and currently I need to speed this up as much as possible. I need the uniqueness of (backend_id, unique_id). Though I do not control these so there are duplicates in the imported data.

So the question is, will using a UNIQUE index instead of a composite PRIMARY KEY help me to increase speed of the insert statements? I know a lot of other factors affect this i.e. buffer pools and so on.

Answers


I'm pretty sure that primary key constraints in all modern database management systems are implemented using unique indexes. In SQL the declarations PRIMARY KEY and NOT NULL UNIQUE are behaviorally equivalent.

Your question boils down to this: is it faster to use a surrogate key in addition to the requisite constraint on {backend_id, unique_id}? Note carefully that using a surrogate key instead of the requisite constraint on {backend_id, unique_id} isn't generally acceptable, because it omits an important business requirement.

Adding a surrogate key

  • makes the table wider,
  • increases the number of bytes that have to be written to the table, and
  • increases the number of indexes that have to be written.

So adding a surrogate key will probably slow you down. This is probably the best structure for your stated requirements if you require concurrent access.

CREATE TABLE data (
  backend_id INTEGER,
  unique_id INTEGER,
  weight INTEGER,
  PRIMARY KEY (backend_id, unique_id)
) ENGINE=InnoDB

But if you can essentially run in single-user mode, it's fastest to load a table with no constraints, using the bulk loader. Then add the constraints later with ALTER TABLE statements.


Need Your Help

SSRS/MDX - A table with dataset pulled from a cube needs a column whose data is in sql db

sql reporting-services ssas mdx business-intelligence

I have a dataset that was build using MDX (analysis cube). I have a table with few columns. Now, I need to add one more column with specific data that is available in sql db and not cube. I know it...

JRadioButtons in a custom JScrollPane not appearing?

java swing jscrollpane jradiobutton buttongroup

I'm writing an IM client GUI for a project. On the left of the window, I want a scrolling pane that has radio buttons for each active user, so that when a New Chat button is pushed, a chat will be

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.