Database design for multiple similar types?
Say I have two question types: Multiple Choice and Range. A Range question allows users to answer by specifying a range of values in their answer (1-10 or 2-4 for example).
I inherited a database where the answers to these question types are stored in the same table which is structured like so:
Answers ------- Id QuestionId choice range_from range_to
This results in data like below:
1 1 null 1 10 2 1 null 2 4 3 2 Pants null null 4 2 Hat null null
Does it make sense to include columns from every answer type in the answer table? Or should they be broken out into separate tables?
This is a very slimmed-down version of my real database. In reality there are about 8 question types, so with every answer there are several columns that are left unused.
Does it make sense to include columns from every answer type in the answer table?
This is "all classess in the same table" strategy for implementing inheritance, which is suitable for small number of classes. As the number of classes grows, you might consider one of the other strategies. There is no predefined "cut-off point" for that - you'll have to measure and decide for yourself.
The alternative would be an EAV-like system as proposed by blotto, but that would shift the enforcement of data consistency away from the DBMS. This is a valid solution if you don't know the structure of data at design-time and want to avoid DML at run-time, but if you do know the structire of data at design-time better stick with inheritance.
You could have a single field that represents the 'type' of question, that seems best suited in the Question table ( not the Answer table). For example:
question_type ENUM('choice', 'range', 'type_3', 'type_4'..)
Then make a one-to-many link ( a join table ) that represents the Question-to-Answers relationship
AnswerId (pk) | QuestionId (fk) 1 1 2 1 3 2 4 2
Finally, your Answer table is a collection of values for each Answer . It can designate each record more specifically by having its own ENUM.
answer_type ENUM('low_range', 'high_range', 'choice', etc) Id (pk)| AnswerId (fk) | Type | Value 1 1 low_range 1 2 1 high_range 10 3 2 low_range 2 4 2 high_range 4 5 3 choice Pants 6 4 choice Hat
This is much more scalable, and basically pivots the fields in your previous table to values in the answers table. So you can always add new 'Type's both for questions an answers without adding new fields to the schema.