How to represent inheritance in database?
P粉041856955
P粉041856955 2023-08-29 13:56:14
0
2
579
<p>I'm thinking about how to represent complex structures in a SQL Server database. </p> <p>Consider an application that needs to store details about a series of objects that share some properties but have many other uncommon properties. For example, a business insurance package might include liability, motor, property, and indemnity coverage in the same policy record. </p> <p>Achieving this in, for example, C# is simple, as you can create policies that contain collections of parts, with parts inherited as needed for various types of overrides. However, relational databases don't seem to allow this. </p> <p>I can see there are two main options:</p> <ol> <li><p>Create a strategy table and then a partial table with all the fields required for all possible variations, most of which are empty. </p></li> <li><p>Create a policy table and multiple partial tables, each table corresponding to a type of insurance. </p></li> </ol> <p>Both alternatives seem unsatisfactory, especially since the query would need to be written across all parts, which would involve either a lot of joins or a lot of null checks. </p> <p>What are the best practices for this scenario? </p>
P粉041856955
P粉041856955

reply all(2)
P粉476475551

The third option is to create a "Policy" table and then a "SectionsMain" table to store all the fields that are common across different types of sections. Then create additional tables for each type of section, containing only the uncommon fields.

Deciding which one is best depends mainly on how many fields you have and how you want to write your SQL. They will all work. If you only have a few fields, then I'd probably go with #1. For "a lot" of areas, I'd lean towards #2 or #3.

P粉722521204

@Bill Karwin In his SQL Antipatterns book, he proposed the SQL Entity Attribute Value anti-pattern. Here's a brief overview:

Single table inheritance (aka per-hierarchy table inheritance):

Using a single table like the first option is probably the simplest design. As you mentioned, many subtype-specific properties must be given NULL values ​​on rows where these properties do not apply. Using this model, you would have a policy table that looks like this:

+------+---------------------+----------+----------------+------------------+
| id   | date_issued         | type     | vehicle_reg_no | property_address |
+------+---------------------+----------+----------------+------------------+
|    1 | 2010-08-20 12:00:00 | MOTOR    | 01-A-04004     | NULL             |
|    2 | 2010-08-20 13:00:00 | MOTOR    | 02-B-01010     | NULL             |
|    3 | 2010-08-20 14:00:00 | PROPERTY | NULL           | Oxford Street    |
|    4 | 2010-08-20 15:00:00 | MOTOR    | 03-C-02020     | NULL             |
+------+---------------------+----------+----------------+------------------+

\------ COMMON FIELDS -------/          \----- SUBTYPE SPECIFIC FIELDS -----/

Keeping the design simple is an advantage, but the main problems with this approach are as follows:

  • As you add new subtypes, you must change the table to accommodate the properties that describe these new objects. This can quickly become a problem when you have many subtypes or when you plan to add subtypes regularly.

  • The database will not be able to enforce which properties apply and which do not, because there is no metadata to define which properties belong to which subtypes.

  • You also cannot enforce NOT NULL on a subtype attribute that should be enforced. You have to handle this in the application, which is usually not ideal.

Specific table inheritance:

Another way to solve the inheritance problem is to create a new table for each subtype, repeating all the common properties in each table. For example:

--// Table: policies_motor
+------+---------------------+----------------+
| id   | date_issued         | vehicle_reg_no |
+------+---------------------+----------------+
|    1 | 2010-08-20 12:00:00 | 01-A-04004     |
|    2 | 2010-08-20 13:00:00 | 02-B-01010     |
|    3 | 2010-08-20 15:00:00 | 03-C-02020     |
+------+---------------------+----------------+
                          
--// Table: policies_property    
+------+---------------------+------------------+
| id   | date_issued         | property_address |
+------+---------------------+------------------+
|    1 | 2010-08-20 14:00:00 | Oxford Street    |   
+------+---------------------+------------------+

This design will basically solve the problems identified by the single-table approach:

  • Mandatory attributes can now be enforced via NOT NULL.

  • Adding new subtypes requires adding a new table, not adding columns to an existing table.

  • There is also no risk of setting inappropriate attributes for specific subtypes, such as the vehicle_reg_no field of an attribute policy.

  • No need for the type attribute like in the single table method. The type is now defined by metadata: table name.

But this model also has some shortcomings:

  • Public properties are mixed with subtype-specific properties, and there is no easy way to identify them. The database doesn't know either.

  • When defining a table, you must repeat the common properties for each subtype table. This is definitely not .

  • Searching for all strategies regardless of subtype becomes difficult and requires a bunch of UNION.

Regardless of type, you must query all policies via:

SELECT     date_issued, other_common_fields, 'MOTOR' AS type
FROM       policies_motor
UNION ALL
SELECT     date_issued, other_common_fields, 'PROPERTY' AS type
FROM       policies_property;

Please note that adding new subtypes will require modifying the above query with additional UNION ALL for each subtype. If you forget to do this, you can easily cause errors in your application.

Class table inheritance (aka table inheritance per type):

This is the solution mentioned by @David in another answer . You create a table for the base class that includes all public properties. You would then create specific tables for each subtype, whose primary keys also serve as base tables. Example:

CREATE TABLE policies (
   policy_id          int,
   date_issued        datetime,

   -- // other common attributes ...
);

CREATE TABLE policy_motor (
    policy_id         int,
    vehicle_reg_no    varchar(20),

   -- // other attributes specific to motor insurance ...

   FOREIGN KEY (policy_id) REFERENCES policies (policy_id)
);

CREATE TABLE policy_property (
    policy_id         int,
    property_address  varchar(20),

   -- // other attributes specific to property insurance ...

   FOREIGN KEY (policy_id) REFERENCES policies (policy_id)
);

This solution solves the problems found in the other two designs:

  • Mandatory attributes can be enforced by NOT NULL.

  • Adding new subtypes requires adding a new table, not adding columns to an existing table.

  • There is no risk of setting inappropriate properties for specific subtypes.

  • No type attribute is required.

  • Now public properties are no longer mixed with subtype specific properties.

  • We can finally stay dry. Table creation does not require duplication of common properties for each subtype table.

  • Managing auto-increment policies for id becomes easier as this can be handled by the base table rather than each subtype table generating them independently.

  • Searching all strategies (regardless of subtype) is now very easy: no UNION required - just SELECT * FROM strategy.

I think the class table method is the most appropriate in most cases.


The names of these three models come from Martin Fowlera bookEnterprise Application Architecture Patterns.

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template