Data modeling rules

Objects (main part):

  1. All public relations, schemas, i.e. PostgreSQL has public schema and a postgres user, also public rights must be deleted.
  2. All xxx_log tables have to be put to "log" schema.
  3. All additional plugins have to be put to "plugin" schema.
  4. All reports scripts, functions, viewses have to be put to "report" schema.
  5. All names of database objects: index, FK, PK, table, column name ant etc. have to be have singular lower case names.
  6. Primary key must be start with <schema name>_<table_name>_<column_name>_pkey. This rule rids duplicate names of PK. Have to be have type as serial (this will create valid sequence name with configurations).
  7. Unique must be sart with <schema_name>_<table_name>_<column_names>_key. This rule rids duplicate names of Unique.
  8. Foreign key must be start with <schema_name>_<table_name>_<table_column>_To_<schema_name>_<related_table_name>_<related_table_column>_fkey. For first view FK has long name, but this name not showing anywere only if error appears, what is good for understand problem.
  9. Foreign key have to have correct and logical CONSTRAINT when update and delete for values appears (do not use NO_ACTION. Must co-ordinate with architect or project lead or db owner). This will solve many problems with data. Not correct data will be in the past as headache. Data profiling will return no errors.
  10. Indexes must be start with <schema_name>_<table_name>_<table_columns>_idx.
  11. Foreign key which have indexes must be start with <schema_name>_<table_name>_<table_columns>_fki.
  12. Trigger functions and functions must be start with [tfn|fn]_[get|set|rep]_<logical description>.
  13. Trigger functions and functions required comments, to understand what they do and why required.
  14. Not used trigger functions and functions must be mark with * from start, which later will be deleted.
  15. Triggers must be start with trg_[repl|ins|upd|del]_<schema_name>_<table_name>_<logical description>.
  16. Sequence must be start with <schema_name>_<table_name>_<column_name>_seq. PostgreSQL create sequence automatically by this rule if you use serial datatype.
  17. Views must be start with vw_<schema_name>_<purpose|logical description>.
  18. Types must be start with type_ + copied function name without prefix.

p.s. suffix reasons:

  1.   You win 1-2 sec. of time when you looking table properties with all constraints and etc.
  2.   You see clear what object you want to open (especially if name is long)
  3.   When you write dbvs scripts and have big list of object is easy to see categorized block.

Tables:

  1. Table name must be lowercase.
  2. Table columns must be <short table name>_<column_name>. Example: Table name bil.receipt (where bil is schema). Column names of this table: rec_id, rec_person_id,...
  3. Table must to have columns: xxx_deleted boolean NOT NULL DEFAILT false
  4. Table which needs process logging need two additional tables: <same table name>_process and <same table name>_process_log (xxx_process_log should be in "log" schema).
  5. We need for all tables created_on, created_by, modified_by ant etc. Exept Archive, Log and temp tables. Having correct process for table and it process_log we will have monitoring possibility and reports correct realization.
  6. Table must to have <= 20 columns. If needed more, create extended table with sufix <table_name>_ext with FK relations to main table correctly.
  7. Not required column of table can be NULL value, all other important columns must be NOT NULL.
  8. Table doesn't have delete rules possibilities, only db owner could that.
  9. Stable completeness business rules should be modeled by not null constraints.

Audit:

  1. Every table must have audit columns: created_by, created_on. Except Archive, Log and temp tables.
  2. Every table must have columns (last) changed_by, changed_on. Except log, archive, temporary tables.
  3. All data changes to transnational tables must be logged. Recommended patterns.
    1. log table. Log table is copy of all columns of original table (including audit trail columns) plus columns log_created_by, log_created_datetime, log_oper_code (I, U, D)
    2. Reuse History table. History table must consist columns start_datetime, end_daterime
  4. Recommended data changes logging techniques in DB are triggers for Log tables and stored procedure for History tables. In case of Log tables we recommend to log all values, including first insert, all updates, delete values.
  5. In general delete data records from DB is not allowed. Use only "logical delete" and delete data only when data is archived. Temporary (outside normal business process) data may be deleted.

Archiving:

  1. Data Model (of each concrete solution) must be designed for business growth and scalability (solution must work if data volumes increases 10 times)
  2. Archiving must be analyzed and modeled.
  3. Archive database should be established and maintained to unload the production DB.
  4. Archive tables must have columns arch_datetime, arch_by
  5. Archiving between databases must be DATE column type based and follow “rolling-window” concept.
  6. Status based archiving must be converted to “rolling window” based archiving inside source database. Example: 1step CONTRACTS to CONTRACTS_ARCH); 2 step: Archiving of CONTRACTS_ARCH follows DATE column type.
  7. Archiving must use Pull method if database link is used for data transfer (Target connects/reads from source).
  8. Jobs scheduler should automate and coordinate archiving activities.
  9. Archiving scripts must be stored in version control systems and follow release process.
  10. Archiving logic recommended to store in DB.
  11. Partitions should be truncated and/or dropped automatically based on configuration.
  12. Daily full table copies (replacing) to archive should be avoided.
  13. Archiving procedures/process/program must be resilient in case of failure (must continue work from the point of failure without any manual clean-up).

Dropping:

  1. Business data is kept according to business rules
  2. Temporary tables should be marked with tmp_ prefix.
  3. Tables dropping should follow 2 step principle:
    1. rename with date suffix and check for possible dependencies.
    2. drop the tables and make the copy to archive DB (optional)


Define not used DB indexes in PostgreSQL