Can you explain the difference between Technical and Functional integrity?

Situation

In DATPROF Subset you can enforce Technical integrity and Functional Integrity. (also referred to as Functional Consistency)

In Subset I can select this in the Deployment Settings of the deployment screen of Subset.

I do not quite understand what they mean. Can you help?


Explanation

Technical Integrity

This is also named Referential integrity,

Given a Parent-Child relation the keyword here is the Parent table!

This means that after subsetting a check is done if all children records have a corresponding Parent record!

Subset performs a check after subsetting if all child records have a corresponding parent record.

If there are missing records these parent records are added

Only if the situation is consistent all constraints will be successfully restored.


SITUATION 1: Referential Integrity is NOT ok!

CHILD_TABLE

PARENT_TABLE


CHILD_KEY

CHILD_VALUE

PARENT_KEY

C1

Macron

P1

C2

Steinmeier

P2

C3

Trump

P3

C4

Rau

P2

C5

Mitterand

P1



PARENT_KEY

PARENT_VALUE

P1

France

P2

Germany


Missing is Parent Key 3 where CHILD_KEY 3 is referring to.

After subsetting Subset detects the missing record and adds this to the Parent Table


SITUATION 2: Referential Integrity is ok!

CHILD_TABLE

PARENT_TABLE


CHILD_KEY

CHILD_VALUE

PARENT_KEY

C1

Macron

P1

C2

Steinmeier

P2

C3

Trump

P3

C4

Rau

P2

C5

Mitterand

P1



PARENT_KEY

PARENT_VALUE

P1

France

P2

Germany

P3

United States of America



Functional Integrity

Functional Integrity is the other way around.

Given a Parent-Child relation the keyword here is the Child table!

First thing to bare in mind is that Functional Integrity really is functional. Technically the integrity is alright.

So, what ís it?

Functional Integrity means that you want all CHILD records belonging to a PARENT record.

Huh? But what is the use? We are subsetting, so why should I want all records.?

True, but....

In some situations you require not to have a subset but , due to calculations or checks, require all records.

Imagine you have:

  • a PARENT_TABLE with all Countries

  • a CHILD_TABLE With alle Presidents

  • a SUM_TABLE containing the total number of presidents per country.

The key is the SUM_TABLE.

Assume you want (FUNCTIONAL specification!) that the number in this table actually reflects to the number of presidents in the CHILD_TABLE.


SITUATION 1: Referential Integrity is ok but Functional Integrity is not.

PARENT_TABLE

CHILD_TABLE

SUM_TABLE


PARENT_KEY

PARENT_VALUE

P1

France

P2

Germany

P3

United States of America



CHILD_KEY

CHILD_VALUE

PARENT_KEY

C1

Macron

P1

C2

Steinmeier

P2

C3

Trump

P3

C4

Rau

P2

C5

Mitterand

P1




SUM_KEY

SUM_VALUE

PARENT_KEY

S1

2

P1

S2

3

P2

S3

2

P3


In this example you can see that PARENT_TABLE and CHILD_TABLE have a correct Technical Integrity. No problem.

But The SUM_TABLE contains the number of Presidents for that country. And we have decided that this number must match the actual presidents in the CHILD_TABLE.

SUM_TABLE says that France has 2, Germany has 3 and The USA has 2 presidents.

In the CHILD_TABLE we only see for France 2, Germany 2 for the USA 1 president.

In other words, if we do a reporting check we see that the addition of presidents in CHILD_TABLE does not match SUM_TABLE.

If the functional requirement is to have SUM_TABLE reflect the records in the CHILD_TABLE you should require ALL child records of PARENT_TABLE.

After Subsetting AND you have specified to use Functional Consistency all the CHILD_RECORDS are added. Once this is done the SUM_TABLE matches the number of records in CHILD_TABLE.


Functional Integrity is set on a individual Relation. It is not a general setting and should be used carefully.

Why? Since a adding records due to a Functional integrity rule might lead to a technical inconsistent situation which Subset will resolve. This might lead to an iterative situation (which can be limited by a setting ...)



Below the remaining presidents are added.


SITUATION 1: Referential Integrity is ok and Functional Integrity as well.

PARENT_TABLE

CHILD_TABLE

SUM_TABLE


PARENT_KEY

PARENT_VALUE

P1

France

P2

Germany

P3

United States of America



CHILD_KEY

CHILD_VALUE

PARENT_KEY

C1

Macron

P1

C2

Steinmeier

P2

C3

Trump

P3

C4

Rau

P2

C5

Mitterand

P1

C6

Obama

P3

C7

Weizsäcker

P2




SUM_KEY

SUM_VALUE

PARENT_KEY

S1

2

P1

S2

3

P2

S3

2

P3