-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Motivation
BOYCE-CODD NORMAL FORM VIOLATION (P177) from Garey & Johnson, A4 SR29. A classical NP-complete problem in database theory. Determining whether a subset of attributes violates Boyce-Codd normal form is fundamental to relational database schema design and normalization.
Associated reduction rules:
- R123: Hitting Set -> Boyce-Codd Normal Form Violation (this problem is the target)
Definition
Name: BoyceCoddNormalFormViolation
Canonical name: Boyce-Codd Normal Form Violation (also: BCNF Violation, BCNF Testing)
Reference: Garey & Johnson, Computers and Intractability, A4 SR29
Mathematical definition:
Background: A functional dependency X → Y means that the values of attributes X uniquely determine the values of attributes Y. The closure X+ of a set X under F is the set of all attributes determined (directly or transitively) by X. A set X is a superkey if X+ = A' (it determines everything). A relation satisfies Boyce-Codd Normal Form (BCNF) if every non-trivial functional dependency X → Y has X as a superkey — i.e., whenever X determines something, it determines everything.
INSTANCE: A set A of attribute names, a collection F of functional dependencies on A, and a subset A' ⊆ A.
QUESTION: Is there a subset X ⊆ A' and two attributes y, z ∈ A' \ X such that y ∈ X+ but z ∉ X+? (In other words, X determines y but not z, meaning X is not a superkey — a BCNF violation.)
Variables
- Count:
target_subset.len()binary variables, one per attribute in A'. - Per-variable domain: binary {0, 1} — whether the attribute is included in X.
dims():vec![2; target_subset.len()]evaluate(): Let X = {target_subset[i] : config[i] = 1}. Compute the closure X+ under F. Then check all pairs (y, z) in A' \ X: returntrueif there exist y, z ∈ A' \ X such that y ∈ X+ and z ∉ X+ (i.e., X determines y but not z, so X is not a superkey).
Schema (data type)
Type name: BoyceCoddNormalFormViolation
Variants: none (no graph or weight type parameter; functional dependencies are stored directly)
| Field | Type | Description |
|---|---|---|
num_attributes |
usize |
Number of attributes in A (attributes indexed 0..num_attributes) |
functional_deps |
Vec<(Vec<usize>, Vec<usize>)> |
Collection F of functional dependencies; each is (lhs_attributes, rhs_attributes) |
target_subset |
Vec<usize> |
The subset A' ⊆ A to test for BCNF violation |
Size fields (getter methods for overhead expressions and declare_variants!):
num_attributes()— returnsnum_attributesnum_functional_deps()— returnsfunctional_deps.len()num_target_attributes()— returnstarget_subset.len()
Complexity
- Decision complexity: NP-complete (Beeri and Bernstein, 1979; transformation from HITTING SET).
- Best known exact algorithm: Brute-force: enumerate all 2^|A'| subsets X, compute closure X+ under F, check if X+ contains some but not all of A' \ X. Total: O(2^num_target_attributes * num_target_attributes^2 * num_functional_deps).
declare_variants!complexity string:"2^num_target_attributes * num_target_attributes^2 * num_functional_deps"- Parameterized: The problem remains NP-complete even when A' is required to satisfy third normal form (3NF). For the special case where functional dependencies form a hierarchy, BCNF testing is linear time.
- References:
- [Beeri and Bernstein, 1979] C. Beeri and P. A. Bernstein, "Computational Problems Related to the Design of Normal Form Relational Schemas", ACM Trans. Database Systems, 4(1), pp. 30-59, 1979.
- [Bernstein and Beeri, 1976] P. A. Bernstein and C. Beeri, "An Algorithmic Approach to Normalization of Relational Database Schemas", University of Toronto, 1976.
Specialization
- Related problems: The problem is closely related to key determination and superkey testing for relational schemas.
- Restriction: When all functional dependencies have single-attribute right-hand sides and the dependency graph is hierarchical, the problem is solvable in polynomial time.
- Generalization of: Testing whether a specific FD violates BCNF (which is polynomial) to testing whether any violation exists in a given attribute subset.
Extra Remark
Full book text:
INSTANCE: A set A of attribute names, a collection F of functional dependencies on A, and a subset A' ⊆ A.
QUESTION: Does A' violate Boyce-Codd normal form for the relational system <A,F>, i.e., is there a subset X ⊆ A' and two attribute names y,z in A' - X such that (X,{y}) in F* and (X,{z}) not in F*, where F* is the closure of F?
Reference: [Bernstein and Beeri, 1976], [Beeri and Bernstein, 1978]. Transformation from HITTING SET.
Comment: Remains NP-complete even if A' is required to satisfy "third normal form," i.e., if X ⊆ A' is a key for the system <A',F> and if two names y,z in A'-X satisfy (X,{y}) in F* and (X,{z}) not in F*, then z is a prime attribute for <A',F>.
How to solve
- It can be solved by (existing) bruteforce. Enumerate all subsets X of A'; for each X and each pair (y, z) in A' - X, compute the attribute closure X+ under F and check if y in X+ but z not in X+.
- It can be solved by reducing to integer programming.
- Other: (none identified)
Example Instance
Instance 1 (YES — BCNF violation exists):
num_attributes = 6 (attributes: {0, 1, 2, 3, 4, 5})
functional_deps = [({0,1}, {2}), ({2}, {3}), ({3,4}, {5})]
target_subset = [0, 1, 2, 3, 4, 5] (all attributes)
The only candidate key for A' is {0, 1, 4}: closure {0,1,4} -> {2} -> {3} -> {5} = A.
Consider X = {2}, y = 3, z = 0:
- Closure of {2}: {2} -> {3} via FD2, so {2}+ = {2, 3}
- y = 3: 3 ∈ {2}+ → X determines y. YES.
- z = 0: 0 ∉ {2}+ → X does not determine z. YES.
- {2} is not a superkey (closure is only {2, 3}, not all of A').
- Therefore X = {2} with y = 3, z = 0 witnesses a BCNF violation.
Answer: YES.
Instance 2 (NO — A' satisfies BCNF):
num_attributes = 4 (attributes: {0, 1, 2, 3})
functional_deps = [({0,1}, {2,3}), ({2,3}, {0,1}), ({0,2}, {1,3}), ({1,3}, {0,2})]
target_subset = [0, 1, 2, 3]
There are 4 candidate keys: {0,1}, {2,3}, {0,2}, {1,3} — a cyclic key structure.
Check all 16 subsets X ⊆ A':
- Singletons {0}, {1}, {2}, {3}: no FD fires, so X+ = X. Trivial (determines nothing new).
- {0,1}+: apply {0,1}->{2,3} → {0,1,2,3} = A'. Superkey.
- {0,2}+: apply {0,2}->{1,3} → {0,1,2,3} = A'. Superkey.
- {1,3}+: apply {1,3}->{0,2} → {0,1,2,3} = A'. Superkey.
- {2,3}+: apply {2,3}->{0,1} → {0,1,2,3} = A'. Superkey.
- {0,3}+: no FD fires (each FD needs a pair not contained in {0,3}). X+ = {0,3}. Trivial.
- {1,2}+: no FD fires. X+ = {1,2}. Trivial.
- All triples and A' itself: superkeys (contain at least one 2-element key).
Every subset either has trivial closure (X+ = X) or is a superkey (X+ = A'). No subset has a partial closure, so no BCNF violation exists.
Answer: NO, A' satisfies BCNF.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status