This page covers essential concepts in designing databases and ensuring they meet processing and storage efficiency through normalization.
Data Modeling
Conceptual Data Modeling
- Definition: High-level view representing business concepts and their relationships
- Components: Entities, relationships, and attributes without implementation details
- Purpose: To understand and communicate business requirements
- Techniques: Entity-Relationship Diagrams (ERDs) with minimal technical specifications
- Audience: Business stakeholders and analysts
Logical Data Modeling
- Definition: Translation of conceptual model into a database-oriented structure
- Components: Tables, columns, keys, and relationships with data types
- Purpose: Bridge between business requirements and technical implementation
- Techniques: Normalized relational schema, detailed ERDs
- Key Considerations: Data integrity, referential constraints, and cardinality
Physical Data Modeling
- Definition: Implementation-specific design for the target database system
- Components: Indexes, partitions, tablespaces, storage parameters
- Optimization Focus: Performance, storage efficiency, query execution plans
- DBMS-Specific Features: Utilizing vendor-specific capabilities (Oracle, SQL Server, PostgreSQL)
- Implementation Concerns: Denormalization strategies, indexing strategies, clustering
Normalization Principles
First Normal Form (1NF)
- Rule: Eliminate repeating groups; each cell must contain a single value
- Process: Identify and remove repeating columns or multi-valued attributes
- Benefits: Simplifies queries and updates
- Example: Converting a spreadsheet with multiple phone numbers in one cell into separate records
Second Normal Form (2NF)
- Rule: Must be in 1NF, and all non-key attributes must depend on the entire primary key
- Process: Remove partial dependencies by creating separate tables
- Benefits: Reduces data redundancy and update anomalies
- Example: Separating order details from customer information in an order processing system
Third Normal Form (3NF)
- Rule: Must be in 2NF, and no non-key attributes can depend on other non-key attributes
- Process: Eliminate transitive dependencies
- Benefits: Further reduces redundancy and improves data integrity
- Example: Moving city and state information from a customer table to a separate locations table
Boyce-Codd Normal Form (BCNF)
- Rule: For any dependency A → B, A must be a superkey
- Process: Restructuring to ensure all determinants are candidate keys
- When to Apply: Complex systems with multiple candidate keys
- Trade-offs: Increased complexity vs. improved data integrity
Fourth and Fifth Normal Forms
- 4NF: Addresses multi-valued dependencies
- 5NF: Addresses join dependencies
- Advanced Applications: Complex data systems requiring high levels of data integrity
- Implementation Considerations: Performance impacts of high normalization levels
Entity-Relationship Diagrams (ERD)
ERD Components
- Entities: Rectangles representing business objects (e.g., Customer, Product)
- Attributes: Properties of entities (e.g., CustomerID, ProductName)
- Relationships: Lines connecting entities with cardinality notations
- Cardinality Types: One-to-one, one-to-many, many-to-many
- Primary and Foreign Keys: Representation in diagrams
ERD Notation Systems
- Chen Notation: Original notation focused on clarity for business stakeholders
- Crow's Foot Notation: Popular notation showing cardinality constraints visually
- UML Class Diagrams: Object-oriented approach to data modeling
- IDEF1X: Government and military standard for data modeling
Translating Business Rules to ERDs
- Identifying Entities: Converting nouns in requirements to entities
- Establishing Relationships: Interpreting verbs and business processes
- Determining Cardinality: Analyzing how entities relate quantitatively
- Validation Techniques: Verifying the model against user stories and use cases
- Iterative Refinement: Methods for evolving the model based on stakeholder feedback
Advanced ERD Concepts
- Weak Entities: Entities dependent on others for identification
- Associative Entities: Resolving many-to-many relationships
- Subtype/Supertype Relationships: Modeling hierarchies and inheritance
- Recursive Relationships: Entities related to themselves
- Constraints and Business Rules: Documenting additional requirements on ERDs
Last modified: Thursday, 10 April 2025, 3:56 PM