As commitment to our database literacy campaign, we're offering our Database Foundations course—for FREE!

Skip to main content
Completion requirements

1. Advanced Query Writing

In this section, we'll see how fine-tuned syntax and best practices can significantly impact query performance.

A. Syntax Refinements and Best Practices

  • Clarity and Readability:
    Write SQL statements that are not only correct but also easy to read and maintain. For instance:

    SELECT employee_id, first_name, last_name 
    FROM employees 
    WHERE department_id = 10;
          

    Instead of:

    SELECT e_id, fname, lname 
    FROM employees 
    WHERE d_id = 10;
          
  • Appropriate Use of Aliases:
    Use column and table aliases to improve readability, especially in complex queries:

    SELECT e.employee_id, e.first_name, e.last_name, d.department_name
    FROM employees AS e
    JOIN departments AS d ON e.department_id = d.department_id;
    
  • Consistent Formatting:
    Adopt a consistent formatting style for indentation, capitalization of SQL keywords, and alias naming. This not only aids review by others but can also reduce errors when modifying queries.

  • Parameterization:
    Use parameterized queries to avoid SQL injection and to allow query caching by the database engine.

B. Subqueries vs. JOINs: When to Use Each

  • Subqueries:
    Subqueries can be used in SELECT, FROM, or WHERE clauses. They are useful when you need to first compute a result before performing a join or filter operation. For example:

    SELECT employee_id, first_name, last_name
    FROM employees
    WHERE department_id IN (SELECT department_id FROM departments WHERE location_id = 1700);
    

    Use Cases and Considerations:

    • When the subquery returns a single or a small dataset.
    • When logical separation improves clarity, but know that some complex subqueries might be less performant than a well-indexed JOIN.
  • JOINs:
    JOINs are generally more performant as they allow the use of indexes and are generally optimized by the DBMS. For example, instead of using a subquery:

    SELECT e.employee_id, e.first_name, e.last_name
    FROM employees AS e
    JOIN departments AS d ON e.department_id = d.department_id
    WHERE d.location_id = 1700;
    

    Use Cases and Considerations:

    • When merging rows from two or more tables efficiently is required.
    • When relational integrity must be enforced by retrieving data from multiple tables.

C. Query Rewriting Techniques for Performance Gains

  • Avoid SELECT *:
    Only retrieve the columns you need. This minimizes I/O and reduces memory consumption.

    -- Problematic approach:
    SELECT * FROM orders;
    
    -- Optimized approach:
    SELECT order_id, order_date, customer_id FROM orders;
    
  • Filtering Early:
    Use filters (WHERE clauses) as early as possible in your queries to reduce the number of rows processed:

    SELECT order_id, order_date
    FROM orders
    WHERE order_date >= '2023-01-01';
    
  • Index Usage:
    Ensure that indexed columns appear in JOIN conditions and WHERE filters to take advantage of faster data access.

  • Using Derived Tables and Common Table Expressions (CTEs): Recursive Queries and Modular Query Design

    • Introduction to CTEs:
      CTEs improve the readability and maintainability of SQL queries. They provide a way to define temporary result sets that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement.

    • Non-Recursive CTE Example:

      WITH TopOrders AS (
          SELECT order_id, amount
          FROM orders
          WHERE amount > 1000
      )
      SELECT * FROM TopOrders;
      
    • Recursive CTE Example:
      Recursive queries are particularly useful for hierarchical data (e.g., organizational charts, bill of materials).

      WITH EmployeeHierarchy AS (
          SELECT employee_id, manager_id, first_name, last_name, 0 AS level
          FROM employees
          WHERE manager_id IS NULL
          
          UNION ALL
          
          SELECT e.employee_id, e.manager_id, e.first_name, e.last_name, eh.level + 1
          FROM employees e
          JOIN EmployeeHierarchy eh ON e.manager_id = eh.employee_id
      )
      SELECT * FROM EmployeeHierarchy;
      
    • Modular Query Design:
      CTEs enable decomposition of complex queries into smaller, logical units that can be developed, tested, and optimized individually.