As commitment to our database literacy campaign, we're offering our Certified Database Practitioner course—for FREE!

Skip to main content
Completion requirements

1. Advanced Query Writing

In this section, we'll see how fine-tuned syntax and best practices can significantly impact query performance.

A. Syntax Refinements and Best Practices

  • Clarity and Readability:
    Write SQL statements that are not only correct but also easy to read and maintain. For instance:

    SELECT employee_id, first_name, last_name 
    FROM employees 
    WHERE department_id = 10;
          

    Instead of:

    SELECT e_id, fname, lname 
    FROM employees 
    WHERE d_id = 10;
          
  • Appropriate Use of Aliases:
    Use column and table aliases to improve readability, especially in complex queries:

    SELECT e.employee_id, e.first_name, e.last_name, d.department_name
    FROM employees AS e
    JOIN departments AS d ON e.department_id = d.department_id;
    
  • Consistent Formatting:
    Adopt a consistent formatting style for indentation, capitalization of SQL keywords, and alias naming. This not only aids review by others but can also reduce errors when modifying queries.

  • Parameterization:
    Use parameterized queries to avoid SQL injection and to allow query caching by the database engine.

B. Subqueries vs. JOINs: When to Use Each

  • Subqueries:
    Subqueries can be used in SELECT, FROM, or WHERE clauses. They are useful when you need to first compute a result before performing a join or filter operation. For example:

    SELECT employee_id, first_name, last_name
    FROM employees
    WHERE department_id IN (SELECT department_id FROM departments WHERE location_id = 1700);
    

    Use Cases and Considerations:

    • When the subquery returns a single or a small dataset.
    • When logical separation improves clarity, but know that some complex subqueries might be less performant than a well-indexed JOIN.
  • JOINs:
    JOINs are generally more performant as they allow the use of indexes and are generally optimized by the DBMS. For example, instead of using a subquery:

    SELECT e.employee_id, e.first_name, e.last_name
    FROM employees AS e
    JOIN departments AS d ON e.department_id = d.department_id
    WHERE d.location_id = 1700;
    

    Use Cases and Considerations:

    • When merging rows from two or more tables efficiently is required.
    • When relational integrity must be enforced by retrieving data from multiple tables.

C. Query Rewriting Techniques for Performance Gains

  • Avoid SELECT *:
    Only retrieve the columns you need. This minimizes I/O and reduces memory consumption.

    -- Problematic approach:
    SELECT * FROM orders;
    
    -- Optimized approach:
    SELECT order_id, order_date, customer_id FROM orders;
    
  • Filtering Early:
    Use filters (WHERE clauses) as early as possible in your queries to reduce the number of rows processed:

    SELECT order_id, order_date
    FROM orders
    WHERE order_date >= '2023-01-01';
    
  • Index Usage:
    Ensure that indexed columns appear in JOIN conditions and WHERE filters to take advantage of faster data access.

  • Using Derived Tables and Common Table Expressions (CTEs): Recursive Queries and Modular Query Design

    • Introduction to CTEs:
      CTEs improve the readability and maintainability of SQL queries. They provide a way to define temporary result sets that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement.

    • Non-Recursive CTE Example:

      WITH TopOrders AS (
          SELECT order_id, amount
          FROM orders
          WHERE amount > 1000
      )
      SELECT * FROM TopOrders;
      
    • Recursive CTE Example:
      Recursive queries are particularly useful for hierarchical data (e.g., organizational charts, bill of materials).

      WITH EmployeeHierarchy AS (
          SELECT employee_id, manager_id, first_name, last_name, 0 AS level
          FROM employees
          WHERE manager_id IS NULL
          
          UNION ALL
          
          SELECT e.employee_id, e.manager_id, e.first_name, e.last_name, eh.level + 1
          FROM employees e
          JOIN EmployeeHierarchy eh ON e.manager_id = eh.employee_id
      )
      SELECT * FROM EmployeeHierarchy;
      
    • Modular Query Design:
      CTEs enable decomposition of complex queries into smaller, logical units that can be developed, tested, and optimized individually.