BI Query Patterns
Foundation Query Patterns
BI Query Architecture
How queries flow through the BI stack, from user interface to database and back
Legend
Components
Connection Types
Dimensional Analysis Patterns
Time-Based Analysis Patterns
Time-based analysis provides crucial context for business metrics:
- Identifies trends and seasonal patterns in key performance indicators
- Enables year-over-year and quarter-over-quarter performance tracking
- Measures the impact of business initiatives over time
- Supports forecasting and predictive modeling
- Aligns reporting with fiscal planning cycles
Time-Based Analysis Patterns
SQL implementations for time-based analysis:
- Period-over-period comparisons with window functions
- Moving averages and cumulative sums
- Fiscal vs. calendar period alignment
- Time-based cohort analysis
- Custom business calendar implementations
Period-over-Period Comparison Query
WITH current_period AS (
SELECT
date_trunc('month', order_date) AS period,
SUM(revenue) AS total_revenue
FROM orders
WHERE order_date BETWEEN '2023-01-01' AND '2023-03-31'
GROUP BY 1
),
previous_period AS (
SELECT
date_trunc('month', order_date) AS period,
SUM(revenue) AS total_revenue
FROM orders
WHERE order_date BETWEEN '2022-01-01' AND '2022-03-31'
GROUP BY 1
)
SELECT
cp.period,
cp.total_revenue AS current_revenue,
pp.total_revenue AS previous_revenue,
(cp.total_revenue - pp.total_revenue) AS absolute_change,
CASE
WHEN pp.total_revenue = 0 THEN NULL
ELSE ROUND(((cp.total_revenue - pp.total_revenue) / pp.total_revenue) * 100, 2)
END AS percentage_change
FROM current_period cp
LEFT JOIN previous_period pp ON
date_part('month', cp.period) = date_part('month', dateadd('year', 1, pp.period))
ORDER BY cp.period
Moving Average Pattern
SELECT
order_date,
daily_revenue,
AVG(daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS seven_day_moving_avg,
AVG(daily_revenue) OVER (
ORDER BY order_date
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW
) AS thirty_day_moving_avg
FROM (
SELECT
date_trunc('day', order_date) AS order_date,
SUM(revenue) AS daily_revenue
FROM orders
GROUP BY 1
) daily_totals
ORDER BY order_date
Hierarchical Analysis
Hierarchical analysis helps organizations understand relationships and aggregations:
- Organizational reporting that rolls up metrics by management structure
- Product categorization analysis with multiple classification levels
- Geographic grouping from local to regional to global metrics
- Budget allocation tracking across departmental hierarchies
- Customer segmentation with nested attributes
Hierarchical Analysis
Technical implementation of hierarchical data queries:
- Recursive CTEs for traversing organizational structures
- Self-referential joins for parent-child relationships
- Materialized path pattern for fast hierarchy lookups
- Nested set model for efficient subtree operations
- Bridge tables for many-to-many hierarchies
Recursive CTE for Organizational Hierarchy
WITH RECURSIVE org_hierarchy AS (
-- Base case: top-level managers
SELECT
employee_id,
employee_name,
manager_id,
department,
job_title,
salary,
1 AS hierarchy_level,
employee_name::TEXT AS path,
employee_id::TEXT AS id_path
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive case: employees with managers
SELECT
e.employee_id,
e.employee_name,
e.manager_id,
e.department,
e.job_title,
e.salary,
oh.hierarchy_level + 1,
oh.path || ' > ' || e.employee_name,
oh.id_path || ',' || e.employee_id
FROM employees e
JOIN org_hierarchy oh ON e.manager_id = oh.employee_id
)
SELECT
employee_id,
employee_name,
manager_id,
department,
job_title,
salary,
hierarchy_level,
path AS reporting_structure,
id_path
FROM org_hierarchy
ORDER BY id_path
Aggregation Patterns
Multi-Level Aggregations
Multi-level aggregations provide flexibility in business reporting:
- Creates reports that combine detail and summary in one view
- Enables drill-down exploration from high-level KPIs to granular data
- Supports comparative analysis across multiple business dimensions
- Improves report readability with appropriate level of detail
- Facilitates exception reporting and anomaly detection
Multi-Level Aggregations
Technical implementation of multi-level aggregations:
- ROLLUP, CUBE, and GROUPING SETS operations
- Dynamic pivoting with CASE expressions
- Cross-tab queries with conditional aggregation
- Pre-aggregation tables for performance
- Handling sparse data in multi-dimensional analysis
ROLLUP Example for Sales Hierarchy
SELECT
COALESCE(region, 'All Regions') AS region,
COALESCE(country, 'All Countries') AS country,
COALESCE(product_category, 'All Categories') AS product_category,
SUM(sales_amount) AS total_sales,
COUNT(DISTINCT customer_id) AS customer_count,
SUM(sales_amount) / COUNT(DISTINCT customer_id) AS average_customer_spend,
GROUPING(region) AS is_region_subtotal,
GROUPING(country) AS is_country_subtotal,
GROUPING(product_category) AS is_category_subtotal
FROM sales
GROUP BY ROLLUP (region, country, product_category)
ORDER BY
GROUPING(region), region,
GROUPING(country), country,
GROUPING(product_category), product_category
Performance Optimization Patterns
Query Optimization Strategies
Key strategies for optimizing BI query performance in data warehouses
Legend
Components
Connection Types
Common Performance Optimizations
-
Query Rewriting Techniques:
- Use EXISTS instead of IN for better performance with subqueries
- Avoid unnecessary DISTINCT operations
- Replace complex JOINs with more efficient EXISTS clauses
- Push down predicates to filter data early
-
Materialization Strategies:
- Create summary tables for common aggregations
- Implement incremental refresh patterns
- Use materialized views with intelligent refresh policies
- Implement aggregate awareness in semantic layers
-
Partition Pruning:
- Design partition keys based on common filter patterns
- Implement time-based partitioning for historical analysis
- Use compound partitioning for multi-dimensional filtering
- Align partition strategy with query patterns
Advanced Analytical Patterns
Customer Journey Analysis
Customer journey analysis reveals actionable insights about customer behavior:
- Identifies drop-off points in conversion funnels
- Shows which marketing channels influence purchasing decisions
- Reveals typical paths customers take before making a purchase
- Helps optimize website and app user experience
- Improves marketing budget allocation based on touchpoint effectiveness
Customer Journey Analysis
Technical implementation of customer journey analysis:
- Path analysis using window functions and self-joins
- First-touch vs. last-touch attribution models
- Funnel analysis with sequential pattern matching
- Conversion rate calculation across touchpoints
- Time-to-conversion metrics with interval analysis
Funnel Analysis Query
WITH sessions AS (
SELECT
user_id,
session_id,
event_type,
event_timestamp,
ROW_NUMBER() OVER (PARTITION BY user_id, session_id ORDER BY event_timestamp) AS event_sequence
FROM user_events
WHERE event_timestamp BETWEEN '2023-01-01' AND '2023-03-31'
),
funnel_stages AS (
SELECT
date_trunc('week', event_timestamp) AS week,
COUNT(DISTINCT CASE WHEN event_type = 'page_view' THEN session_id END) AS stage_1_count,
COUNT(DISTINCT CASE WHEN event_type = 'product_view' THEN session_id END) AS stage_2_count,
COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN session_id END) AS stage_3_count,
COUNT(DISTINCT CASE WHEN event_type = 'checkout_start' THEN session_id END) AS stage_4_count,
COUNT(DISTINCT CASE WHEN event_type = 'purchase' THEN session_id END) AS stage_5_count
FROM sessions
GROUP BY 1
)
SELECT
week,
stage_1_count,
stage_2_count,
stage_3_count,
stage_4_count,
stage_5_count,
ROUND(100.0 * stage_2_count / NULLIF(stage_1_count, 0), 2) AS stage_1_to_2_conversion,
ROUND(100.0 * stage_3_count / NULLIF(stage_2_count, 0), 2) AS stage_2_to_3_conversion,
ROUND(100.0 * stage_4_count / NULLIF(stage_3_count, 0), 2) AS stage_3_to_4_conversion,
ROUND(100.0 * stage_5_count / NULLIF(stage_4_count, 0), 2) AS stage_4_to_5_conversion,
ROUND(100.0 * stage_5_count / NULLIF(stage_1_count, 0), 2) AS overall_conversion
FROM funnel_stages
ORDER BY week
BI Query Anti-Patterns
Common Query Anti-Patterns to Avoid
-
Correlated Subqueries in Large Tables
- Problem: Performance degradation with large datasets
- Solution: Replace with JOINs or window functions
-
SELECT * in Production Queries
- Problem: Unnecessary data retrieval and network transfer
- Solution: Explicitly select only needed columns
-
Inefficient Date Handling
- Problem: Non-SARGable date conditions (e.g., using functions on indexed columns)
- Solution: Use range conditions that preserve index usage
-
Nested Views on Views
- Problem: Query optimizer struggles with deeply nested views
- Solution: Flatten view hierarchies or use materialization
-
Overuse of DISTINCT
- Problem: Expensive sort or hash operations
- Solution: Proper JOIN conditions or GROUP BY clauses