top of page
Search

Use Case – Evaluating Tableau Performance with Live Connection to Snowflake

  • Dganit Lebedev
  • Jul 1
  • 3 min read

Background

This article is intended for developers interested in using Tableau with live data sources in Snowflake to understand their behavior. By analyzing how queries are generated, executed, and optimized.


Tools Used for Data Analysis

Snowflake Query History

Provides access to query execution history data in Snowflake for performance analysis, monitoring, and troubleshooting.

Key Features:

  • Query Performance Metrics: Includes start time, end time, duration, sessions, number of rows returned, and data volume scanned.

  • Data Filtering: Allows filtering by time, user, or query text.

  • Bottleneck Identification: Helps identify queries that were queued, compiled, or took a long time to execute.

Accessible via the query_history table or Snowflake’s web interface, which displays queries executed in the last 14 days.

Snowflake Query Profile

A graphical tool for analyzing query execution stages and resource usage.

Key Features:

  • Graphical Representation: Displays a tree diagram of query execution stages.

  • Execution Time Analysis: Provides details on processing times, local and remote disk reads/writes, network communication, and more.

  • Statistics: Includes data volume scanned, row counts, and more.

Both tools are critical for optimizing Snowflake’s performance.

Use Case – Evaluating Tableau Performance with Live Connection to Snowflake

Findings

Tableau Behavior in Live Mode

Real-Time Queries:

  • Every interaction in Tableau (e.g., filter or parameter changes) sends individual queries to Snowflake.

  • Queries are executed per dashboard object, leading to sequential or parallel execution for affected elements.

  • Includes SELECT statements for fetching data required by visualizations.

Metadata and Context Queries:

  • Includes ALTER SESSION, SHOW, DESCRIBE,EXPLAIN and other queries to manage session context and retrieve metadata.

Temporary Tables:

  • Could be created for complex filters, sets, or action-based operations in Tableau.

  • Example: A "Submit All" button based on an Action Filter generates queries that create temporary tables.

Cache Usage:

  • Tableau Caching: Exists in both Tableau Desktop and Tableau Server/Cloud, significantly reducing rendering, calculation, and query loads. Clearing the Tableau Desktop Query Cache

  • Snowflake Caching:

    • Result Caching:

      • Query results persist for 24 hours after execution.

      • Cached results are reused if specific conditions are met, these are the main ones:

        • The new query matches the previous query syntactically.

        • The contributing table data remains unchanged.

        • Cached results are still available.

        • The query does not include real-time evaluation functions (e.g., CURRENT_TIMESTAMP).

      • Cached results eliminate query execution, reducing runtime significantly.

    • Warehouse Caching:

      • Active warehouses cache table data during query processing, improving performance for subsequent queries.

      • Cache size depends on the type and size of the warehouse.

      • Suspended warehouses clear their cache, causing slower initial performance until the cache is rebuilt.


Total of Round-Trip Time

Key Steps:

  1. Query Generation: Tableau dynamically generates SQL queries based on user interactions (e.g., applying filters or parameters).

  2. Connection Initialization: Time required to establish or reuse a connection, including authentication and session setup.

  3. Query Execution in Snowflake: Performance is influenced by:

    • Warehouse size (Scaling Up).

    • Concurrency levels (Scaling Out).

    • Query complexity.

  4. Data Transfer to Tableau: Affected by network latency, data volume, and data compression methods.

Rendering and Visualization in Tableau: Tableau processes and visualizes data, performing calculations, applying filters, and rendering visuals on the client or server side.


Performance Improvement Strategies

Vertical Scaling (Scale-Up)

Description: Increasing the size of the warehouse by adding more computing resources.

Advantages:

  • Enhances performance for large and complex queries.

  • Reduces query queuing when existing warehouse resources cannot handle high concurrency.

Notes:

  • Scaling up does not necessarily improve smaller or already optimized queries, as they may not utilize additional resources.

  • Resizing a running warehouse does not affect ongoing queries; new resources apply only to queued or subsequent queries.

Horizontal Scaling (Scale-Out)

Description: Expanding concurrency capabilities by adding multiple warehouse clusters (Multi-Cluster Warehouse) to an existing warehouse. Available in Snowflake Enterprise Edition or higher.

Advantages:

  • Improves the system’s ability to handle concurrent query loads.

  • Distributes load across clusters, minimizing query queuing times.

  • Optimizes resource utilization through Session ID allocation, enabling better parallel execution.

Dashboard Design Best Practices

  1. Simplify dashboard designs to enhance performance and usability.

  2. Reduce query complexity by leveraging filtering and summarization.


Identifying Parallel or Sequential Query Execution

Criteria:

  • Sequential execution occurs when one query begins only after another ends.

  • Queries sharing the same Session ID are part of the same session, therefore are necessarily sequential.

  • Overlapping start and end times indicate concurrent execution.

Optimization: By analyzing the queries sent, you can determine if queries spend significant time in the queue. This allows you to assess whether scaling out (adding additional clusters) is necessary to handle concurrency more effectively.


Auto-Suspend: 

Key Points

  1. Performance Impact:

    • When a warehouse suspends, its cache clears, and restarting takes time, affecting the first query's performance.

  2. Balancing Cost and Performance:

    • Short intervals save costs but can cause frequent restarts.

    • Longer intervals help maintain cache and improve performance for consistent workloads.

  3. Consider Startup Time:

    • Startup time should be factored into performance-critical scenarios.

  4. Additional Details:

For Additional Information


Comments


bottom of page