Boost BigQuery Performance
Share on facebook
Share on linkedin
Share on twitter
Share on reddit

5 ways to boost BigQuery performance with superQuery

Google BigQuery is arguably the most popular and powerful serverless data warehouses in the market. However, as powerful as it already is, improving BigQuery performance is constantly on the minds of data & analytics teams.

And while Google and others have given great tips for optimizing performance, in this post you’re going to learn how to take your BigQuery performance & productivity to the next level with superQuery.

Through our IDE for BigQuery, superQuery enables data & analytics teams to use BigQuery at peak performance.

However, the definition of “peak performance” varies, depending on your role.

  • If you’re a data scientist or analyst, that might mean shortening your workflows in BQ to do more work in less time.

     

  • In contrast, if you’re a Head of Data / Analytics / BI, peak performance could mean optimizing BigQuery resource usage across your team. Or having more fine-grain controls for managing costs.

Whatever your role is, if you want to improve the way you and your team use BigQuery, continue reading.

Let’s get started

BigQuery Performance Tip #1:

Work on multiple queries with Tabs

When working on several queries for a work assignment, you have several options:

Option #1: Go back and forth between queries, saving one before switching to work on another. 

However, in this case you’ll lose the query results of your first query to execute the second.

Option #2: Open multiple Chrome tabs running BigQuery. 

Yet, the more Chrome tabs open, the harder it becomes to find the query you’re looking for. 

Query tabs in superQuery address the drawbacks of both options.

Improve BigQuery Performance with Query Tabs

Query tabs allow you to work across multiple queries in a single window, but there’s much more beyond that:

  • The SQL you write persists over time. If you close your session or refresh the page, they’ll still be there when you return. No more needing to remember to hit that “Save Query” button!
  • Is one query taking 50 seconds to execute? Work on a different query in another tab while it runs.
  • Run queries in parallel or, as you would in Airflow, in a sequence with one query building off the previous one’s result. You can even schedule these in a single scheduler!
  • Each tab has its own results history, so you won’t need to re-execute queries to get previous results.
Query tab results
Results history of one tab

BigQuery Performance Tip #2:

Categorize your queries with Boards​

So now that your queries don’t disappear automatically, how do you organize all of those query tabs?

The answer is by using Boards in superQuery.

Boards are like folders where you store and manage your queries. They’re Saved Queries on steroids, allowing you to categorize your queries by department, data project, or however you’d like!

Rather than having one list containing 50+ unrelated Saved Queries, split up your queries into Boards. As result, queries will be easier to find and you’ll work with more clarity.

Diagram showing how to organize your BigQuery queries using superQuery

BigQuery Performance Tip #3:

Write dynamic queries with variables​

How much time do you spend repeatedly making small changes to your SQL? Or using CTRL-F + Replace All to change values that appear multiple times throughout a lengthy query?

When you find yourself in these situations, then it’s time to insert variables into your SQL.

Variables (sometimes called “parameters“) are simply placeholders for values. They make your queries dynamic and more readable. Plus, they allow you to perform quick changes to your query without modifying the SQL itself.

The best part? In superQuery, you can create variables without using any SQL

Let’s see how.

How to set up variables in superQuery

Rather than writing a DECLARE and SET clause, you just need to perform three quick steps to create a variable:
 
  1. Select your variable’s data type.
  2. Give your variable a name.
  3. Assign it an initial value.
Create Variables in superQuery
Creating a String variable for product category called "category" with a value of "Apparel"

Then, insert variables into your query by substituting the literal value with @your_variable_name, as shown below.

Inserting Variables SQ
How to insert your variable into a query

Changing variable values is also done without SQL — no SET clause necessary. Just edit the value in the text box below your variable’s name.

Change Variable Value SQ
Change variable values with zero SQL

Variables can represent DATE, DATETIME, or STRING data types. 

Outside of WHERE clauses, they can also be used to substitute column, project, dataset, or table values — even partially, which is great if your table names end with a date like “daily_sales20191130”.

Variables for Dates
Using variables for Date values
Using variables to substitute table values
Using variables to substitute table values

As seen above, using variables helps improve your BigQuery performance by enabling you to make quick edits to your queries. Less time using CTRL-F + Replace All means a more productive analyst.

BigQuery Performance Tip #4:

Explore your query results in one click

Sometimes viewing your query results in table format isn’t enough. It doesn’t tell the whole story.

Furthermore, you don’t want to go through exporting results and loading them into another tool.

You want a visualization of your results, and you want it now.

In superQuery, that’s achievable with one click of the Chart icon in your results grid.

Visualize query result with SuperQuery
Instantly visualize your query result

Other times, you may want to quickly combine the results of all your queries into one dashboard.

For that, click on the Visualize toggle, as shown below. This toggle visualizes your entire Board, into shareable dashboards.

Create dashboards SQ
Transform your queries into a dashboard

BigQuery Performance Tip #5:

Version-control your queries automatically

Have you ever written what you thought was a one-time query, only to have to recreate it from memory a few weeks later? 

Or have you ever found yourself needing a previous version of a query you wrote?

With a query library — a repository for saving, updating, and tracking changes in your queries — you’ll never find yourself troubled with these situations again.

However as your query count piles, it can get tedious manually managing your query library.

But by connecting your GitHub repository to superQuery, your query library will automatically update itself.

How does it work? 

Every time you execute a query and there is a change in the SQL, it will automatically get committed to your selected GitHub repository. 

Version History for a query with the superQuery integration for Github

 

Your queries are catalogued under a hierarchy of Boards > Query Tabs, making your library easy to navigate (as long as you give your boards & tabs good names).

Bonus Tip:

Explore your BigQuery data in Dark Mode

Some prefer Dark Mode when working at night. Others for accessibility reasons. Or, you might simply like the aesthetics.

Whatever the reason, Dark Mode is just a click away in superQuery!

Besides, is it *really* an IDE if there isn’t a Dark Mode?

 

Explore BigQuery Data in Dark Mode

We use 🍪 to ensure you get the best experience.