aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Design
  • Engineering

Introducing Parallel Steps For Workflows: Speed Up Workflow Executions By Running Steps Concurrently

  • aster.cloud
  • July 20, 2022
  • 5 minute read

We’re excited to launch a new feature for Workflows, a serverless orchestrator for developers that connects multiple Google Cloud and external services. Parallel Steps—now in Preview—enables developers to run multiple concurrent steps, which can help reduce the time it takes to execute a workflow, particularly one that includes long-running operations like HTTP requests and callbacks.

To create a workflow, developers define a series of steps and order of execution. Each step performs an operation, like assigning variables, returning a value, or calling an HTTP endpoint.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

By default, Workflows executes steps in sequential order, one at a time. Running steps in serial order like this can prove inefficient for long-running operations that take minutes, hours, or days because they include, for example, long-running calls, callbacks, polling, or waiting for human approval. (Workflows’ execution duration limit is one year.)

To address this inefficiency, we’ve introduced the ability to execute steps concurrently using parallel branches and parallel iteration, to speed up overall workflow execution time. A workflow can now contain both serial steps for sequential operations, and parallel steps for non-linear ones.

For many users, Workflows with parallel steps will be the most efficient way on Google Cloud to run a batch of services in parallel and aggregate the results. Because serverless compute services like Cloud Functions and Cloud Run can autoscale, you can use Workflows to run those services with high concurrency when needed, without needing to provision high capacity when idle.

Let’s take a closer look at how this new feature works.

Parallel steps in action: Running concurrent BigQuery jobs to speed up data processing

To test out the benefit of parallel steps, here’s an example tutorial of a workflow that runs five BigQuery jobs to process a Wikipedia dataset. In this tutorial, we compare parallel and non-parallel execution of this workflow, and see a major improvement in execution time when parallelizing those BigQuery jobs.

Read More  Built With BigQuery: Lytics Launches Secure Data Sharing And Enrichment Solution On Google Cloud

First, you execute the workflow serially, using the for loop below. Each BigQuery job will execute in about 20 seconds, bringing total execution time to 1 minute:

Serial iteration

 

steps:
    - init:
        assign:
            - results : {} # result from each iteration keyed by table name
            - tables:
                - 201201h
                - 201202h
                - 201203h
                - 201204h
                - 201205h
    - runQueries:
            for:
                value: table
                in: ${tables}
                steps:
                - runQuery:
                    call: googleapis.bigquery.v2.jobs.query
                    args:
                        projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                        body:
                            useLegacySql: false
                            useQueryCache: false
                            timeoutMs: 30000
                            # Find top 100 titles with most views on Wikipedia
                            query: ${
                                "SELECT TITLE, SUM(views)
                                FROM `bigquery-samples.wikipedia_pageviews." + table + "`
                                WHERE LENGTH(TITLE) > 10
                                GROUP BY TITLE
                                ORDER BY SUM(VIEWS) DESC
                                LIMIT 100"
                                }
                    result: queryResult
                - returnResult:
                    assign:
                        # Return the top title from each table
                        - results[table]: {}
                        - results[table].title: ${queryResult.rows[0].f[0].v}
                        - results[table].views: ${queryResult.rows[0].f[1].v}
    - returnResults:
        return: ${results}

 

Next, we tried executing the BigQuery jobs concurrently. Note that it was very simple to make the change to from a non-parallel to a parallel iteration, simply by adding the parallel parameter and declaring results as a shared variable, as highlighted below:

Parallel iteration 

 

steps:
    - init:
        assign:
            - results : {} # result from each iteration keyed by table name
            - tables:
                - 201201h
                - 201202h
                - 201203h
                - 201204h
                - 201205h
    - runQueries:
        parallel:
            shared: [results]
            for:
                value: table
                in: ${tables}
                steps:
                - runQuery:
                    call: googleapis.bigquery.v2.jobs.query
                    args:
                        projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
                        body:
                            useLegacySql: false
                            useQueryCache: false
                            timeoutMs: 30000
                            # Find top 100 titles with most views on Wikipedia
                            query: ${
                                "SELECT TITLE, SUM(views)
                                FROM `bigquery-samples.wikipedia_pageviews." + table + "`
                                WHERE LENGTH(TITLE) > 10
                                GROUP BY TITLE
                                ORDER BY SUM(VIEWS) DESC
                                LIMIT 100"
                                }
                    result: queryResult
                - returnResult:
                    assign:
                        # Return the top title from each table
                        - results[table]: {}
                        - results[table].title: ${queryResult.rows[0].f[0].v}
                        - results[table].views: ${queryResult.rows[0].f[1].v}
    - returnResults:
        return: ${results}

 

Read More  Securing Cloud Run Deployments With Binary Authorization

Executing these BigQuery jobs concurrently in a parallel iteration took 20 seconds total. That’s 5x faster, as compared to the non-parallel execution. 

 

Running BigQuery jobs in parallel helped speed up total workflow execution time by 5x.

 

Note that when using parallel steps, all variable assignments are guaranteed to be atomic, meaning that you don’t have to worry about variable read/write ordering or race conditions. Declaring a variable as  shared (like in the example above) allows that variable to write to other branches. With shared variables, the assigned value is determined and written without any intervening reads or writes by other branches.Shared variable writes are immediately seen by other branches. (It’s important to note, though, that execution order is not guaranteed.)

 

Parallel branches: Run a set of operations in parallel

If your workflow has multiple and different sets of steps that can be executed at the same time, placing them in parallel branches can decrease the total time needed to complete those steps. You can define up to 10 branches per parallel step, and run up to 20 concurrent branches (after 20, additional parallel branches will be queued).

Here’s an example of using parallel branches to retrieve data in parallel from two different services:

 

main:
  params: [input]
  steps:
    - init:
        assign:
          - userProfile: {}
          - recentItems: []
    - enrichUserData:
        parallel:
          # userProfile and recentItems are shared to make them writeable in the branches
          shared: [userProfile, recentItems]
          branches:
            - getUserProfileBranch:
                steps:
                  - getUserProfile:
                      call: http.get
                      args:
                        url: '${"https://example.com/users/" + input.userId}'
                      result: userProfile
            - getRecentItemsBranch:
                steps:
                  - getRecentItems:
                      try:
                        call: http.get
                        args:
                          url: '${"https://example.com/items?userId=" + input.userId}'
                        result: recentItems
                      except:
                        as: e
                        steps:
                          - ignoreError:
                              assign:
                                # continue with an empty list if this call fails
                                - recentItems: []

 

Read More  Google Cloud Next 2019 | Choosing the Right GCE Instance Type for Your Workload

When should I use parallel steps?

You’ll see the most efficiency gains by parallelizing long-running steps (~1 second or greater) that include operations like sleep, HTTP requests, or callbacks. For fast-running compute operations and steps that include operations like assign, switch, or next, you should continue running those in serial order, because you won’t see any efficiency gains by parallelizing them.

 

For example, in the preceding tutorial, each BigQuery job takes approximately 20 seconds to run. For that reason, parallelizing those jobs makes a lot of sense to speed up execution time, because those jobs don’t need to run in sequential order.

Next steps: Getting started with Workflows Parallel Steps

 

  1. Check out our Parallel Steps codelab: Try out this codelab for a walkthrough on using parallel iteration to run multiple BigQuery services concurrently to process a data set.
  2. Check out our documentation for more information on parallel steps, and for more sample code.
  3. Test Workflows out for free: Workflows is pay-per-use, and your first 5,000 internal steps per month are free. Just head to Google Cloud Console to get started.

 

Parallel Steps is currently in Preview, and we’d love to hear your feedback on this feature as you’re using it. You can send us your feedback through this form.

 

If you’re a Google Cloud developer or a data engineer and you’re not using Workflows today, we encourage you to test it out—especially if you want to build an event-driven business process or application, or a lightweight data pipeline. Get familiar with Workflows in this free codelab or view Workflows product documentation. 

 

 

 

By: Megan Bruce (Outbound Product Manager, Google Cloud)
Source: Google Cloud Blog


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • BigQuery;
  • Google Cloud
  • Workflows
You May Also Like
Points, Lines and a Question
View Post
  • Architecture
  • Design
  • Engineering
  • People

What Is The Point In Making Points?

  • November 26, 2025
View Post
  • Engineering
  • Software Engineering

Development gets better with Age

  • October 9, 2025
View Post
  • Engineering
  • Technology

Apple supercharges its tools and technologies for developers to foster creativity, innovation, and design

  • June 9, 2025
View Post
  • Engineering

Just make it scale: An Aurora DSQL story

  • May 29, 2025
View Post
  • Engineering
  • Technology

Guide: Our top four AI Hypercomputer use cases, reference architectures and tutorials

  • March 9, 2025
View Post
  • Computing
  • Engineering

Why a decades old architecture decision is impeding the power of AI computing

  • February 19, 2025
View Post
  • Engineering
  • Software Engineering

This Month in Julia World

  • January 17, 2025
View Post
  • Engineering
  • Software Engineering

Google Summer of Code 2025 is here!

  • January 17, 2025

Stay Connected!
LATEST
  • 1
    Expectations vs. Reality: The AI We Thought We’d Have in 10 Years
    • June 19, 2026
  • digital-nomad-freelancer-worker-2151205464 2
    One paperwork problem – Get your Digital Nomad Visa employment documents fast from UK, EU or Singapore
    • June 16, 2026
  • 3
    Samsung Art Store Brings Art Basel to Homes Worldwide With New Curated Collection
    • June 15, 2026
  • 4
    You Do Not Need to Invest in the IPO of SpaceX, Anthropic, and OpenAI
    • June 10, 2026
  • 5
    The consequences of relying on AI for accurate news
    • June 10, 2026
  • 6
    Connecting AI agents with unstructured data using Google Cloud Storage MCP Servers
    • June 10, 2026
  • 7
    WWDC26: Apple unveils next generation of Apple Intelligence, Siri AI, powerful parental controls, and an expansive set of software improvements
    • June 8, 2026
  • 8
    IBM and Google Cloud Announce Strategic Partnership to Scale AI with Human Expertise and AI‑Powered Delivery
    • June 4, 2026
  • Data center 9
    Data Sovereignty in Spain. It’s Not Just About the Law, It’s About Efficiency
    • June 3, 2026
  • 10
    Ink vs Pixels. What you miss versus what you are actually missing.
    • June 1, 2026
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    Banks race to patch new cyber vulnerabilities, and other cybersecurity news
    • May 25, 2026
  • pope-leo-xiv-cq5dam-1500.844 2
    Pope Leo XIV to Publish First Encyclical on Artificial Intelligence and Human Dignity on 25 May
    • May 22, 2026
  • 3
    Portfolio to Clients, and is Strengthened by Ongoing Project Glasswing Work
    • May 20, 2026
  • reMarkable Paper Pure 4
    Everything The reMarkable Paper Pure Actually Does
    • May 14, 2026
  • 5
    Scaling cloud and AI: Microsoft Azure’s commitment to Europe’s digital future
    • May 11, 2026
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.