Use Parallelism¶

Multi-node tests have a natural advantage: many setup steps can run on several machines at the same time. If you accidentally serialize that work, the test becomes slower than necessary.

Still, this is not a hard rule. Sometimes sequential execution is the better choice, for example when you want to fail early before starting more machines and consuming more CPU, memory, or I/O. Most of the time, however, independent setup work should be started in parallel.

A common serial pattern¶

This loop is simple, but it fully finishes one machine before the next machine even starts the same work:

machines = [machine1, machine2, machine3]

for m in machines:
    m.systemctl("start network-online.target")
    m.wait_for_unit("network-online.target")

If network-online.target takes a few seconds per machine, the total runtime becomes roughly the sum of all three waits.

network-online.target is only an example here. The same pattern appears with other long-running and independent setup work, such as:

starting services
triggering initialization jobs
warming caches
waiting for background preparation to finish

The parallel version¶

When each machine can make progress independently, it is usually better to split the operation into two phases:

Start the work everywhere.
Wait for completion everywhere.

machines = [machine1, machine2, machine3]

for m in machines:
    m.systemctl("start network-online.target")

for m in machines:
    m.wait_for_unit("network-online.target")

This allows all machines to work at the same time. The total runtime is then closer to the slowest machine instead of the sum of all machines.

Why this is faster¶

The timing difference looks like this:

Sequential setup

machines = [machine1, machine2, machine3]

for m in machines:
    m.systemctl("start network-online.target")
    m.wait_for_unit("network-online.target")

Here, machine2 does not even begin until machine1 is done, and machine3 waits for both.

Parallel setup

machines = [machine1, machine2, machine3]

for m in machines:
    m.systemctl("start network-online.target")

for m in machines:
    m.wait_for_unit("network-online.target")

Here, all three machines make progress at the same time.

When sequential execution is still reasonable¶

Parallelism is usually the better default, but there are cases where sequential execution is intentional:

you want to fail fast on the first machine before starting more expensive work
later machines depend on an earlier machine becoming healthy first
starting everything at once would create avoidable resource pressure on the host

For example, if machine1 fails to come up cleanly, it can be better to stop there instead of also booting and preparing several more machines that are not yet needed.

The best practice is not "Always parallelize". It is: "Always parallelize independent work unless you have a concrete reason not to."

A practical rule of thumb¶

If an operation on several nodes is:

independent across machines
expected to take noticeable time
needed on all machines anyway

then this pattern is usually a good choice:

for m in machines:
    start_some_long_running_step(m)

for m in machines:
    wait_until_that_step_finished(m)

That keeps the test readable while still taking advantage of the parallelism available in a multi-node setup.