NEAR Chain Signatures enables signing and executing transactions across multiple blockchain protocols. The technology relies heavily on threshold cryptography, using signature schemes such as ECDSA and EdDSA. This blog post explores how newly integrated schemes can help scale our systems, demonstrating their impact through practical benchmarking techniques and results.

Threshold signing allows multiple parties, each holding a secret key share, to jointly compute a signature over a specific message. Such a technology guarantees cryptographically the possibility of generating valid signatures if and only if at least $t$ participants actively take part in the signing process, where $t$ is the system’s threshold parameter. In other words, any $t-1$ malicious parties would fail to generate a valid signature without the inclusion of one honest signer.

Our system maintains a pool of eight signers. For each protocol instance, a subset of five signers is selected to participate, and operations require approval from all five participants (i.e., a 5-of-5 configuration). As we scale, we plan to increase the total number of available signers while preserving strong security guarantees and high performance.

A preliminary performance assessment revealed that the primary bottlenecks stem from the complexity of our deployed ECDSA scheme (based on Cait-Sith implementation), which we refer to as OT-Based ECDSA. This protocol requires more than eleven computation-heavy communication rounds and exhibits quadratic communication overhead with respect to the number of active participants. We split this scheme (as well as all other schemes) into two phases: an offline phase and an online phase. The offline phase is a precomputation stage that occurs before the message is known to the protocol participants. During this phase, participants collaboratively generate cryptographic assets that will be needed in the online phase. Once the message becomes known to the signers, the online phase begins, and signatures can be produced and delivered with minimal delay, since all necessary computations have already been performed in advance. In this blog post, all the schemes mentioned require a presigning protocol (Presign), which is part of the offline phase, and a signing protocol (Sign), which belongs to the online phase. Additionally, the OT-Based ECDSA scheme requires a Two Triple Generation protocol during the offline phase, which must be executed before the presigning protocol.

To address the OT-Based ECDSA performance limitations, we implemented and evaluated a more efficient ECDSA construction based on [DJNPO20], which we refer to as Robust ECDSA. To quantify the expected improvements, we developed a dedicated benchmarking methodology.

This blog post presents our approach, assumptions, and findings and shows results for the OT-Based ECDSA, Robust ECDSA and (for completeness) our EdDSA implementation based on FROST [KG20].

Benchmarking

Prior to this work, we had multiple threshold signature schemes implemented, but no systematic way to compare their practical performance. A key challenge in this setting is that these protocols are inherently distributed, yet benchmarking them on multiple machines is costly and difficult to control. As a result, our goal was to design benchmarking methodologies that run on a single machine while still providing results that are representative of real-world deployments.

To address this gap, we designed two benchmarking approaches based on Rust’s Criterion framework to measure computation time. To ensure a fair comparison between the implemented schemes, we evaluated them under a shared invariant: the security level, defined as the maximum number of malicious parties tolerated.

Importantly, for a fixed maximum number of malicious parties, different schemes may require different numbers of active participants. Our methodology accounts for this distinction, enabling an apples-to-apples comparison at an equivalent security level. Naturally, derandomizing our algorithms was a necessary precondition for producing consistent results, thereby improving the reliability of our benchmarks by eliminating protocol-induced randomness.

Our first approach measures the end-to-end execution time of each scheme by running all participants on a single machine. We refer to this as the naive technique. While straightforward and fast to implement, this method does not fully reflect realistic deployment conditions and may therefore produce results that are only partially representative. In the following section, we describe how this technique operates in practice and explain why we classify it as “naive.”

We then designed and implemented a more representative, advanced technique. This approach captures a snapshot of the communication during protocol execution and subsequently replays the protocol from the perspective of a single participant using the recorded data. By doing so, we are able to isolate and measure the per-participant computational cost, while still accounting for network latency and communication volume. The advantage of this approach is that it provides a more accurate approximation of real-world performance without requiring a fully distributed deployment for each benchmark run.

All benchmarks were executed on a laptop equipped with an AMD Ryzen 7 7730U processor with Radeon Graphics and 16 GB of RAM with a clock speed ranging between 3.2-4.3 GHz. Each experiment was run for a minimum of 15 iterations to ensure statistically meaningful results.

Naive Technique

A straightforward way to benchmark our schemes is to execute the entire protocol with all participants running side by side and then analyze the aggregated results. We refer to this approach as naive for several reasons:

  1. Sequential execution distorts computation costs.
    Participants are executed sequentially in a single environment. When combined with the quadratic or cubic communication complexity of some protocols, this makes it difficult to isolate per-participant computation time.

  2. Potential unfairness across schemes.
    Different signature schemes may require different numbers of active participants to tolerate the same maximum number of malicious parties. In our naive benchmarking approach, all participants are executed sequentially on a single machine, causing the total runtime to scale with the number of participants. This can introduce bias when comparing schemes, as protocols requiring more participants may appear slower, even though in a real distributed deployment these participants would operate in parallel.

  3. Overcounting communication costs.
    When a participant broadcasts a message to all others, the same send operation is effectively measured multiple times, artificially inflating the perceived communication overhead.

The table below reports the results of our Criterion benchmarks for Robust ECDSA and OT-Based ECDSA, with the maximum number of malicious parties fixed at 6.

Naive benchmarking of ECDSA threshold signing schemes.
Maximum number of malicious parties: 6
Scheme Parties Offline Phase Online Phase
Two Triples Generation Presign Sign
OT-Based ECDSA 7 1.4237 s 1.4626 ms 191.82 µs
Robust ECDSA 13 N/A 66.060 ms 278.13 µs

In this configuration, Robust ECDSA requires 13 active participants to tolerate 6 malicious parties, whereas the OT-Based scheme requires only 7. Despite the higher number of participants, Robust ECDSA demonstrates substantially faster offline phase performance by eliminating the costly “Two Triples Generation” phase.

Advanced Technique

A more accurate benchmarking approach is the snap-then-simulate method. In this setup, the protocol is executed with a single real participant (the coordinator), while all other participants are simulated.

We study protocol scalability by increasing the number of simulated participants within this framework. Although only one participant runs as a real process, the simulator emulates the interactions of all other parties with the real participant. As the number of participants grows, the real participant must handle a proportionally larger workload: it processes more messages, performs more computation, and sends/receives more data over the network. This allows us to faithfully capture how the per-participant computational and communication costs scale with the total number of participants.

More specifically, we first enabled derandomization of our algorithms during benchmarking to ensure reproducibility. Next, we implemented a function that executes a protocol with all participants and records all exchanged messages in a dictionary. We then developed the simulator logic, including a function that allows the simulator to respond to a real participant in a simplified, dummy manner using the previously captured snapshot data.

During the second (simulated) run, we benchmarked the real participant’s performance using Criterion. Additionally, we enabled measurement of the data volume received by each participant throughout the protocol execution.

Why is this technique better than the naive one?

  1. Fair benchmarking across protocols.
    Even when one scheme requires more participants than another, this method focuses on the performance of a single real participant rather than measuring all participants simultaneously.

  2. Accurate representation of $O(n^2)$ communication costs.
    By simulating all-but-one participants, we avoid artificially inflating communication complexity. This reduces the protocol from $O(n^2)$ to $O(n)$ for benchmarking purposes, allowing a clearer focus on the real participant’s computation and communication.

  3. Easy measurement of data transmitted.
    The snap-then-simulate approach makes it straightforward to track the amount of data sent and received by each participant during a protocol run.


Results & Analysis

In this section, we present selected benchmark results. The tables below show the time required for a single participant (or coordinator, where applicable) to complete each protocol. These measurements were obtained using the advanced snap-then-simulate benchmarking technique, providing a more accurate and representative view of per-participant performance.

Advanced benchmarking of both ECDSA and FROST threshold signing schemes.
Maximum number of malicious parties: 6   |   Network Latency: 0 ms
Scheme Parties Offline Phase Online Phase
Two Triples Generation Presign Sign
OT-Based ECDSA 7 198.95 ms 206.52 µs 111.76 µs
Robust ECDSA 13 N/A 4.90 ms 114.63 µs
FROST 7 N/A 419.23 µs 348.94 µs

Note that in the naive benchmarking, the time reported for Two Triples Gen and Presign roughly corresponds to the time measured in the advanced setting multiplied by the number of active participants, i.e., \(\text{time}_{\text{naive}} \approx \text{time}_{\text{advanced}} \times \text{number\_of\_participants}\)

For a higher number of tolerated malicious parties, the measured results are as follows:

Advanced benchmarking of both ECDSA and FROST threshold signing schemes.
Maximum number of malicious parties: 15   |   Network Latency: 0 ms
Scheme Parties Offline Phase Online Phase
Two Triples Generation Presign Sign
OT-Based ECDSA 16 544.94 ms 257.05 µs 119.65 µs
Robust ECDSA 31 N/A 24.56 ms 129.45 µs
FROST 16 N/A 964.76 µs 590.34 µs

The Robust ECDSA offline phase is 40× faster than that of OT-based ECDSA when the maximum number of malicious parties is 6, and 22× faster when the maximum rises to 15. We attribute this difference primarily to the increasing number of active participants required in the Robust ECDSA setting which relies on honest majority assumptions.


Latency

Because the computation time of both schemes is relatively small, adding network latency has a proportionally larger impact, effectively dominating the total measured time. As a result, the observed performance under latency depends primarily on the number of communication rounds each scheme requires. The table below provides the number of rounds per protocol run:

Number of rounds of threshold signing schemes.
*Note: The OT-Based ECDSA Two Triples generation scheme requires more than 8 rounds of communication to complete. This is an estimation that gives a good idea of the network latency cost on the benchmarking.
Scheme Offline Phase Online Phase
Two Triples Generation Presign Sign
OT-Based ECDSA 8* 2 1
Robust ECDSA N/A 3 1
FROST N/A 1 1

Our implementation does not yet support explicitly adding latency to the communication but we are currently working on this feature. However, when accounting for network latency, the total execution time can be estimated using the following formula: $\text{latency} \times \text{rounds} + \text{computation}$.

For example, we have the following results:

Advanced benchmarking of both ECDSA and FROST threshold signing schemes.
Maximum number of malicious parties: 15   |   Network Latency: 100 ms
Scheme Parties Offline Phase Online Phase
Two Triples Generation Presign Sign
OT-Based ECDSA 16 1.344 s 200.25 ms 100.11 ms
Robust ECDSA 31 N/A 324.56 ms 100.12 ms
FROST 16 N/A 100.96 ms 100.59 ms

Notice that the Robust ECDSA offline phase is roughly 4.7× faster than the OT-Based ECDSA offline phase. In fact, as network latency increases, the ratio approaches 3.3×, reflecting the ratio of communication rounds (10 vs 3).


Bandwidth

In real systems, scalability is often limited by network bandwidth. To explore this, we measured the total amount of data received by a real participant during a snap-then-simulate protocol run. Inferring data per individual round from the snapshots would require a major refactor of our cryptographic communication channels—a complex task that could introduce breaking changes in our deployed product. As a result, computing per-participant, per-round data is ongoing work and is not included in this post. Instead, we report the total data received over the entire protocol execution.

For protocols that distinguish a coordinator from regular participants, we specifically measured the data received by the coordinator. As expected, the coordinator receives more data than other participants, since it is the only party capable of producing a signature. Across our benchmark runs, the amount of data sent was stable, with zero variance across iterations.

Additionally, the reported sizes are expressed in bytes and include the metadata (e.g., receiver ID, session number, etc.) sizes. The measurements reflect the raw data transmitted by the protocol itself, including application-level metadata such as the sender, receiver, and session identifiers. They do not include transport-layer overhead such as TLS or TCP headers, which would slightly increase the total byte count in a real deployment.

Data received by the real participant when the system is configured with 6 malicious parties and no latency. The unit of measurement is bytes.
Scheme Parties Offline Phase Online Phase
Two Triples Generation Presign Sign
OT-Based ECDSA 7 595260 1416 557
Robust ECDSA 13 N/A 6387 1096
FROST 7 N/A 918 609
Data received by the real participant when the system is configured with 15 malicious parties and no latency. The unit of measurement is bytes.
Scheme Parties Offline Phase Online Phase
Two Triples Generation Presign Sign
OT-Based ECDSA 16 2088966 3485 1360
Robust ECDSA 31 N/A 15986 2752
FROST 16 N/A 2274 1526

Conclusion

The primary goal of this work was to determine whether the Robust ECDSA scheme can replace OT-based ECDSA as a more scalable and efficient alternative. Our benchmarking results confirm that it can: Robust ECDSA outperforms OT-based ECDSA across all measured dimensions — latency, communication rounds, and bandwidth. With 15 maximum malicious parties and 100 ms of latency, the Robust ECDSA offline phase is approximately 4.7× faster than OT-based ECDSA and transmits 130× less data over the network. We found no scenario in which OT-based ECDSA holds an advantage.

It is important to note that FROST is not directly comparable to the ECDSA schemes, as it implements EdDSA — a fundamentally different signature algorithm operating over a different curve. The choice between ECDSA and EdDSA is dictated by the target blockchain’s requirements, not by performance alone. We include FROST in our benchmarks to provide a complete picture of the threshold signing implementations available in our system. That said, FROST’s results are noteworthy: with only one communication round in the offline phase, it achieves 3–15× lower latency than both ECDSA variants while maintaining comparable online-phase performance and low bandwidth usage. For chains that support EdDSA, FROST is the most lightweight option available.

Overall, both Robust ECDSA and FROST exhibit reasonable bandwidth usage as the number of participants grows, demonstrating that our system is well-positioned to scale further with an increasing number of active signers.