Skip to content
This repository was archived by the owner on Mar 25, 2025. It is now read-only.

Commit 5be39db

Browse files
committed
refactor
1 parent 64f6b01 commit 5be39db

8 files changed

Lines changed: 108 additions & 31 deletions

File tree

docs/_examples.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
---
2+
sidebar_position: 3
3+
---
4+
5+
# Demo at Devcon 2024
6+
7+
We had a demo at [Devcon 2024](https://www.youtube.com/watch?v=wCp7Zsjou7w) where participants can prove and share their Binance ETH balance in their spot account using the Binance API. We then derive the mean, median, max, and gini index of the balances of all participants, without revealing their individual balances.
8+
9+
This demo demonstrates how real-world data sources like Binance can be securely integrated into privacy-preserving statistical computations using MPC and TLSNotary. It's a foundational step toward building tools that enable collaborative data analysis while preserving user privacy.
10+
11+
You can explore our [implementation code](https://github.com/ZKStats/mpc-demo-infra) and read our detailed [Devcon Demo Report](https://pse-team.notion.site/MPCStats-Devcon-Demo-Report-3055bb69afd24d60bf8ee8d4fa5f774c) to learn more about the technical details and outcomes of this demonstration.
12+
13+
14+
Below is what our stats page looked like:
15+
16+
![Devcon demo interface](./devcon-demo.png)
17+
18+
19+
<!--
20+
## (TODO) 2. Simple Example
21+
22+
- **Branch:** [`simple`](https://github.com/ZKStats/mpc-demo-infra/tree/simple)
23+
24+
Three participants prove and share their followers count on [page 0](https://jernkunpittaya.github.io/followers-page/party_0.html), [page 1](https://jernkunpittaya.github.io/followers-page/party_1.html), and [page 2](https://jernkunpittaya.github.io/followers-page/party_2.html), respectively. We then derive the statistics of all participants using MPC. -->

docs/architecture.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,22 @@ We use a client-server architecture where data providers and consumers delegate
88

99
## Components
1010

11+
12+
Below is the workflow of the demo. The black arrows are the flow for sharing data and the blue ones are the flow for querying the results.
13+
14+
![Components flow diagram](./components-flow.png)
15+
1116
- **Coordination Server**: Coordinates communication between data providers, data consumers, and computation parties.
1217
- Implements basic congestion control, rate-limiting, and sybil prevention to maintain system robustness.
1318
- Does not access or store plaintext data.
14-
- **Computation Parties (3 servers)**:
19+
- **Computation Party Servers (3 servers)**:
1520
- Store encrypted data received from data providers.
1621
- Perform statistical operations defined in MPC while verifying the data matches TLSNotary proof.
1722
- Return results to data consumers.
1823
- Each party operates independently to ensure security.
1924
- **TLSNotary Server**: Data providers use it to generate proofs confirming their data is authenticated from verified websites.
2025
- **(Optional) Client API**:
21-
- Acts as a data consumer by periodically polling computation parties for results and caches them.
26+
- It's not shown in the diagram above since it's basically a data consumer that polls computation parties for results and caches them.
2227
- Offers a simple REST API for end users to query statistical results without directly interacting with the Coordination Server, preventing unnecessary MPC computations.
2328

2429
## System Workflow
@@ -45,7 +50,6 @@ We use a client-server architecture where data providers and consumers delegate
4550
- **Coordination Server**:
4651
- Centralized to streamline coordination but does not access or store plaintext data.
4752
- Rate-limiting and participant verification reduce the risk of Sybil attacks and DoS.
48-
- In Binance case, we expose a field "uid" in the TLSNotary proof, which is a unique identifier for each data provider. This way we prevent Sybil attacks by checking if the uid is unique.
4953
- **Notary Server**:
5054
- Participants trust the Notary Server to generate a correct proof.
5155
- By default, we use a local notary whose private keys are exposed, so it's possible for people to forge it. A trusted party running a remote notary server can mitigate this risk.

docs/components-flow.png

91 KB
Loading
93.1 KB
Loading

docs/examples.md

Lines changed: 0 additions & 20 deletions
This file was deleted.

docs/getting-started/customization.md

Lines changed: 64 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,22 @@ sidebar_position: 3
44

55
# Customization
66

7-
Developers can modify and extend the boilerplate to create their own privacy-preserving applications by customizing these following components.
7+
Developers can modify and extend the boilerplate to create their own privacy-preserving applications. The ones marked with red are the ones that should be changed to customize for new data sources or different statistical operations.
8+
9+
![Components to be customized](../components-to-be-customized.png)
10+
11+
12+
Follow the steps below to customize the components.
813

914
## 1. Modify the TLSNotary prover and verifier for new data sources.
1015

1116
> This is a bit different from original TLSNotary, because in addition to being able to specify "redacted" parts of data where they are just not shown in the proof, users can also specify the "private" parts in the proof, which are not only not shown in the proof, but also having their sha3 commitment in the proof such that they can be seamlessly integrated with MP-SPDZ to make sure that the inputs of MP-SPDZ actually come from the private parts of these data from TLSNotary. This guide will mostly focus on customizing this additional "private" feature.
1217
13-
We have a few examples [here](https://github.com/ZKStats/tlsn/tree/mpspdz-compat/tlsn/examples), but to be fully compatible with end-to-end flow, let's take a look at how to customize this TLSNotary prover and verifier from Binance Eth Balance [Example](https://github.com/ZKStats/tlsn/tree/mpspdz-compat/tlsn/examples/binance) by modifying the following files:
18+
We have a few examples [here](https://github.com/ZKStats/mpc-demo-infra/tree/main/tlsn/tlsn/examples), but to be fully compatible with end-to-end flow, let's take a look at how to customize this TLSNotary prover and verifier from Binance Eth Balance [Example](https://github.com/ZKStats/mpc-demo-infra/tree/main/tlsn/tlsn/examples/binance) by modifying the following files:
1419

15-
#### binance_prover.rs
20+
### 1.1. TLSN Prover
1621

17-
Here is the main file for creating proof, where you can make this following customizations for "private" part of data.
22+
The TLSN prover for Binance ETH balance is in [binance_prover.rs](https://github.com/ZKStats/mpc-demo-infra/blob/main/tlsn/tlsn/examples/binance/binance_prover.rs). It's the main file for creating proof, where you can make this following customizations for "private" part of data.
1823

1924
**In main()**:
2025

@@ -31,13 +36,13 @@ Here is the main file for creating proof, where you can make this following cust
3136
- Specify "private" parts of the received message.
3237
We specify the private part (recv_private_ranges)that will be accompanied with sha3 commitment in the proof while being censored from the proof itself by specifying our preferred regex. In Binance example, we specify to make ETH free balance of only 2 decimlals precision private.
3338

34-
> With this structure, there will be some parts of received message that is not in either recv_public_ranges or recv_private_ranges. Those will be just redacted data that are censored without its correponding commitment (like in original TLSNotary)
39+
> With this structure, there will be some parts of received message that is not in either recv_public_ranges or recv_private_ranges. Those will be just redacted data that are censored without its corresponding commitment (like in original TLSNotary)
3540
3641
> Since we decide which part to censor based on regex, it is very important to make sure that the returned data is formatted as you expect when you write regex or else there may result in unexpected data leaking. In our case, we enforce the check that recv transcript ends with uid because this is the assumption that we used to constrain regex in determining recv_public_ranges
3742
3843
> In getting data from API, it's recommended to specify as many arguments for API query as possible because we prefer the data sent back from API to be as smallest as possible.
3944
40-
#### binance_verifier.rs
45+
### 1.2. TLSN Verifier
4146

4247
Here, we just encapsulate the logic that distinguishes which part is just redacted, and which part is private.
4348

@@ -49,13 +54,64 @@ Here, we just encapsulate the logic that distinguishes which part is just redact
4954

5055
## 2. Update the MPC program to include additional or modified statistical operations.
5156

52-
Use MPCStats library for statistical operations. Can see the example and instruction what/how to customize [here](https://github.com/ZKStats/MP-SPDZ/tree/mpcstats-lib/mpcstats)
57+
We define a computation template in [query_computation.mpc](https://github.com/ZKStats/mpc-demo-infra/blob/d8de6b4dcf85ff434ca48cb2af3bd00de43aba8a/mpc_demo_infra/program/query_computation.mpc#L26-L64). Whenever there is a new stats query, each MPC party fill in necessary information in this template and runs the program with other MPC parties.
58+
59+
```python
60+
# Filled in by computation party
61+
PORTNUM = {client_port_base}
62+
MAX_DATA_PROVIDERS = {max_data_providers}
63+
NUM_DATA_PROVIDERS = {num_data_providers}
64+
```
65+
66+
Actual statistical operations are done in [`computation` function](https://github.com/ZKStats/mpc-demo-infra/blob/d8de6b4dcf85ff434ca48cb2af3bd00de43aba8a/mpc_demo_infra/program/query_computation.mpc#L26-L64). You can see the following lines in the function
67+
```python=
68+
result[0] = sint(num_data_providers)
69+
# Max
70+
result[1] = data[num_data_providers-1]
71+
# Sum
72+
result[2] = sum(data)
73+
# Median
74+
result[3] = mpcstats_lib.median(data)
75+
76+
# Note that Gini coefficient = (area/(num_data_providers*result[1])) - 1
77+
# But we leave that to client side handling to optimize calculation in mpc
78+
area = sint(0)
79+
@for_range(num_data_providers)
80+
def _(i):
81+
area.update(area+(2*i+1)*data[i])
82+
result[4] = area
83+
```
84+
We return the results in `result` array.
85+
- `result[0]` is the number of data providers.
86+
- `result[1]` is the max value of data.
87+
- `result[2]` is the sum of data.
88+
- `result[3]` is the median of data.
89+
- `result[4]` is the area of the gini coefficient.
90+
91+
You can see for `result[3]`, we use `mpcstats_lib.median(data)` to calculate the median of data. Aside from `median`, we provide implementations of common statistical operations in `mpcstats_lib` so you can just import them and use:
92+
- stats operations: `mean`, `median`, `covariance`, `correlation`, `geometric_mean`, `mode`, `variance`, `linear_regression`, `harmonic_mean`
93+
- data operations: `where`, `join`
94+
95+
See the implementation [here](https://github.com/ZKStats/MP-SPDZ/blob/cdad13da73d4bcd7e10c04efd8c22cba7453f0c3/mpcstats/mpcstats_lib.py#L79-L303). To learn more about the DSL for MPC program, make sure to check out [MP-SPDZ documentation](https://mp-spdz.readthedocs.io/en/latest/readme.html).
96+
97+
After changing the MPC program, you also need to update the [Client CLI](https://github.com/ZKStats/mpc-demo-infra/blob/c57245417eec906947bd463e4651ecc528f949ce/mpc_demo_infra/client_lib/lib.py#L107-L113) so that it parse the result correctly.
98+
99+
```python
100+
results = StatsResults(
101+
num_data_providers = num_data_providers,
102+
max = safe_div(output_list[1], 10 * BINANCE_DECIMAL_SCALE),
103+
mean = safe_div(output_list[2], num_data_providers * 10 * BINANCE_DECIMAL_SCALE),
104+
median = safe_div(output_list[3], 10 * BINANCE_DECIMAL_SCALE),
105+
gini_coefficient = safe_div(output_list[4], num_data_providers * output_list[2]) - 1,
106+
)
107+
```
108+
Here, we parse the result from the MPC program. Values are divided by `10 * BINANCE_DECIMAL_SCALE` because the values from MPC are scaled up.
53109

54110
## 3. Customize Coordination Server
55111

56112
This customization is to make sure we properly handle what data is allowed to participate in MPC process. In Binance example, we make sure that one "uid" (user id of binance account) can only submit one proof. Hence, with different data source, we need to adjust how we store & process data in coordination accordingly as follows.
57113

58-
- Specify what to store in coordination server.
114+
- Specify what to store in Coordination Server.
59115
In Binance example, we store eth_address and uid [here](https://github.com/ZKStats/mpc-demo-infra/blob/e73b35aa487b8dc1efd403edddb80f10ebebf681/mpc_demo_infra/coordination_server/database.py#L31)
60116
- Modify how we extract those data field from proof and make sure it wont repeat what's already in database.
61117
Just modify the code [here](https://github.com/ZKStats/mpc-demo-infra/blob/e73b35aa487b8dc1efd403edddb80f10ebebf681/mpc_demo_infra/coordination_server/routes.py#L142-L157)

docs/intro.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,19 @@ We provide these features:
2121
- **For data providers and consumers**: By delegating MPC to three non-colluding servers, participants experience a workflow similar to traditional statistical survey systems. This design allows users to submit data and disconnect, enhancing usability without compromising privacy.
2222

2323
MPCStats Demo Infrastructure ensures privacy-preserving and verifiable results, making it suitable for sensitive applications such as financial analysis, health data aggregation, and collaborative statistics.
24+
25+
## Demo at Devcon 2024
26+
27+
We had a demo at [Devcon 2024](https://www.youtube.com/watch?v=wCp7Zsjou7w) where participants can prove and share their Binance ETH balance in their spot account using the Binance API. We then derive the mean, median, max, and gini index of the balances of all participants, without revealing their individual balances.
28+
29+
This demo demonstrates how real-world data sources like Binance can be securely integrated into privacy-preserving statistical computations using MPC and TLSNotary. It's a foundational step toward building tools that enable collaborative data analysis while preserving user privacy.
30+
31+
You can explore our [implementation code](https://github.com/ZKStats/mpc-demo-infra) and read our detailed [Devcon Demo Report](https://pse-team.notion.site/MPCStats-Devcon-Demo-Report-3055bb69afd24d60bf8ee8d4fa5f774c) to learn more about the technical details and outcomes of this demonstration. For the rest of the documentation, we'll focus on the components and workflow of the demo.
32+
33+
Below is what our stats page looked like:
34+
35+
![Devcon demo interface](./devcon-demo.png)
36+
2437
<!--
2538
MPCStats Demo Infrastructure is the demo built by the [MPCStats](https://pse.dev/en/projects/mpc-stats) team for [Devcon 7](https://www.youtube.com/watch?v=wCp7Zsjou7w). Data providers can share their data privately, and data consumers can learn statistical results from all shared data.
2639

docs/share-data.png

5.81 KB
Loading

0 commit comments

Comments
 (0)