DChain single-node blockchain + React Native messenger client. Core: - PBFT consensus with multi-sig validator admission + equivocation slashing - BadgerDB + schema migration scaffold (CurrentSchemaVersion=0) - libp2p gossipsub (tx/v1, blocks/v1, relay/v1, version/v1) - Native Go contracts (username_registry) alongside WASM (wazero) - WebSocket gateway with topic-based fanout + Ed25519-nonce auth - Relay mailbox with NaCl envelope encryption (X25519 + Ed25519) - Prometheus /metrics, per-IP rate limit, body-size cap Deployment: - Single-node compose (deploy/single/) with Caddy TLS + optional Prometheus - 3-node dev compose (docker-compose.yml) with mocked internet topology - 3-validator prod compose (deploy/prod/) for federation - Auto-update from Gitea via /api/update-check + systemd timer - Build-time version injection (ldflags → node --version) - UI / Swagger toggle flags (DCHAIN_DISABLE_UI, DCHAIN_DISABLE_SWAGGER) Client (client-app/): - Expo / React Native / NativeWind - E2E NaCl encryption, typing indicator, contact requests - Auto-discovery of canonical contracts, chain_id aware, WS reconnect on node switch Documentation: - README.md, CHANGELOG.md, CONTEXT.md - deploy/single/README.md with 6 operator scenarios - deploy/UPDATE_STRATEGY.md with 4-layer forward-compat design - docs/contracts/*.md per contract
164 lines
6.1 KiB
Markdown
164 lines
6.1 KiB
Markdown
# DChain production deployment
|
|
|
|
Turn-key-ish stack: 3 validators + Caddy TLS edge + optional
|
|
Prometheus/Grafana, behind auto-HTTPS.
|
|
|
|
## Prerequisites
|
|
|
|
- Docker + Compose v2
|
|
- A public IP and open ports `80`, `443`, `4001` (libp2p) on every host
|
|
- DNS `A`-record pointing `DOMAIN` at the host running Caddy
|
|
- Basic familiarity with editing env files
|
|
|
|
## Layout (single-host pilot)
|
|
|
|
```
|
|
┌─ Caddy :443 ── TLS terminate ──┬─ node1:8080 ──┐
|
|
internet ────────→│ ├─ node2:8080 │ round-robin /api/*
|
|
└─ Caddy :4001 (passthrough) └─ node3:8080 │ ip_hash /api/ws
|
|
...
|
|
Prometheus → node{1,2,3}:8080/metrics
|
|
Grafana ← Prometheus data source
|
|
```
|
|
|
|
For a real multi-datacentre deployment, copy this whole directory onto each
|
|
VPS, edit `docker-compose.yml` to keep only the node that runs there, and
|
|
put Caddy on one dedicated edge host (or none — point clients at one node
|
|
directly and accept the lower availability).
|
|
|
|
## First-boot procedure
|
|
|
|
1. **Generate keys** for each validator. Easiest way:
|
|
|
|
```bash
|
|
# On any box with the repo checked out
|
|
docker build -t dchain-node-slim -f deploy/prod/Dockerfile.slim .
|
|
mkdir -p deploy/prod/keys
|
|
for i in 1 2 3; do
|
|
docker run --rm -v "$PWD/deploy/prod/keys:/out" dchain-node-slim \
|
|
/usr/local/bin/client keygen --out /out/node$i.json
|
|
done
|
|
cat deploy/prod/keys/node*.json | jq -r .pub_key # → copy into DCHAIN_VALIDATORS
|
|
```
|
|
|
|
2. **Configure env files**. Copy `node.env.example` to `node1.env`,
|
|
`node2.env`, `node3.env`. Paste the three pubkeys from step 1 into
|
|
`DCHAIN_VALIDATORS` in ALL THREE files. Set `DOMAIN` to your public host.
|
|
|
|
3. **Start the network**:
|
|
|
|
```bash
|
|
DOMAIN=dchain.example.com docker compose up -d
|
|
docker compose logs -f node1 # watch genesis + first blocks
|
|
```
|
|
|
|
First block is genesis (index 0), created only by `node1` because it has
|
|
the `--genesis` flag. After you see blocks #1, #2, #3… committing,
|
|
**edit `docker-compose.yml` and remove the `--genesis` flag from node1's
|
|
command section**, then `docker compose up -d node1` to re-create it
|
|
without that flag. Leaving `--genesis` in makes no-op on a non-empty DB
|
|
but is noise in the logs.
|
|
|
|
4. **Verify HTTPS** and HTTP-to-HTTPS redirect:
|
|
|
|
```bash
|
|
curl -s https://$DOMAIN/api/netstats | jq
|
|
curl -s https://$DOMAIN/api/well-known-contracts | jq
|
|
```
|
|
|
|
Caddy should have issued a cert automatically from Let's Encrypt.
|
|
|
|
5. **(Optional) observability**:
|
|
|
|
```bash
|
|
GRAFANA_ADMIN_PW=$(openssl rand -hex 24) docker compose --profile monitor up -d
|
|
# Grafana at http://<host>:3000, user admin, password from env
|
|
```
|
|
|
|
Add a "Prometheus" data source pointing at `http://prometheus:9090`,
|
|
then import a dashboard that graphs:
|
|
- `dchain_blocks_total` (rate)
|
|
- `dchain_tx_submit_accepted_total` / `rejected_total`
|
|
- `dchain_ws_connections`
|
|
- `dchain_peer_count_live`
|
|
- `rate(dchain_block_commit_seconds_sum[5m]) / rate(dchain_block_commit_seconds_count[5m])`
|
|
|
|
## Common tasks
|
|
|
|
### Add a 4th validator
|
|
|
|
The new node joins as an observer via `--join`, then an existing validator
|
|
promotes it on-chain:
|
|
|
|
```bash
|
|
# On the new box
|
|
docker run -d --name node4 \
|
|
--volumes chaindata:/data \
|
|
-e DCHAIN_ANNOUNCE=/ip4/<public-ip>/tcp/4001 \
|
|
dchain-node-slim \
|
|
--db=/data/chain --join=https://$DOMAIN --register-relay
|
|
```
|
|
|
|
Then from any existing validator:
|
|
|
|
```bash
|
|
docker compose exec node1 /usr/local/bin/client add-validator \
|
|
--key /keys/node.json \
|
|
--node http://localhost:8080 \
|
|
--target <NEW_PUBKEY>
|
|
```
|
|
|
|
The new node starts signing as soon as it sees itself in the validator set
|
|
on-chain — no restart needed.
|
|
|
|
### Upgrade without downtime
|
|
|
|
PBFT tolerates `f` faulty nodes out of `3f+1`. For 3 validators that means
|
|
**zero** — any offline node halts consensus. So for 3-node clusters:
|
|
|
|
1. `docker compose pull && docker compose build` on all three hosts first.
|
|
2. Graceful one-at-a-time: `docker compose up -d --no-deps node1`, wait for
|
|
`/api/netstats` to show it catching up, then do node2, then node3.
|
|
|
|
For 4+ nodes you can afford one-at-a-time hot rolls.
|
|
|
|
### Back up the chain
|
|
|
|
```bash
|
|
docker run --rm -v node1_data:/data -v "$PWD":/bak alpine \
|
|
tar czf /bak/dchain-backup-$(date +%F).tar.gz -C /data .
|
|
```
|
|
|
|
Restore by swapping the file back into a fresh named volume before node
|
|
startup.
|
|
|
|
### Remove a bad validator
|
|
|
|
Same as adding but with `remove-validator`. Only works if a majority of
|
|
CURRENT validators cosign the removal — intentional, keeps one rogue
|
|
validator from kicking others unilaterally (see ROADMAP P2.1).
|
|
|
|
## Security notes
|
|
|
|
- `/metrics` is firewalled to internal networks by Caddy. If you need
|
|
external scraping, add proper auth (Caddy `basicauth` or mTLS).
|
|
- All public endpoints are rate-limited per-IP via the node itself — see
|
|
`api_guards.go`. Adjust limits before releasing to the open internet.
|
|
- Each node runs as non-root inside a read-only rootfs container with all
|
|
capabilities dropped. If you need to exec into one, `docker compose exec
|
|
--user root nodeN sh`.
|
|
- The Ed25519 key files mounted at `/keys/node.json` are your validator
|
|
identities. Losing them means losing your ability to produce blocks; get
|
|
them onto the host via your normal secret-management (Vault, sealed-
|
|
secrets, encrypted tarball at deploy time). **Never commit them to git.**
|
|
|
|
## Troubleshooting
|
|
|
|
| Symptom | Check |
|
|
|---------|-------|
|
|
| Caddy keeps issuing `failed to get certificate` | Is port 80 open? DNS A-record pointing here? `docker compose logs caddy` |
|
|
| New node can't sync: `FATAL: genesis hash mismatch` | The `--db` volume has data from a different chain. `docker volume rm nodeN_data` and re-up |
|
|
| Chain stops producing blocks | `docker compose logs nodeN \| tail -100`; look for `SLOW AddBlock` or validator silence |
|
|
| `/api/ws` returns 429 | Client opened > `WSMaxConnectionsPerIP` (default 10). Check `ws.go` for per-IP cap |
|
|
| Disk usage growing | Background vlog GC runs every 5 min. Manual: `docker compose exec nodeN /bin/sh -c 'kill -USR1 1'` (see `StartValueLogGC`) |
|