·Nelson Kamga

How we run our Tempo RPC nodes

Hardware, setup and operational lessons from running Tempo testnet nodes

infraguidetempo

We've been running our own Tempo testnet nodes at Transfa for a couple of months now. This is an opinionated guide on how we run those nodes, what we've learnt in the process and things we wish we knew when we started. If you're looking for a complete guide on how to run a Tempo node, the Tempo docs are a good starting point. Although this setup should work for a minimal production setup, some of our choices may not work for your use case so your mileage may vary.

Hardware requirements

Disk I/O

NVMe disks are non-negotiable. Network-attached block storage like EBS won't cut it. You can verify your disk performance with fio:

fio --name=nvme_test --directory="$DATADIR" --filename=fio.test --size=10G --rw=randread --bs=4k --ioengine=libaio --iodepth=64 --numjobs=4 --direct=1 --time_based --runtime=60 --group_reporting && rm $DATADIR/fio.test

$DATADIR is any directory on the same partition as the data directory for Tempo. You can also point to the raw disk directly using --filename=<disk-name> (e.g. --filename=/dev/nvme0n1). Note that this is safe because we are running a read test, if we were doing writes then this would corrupt data on disk.

sudo fio --name=nvme_test --filename=/dev/nvme0n1 --size=10G --rw=randread --bs=4k --ioengine=libaio --iodepth=64 --numjobs=4 --direct=1 --time_based --runtime=60 --group_reporting

For an NVMe disk, you should be looking at 500K-1M IOPS, less than 500µs latency and about 3000-7000 MB/s depending on the generation. Here's the result of a test on a Scaleway Elastic Metal server (A610R-NVME): 749K IOPS, 337µs average latency, 2925 MiB/s throughput.

Full fio output
nvme_test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
...
fio-3.36
Starting 4 processes
Jobs: 4 (f=4): [r(4)][100.0%][r=2919MiB/s][r=747k IOPS][eta 00m:00s]
nvme_test: (groupid=0, jobs=4): err= 0: pid=87055: Tue Feb 17 02:51:57 2026
  read: IOPS=749k, BW=2925MiB/s (3067MB/s)(171GiB/60001msec)
    slat (usec): min=2, max=255, avg= 3.88, stdev= 2.21
    clat (usec): min=35, max=8371, avg=337.51, stdev=27.76
     lat (usec): min=38, max=8388, avg=341.39, stdev=27.91
    clat percentiles (usec):
     |  1.00th=[  302],  5.00th=[  310], 10.00th=[  314], 20.00th=[  322],
     | 30.00th=[  326], 40.00th=[  330], 50.00th=[  338], 60.00th=[  343],
     | 70.00th=[  347], 80.00th=[  355], 90.00th=[  363], 95.00th=[  371],
     | 99.00th=[  388], 99.50th=[  396], 99.90th=[  453], 99.95th=[  578],
     | 99.99th=[  783]
   bw (  MiB/s): min= 2688, max= 2965, per=100.00%, avg=2926.42, stdev= 6.44, samples=476
   iops        : min=688268, max=759076, avg=749163.93, stdev=1648.77, samples=476
  lat (usec)   : 50=0.01%, 100=0.01%, 250=0.01%, 500=99.93%, 750=0.05%
  lat (usec)   : 1000=0.02%
  lat (msec)   : 2=0.01%, 10=0.01%
  cpu          : usr=21.06%, sys=78.92%, ctx=1281, majf=0, minf=305
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=44934568,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64
 
Run status group 0 (all jobs):
   READ: bw=2925MiB/s (3067MB/s), 2925MiB/s-2925MiB/s (3067MB/s-3067MB/s), io=171GiB (184GB), run=60001-60001msec
 
Disk stats (read/write):
  nvme0n1: ios=44855690/61, sectors=358845520/378, merge=0/6, ticks=1517197/195, in_queue=1517415, util=99.86%

Why does disk I/O matter so much? Reth (the execution engine for Tempo) runs the sync process in stages. The execution stage, which involves executing transactions and storing indexes, is the main bottleneck and is heavily I/O bound. A node with low IOPS will spend more time on this stage, and given Tempo has sub-second finality, if the sync rate is slower than the block creation rate the node will not be able to catch up.

RAM & CPU

We run our nodes on machines with 32GB of RAM and 8 cores. In terms of actual usage, our nodes are comfortably sitting at 8GB of RAM used with 3.6 average CPU load. Nodes tend to use a lot of buffer/cache so higher memory means more headroom for caching which reduces disk pressure.

Storage

At the time of writing the Moderato testnet is at roughly 450GB of data, growing at around 10GB per day. At this rate, a 2TB disk would be full within 5 months. Once that happens, we would have to increase storage capacity or run a full node with pruning enabled. The documentation mentions that consensus data can be separated from execution data and stored on less performant (thus cheaper) storage. I am not sure what proportion of data is consensus data but if it turns out to be a sizeable portion then separating data could be an option.

For our disk setup, the baremetal provider we use provides two NVMe disks on the host (most providers do) so we configured those disks in RAID0 for better performance and storage. Although this would mean total data loss in case of disk failure, we make up for the risk with node redundancy and in the worst case we can reinstall and sync a node within a couple of hours using snapshots.

Networking

Low network speed won't really affect node performance that much in our experience. It does affect the snapshot download speed during the initial sync. Although the recommended minimum is 1Gbps, we have been able to run nodes with 250 Mbps (albeit with a low connected peer count).

Providers

Tempo has a few baremetal provider recommendations which are a good starting point. In my experience, OVH Advance servers seem to be the best value (starting in the $100 range). If you don't mind the EU-US ping latency, Scaleway Elastic Metal and Hetzner Dedicated Rootserver are strong contenders.

Installing and running a node

A node is just a program you run on a machine and the installation is pretty simple. The Tempo docs do a good job at outlining how to do that. For our setup, we opted for systemd as it is the simplest way to run the node on a linux machine. The other option is using docker but it doesn't provide much benefit for the added complexity. We use ansible scripts to automate node provisioning on new baremetal servers. You can find the ansible playbook in this Github repo.

Here's our systemd unit file:

[Unit]
Description=Tempo RPC Node
After=network.target
Wants=network.target
 
[Service]
Type=simple
User=tempo
Group=tempo
WorkingDirectory=/data/tempo
Environment=RUST_LOG=info
 
ExecStart=/usr/local/bin/tempo node \
  --datadir /data/tempo \
  --follow \
  --chain moderato \
  --port 30303 \
  --discovery.addr 0.0.0.0 \
  --discovery.port 30303 \
  --nat extip:<PUBLIC_IP> \
  --http \
  --http.addr 127.0.0.1 \
  --http.port 8545 \
  --http.api eth,net,web3,txpool,trace,debug \
  --log.file.directory /var/log/tempo \
  --rpc.max-logs-per-response 100000 \
  --metrics <TAILSCALE_IP>:9001
 
Restart=on-failure
RestartSec=10
 
StandardOutput=journal
StandardError=journal
SyslogIdentifier=tempo
LimitNOFILE=infinity
 
ProtectSystem=full
ProtectHome=true
NoNewPrivileges=true
PrivateTmp=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
LockPersonality=true
MemoryDenyWriteExecute=true
RestrictSUIDSGID=true
RestrictRealtime=true
CapabilityBoundingSet=
 
[Install]
WantedBy=multi-user.target

--follow enables follower mode which is what you want for an RPC node. The --http.addr 127.0.0.1 binds the RPC port to localhost only, which we then expose via a reverse proxy (more on that below). We also enable the systemd security hardening directives (ProtectSystem, NoNewPrivileges, etc.) to limit the blast radius in case of a compromise.

Once everything is installed, the next step is to download a snapshot to speed up the initial syncing process. Tempo hosts daily compressed snapshots of the node database and static files at snapshots.tempoxyz.dev which can be downloaded with the tempo download command. The snapshots are quite large so downloading one could take anywhere from a few minutes to a couple of hours depending on network speed. The Tempo team has done some work to improve the reliability of the snapshot download process by making it resumable.

Starting the node is as simple as running systemctl start tempo, we run journalctl -u tempo -f to view logs (we also ship those logs via promtail). cast comes in handy for quick debugging against the RPC node.

Operating a node

Given we only use our nodes for internal purposes, the RPC port is only available via our Tailscale network. For security reasons, the RPC port is bound to localhost (127.0.0.1) and we use a reverse proxy (Caddy) to expose it on the Tailscale interfaces.

<TAILSCALE_IP>:8545 {
    reverse_proxy localhost:8545
}

Tempo nodes have a metrics server that can be scraped and used as a Prometheus datasource. We use the Reth Grafana dashboard to monitor our nodes. The main things we look at is the connected peer count and the sync status. For the connected peer count, higher is better but so long as it is not zero, your node will keep running. Sync status helps you detect lags or when your node is out of sync. It is also helpful for tracking sync progress. We have two alerts configured to monitor those metrics:

  • reth_network_connected_peers < 1 (no connected peers): This usually signals connectivity issues, node running on an old version that is no longer supported. If your node had been down for a while, it will likely have 0 connected peers when started. It can take a few minutes for your node to find connected peers. So don't panic. Just give it some time.

  • increase(reth_sync_checkpoint{stage="Finish"}[5m]) < 1 (node out of sync): If the Finish sync stage has not progressed in the last 5 minutes it means that sync has stalled. This usually happens when there are no connected peers.

Here are the Prometheus alert rules we use:

groups:
  - name: Node Alerts
    interval: 1m
    rules:
      - alert: NodeHasNoConnectedPeers
        expr: reth_network_connected_peers < 1
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Node {{ $labels.instance }} has no connected peers"
          description: "The node has been without peers for more than 2 minutes."
 
      - alert: NodeIsOutOfSync
        expr: increase(reth_sync_checkpoint{stage="Finish"}[5m]) < 1
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Node {{ $labels.instance }} is out of sync"
          description: "The node has not processed any new blocks in the last 5 minutes."

We've observed that nodes often get out of sync when they are not upgraded on time before a hardfork or a backward incompatible chain upgrade. These have happened a few times since the initial Tempo release and are usually announced in the Tempo Github release notes. We also track disk usage and set up an alert when disk usage is at 80% to provide a buffer for increasing storage capacity or switching to a full node if increasing capacity is not an option. We do this with node exporter which ships host metrics to our Grafana instance.

What's next

There are a few things we're still figuring out. We haven't explored separating consensus data from execution data yet — if consensus data turns out to be a sizeable chunk, storing it on cheaper disks could help with the storage cost problem. Pruning is another option we'll likely need to evaluate as the chain grows.

We're also keeping an eye on how things change as Tempo moves towards mainnet. Higher throughput and more activity will likely shift the hardware requirements, so some of the numbers in this post may not age well.

If you're setting up your own node, the Tempo docs are the best starting point. You can also find our ansible playbook for automating node provisioning here.