Performance: Node benchmarking utility #6198

urtho · 2024-12-16T12:43:39Z

go-algorand could use a standardized way to compare and benchmark the underlying hardware, ideally with a repeatable workload that closely matches a real scenario.

Users could compare their results online and make sure their hardware's performance is above the median so that network peak performance can grow with the number of new nodes.

The catchpointdump utility is the perfect first candidate for such a utility.

It is already in the repo
Can simulate a fast catchup procedure closely in a repeatable setting

This patch adds a bench command to the utility by combining both network and file restore scenarios. Download, SQLite loading and Merkle tree build can be benchmarked all in one go.
It reuses some of the dependencies that are already in go.mod to get information about the hardware - at least on the Linux platform.

Results are optionally dumped to a JSON file and ready for submission to some central benchmark repository.

Examples

Simple Network, SSD and CPU test

Known catchpoint label, sourced from a random relay/archiver :

./catchpointdump bench -r 41600000 -n mainnet.algorand.network

# Benchmark report:
# >> stage:network duration_sec:89.1 duration_min:1.5 cpu_sec:101
# >> stage:database duration_sec:648.8 duration_min:10.8 cpu_sec:507
# >> stage:digest duration_sec:385.1 duration_min:6.4 cpu_sec:550

SSD and CPU test with local file

Benchmarking the disk and CPU part only using the already downloaded ledger snapshot:

catchpointdump bench -n mainnet.algorand.network -t mainnet/snap/41600000.tar

Full benchmark with JSON report and hosted snapshot

A repeatable benchmark with a CloudFlare hosted catchpoint and report dump

./catchpointdump bench -r 41600000 -n mainnet.algorand.network -p snap.nodely.io -j report.json

Report file

Sample report.json:

{
    "report": "a193cbc7-6e6a-732b-93cf-36f0c0589864",
    "stages": [
        {
            "stage": "network",
            "duration_sec": 39,
            "cpu_time_sec": 59
        },
        {
            "stage": "database",
            "duration_sec": 795,
            "cpu_time_sec": 629
        },
        {
            "stage": "digest",
            "duration_sec": 363,
            "cpu_time_sec": 482
        }
    ],
    "host": {
        "cores": 20,
        "log_cores": 20,
        "base_mhz": 2500,
        "max_mhz": 3500,
        "cpu_name": "13th Gen Intel(R) Core(TM) i5-13500",
        "cpu_vendor": "Intel",
        "mem_mb": 64105,
        "os": "linux",
        "uuid": "c3acdb4e-3937-a9a6-2266-d80ce615ef45"
    }
}

File can be uploaded to a 3rd pty benchmark site like:

curl -X POST https://benchmarks.nodely.io/api/report -d @report.json
#{"success":true,"goto":"https://benchmarks.nodely.io/edit/a193cbc7-6e6a-732b-93cf-36f0c0589864"}

codecov · 2024-12-16T13:08:53Z

Codecov Report

Attention: Patch coverage is 0% with 163 lines in your changes missing coverage. Please review.

Project coverage is 51.68%. Comparing base (269945c) to head (9d83da6).

Files with missing lines	Patch %	Lines
cmd/catchpointdump/bench.go	0.00%	93 Missing ⚠️
cmd/catchpointdump/bench_report.go	0.00%	59 Missing ⚠️
util/util.go	0.00%	10 Missing ⚠️
cmd/catchpointdump/commands.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6198      +/-   ##
==========================================
- Coverage   51.78%   51.68%   -0.10%     
==========================================
  Files         644      646       +2     
  Lines       86697    86860     +163     
==========================================
+ Hits        44894    44895       +1     
- Misses      38933    39098     +165     
+ Partials     2870     2867       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

urtho · 2024-12-18T00:11:35Z

Submitting a report might be fun :

algorandskiy

Good work! I left few comments.
The PR will need an update after #6177 gets merged.

algorandskiy · 2024-12-18T00:53:29Z

cmd/catchpointdump/bench_report.go

+	return fmt.Sprintf(">> stage:%s duration_sec:%.1f duration_min:%.1f cpu_sec:%d", bs.stage, bs.duration.Seconds(), bs.duration.Minutes(), bs.cpuTimeNS/1000000000)
+}
+
+func maybeGetTotalMemory() uint64 {


consider moving to util/util.go

algorandskiy · 2024-12-18T00:55:30Z

cmd/catchpointdump/bench.go

+	benchCmd.Flags().IntVarP(&round, "round", "r", 0, "Specify the round number ( i.e. 7700000 )")
+	benchCmd.Flags().StringVarP(&relayAddress, "relay", "p", "", "Relay address to use ( i.e. r-ru.algorand-mainnet.network:4160 )")
+	benchCmd.Flags().StringVarP(&catchpointFile, "tar", "t", "", "Specify the catchpoint file (either .tar or .tar.gz) to process")
+	benchCmd.Flags().StringVarP(&reportJsonPath, "report", "j", "", "Specify the file to save the Json formatted report to")


Suggested change

benchCmd.Flags().StringVarP(&reportJsonPath, "report", "j", "", "Specify the file to save the Json formatted report to")

benchCmd.Flags().StringVarP(&reportJsonPath, "report", "j", "", "Specify the file to save the JSON formatted report to")

algorandskiy · 2024-12-18T00:56:29Z

cmd/catchpointdump/bench_report.go

+}
+
+func GetCPU() int64 {
+	usage := new(syscall.Rusage)


same, move to util

algorandskiy · 2024-12-18T00:58:03Z

cmd/catchpointdump/bench.go

+		addrs = []string{relayAddress}
+	} else {
+		//append relays
+		dnsaddrs, err := tools.ReadFromSRV(context.Background(), "algobootstrap", "tcp", networkName, "", false)


"algobootstrap" probably should not be here since they not obliged to have catchpoints except few most recent ones.

gmalouf · 2025-01-29T20:06:32Z

@urtho if want to refresh this PR from master, now is a good time!

gmalouf · 2025-02-05T02:32:48Z

Some issues to be worked out before can move forward:

# github.com/algorand/go-algorand/cmd/catchpointdump
cmd/catchpointdump/bench.go:178:35: assignment mismatch: 4 variables but catchupAccessor.GetVerifyData returns 6 values

…o urtho-benchmark

algorandskiy · 2025-02-15T20:36:48Z

@urtho could you remerge/fix the build and go through my comments?

gmalouf · 2025-02-18T20:16:10Z

@urtho remaining failures can be tracked down/addressed by:

running check_license.sh locally- that will add the license to your new files (Codegen failure)
make lint reveals a number of new items introduced by the changes in this PR. (ReviewDog failure)

algorandskiy · 2025-03-09T03:20:31Z

cmd/catchpointdump/bench.go

+// Copyright (C) 2019-2024 Algorand, Inc.
+// This file is part of go-algorand
+//
+// go-algorand is free software: you can redistribute it and/or modify
+// it under the terms of the GNU Affero General Public License as
+// published by the Free Software Foundation, either version 3 of the
+// License, or (at your option) any later version.
+//
+// go-algorand is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU Affero General Public License for more details.
+//
+// You should have received a copy of the GNU Affero General Public License
+// along with go-algorand.  If not, see <https://www.gnu.org/licenses/>.


double copyright, remove

feat: cachpointdump benchmark

9436a7c

algorandskiy reviewed Dec 18, 2024

View reviewed changes

This comment was marked as duplicate.

Sign in to view

Merge branch 'algorand:master' into urtho-benchmark

a83a9dc

urtho added 2 commits February 9, 2025 12:08

fix Merkle Trie step logging

05c0b94

Merge branch 'urtho-benchmark' of github.com:AlgoNode/go-algorand int…

abd871b

…o urtho-benchmark

urtho added 2 commits February 16, 2025 12:50

only use archivers fro catchpoint source

ad3c975

Move utility functions to util

9ba8147

gmalouf changed the title ~~Node benchmarking utility~~ Performance: Node benchmarking utility Feb 18, 2025

gmalouf added the Enhancement label Feb 18, 2025

algorandskiy added the external contribution label Feb 18, 2025

urtho added 3 commits March 8, 2025 22:38

Merge branch 'algorand:master' into urtho-benchmark

e6b5a61

add license to benchmark files

471b662

make linter happy

9d83da6

algorandskiy reviewed Mar 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: Node benchmarking utility #6198

Performance: Node benchmarking utility #6198

urtho commented Dec 16, 2024

codecov bot commented Dec 16, 2024 •

edited

Loading

urtho commented Dec 18, 2024

algorandskiy left a comment

algorandskiy Dec 18, 2024

algorandskiy Dec 18, 2024

algorandskiy Dec 18, 2024

algorandskiy Dec 18, 2024

gmalouf commented Jan 29, 2025

This comment was marked as duplicate.

gmalouf commented Feb 5, 2025

algorandskiy commented Feb 15, 2025

gmalouf commented Feb 18, 2025

algorandskiy Mar 9, 2025

	benchCmd.Flags().StringVarP(&reportJsonPath, "report", "j", "", "Specify the file to save the Json formatted report to")
	benchCmd.Flags().StringVarP(&reportJsonPath, "report", "j", "", "Specify the file to save the JSON formatted report to")

Performance: Node benchmarking utility #6198

Are you sure you want to change the base?

Performance: Node benchmarking utility #6198

Conversation

urtho commented Dec 16, 2024

Examples

Simple Network, SSD and CPU test

SSD and CPU test with local file

Full benchmark with JSON report and hosted snapshot

Report file

codecov bot commented Dec 16, 2024 • edited Loading

Codecov Report

urtho commented Dec 18, 2024

algorandskiy left a comment

Choose a reason for hiding this comment

algorandskiy Dec 18, 2024

Choose a reason for hiding this comment

algorandskiy Dec 18, 2024

Choose a reason for hiding this comment

algorandskiy Dec 18, 2024

Choose a reason for hiding this comment

algorandskiy Dec 18, 2024

Choose a reason for hiding this comment

gmalouf commented Jan 29, 2025

This comment was marked as duplicate.

gmalouf commented Feb 5, 2025

algorandskiy commented Feb 15, 2025

gmalouf commented Feb 18, 2025

algorandskiy Mar 9, 2025

Choose a reason for hiding this comment

codecov bot commented Dec 16, 2024 •

edited

Loading