Add zinnia spark spot check #2

pyropy · 2025-02-06T16:21:57Z

Adds basic utility for retrieving performing spot checks on spark retrievals. By default it performs full retrieval on all deals for current round.

Implementation plan specified here
Closes CheckerNetwork/roadmap#226

pyropy · 2025-02-07T12:06:25Z

I have rebased to main given that #3 is predecessor to #2 but was opened at a later time (hence the force-push)

juliangruber · 2025-02-07T12:38:16Z

I have rebased to main given that #3 is predecessor to #2 but was opened at a later time (hence the force-push)

@pyropy GitHub has a feature for this. If PR A depends on PR B, set the target branch of PR A to PR B's branch. Then, when PR B gets merged, GitHub automatically changes PR A to target main.

lib/constants.js

pyropy · 2025-02-07T13:26:15Z

I have rebased to main given that #3 is predecessor to #2 but was opened at a later time (hence the force-push)

@pyropy GitHub has a feature for this. If PR A depends on PR B, set the target branch of PR A to PR B's branch. Then, when PR B gets merged, GitHub automatically changes PR A to target main.

Did not know about that, that's awesome! Thanks for letting me know 🙏🏻

juliangruber

Great work! 👏

juliangruber · 2025-02-10T14:32:00Z

lib/constants.js

-export const SPARK_VERSION = '1.15.0'
-export const MAX_CAR_SIZE = 200 * 1024 * 1024 // 200 MB
-export const APPROX_ROUND_LENGTH_IN_MS = 20 * 60_000 // 20 minutes
+export const MAX_CAR_SIZE = 200 * 1024 * 1024 // 200MB


This constant can be removed since the spot checker isn't enforcing a CAR size limit anywhere

It is actually used to define max size of buffer used to read CAR bytes to

spark-spot-check/lib/spot-checker.js

Line 95 in 5ecb296

const carBuffer = new ArrayBuffer(0, { maxByteLength: MAX_CAR_SIZE })

Thanks I didn't see that! Can it happen that we read more than MAX_CAR_SIZE into the buffer, and if, what happens then?

Usually when we are trying to read more bytes into the buffer ArrayBuffer instance throws an error, for example:

RangeError: ArrayBuffer.prototype.resize: Invalid length parameter at ArrayBuffer.resize (<anonymous>) at SpotChecker.fetchCAR (file:///home/srdjan/Development/checker-network/spark-spot-check/lib/spot-checker.js:126:21) at eventLoopTick (ext:core/01_core.js:178:11) at async SpotChecker.executeSpotCheck (file:///home/srdjan/Development/checker-network/spark-spot-check/lib/spot-checker.js:75:5) at async SpotChecker.spotCheck (file:///home/srdjan/Development/checker-network/spark-spot-check/lib/spot-checker.js:158:5) at async SpotChecker.run (file:///home/srdjan/Development/checker-network/spark-spot-check/lib/spot-checker.js:183:24) at async file:///home/srdjan/Development/checker-network/spark-spot-check/main.js:35:1

I have reduced size from 200MB to 1MB as we're currently only reading 200 bytes (there is some overhead due to car encoding hence bigger upper limit of 1MB).

I’ve removed this limit since we’re no longer loading the CAR file into memory. Instead, we're now reading a chunked stream returned by Lassie.

We now use the maxByteLength parameter, which defines the threshold for the number of bytes downloaded before interrupting the retrieval.

lib/spot-checker.js

lib/tasker.js

lib/spot-checker.js

lib/tasker.js

manual-check.js

bajtos · 2025-02-11T10:15:53Z

lib/spot-checker.js

-          // 20 is the digest length (32 bytes = 256 bits)
-          stats.carChecksum = '1220' + encodeHex(digest)
-        }
+        await verifyContent(cid, carBytes)


What exactly is verifyContent checking? Will it only verify that the root block payload in carBytes matches the hash in cid? IIUC, that's how I implemented this verification in Spark, where it makes sense, since we are fetching the root block only.

When we are fetching more than a root block, then it would be best to verify the entire content. Otherwise, the server can return a CAR with a root block followed by a random payload, and we will accept it as a valid response.

To verify the entire content, we need to download the entire Merkle tree (the full CAR file).

If we want to work with byte ranges, then we may be able to use IPFS HTTP Gateway feature where the client asks for a byte range in the file and the gateway is expected to send not only those bytes, but also additional content required to perform content-verification all the way up to the root cid.

Another option is to skip content verification. The downside is that without content verification, our spot check cannot tell whether the SP is serving the real content or just random bytes.

Let's discuss!

I see that you are already using entity-bytes below, so maybe my comment above is not relevant as it's based on an incomplete understanding of this pull request.

bajtos

The PR looks great at the high level.

I think the biggest remaining question is how many bytes to retrieve while being able to do proper content verification. As we discussed in our call earlier today, one option is to use Lassie for all retrievals (both Graphsync and HTTP). Lassie performs content verification and can offload large payloads to disk.

I am concerned about having two subtly different copies of Spark's lib files. It's fine for a proof of concept, but once we move to a more production-ready version, I would like us to explore how to refactor the original Spark checker code to support both "regular" checks and these new "spot" checks. Ideally, the new spot checker should re-use Spark checker's lib files with no modifications needed. (We can even move the spot checker's entry point to the Spark checker repository, similarly to how we have manual-check.js, but that's open for discussion.)

lib/spot-checker.js

lib/tasker.js

main.js

test/spot-checker.js

pyropy · 2025-02-11T16:54:39Z

The PR looks great at the high level.

I think the biggest remaining question is how many bytes to retrieve while being able to do proper content verification. As we discussed in our call earlier today, one option is to use Lassie for all retrievals (both Graphsync and HTTP). Lassie performs content verification and can offload large payloads to disk.

I think that for now we should piggyback of the Lassie retrievals as it enables us to perform.

What I have found is that we're able to perform validation over each chunk that we receive as response:

const reader = await CarBlockIterator.fromIterable(res.body)
for await (const block of reader) {
    await validateBlock(block)
}

I am not sure if this is correct, but in case it is we can either perform full retrieval or cancel retrieval after some set period while we validate streamed blocks on the fly.

I am concerned about having two subtly different copies of Spark's lib files. It's fine for a proof of concept, but once we move to a more production-ready version, I would like us to explore how to refactor the original Spark checker code to support both "regular" checks and these new "spot" checks. Ideally, the new spot checker should re-use Spark checker's lib files with no modifications needed. (We can even move the spot checker's entry point to the Spark checker repository, similarly to how we have manual-check.js, but that's open for discussion.)

I completely agree with you on this one. Original idea was to add spot check to spark codebase but I was afraid of introducing too much changes at once to such critical part of codebase that I have decided to create a new repository as a PoC. I think we can move it to main codebase once we are sure that spot checks make sense and that the ones we perform are correct.

bajtos · 2025-02-12T12:01:39Z

The PR looks great at the high level.
I think the biggest remaining question is how many bytes to retrieve while being able to do proper content verification. As we discussed in our call earlier today, one option is to use Lassie for all retrievals (both Graphsync and HTTP). Lassie performs content verification and can offload large payloads to disk.

I think that for now we should piggyback of the Lassie retrievals as it enables us to perform.

What I have found is that we're able to perform validation over each chunk that we receive as response:
const reader = await CarBlockIterator.fromIterable(res.body)
for await (const block of reader) {
    await validateBlock(block)
}
I am not sure if this is correct, but in case it is we can either perform full retrieval or cancel retrieval after some set period while we validate streamed blocks on the fly.

It looks correct to me. IIRC, Lassie is re-ordering the blocks inside the returned CAR file to enable streaming validation like the one you are performing in your for-await loop.

I believe Lassie already performs content verification; therefore, it's not strictly necessary to do content verification inside the spot checker. It should not harm, though.

bajtos · 2025-02-12T12:33:55Z

lib/multiaddr.js

I think we can remove this file and not call validateHttpMultiaddr() since we are delegating all retrievals to Lassie. We can let Lassie validate the multiaddr.

What do you think?

The only downside I see is that the spot-checker may be able to retrieve successfully from a multiaddr value that the real Spark checker does not support, which can be confusing.

So maybe it's better to keep the validation check in place 🤷🏻

I originally removed this in favor of Lassie's validation logic but later decided to add it back. When given an invalid multiaddr, Lassie responds with a 400 status code and a generic error message like invalid providers parameter, without providing much detail about the issue.

config.js

juliangruber · 2025-02-12T13:27:56Z

lib/spot-checker.js

+  }
+
+  /**
+                 * @param {object} args


Indentation looks way off 😅

This applies to more jsdoc comments

Please in the future review PRs before submitting them :)

I will continue review after the indentation has been fixed, because that will make the diff easier to read

I ran npx standard --fix, and that’s the result 😅

By the way, what have you been using on Spark?

I've reformatted everything using VSCode and then ran npx standard --fix again and now it seems fine. I've been using neovim for past couple of days so maybe that's the issue.

I haven't been using any code formatter, but I think @bajtos uses one

Not sure how this got messed up 😅

IIRC, standard has no opinions on whitespace in code comments. It annoys me, I hope we will find a better solution as part of CheckerNetwork/roadmap#166

lib/tasker.js

Co-authored-by: Julian Gruber <[email protected]>

This was referenced Feb 6, 2025

Add spark spot check #1

Closed

Copy Spark codebase #3

Merged

pyropy added 4 commits February 7, 2025 13:04

Add basic spot check tasker and checker module

75e535f

Add basic spot checker

bc28df1

Remove redirect assert

96e4b85

Set max car size to 32 GB

eb1d72e

pyropy force-pushed the add-zinnia-spark-spot-check branch from 72196c3 to eb1d72e Compare February 7, 2025 12:05

pyropy self-assigned this Feb 7, 2025

juliangruber reviewed Feb 7, 2025

View reviewed changes

lib/constants.js Outdated Show resolved Hide resolved

Change how tasker tasks are selected

00c2f45

pyropy added 4 commits February 7, 2025 15:14

Fetch byte range by default

dc0aa29

Add usage for main.js

cd59aa5

Remove unused dep

bf73144

Extract round url logic; Reformat

5ecb296

pyropy marked this pull request as ready for review February 10, 2025 12:02

pyropy requested review from juliangruber, bajtos and NikolasHaimerl February 10, 2025 12:03

juliangruber requested changes Feb 10, 2025

View reviewed changes

pyropy added 3 commits February 10, 2025 17:25

Apply feedback from Julian

5584c3a

Remove manual check and refactor integration test

8563325

Make retrievalTasks an argument

359c93e

bajtos reviewed Feb 11, 2025

View reviewed changes

Test only range requests and validate root cid

beb4546

pyropy marked this pull request as draft February 11, 2025 12:39

bajtos reviewed Feb 11, 2025

View reviewed changes

lib/spot-checker.js Outdated Show resolved Hide resolved

lib/tasker.js Show resolved Hide resolved

main.js Show resolved Hide resolved

test/spot-checker.js Show resolved Hide resolved

pyropy added 2 commits February 11, 2025 19:25

daily commmit

16fa1db

Fix tests; define max byte length before terminating

e86e9c3

Load config via config.js

c0ab6e8

bajtos reviewed Feb 12, 2025

View reviewed changes

Make default arguments undefined instead of -1

547956c

pyropy marked this pull request as ready for review February 12, 2025 13:18

juliangruber requested changes Feb 12, 2025

View reviewed changes

pyropy and others added 4 commits February 12, 2025 15:53

Update lib/tasker.js

a3a6bf5

Co-authored-by: Julian Gruber <[email protected]>

Move configuration comments above the variable definition

2a52560

Reformat jsdoc comments

ad5949f

Reogranise imports in test.js

c723844

pyropy requested review from juliangruber and bajtos February 12, 2025 15:07

juliangruber approved these changes Feb 12, 2025

View reviewed changes

bajtos approved these changes Feb 14, 2025

View reviewed changes

pyropy merged commit 255133e into main Feb 17, 2025

pyropy deleted the add-zinnia-spark-spot-check branch February 17, 2025 07:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add zinnia spark spot check #2

Add zinnia spark spot check #2

pyropy commented Feb 6, 2025 •

edited

Loading

pyropy commented Feb 7, 2025 •

edited

Loading

juliangruber commented Feb 7, 2025

pyropy commented Feb 7, 2025

juliangruber left a comment

juliangruber Feb 10, 2025

pyropy Feb 10, 2025

juliangruber Feb 10, 2025

pyropy Feb 11, 2025

pyropy Feb 12, 2025

bajtos Feb 11, 2025

bajtos Feb 11, 2025

bajtos left a comment

pyropy commented Feb 11, 2025 •

edited

Loading

bajtos commented Feb 12, 2025

bajtos Feb 12, 2025

pyropy Feb 12, 2025

juliangruber Feb 12, 2025

juliangruber Feb 12, 2025

juliangruber Feb 12, 2025

juliangruber Feb 12, 2025 •

edited

Loading

pyropy Feb 12, 2025 •

edited

Loading

pyropy Feb 12, 2025

juliangruber Feb 12, 2025

juliangruber Feb 12, 2025

bajtos Feb 14, 2025

Add zinnia spark spot check #2

Add zinnia spark spot check #2

Conversation

pyropy commented Feb 6, 2025 • edited Loading

pyropy commented Feb 7, 2025 • edited Loading

juliangruber commented Feb 7, 2025

pyropy commented Feb 7, 2025

juliangruber left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bajtos left a comment

Choose a reason for hiding this comment

pyropy commented Feb 11, 2025 • edited Loading

bajtos commented Feb 12, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juliangruber Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

pyropy Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pyropy commented Feb 6, 2025 •

edited

Loading

pyropy commented Feb 7, 2025 •

edited

Loading

pyropy commented Feb 11, 2025 •

edited

Loading

juliangruber Feb 12, 2025 •

edited

Loading

pyropy Feb 12, 2025 •

edited

Loading