-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add zinnia spark spot check #2
Conversation
72196c3
to
eb1d72e
Compare
@pyropy GitHub has a feature for this. If PR A depends on PR B, set the target branch of PR A to PR B's branch. Then, when PR B gets merged, GitHub automatically changes PR A to target |
Did not know about that, that's awesome! Thanks for letting me know 🙏🏻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! 👏
lib/constants.js
Outdated
export const SPARK_VERSION = '1.15.0' | ||
export const MAX_CAR_SIZE = 200 * 1024 * 1024 // 200 MB | ||
export const APPROX_ROUND_LENGTH_IN_MS = 20 * 60_000 // 20 minutes | ||
export const MAX_CAR_SIZE = 200 * 1024 * 1024 // 200MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constant can be removed since the spot checker isn't enforcing a CAR size limit anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually used to define max size of buffer used to read CAR bytes to
spark-spot-check/lib/spot-checker.js
Line 95 in 5ecb296
const carBuffer = new ArrayBuffer(0, { maxByteLength: MAX_CAR_SIZE }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I didn't see that! Can it happen that we read more than MAX_CAR_SIZE into the buffer, and if, what happens then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually when we are trying to read more bytes into the buffer ArrayBuffer
instance throws an error, for example:
RangeError: ArrayBuffer.prototype.resize: Invalid length parameter
at ArrayBuffer.resize (<anonymous>)
at SpotChecker.fetchCAR (file:///home/srdjan/Development/checker-network/spark-spot-check/lib/spot-checker.js:126:21)
at eventLoopTick (ext:core/01_core.js:178:11)
at async SpotChecker.executeSpotCheck (file:///home/srdjan/Development/checker-network/spark-spot-check/lib/spot-checker.js:75:5)
at async SpotChecker.spotCheck (file:///home/srdjan/Development/checker-network/spark-spot-check/lib/spot-checker.js:158:5)
at async SpotChecker.run (file:///home/srdjan/Development/checker-network/spark-spot-check/lib/spot-checker.js:183:24)
at async file:///home/srdjan/Development/checker-network/spark-spot-check/main.js:35:1
I have reduced size from 200MB to 1MB as we're currently only reading 200 bytes (there is some overhead due to car encoding hence bigger upper limit of 1MB).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve removed this limit since we’re no longer loading the CAR file into memory. Instead, we're now reading a chunked stream returned by Lassie.
We now use the maxByteLength
parameter, which defines the threshold for the number of bytes downloaded before interrupting the retrieval.
lib/spot-checker.js
Outdated
// 20 is the digest length (32 bytes = 256 bits) | ||
stats.carChecksum = '1220' + encodeHex(digest) | ||
} | ||
await verifyContent(cid, carBytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is verifyContent
checking? Will it only verify that the root block payload in carBytes
matches the hash in cid
? IIUC, that's how I implemented this verification in Spark, where it makes sense, since we are fetching the root block only.
When we are fetching more than a root block, then it would be best to verify the entire content. Otherwise, the server can return a CAR with a root block followed by a random payload, and we will accept it as a valid response.
To verify the entire content, we need to download the entire Merkle tree (the full CAR file).
If we want to work with byte ranges, then we may be able to use IPFS HTTP Gateway feature where the client asks for a byte range in the file and the gateway is expected to send not only those bytes, but also additional content required to perform content-verification all the way up to the root cid.
Another option is to skip content verification. The downside is that without content verification, our spot check cannot tell whether the SP is serving the real content or just random bytes.
Let's discuss!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you are already using entity-bytes
below, so maybe my comment above is not relevant as it's based on an incomplete understanding of this pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks great at the high level.
I think the biggest remaining question is how many bytes to retrieve while being able to do proper content verification. As we discussed in our call earlier today, one option is to use Lassie for all retrievals (both Graphsync and HTTP). Lassie performs content verification and can offload large payloads to disk.
I am concerned about having two subtly different copies of Spark's lib files. It's fine for a proof of concept, but once we move to a more production-ready version, I would like us to explore how to refactor the original Spark checker code to support both "regular" checks and these new "spot" checks. Ideally, the new spot checker should re-use Spark checker's lib
files with no modifications needed. (We can even move the spot checker's entry point to the Spark checker repository, similarly to how we have manual-check.js
, but that's open for discussion.)
I think that for now we should piggyback of the Lassie retrievals as it enables us to perform. What I have found is that we're able to perform validation over each chunk that we receive as response: const reader = await CarBlockIterator.fromIterable(res.body)
for await (const block of reader) {
await validateBlock(block)
} I am not sure if this is correct, but in case it is we can either perform full retrieval or cancel retrieval after some set period while we validate streamed blocks on the fly.
I completely agree with you on this one. Original idea was to add spot check to spark codebase but I was afraid of introducing too much changes at once to such critical part of codebase that I have decided to create a new repository as a PoC. I think we can move it to main codebase once we are sure that spot checks make sense and that the ones we perform are correct. |
It looks correct to me. IIRC, Lassie is re-ordering the blocks inside the returned CAR file to enable streaming validation like the one you are performing in your for-await loop. I believe Lassie already performs content verification; therefore, it's not strictly necessary to do content verification inside the spot checker. It should not harm, though. |
lib/multiaddr.js
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove this file and not call validateHttpMultiaddr()
since we are delegating all retrievals to Lassie. We can let Lassie validate the multiaddr.
What do you think?
The only downside I see is that the spot-checker may be able to retrieve successfully from a multiaddr value that the real Spark checker does not support, which can be confusing.
So maybe it's better to keep the validation check in place 🤷🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally removed this in favor of Lassie's validation logic but later decided to add it back. When given an invalid multiaddr, Lassie responds with a 400 status code and a generic error message like invalid providers parameter, without providing much detail about the issue.
lib/spot-checker.js
Outdated
} | ||
|
||
/** | ||
* @param {object} args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation looks way off 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This applies to more jsdoc comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please in the future review PRs before submitting them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will continue review after the indentation has been fixed, because that will make the diff easier to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran npx standard --fix
, and that’s the result 😅
By the way, what have you been using on Spark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reformatted everything using VSCode and then ran npx standard --fix
again and now it seems fine. I've been using neovim for past couple of days so maybe that's the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been using any code formatter, but I think @bajtos uses one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this got messed up 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, standard
has no opinions on whitespace in code comments. It annoys me, I hope we will find a better solution as part of CheckerNetwork/roadmap#166
Adds basic utility for retrieving performing spot checks on spark retrievals. By default it performs full retrieval on all deals for current round.
Implementation plan specified here
Closes CheckerNetwork/roadmap#226