You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some random notes to keep in mind when refactoring the DAG modifier. Complemented by #5192.
Document what are the rules when writing and rewriting a file DAG, what can be expected in terms of DAG structure, block sizes, etc? How is a DAG extended or modified?
Rename and document expandSparse, this is not really sparse, all the zeros are actually being written to the DAG. (It would be interesting to add a codification feature in UnixFS v2 to allow truly sparse file DAGs).
curNode, similar to the DAG reader, but actually used, where and how?
Entry point command: ipfs files write.
Any modification implies copying the node, we don't modify in-place (is this correct?) because of the merkle DAG nature of data, this should be obvious to the experienced reader but it's worth clarifying it, especially when functions like WriteAt are documented as: will modify a dag file in place. The modified node has to be re-added, so it's not in-place, or maybe I should redefine in-place, merkle is not about the where but about the what.
Document and clarify that the write operations are buffered, Sync does the actual writing, Write just writes to a buffer (that needs to be synced).
The DAG modifier contains a DAG reader. Can these be combined in a single structure (and set it to a read-only mode in the cases where the DAG reader was used)?
The appendData function only works with the trickle layout, so a balanced DAG would be appended with the trickle layout rules. Is a balanced append function needed?
The modifyDag function seems to be the core function that does the actual writing in a DAG. It is too long, 100+ LoC. It has a base case for leaf nodes and then a case for internal nodes where it iterates the child sizes in a similar way to the reader (but it doesn't use it for this part, why? because it doesn't use the reader's offset?) and recursively calls itself.
Clarify the use of writeStart compared with curWrOff that seems to be used only for reading (it doesn't leverage the reader offset).
May be useful to extract the FSNodeOverDag cache structure from the builder package into unixfs and start using it here. This may replace the EncodeProtobuf calls.
Truncate calls dagTruncate for the actual operation that does a similar recursive algorithm as modifyDag, that part of the code should be reused (and maybe even replaced with the reader operations). expandSparse, when truncate has a bigger-than-size offset, just uses appendData to fill with zeros.
Seek when called with a bigger offset than the file size is appending data, is this the expected behavior from the Unix API?
The text was updated successfully, but these errors were encountered:
Some random notes to keep in mind when refactoring the DAG modifier. Complemented by #5192.
Document what are the rules when writing and rewriting a file DAG, what can be expected in terms of DAG structure, block sizes, etc? How is a DAG extended or modified?
Rename and document
expandSparse
, this is not really sparse, all the zeros are actually being written to the DAG. (It would be interesting to add a codification feature in UnixFS v2 to allow truly sparse file DAGs).curNode
, similar to the DAG reader, but actually used, where and how?Entry point command:
ipfs files write
.Any modification implies copying the node, we don't modify in-place (is this correct?) because of the merkle DAG nature of data, this should be obvious to the experienced reader but it's worth clarifying it, especially when functions like
WriteAt
are documented as:will modify a dag file in place
. The modified node has to be re-added, so it's not in-place, or maybe I should redefine in-place, merkle is not about the where but about the what.Document and clarify that the write operations are buffered,
Sync
does the actual writing,Write
just writes to a buffer (that needs to be synced).The DAG modifier contains a DAG reader. Can these be combined in a single structure (and set it to a read-only mode in the cases where the DAG reader was used)?
The
appendData
function only works with the trickle layout, so a balanced DAG would be appended with the trickle layout rules. Is a balanced append function needed?The
modifyDag
function seems to be the core function that does the actual writing in a DAG. It is too long, 100+ LoC. It has a base case for leaf nodes and then a case for internal nodes where it iterates the child sizes in a similar way to the reader (but it doesn't use it for this part, why? because it doesn't use the reader's offset?) and recursively calls itself.Clarify the use of
writeStart
compared withcurWrOff
that seems to be used only for reading (it doesn't leverage the reader offset).May be useful to extract the
FSNodeOverDag
cache structure from thebuilder
package intounixfs
and start using it here. This may replace theEncodeProtobuf
calls.Truncate
callsdagTruncate
for the actual operation that does a similar recursive algorithm asmodifyDag
, that part of the code should be reused (and maybe even replaced with the reader operations).expandSparse
, when truncate has a bigger-than-size offset, just usesappendData
to fill with zeros.Seek
when called with a bigger offset than the file size is appending data, is this the expected behavior from the Unix API?The text was updated successfully, but these errors were encountered: