Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make trust graph consider paths (aka flow). #45

Open
dpc opened this issue Dec 8, 2018 · 13 comments
Open

Make trust graph consider paths (aka flow). #45

dpc opened this issue Dec 8, 2018 · 13 comments
Labels
design How should it work enhancement New feature or request

Comments

@dpc
Copy link
Collaborator

dpc commented Dec 8, 2018

This is related to #44 .

Right now WoT graph is build by just a cost-bounded flooding of a graph. This makes it possible for anyone to create a new CrevId, trust it, and this way artificially increase the possible count of reviewes for a given crate.

This algorithm should keep track of path (id(s) of that directly trusted this one), on each step to the root of the trust tree, and so that when calculating the count of reviews, it's possible to "merge" reviews coming out from a common path, for the purpose of calculating the total trust count.

Example: If you directly trust only one other CrevId, you can only have trust count equal to 1, for any given crate, no matter how many people reviewed it.

@dpc dpc added enhancement New feature or request design How should it work labels Dec 8, 2018
@dpc dpc changed the title Make trust grap consider paths. Make trust graph consider paths. Dec 9, 2018
@dpc dpc added this to the 0.1 - MVP: cargo-crev/cargo-trust milestone Dec 11, 2018
@dpc
Copy link
Collaborator Author

dpc commented Jan 5, 2019

To explain:

             /- b
root -> a - c
             \- d

In this case, if both b and d create a review, it will count only as one review, to prevent a from being able to create and trust fake IDs, to increase review count.

@ffranr
Copy link
Member

ffranr commented Jan 5, 2019

it will only count as one review

Am I correct in saying that the count, in this case, is made from a point of view?

Shouldn't we avoid trusting nodes who we suspect of trusting and creating fake nodes anyway?

@dpc
Copy link
Collaborator Author

dpc commented Jan 6, 2019

Shouldn't we avoid trusting nodes who we suspect of trusting and creating fake nodes anyway?

Of course. But trust is sometimes broken, and accounts get compromised. Currently I believe that after crev gets popular, everyone should require at least 2 reviews to trust anything. And such review should be "uncorrelated". This issue is a method to make sure no account can increase the review count by creating sock-puppet accounts.

@pimotte
Copy link
Contributor

pimotte commented Jan 7, 2019

I think this would be solved by computing the max-flow from the root to each package, where people are nodes. Does this match what you're looking for?

@dpc
Copy link
Collaborator Author

dpc commented Jan 7, 2019

@pimotte I am not sure which algorithm you refer to, and I am not that well versed with graph algorithms. After brief looking at https://en.wikipedia.org/wiki/Maximum_flow_problem, I guess one could say I'm looking for max-flow, if all edges have flow of 1, the source is root of the WoT, and all trusted users that have reviewed the package are sinks.

@dpc
Copy link
Collaborator Author

dpc commented Aug 28, 2019

Something to consider regarding max flow algorithm:

http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002600.html

@dpc dpc changed the title Make trust graph consider paths. Make trust graph consider paths (aka flow). Aug 28, 2019
@dpc
Copy link
Collaborator Author

dpc commented May 24, 2020

I've read http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2019-August/002600.html and I pretty much agree with everything there.

This might affect #329 .

I was naively thinking about implementing a proper max-flow algorithm, but something like tracking number for each node, and splitting it between trusted 2nd-level node is probably easier and can be done breadth-first.

TODO:

@pimotte
Copy link
Contributor

pimotte commented May 24, 2020

Could you elaborate a bit on what you mean with "tracking number"?

Other thoughts:

#329 basically introduces the concept of "negative trust". The reporter doesn't phrase it as such, but if your network is saying someone has a certain level of trust and you want to downgrade that, that is what is going on. Introducing negative trust in the core mechanism will likely lead to an even more complex system. Advogato with a max-flow algorithm that works breadth-first seems like the perfect way to go about this. Personally, if that yields unexpected results for the user, I'd see that as a UX problem, rather than an algorithms problem. Those then could be resolved by hard-capping certain results based on reviews from the user or possbily allowing the user insight into why this result came to be, which might lead to a re-evaluation of the trust they have expressed in other parties.

This last bit is actually the counter-intuitive part about all these mechanisms. If I trust Alice and Alice trusts Eve, but I don't trust Eve, than that should lead me to re-evaluate my trust in Alice. If a crev result shows me this, it's very easy to say "oh, but I want to downgrade Eve, so let's tell that to crev". If the user can see that crev is trusting Eve because of Alice, then I get to choose: I downgrade my trust in Alice, since her judgment of Eve affects it, or I adjust my assessment of Eve, because I trust Alice's judgment of her above my own. In neither case I'll input something directly related to Eve, even though she "started" this.

@dpc
Copy link
Collaborator Author

dpc commented May 24, 2020

but something like tracking number for each node

That's just my way of summing up my shallow understanding of Advogato - keep track of how much flow each node has (some number).

@dpc
Copy link
Collaborator Author

dpc commented May 25, 2020

After reading through http://www.squarefree.com/trust/trust.pdf

Looking at these trust metrics etc. I think there are some fundamental differences between how they work, and how I perceive a problem.

First, for me the trust is not binary. For some reason most (all?) the existing trust solutions are binary. Something is either trustworthy or not. In crev, there are levels (and maybe also "confidence"). So the trust is more nuanced. Even without any flow analysis, the trust level can be easily made to "dissipate" as the distance from the root grows.

Then, I consider everything subjective and want to embrace that subjectivity. There is no one global "this is trustworthy" vs "this isn't". It's all opinions, and depending on circumstances two different agents, with same knowledge, can reach different conclusions about what is trustworthy.

Third, a combination of the previous two: there is no fixed point where something is trustworthy. Even the same person, with the same WoT, can run multiple verification runs, with different parameters (trading: depth, required trust-level, required understanding level, required redundancy, etc for each other), and look for the most susceptible pieces of code to review themselves.

If I trust Alice and Alice trusts Eve, but I don't trust Eve, than that should lead me to re-evaluate my trust in Alice.

The subjectivity is the reason why I don't think it's necessarily the case. It would make sense if we considered everything objective. But if it's subjective, then it's totally fine to have a different opinion. With the confidence stuff, there will be a way to possibly overwrite inferred trust if needed. But that will require a conscious decision of someone analyzing their WoT.

Originally I wanted flow analysis pretty much just to make the redundancy option (number of required reviews) to be attack-proof. So that someone in the WoT can't just create sock-puppets to create "copies" of the review that would count as many different reviews.

So my thinking was - after finding all the reviews, just run some kind of a max-flow algorithm from all the matching reviews to the root, and drop/merge multiple reviews coming through the same trust-bottleneck to count as one.

@dpc
Copy link
Collaborator Author

dpc commented May 25, 2020

The problem I'm looking at might not be a max-flow algorithm... . I'm only looking at flow of 1 everywhere, and I'm not interested in the flow value, but at actually pruning "duplicates". A naive version seems fast and simple:

  • when flooding the graph from root, build "previous node" list for each node
  • for every node with a review (sorted from closest to root to most distant) "traverse the path back to root":
    • go node by node to the root, and remember which nodes you've visited avoiding nodes that were already visited by any other previous iteration
  • nodes for which the traversal back to root can't succeed because the previous nodes were already used, must be merged/discarded.

However the result is not "stable" and depends on the order of how nodes were sorted, and how they picked the previous node to use.

But it seems to me that if every time when taking a previous node while having a choice of some other previous nodes, we note the unused previous nodes on the used previous node and allow subsequent "traverse the path back to root" to use the previous nodes that weren't taken like they were previous nodes of the currently consider node itself ... things might work OK? It seems to fix the problem where traversal uses a path that is needed for another traversal, instead of a path that no one else uses.

It's late and I'm probably rumbling, but maybe myself I'll be able to decipher it, once I get back to it.

@dpc
Copy link
Collaborator Author

dpc commented Apr 24, 2022

@dpc
Copy link
Collaborator Author

dpc commented Jun 9, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design How should it work enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants