What random sampling method is being used? #169

alwilson · 2022-11-02T18:50:01Z

alwilson
Nov 2, 2022

While filing #161 @mballance mentioned 'bit swizzling' and I poked at the code a little where it gets mentioned as well. I'm curious how it works and what other methods are out there or have been explored. I seen some other projects, such as
CRAVE and SMTSampler, and I've also found a few papers out there vaguely describing how other systems work.

I also bring this up b/c in #161 I found that SV constraint solvers were on the order of 100-1000x faster than pyvsc and I'd love to find ways to help close the gap for pyvsc, as well as further tackle distribution issues.

I have some exploration I've done with Z3's optimization engine to do what I call "region sampling" that seems to do fairly well at producing a uniform distribution at rates on par with SV. It fails for constraints that are too sparse or not 'square'. I want to explore using those regions and backtracking through constraints to find new regions, but I'm curious what feedback or direction @mballance had for pyvsc.
https://github.com/alwilson/smt2_rand_sampler

mballance · 2022-11-03T14:59:48Z

mballance
Nov 3, 2022
Maintainer

Thanks for raising this, @alwilson. Manipulating a SAT solver to produce randomly (or somewhat-randomly) distributed solutions is a very interesting topic, given the availability of great SAT/SMT solvers. A big challenge, in my opinion, is figuring out how to align the characteristics of the sampling techniques with the characteristics of randomization in a typical verification environment.

To help inform this discussion, here is a brief description of the technique that PyVSC currently uses:

Prior to solving a set of variables and constraints, the constraint system is broken into rand sets" composed of related variables and constraints. Each of these rand sets can be solved indepdently
After ensuring the rand set is SAT, soft constraints are constructed to select a point within the reachable space. The current technique is called part-select swizzling because it forces the value of part selects of variables within the solve set. You can see the code here: https://github.com/fvutils/pyvsc/blob/master/src/vsc/model/solvegroup_swizzler_partsel.py
Up to N variables are randomly selected from the rand set (currently N=4)
A simple minmax algorithm is applied to the variables more-closely approximate the reachable domain of each variable (eg bit[32] a < 16 -> domain is [0..15])
Up to P bit ranges are randomly selected on each selected variable. The size of the interval is based on the bit width relative to the approximated reachable domain. Soft constraints are built forcing the selected bits to a randomly-selected value (eg soft var[3:2] == 2'b10)
The resulting list of soft constraints are incrementally assumed against the constraint system. Those that are SAT are asserted. Note: this is one place where a simple binary-search approach to applying the constraints could result in performance improvement

Initially, PyVSC selected ranges based on the approximated reachable variable domains. Unfortunately, that resulted in relatively poor randomness.

One of the biggest challenges to adapting some of the "sampling" techniques described in papers (there were quite a few out of Stanford a couple years back) is the way that random stimulus generation is typically performed.
Random sampling assumes you have a fixed solution space from which you can sample a series of solutions, using the initial solution to guide selection of the next, etc. If I understand correctly, there is the assumption that some up-front (potentially time-consuming) work pays off in an efficiently-produced stream of well-distributed solutions.
In a typical verification environment, there are a large number of distinct constraint problems. Furthermore, it's unclear which will be reused as-is many times and, thus, benefit from up-front sampling work. For example:

@vsc.randobj
class A(object):
  def __init__(self):
    self.v = vsc.rand_uint8_t()

a = A()
for i in range(100):
  with a.randomize_with():
    a.v > i

Worst case, a tool would perform the initial work of setting up a sampling problem 100 times and return the initial result.
Perhaps a user-driven marker should be used to identify objects that will be randomized as-is many times, and where it makes sense to do more up-front work on sampling.

So, bit of background there...

If you're interested in trying to apply the region sampling technique to PyVSC, I would be very interested. The 'swizzler' code is separate module to allow experimentation, so we could even (for example) allow the swizzling/sampling technique to be selected on a per-randomization basis.

One thing I can see you needing is the ability to associate some 'cache' with a randomizable object that you can setup on the initial randomization and reuse on subsequent randomizations.

Are you interested in working toward some sort of prototype here? I'd definitely be interested in building in some 'hooks' in PyVSC to support that work.

0 replies

alwilson · 2022-11-04T03:29:32Z

alwilson
Nov 4, 2022
Author

Are you interested in working toward some sort of prototype here? I'd definitely be interested in building in some 'hooks' in PyVSC to support that work.

Definitely! I'm glad you're interested in it. Integrating it and making use of the existing test suite would be very useful.

A per-randomization option would be good. Could even have fallback conditions where we can try one method and if it hits some deterministic failure point it bails out to another method. That could be a good hybrid solution where you have multiple solvers that vary in speed and scope in order to quickly solve easy, trivial constraints, but fail quickly so that other solvers can take over.

So far with the region sampler it spends a good chunk of time querying Z3 for the min/max, compiling the python function, and doing a little region bisecting and elimination with Z3. But once that's done it can hit up to 100k randomizations per second by just running the python function and avoiding the SMT solver overhead. That's all depending on how well the region mapping worked. I think it applies really well to some constraints, and very poorly for others. I would imagine using each set or partition of constraints as a key in a cache (in memory or on disk) could speed things up quite a bit and persist between runs. Especially if you're running multiple tests that share the same base sets of constraints.

What was the impetus behind using boolector? I stumbled upon Z3 after a hackernews article and got used to its python interface. I've mostly just used it for fun, some advent of code, and found one good use for it in a script at work. I imagine there are some licensing, feature, and performance differences. I am curious how both fair at debugging. I haven't done much in that regard with Z3, but I've run into tough constraint problems at work before. Which is another reason why I'm interested in PyVSC and having open source tooling so I can better understand constraint violations and distribution issues.

Thanks for sharing about what PyVSC is currently doing. I'll have to explore the code some more now with that context!

0 replies

mballance · 2022-11-14T01:56:01Z

mballance
Nov 14, 2022
Maintainer

Okay, sounds like I definitely need to dig into your code to understand the approach a bit more. Can you point me to any papers, etc. that describe the technique that you're using? My interpretation is that, fundamentally, you're using a SMT solver to build a problem-specific constraint propagator. While applying this approach to optimizing runtime randomization (ie as a sort of JIT solver) looks very promising, there are other applications that could also benefit from this approach (eg long-running validation tests, where a traditional solver is too heavyweight).

What limitations does this approach have in terms of operators? For example, does it work best on continuous functions (ie doesn't work well with if/else, doesn't work well with non-linear operators)? I think you mentioned something about working better with dense vs sparse solution spaces. Anything we can do to quantify this so as to identify and avoid constraint systems that work well vs those that are suboptimal, or might not work at all?

With PyVSC, and random generation in general, there are two technology directions I'm keeping in mind. One is the current PyVSC that is pure Python around the native-implemented Boolector solver. The other is PyVSC 2 (for lack of a better name) that will be a thin pure-Python modeling layer around a native random-solver core library (libvsc). libvsc does the 'heavy lifting' of constraint-system manipulation more efficiently in C++ vs Python, but also is designed with the use of multiple solvers in mind.

With respect to the first technology direction (adding this technique to current PyVSC), I've been giving more thought to the approach. Each PyVSC random class object has a companion random-state object that ensures random stability and gives the user control over, for example, snapshotting and restoring the random state of an object. I'm thinking this is a good location to place cached data used by a technique like you're working with. For experimentation purposes, I'm also thinking that we could allow specifying a 'swizzler' class via the randomization API to customize how that process is approached. It would be up to the swizzler class to query the randstate object for cached data.

With respect to Boolector, I selected it based on two criteria. First, the API was quite straightforward. Second, the performance seemed to be quite good. I believe I started using Boolector the year it placed quite highly in the annual SMT competition. I had also been using SymbiYosis for property checking, and noticed a significant performance increase when moving from Z3 to Boolector.
Now, just because Boolector is fast at the type of problems used for property checking doesn't necessarily mean that it's also fast on the types of problems that come up in constrained-random generation. In fact, constrained-random generation tends to involve lots of very simple constraints alongside some realy complex ones. One of the goals of libvsc is to make smarter choices by delegating simple problems to simple (and faster) solvers, while reserving solvers like Boolector for the most difficult. In this context, I think it's well-worth revisiting Z3 to see if it where it falls in terms of complexity vs speed.

Hope this helps a bit in terms of initial motivations/justifications for Boolector, as well as what might be coming. Looking forward to any resources you can point me to that detail the sampling approach that you're currently taking.

0 replies

alwilson · 2022-11-14T06:03:20Z

alwilson
Nov 14, 2022
Author

I thought of my approach while reading the related work section from this paper that mentioned "interval-propagation-based sampling". I actually forget what method the paper proposed now, but found that related work section fascinating.

I thought that a simple, low effort way to randomly sample would be to calculate a valid interval or range of values for each variable with an SMT solver (Z3's min/max optimizer features), and then outside the SMT solver randomize variables in that range and check them against the constraints as a compiled function. If it passes, use it, if it fails throw it away. Essentially a shotgun approach. I have some other functionality in there to split the variable ranges into sub-regions and then use the SMT solver again to find the new valid ranges within that sub-region. I think that can help improve the hit rate, but admittedly this is a pretty simple technique. It originally crossed my mind that generating numbers and checking a compiled function could potentially be done millions of times per second, so I went about writing up the basics to test it out. I mostly wanted to get my toes wet before diving into something else. I do think it might be useful as a first-pass sort of sampler though. Oh, and it does have the added benefit that its sampling is very uniform. It samples the entire range of values as uniformly as the random.randrange() function does.

A simple example of a constraint that this method struggles with is something as simple as int x == int y, which has a huge range of values for x and y, but only a single value of x and y are valid. The region sampling I described above doesn't have a good answer for this yet. Even "bisecting" the ranges just generates smaller and smaller sub-regions that still have a very low chance of hitting valid values. It does really well for something like what I shared in #116, int x + int y <= 50, b/c the solution space is dense once you limit yourself to the valid range for x and y. I played with some other constraints using modulo operators and even just a few single points, which then the "bisecting" can quickly eliminate large holes from that initial range and narrow it down. A deterministic way to avoid constraints that don't work could be some arbitrary hit rate threshold that causes it to exit. Or maybe a simple way to avoid equals operator problems is to post process them by randomizing one variable and just calculating the next value? Perhaps there are other ways to tackle that problem by detecting the region "shape" and trying out different region "shapes". I should work on generating more constraint examples to get a better idea of what operators do and don't work. Maybe I can get boolector to dump the smt2 version of the constraints you have in your test suite and run them through it.

That section on "interval-propagation-based sampling" describes a system where the sampler finds the valid range for each variable, randomly selects and randomizes a single variable, and then works its way back through the constraints to calculate the new range for the next randomly-selected variable and so on. I was thinking of exploring that some more and how to write up something like that. I think that's considerably more complicated though. That paper is pretty old too, and makes me wonder what newer papers or examples might be out there.

I played around with randomizer.py a day ago to see better how it works. I wrote up a simple cache using the constraint pretty printer string as a key in a dictionary. I was curious if the boolector and RandInfo objects could be cached, but I think there's some randomization information stored in the randsets along with references to the boolector object that don't make cache-able. I think the hard/soft constraint checking could be cached since for a particular set of constraints that will always be the same. I looked at some cProfile results using flameprof, and it looks like about 1/3 of runtime is spent in boolector, and another 1/3 visting/building constraints.

All that said, I noticed that the current method already does all the work of checking if the hard constraints are SAT and then which soft constraints are also SAT. I believe the same method I'm using to speed up constraint checking could be applied to your swizzling method. At least, I didn't see where the btor model values were being queried. I need to go back through your description of the swizzling and the code and see if that's true. Maybe using boolector to prove that there are solutions and then building a compiled function to quickly test bit ranges would work with the swizzle flow?

I had seen your libvsc repo early and was curious what it was! Yeah, I think Z3 has some really neat features, but I think that impacts performance and probably goes beyond what's needed for this sort of problem.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What random sampling method is being used? #169

{{title}}

Replies: 4 comments

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

What random sampling method is being used? #169

alwilson Nov 2, 2022

Replies: 4 comments

mballance Nov 3, 2022 Maintainer

alwilson Nov 4, 2022 Author

mballance Nov 14, 2022 Maintainer

alwilson Nov 14, 2022 Author

alwilson
Nov 2, 2022

mballance
Nov 3, 2022
Maintainer

alwilson
Nov 4, 2022
Author

mballance
Nov 14, 2022
Maintainer

alwilson
Nov 14, 2022
Author