Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request to support automatic interning of Built objects #750

Open
dave-doty opened this issue Nov 16, 2019 · 6 comments
Open

request to support automatic interning of Built objects #750

dave-doty opened this issue Nov 16, 2019 · 6 comments
Assignees

Comments

@dave-doty
Copy link

dave-doty commented Nov 16, 2019

It would be great to have an option to support automatic interning of Built objects. (caching them so that objects that are equal according to == are not duplicated, speeding up equality comparisons)

In Dart 2.6, it is possible to use extension methods to add a intern() method to Built types:

Set<Built> _cache_built_value = {};

extension BuiltExtensionIntern<V extends Built<V, B>, B extends Builder<V, B>> on Built<V, B> {
  V intern() {
    Set<Built> cache = _cache_built_value;
    Built<V, B> cached_value = cache.lookup(this);
    if (cached_value == null) {
      cache.add(this);
      return this;
    } else {
      return cached_value;
    }
  }
}

Now, I can call intern() on instances of Built, e.g.,

var myValue = (MyValueBuilder()..field1="abc").build().intern();

However, it would be great if this could be automated to happen on every call to build(). For example, if I have a nested builder, then I have to assign a Builder object to it, not a Built, so it's not an option for my code to call intern() immediately after. The call to build() happens in the library code where I cannot control it.

One option is a special subclass of Built that could be implemented instead of Built?

Of course the same thing would be useful for BuiltList, BuiltMap, etc.

@davidmorgan
Copy link
Collaborator

You could use the new _finalizeBuilder functionality. It's a little obscure, but if you build then replace, the builder will use that instance rather than building again.

static void _finalizeBuilder(void Function(MyBuilder b)) {
  b.replace(intern(b.build()));
}

I'd like to understand your use case a little better--aren't you worried about memory consumption?

@dave-doty
Copy link
Author

You mean I should add that static method as you typed it into each class I want to be automatically intern()'ed?

aren't you worried about memory consumption?

Why would memory consumption be a problem? It should actually be better this way (other than the overhead of the Set instance), because there will never be two objects both occupying memory that are == to each other.

My use case is that I'm using many instances of Built, BuiltList, BuiltSet, and BuiltMap to represent state in a redux store to use with OverReact:

https://pub.dev/packages/over_react
https://github.com/Workiva/over_react/blob/master/doc/over_react_redux_documentation.md

The main way that expensive DOM re-renders are avoided is by comparing objects for equality and skipping the re-render of a component when its new state data is == to its old state data (which in turn can skip recursively checking all the fields if the two objects are actually the same object). I haven't really gotten to the point yet were I've profiled to see how much time this would save, but it seems a straightforward idea to support.

In fact, if the library supported automatic guaranteed intern()'ing of all built_value and built_collection objects, then the library could assume that two objects are == if and only if they are the same object, so all equality tests would be a single step.

@davidmorgan
Copy link
Collaborator

Yes, adding that static method into all the classes should do it.

The memory problem is that the cache does not know when you're done using a particular object, so if you make lots of modifications to objects then the cache will fill up.

built_collection anyway caches hash codes and uses these to avoid comparing when hash codes are different; I don't know how much more interning would save compared to this. built_value does not currently cache hash codes or use them in this way.

I'd definitely be interested to learn how interning behaves in practice!

@dave-doty
Copy link
Author

Ah, that's a good point about building up old values in the cache and taking up memory. I suppose that's why Java can automatically intern() Strings, because it can talk to the garbage collector and ensure that Strings can be garbage collected once the only reference that can reach them is the cache.

@davidmorgan
Copy link
Collaborator

Yeah. So I wonder if we need some kind of eviction strategy.

It would be very interested to see a real world benchmark :)

@dave-doty
Copy link
Author

dave-doty commented Nov 29, 2019

@davidmorgan I tried your suggestion for _finalizeBuilder above in a built_value class called BoundSubstrand. Here is my type signature for the intern function (I made it a top-level function instead of an extension method):

V intern<V extends Built<V, B>, B extends Builder<V, B>>(V value) {...}

However, when I add this code (I expanded out the lines you wrote a bit to help me track down the error):

static void _finalizeBuilder(void Function(BoundSubstrandBuilder builder)) {
  BoundSubstrand bss = builder.build();
  BoundSubstrand bss_interned = intern(bss);
  builder.replace(bss_interned);
}

I get the following error:

Error in BuiltValueGenerator for abstract class BoundSubstrand implements Built<BoundSubstrand, dynamic>, Substrand.Please make the following changes to use BuiltValue:

1. Fix _finalizeBuilder signature: static void _finalizeBuilder(BoundSubstrandBuilder builder)

I guessed that maybe the signature as you write it was a typo (since I assume the named parameter builder is a parameter to _finalizeBuilder, not to the function being passed in as a parameter to _finalizeBuilder), so I tried this (note the slightly different parentheses location near the right end):

static void _finalizeBuilder(void Function(BoundSubstrandBuilder) builder)

but that generates the same compiler error.

I tried changing the signature to the one suggested by the compiler error, but then I get this error:

Error: Inferred type argument 'Builder<BoundSubstrand, dynamic>' doesn't conform to the bound 'Builder<V, B>' of the type variable 'B' on 'intern'.
 - 'Builder' is from 'package:built_value/built_value.dart' ('packages/built_value/built_value.dart').
 - 'BoundSubstrand' is from <my code; path omitted>
Try specifying type arguments explicitly so that they conform to the bounds.
    BoundSubstrand bss_interned = intern(bss);
                                  ^
packages/scadnano/src/built_intern.dart:9:33: Context: This is the type variable whose bound isn't conformed to.
V intern<V extends Built<V, B>, B extends Builder<V, B>>(V value) {
                                ^

A workaround to make the compiler happy is to pass a Builder to the interning function, instead of a Built:

V build_interned<V extends Built<V, B>, B extends Builder<V, B>>(B builder) {
  V value = builder.build();
  ... // similar to code implementing build()
}

But when I create an object of type BoundSubstrand:

BoundSubstrand bss = BoundSubstrand((b) => b
  ..field1=value1
  // etc.
);

there is infinite recursion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants