Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StandardMaterial causes long delays on Firefox with WebGL #18142

Open
ashivaram23 opened this issue Mar 3, 2025 · 2 comments
Open

StandardMaterial causes long delays on Firefox with WebGL #18142

ashivaram23 opened this issue Mar 3, 2025 · 2 comments
Labels
C-Bug An unexpected or incorrect behavior S-Needs-Triage This issue needs to be labelled

Comments

@ashivaram23
Copy link

Bevy version

0.15.3

Relevant system information

AdapterInfo { name: "Apple M1, or similar", vendor: 4203, device: 0, device_type: IntegratedGpu, driver: "", driver_info: "WebGL 2.0", backend: Gl }

Browser is Firefox 134.0, "WebGL 2 Driver WSI Info" in about:support says it's using CGL

What you did

Open any Bevy project that uses StandardMaterial in Firefox with WebGL. This includes most of the WebGL examples on the Bevy official examples page like https://bevyengine.org/examples/3d-rendering/3d-scene/.

Alternatively, run any Bevy WebGL project that uses a custom material with a large uniform array of large structs:

main.rs
use bevy::{
    prelude::*,
    render::render_resource::{AsBindGroup, ShaderRef, ShaderType},
};

#[derive(Clone, Copy, Default, ShaderType)]
struct LargeStruct {
    a: f32,
    b: f32,
    c: f32,
    d: f32,
    e: f32,
    f: f32,
    g: f32,
    h: f32,
    i: f32,
    j: f32,
    k: f32,
    l: f32,
    m: f32,
    n: f32,
    o: f32,
    p: f32,
}

#[derive(AsBindGroup, Asset, Clone, TypePath)]
struct TestMaterial {
    #[uniform(0)]
    struct_array: [LargeStruct; 200],
}

impl Material for TestMaterial {
    fn fragment_shader() -> ShaderRef {
        ShaderRef::from("test.wgsl")
    }
}

fn main() {
    let mut app = App::new();
    app.add_plugins((DefaultPlugins, MaterialPlugin::<TestMaterial>::default()))
        .add_systems(Startup, setup)
        .run();
}

fn setup(
    mut commands: Commands,
    mut meshes: ResMut<Assets<Mesh>>,
    mut materials: ResMut<Assets<TestMaterial>>,
) {
    commands.spawn((Camera3d::default(), Transform::from_xyz(0.0, 0.0, 10.0)));
    commands.spawn((
        Mesh3d(meshes.add(Cuboid::new(4.0, 4.0, 4.0))),
        MeshMaterial3d(materials.add(TestMaterial {
            struct_array: [LargeStruct::default(); 200],
        })),
    ));
}
test.wgsl
struct LargeStruct {
    a: f32,
    b: f32,
    c: f32,
    d: f32,
    e: f32,
    f: f32,
    g: f32,
    h: f32,
    i: f32,
    j: f32,
    k: f32,
    l: f32,
    m: f32,
    n: f32,
    o: f32,
    p: f32,
}

@group(2) @binding(0) var<uniform> struct_array: array<LargeStruct, 200>;

@fragment
fn fragment() -> @location(0) vec4f {
    return vec4(struct_array[0].a, vec3(1.0));
}

What went wrong

The browser takes several seconds to link the WebGL program. This blocks the main thread and causes a long delay when loading the page or whenever shaders are recompiled.

Additional information

Any browser that uses OpenGL to implement WebGL runs into this problem. In addition to desktop Firefox on Mac and Linux, this includes desktop Chrome/Safari whenever the WebGL backend is manually set to OpenGL, as well as Android Chrome/Firefox on at least some devices. Profiling shows that most of the time is spent in calls to glGetActiveUniform or glGetActiveUniformsiv.

What seems to be going on is:

  • While linking a WebGL program, the browser has to collect information about each active uniform with calls to glGetActiveUniformsiv and glGetActiveUniform. OpenGL treats each member of a struct as a different uniform resource (see OpenGL introspection documentation), so large arrays of large structs can have a giant number of uniforms to query.
  • The StandardMaterial PBR shader accesses clusterable_objects.data in pbr_lighting.wgsl. When storage buffers are unavailable, this is a uniform array of 204 ClusterableObject structs, each with 11 members as defined in mesh_view_types.wgsl. In Firefox's implementation, that makes 2244 uniforms for each of five calls to glGetActiveUniformsiv to check, plus 2244 calls to glGetActiveUniform.
  • Parts of a uniform array can be excluded from the list of active uniforms if the shader compiler finds that they're unused, but when the array is in a uniform block, all its uniforms are apparently counted as active (this might be implementation dependent). Naga translates WGSL uniform buffers to GLSL ES uniform blocks, so all members of all 204 entries are queried if any one is accessed.

If possible, it might help to make space for fewer than 204 entries in clusterable_objects at first if there aren't too many point or spot lights in the scene, then somehow increase the capacity up to MAX_UNIFORM_BUFFER_CLUSTERABLE_OBJECTS if more lights are added at runtime. This might require recompiling shaders, but it would only affect platforms without storage buffer support.

@ashivaram23 ashivaram23 added C-Bug An unexpected or incorrect behavior S-Needs-Triage This issue needs to be labelled labels Mar 3, 2025
@mockersf
Copy link
Member

mockersf commented Mar 3, 2025

shaders are known to be long to compile in WebGL2, and block the browser while doing so

@ashivaram23
Copy link
Author

I believe this particular delay only happens when using StandardMaterial, only on browsers with OpenGL backends for WebGL, and only because of the couple of lines in the pbr shader where this clusterable_objects array is accessed. Without that length 204 uniform array, I would expect shader link times to go from ~5 seconds of delay to a split second, leading to roughly the same responsiveness as Bevy projects without StandardMaterial and other web game engines/frameworks that also use WebGL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-Bug An unexpected or incorrect behavior S-Needs-Triage This issue needs to be labelled
Projects
None yet
Development

No branches or pull requests

2 participants