[Bug]: `is_image_space` works poorly with Gymnasium's `FrameStackObservation` #2090

fracapuano · 2025-02-25T14:48:46Z

🐛 Bug

MRC 👇

from stable_baselines3.common.preprocessing import is_image_space
import gymnasium as gym
import numpy as np

from gymnasium.wrappers import FrameStackObservation

image_space = gym.spaces.Box(0, 255, (3, 64, 64), np.uint8)  # a basic RGB 64x64 image

class DummyEnv(gym.Env):
    def __init__(self):
        super().__init__()
        self.observation_space = image_space

env = DummyEnv()
print(is_image_space(env.observation_space))  # True

env = FrameStackObservation(env, stack_size=2)

print(env.observation_space.shape)  # (2, 3, 64, 64), stacked obs on dim=0
print(is_image_space(env.observation_space))  # False

From here it seems to be due to the fact that sb3 expects images to be tensors of dimension strictly equal to 3.

I am wondering why not setting the check to be >=3 instead of strictly equal. In this way, one would still pass the image check (though I reckon might have problems with NatureCNN as I am not sure how it would handle 4-dimensional tensors).

Worth adding that:

A workaround this bug is to transform the observation using the TransformObservation env wrapper from gymnasium, merging the two first dimension into one
This is precisely what sb3 does in their VecFrameStack!

Worth saying this can have very unintended consequence: if one passes an image through Gymnasium's frame stacking and then uses sb3, the image won't be recognized as such, as the feature extractor for that image will be set to be Flatten (instead of NatureCNN).

Happy to open a PR to change this check, but I wanted to double check it made sense first!

To Reproduce

see above

Relevant log output / Error message

see above

System Info

OS: macOS-15.0.1-arm64-arm-64bit Darwin Kernel Version 24.0.0: Tue Sep 24 23:36:26 PDT 2024; root:xnu-11215.1.12~1/RELEASE_ARM64_T8103
Python: 3.12.9
Stable-Baselines3: 2.5.0
PyTorch: 2.6.0
GPU Enabled: False
Numpy: 2.2.3
Cloudpickle: 3.1.1
Gymnasium: 1.0.0

Checklist

My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
I have checked that there is no similar issue in the repo
I have read the documentation
I have provided a minimal and working example to reproduce the bug
I've used the markdown code blocks for both code and stack traces.

The text was updated successfully, but these errors were encountered:

araffin · 2025-02-25T14:59:17Z

Hello,
probably related to #1500.

This is precisely what sb3 does in their VecFrameStack!

In general you should use VecFrameStack() instead yes.

fracapuano · 2025-02-25T15:11:53Z

Thank you! Are you satisfied with this at all? I think the check could be changed, or you reckon that would require too much work?

araffin · 2025-03-01T13:12:29Z

I am wondering why not setting the check to be >=3 instead of strictly equal.

stable-baselines3/stable_baselines3/common/preprocessing.py

Line 35 in fa21bce

The check is conservative, so that it returns False if there is a doubt.

To me (2, 3, 64, 64) looks like a batch of images and not an image, also it makes the detection of channel-first/last much harder (so the forward pass is likely to fail after).

So I would keep the check as-is but probably add a note in the documentation linking to this issue and VecFrameStack, on https://stable-baselines3.readthedocs.io/en/master/guide/custom_env.html#using-custom-environments

fracapuano · 2025-03-02T15:17:44Z

Thank you for the extra context---will sort this out opening a PR with this soon then ;)

it makes the detection of channel-first/last much harder
Regarding this point: I agree, and I actually think it'd be best to stack the use a flag (within FrameStackObservation in gymnasium) to "collapse" the stack into one dimension only (thus aligning it with VecFrameStack).

Edit: I actually think it'd be suboptimal, as they already define a TransformObservation wrapper which can be used to achieve exactly the same objective. I will reference this issue in the documentation, warning about unintended side effects (i.e., image observations being flattened!) and providing a small snippet to use TransformObservation in case one cannot use VecFrameStack (that was my case!)

Thanks for discussing! 🙏

$@fracapuano$ fracapuano added the bug Something isn't working label Feb 25, 2025

araffin added the documentation Improvements or additions to documentation label Mar 1, 2025

araffin added the help wanted Help from contributors is welcomed label Mar 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: `is_image_space` works poorly with Gymnasium's `FrameStackObservation` #2090

[Bug]: `is_image_space` works poorly with Gymnasium's `FrameStackObservation` #2090

fracapuano commented Feb 25, 2025

araffin commented Feb 25, 2025 •

edited

Loading

fracapuano commented Feb 25, 2025

araffin commented Mar 1, 2025

fracapuano commented Mar 2, 2025 •

edited

Loading

[Bug]: is_image_space works poorly with Gymnasium's FrameStackObservation #2090

[Bug]: is_image_space works poorly with Gymnasium's FrameStackObservation #2090

Comments

fracapuano commented Feb 25, 2025

🐛 Bug

To Reproduce

Relevant log output / Error message

System Info

Checklist

araffin commented Feb 25, 2025 • edited Loading

fracapuano commented Feb 25, 2025

araffin commented Mar 1, 2025

fracapuano commented Mar 2, 2025 • edited Loading

[Bug]: `is_image_space` works poorly with Gymnasium's `FrameStackObservation` #2090

[Bug]: `is_image_space` works poorly with Gymnasium's `FrameStackObservation` #2090

araffin commented Feb 25, 2025 •

edited

Loading

fracapuano commented Mar 2, 2025 •

edited

Loading