The dependencies for Sparrow-V2 are:
torch>=2.2.0 # GPU version is recommended
torchaudio>=2.2.0
torchvision>=0.17.0
numpy>=1.24.4
scipy>=1.10.1
pygame>=2.5.2
You can install torch by following the guidance from its official website. We strongly suggest you choose CUDA>=12.4, though CPU version or lower CUDA version is also supported.
Then you can install other packages via:
pip3 install numpy==1.24.4 scipy==1.10.1 pygame==2.5.2
Additionally, we recommended python>=3.11.10
. Although other versions might also work.
You can play with Sparrow with your keyboard (⇠/⇡/⇣/⇢) to test if you have installed it successfully:
python play_with_keyboard.py
python play_with_mouse.py
from Sparrow_V2 import Sparrow
import torch
envs = Sparrow()
s0 = envs.reset()
while True:
# action_dim: dimension of action, N: number of vectorized envs, dvc: running device of Sparrow
a = torch.randint(low=0, high=envs.action_dim, size=(envs.N,), device=envs.dvc)
s_next, r, terminated, truncated, info = envs.step(a)
- Vectorizable (Super fast data collection)
- High Diversity (Randomly-generated static obstacles, Randomly moving dynamic obstacles)
- Easy Sim2Real (Domain randomization and sensor noise are integrated)
- Discrete&Continuous Action (Discrete and Continuous action are both accepted)
- Lightweight (Consume only a few GPU memories for vectorized environments)
- Standard Gym API with Pytorch data flow
- GPU/CPU are both acceptable (Fast & Compatible)
- Conversion-free data flow (The state generated by Sparrow are in torch.tensor format. If you use Pytorch to build your DRL model, you can run your RL model and Sparrow both on GPU, obviating the need to transfer the data from CPU to GPU.)
- Easy to use (Python files only. Just import, never worry about installation)
- Ubuntu/Windows/MacOS are all supported
- Accept image as map (Customize your own environments easily and rapidly)
Please refer to this repo.
from Sparrow_V2 import Sparrow, str2bool
import argparse
import torch
'''Hyperparameter Setting for Sparrow'''
parser = argparse.ArgumentParser()
parser.add_argument('--dvc', type=str, default='cpu', help='running device of Sparrow: cuda / cpu')
parser.add_argument('--render_mode', type=str, default='human', help=" None or 'human' ")
parser.add_argument('--action_type', type=str, default='Discrete', help='Action type: Discrete / Continuous')
parser.add_argument('--window_size', type=int, default=800, help='size of the map')
parser.add_argument('--D', type=int, default=400, help='maximal local planning distance')
parser.add_argument('--N', type=int, default=1, help='number of vectorized environments')
parser.add_argument('--O', type=int, default=3, help='number of obstacles in each environment')
parser.add_argument('--RdON', type=str2bool, default=False, help='whether to randomize the Number of dynamic obstacles')
parser.add_argument('--ScOV', type=str2bool, default=False, help='whether to scale the maximal velocity of dynamic obstacles')
parser.add_argument('--RdOV', type=str2bool, default=True, help='whether to randomize the Velocity of dynamic obstacles')
parser.add_argument('--RdOT', type=str2bool, default=True, help='whether to randomize the Type of dynamic obstacles')
parser.add_argument('--RdOR', type=str2bool, default=True, help='whether to randomize the Radius of obstacles')
parser.add_argument('--Obs_R', type=int, default=14, help='maximal obstacle radius, cm')
parser.add_argument('--Obs_V', type=int, default=30, help='maximal obstacle velocity, cm/s')
parser.add_argument('--MapObs', type=str, default=None, help="name of map file, e.g. 'map.png' or None")
parser.add_argument('--ld_a_range', type=int, default=270, help='max scanning angle of lidar (degree)')
parser.add_argument('--ld_d_range', type=int, default=300, help='max scanning distance of lidar (cm)')
parser.add_argument('--ld_num', type=int, default=27, help='number of lidar streams in each world')
parser.add_argument('--ld_GN', type=int, default=3, help='how many lidar streams are grouped for one group')
parser.add_argument('--ri', type=int, default=0, help='render index: the index of world that be rendered')
parser.add_argument('--basic_ctrl_interval', type=float, default=0.1, help='control interval (s), 0.1 means 10 Hz control frequency')
parser.add_argument('--ctrl_delay', type=int, default=0, help='control delay, in basic_ctrl_interval, 0 means no control delay')
parser.add_argument('--K', type=tuple, default=(0.55,0.6), help='K_linear, K_angular')
parser.add_argument('--draw_auxiliary', type=str2bool, default=False, help='whether to draw auxiliary infos')
parser.add_argument('--render_speed', type=str, default='fast', help='fast / slow / real')
parser.add_argument('--max_ep_steps', type=int, default=500, help='maximum episodic steps')
parser.add_argument('--noise', type=str2bool, default=True, help='whether to add noise to the observations')
parser.add_argument('--DR', type=str2bool, default=True, help='whether to use Domain Randomization')
parser.add_argument('--DR_freq', type=int, default=int(3.2e3), help='frequency of Domain Randomization, in total steps')
parser.add_argument('--compile', type=str2bool, default=False, help='whether to use torch.compile to boost simulation speed')
opt = parser.parse_args()
opt.dvc = torch.device(opt.dvc)
envs = Sparrow(**vars(opt))
s, info = envs.reset()
while True:
a = torch.randint(low=0, high=envs.action_dim, size=(envs.N,), device=envs.dvc)
s_next, r, terminated, truncated, info = envs.step(a)
Note that Sparrow-V2 runs in a vectorized manner, thus the dimension of s, a, r, terminated, truncated are (N,*), (N,), (N,), (N,), (N,) respectively, where N is the number of vectorized environments. In addition, Sparrow-V2 has its own AutoReset mechanism. Users only need to reset the envs once in the beginning.
The LiDAR perception range is 300cm×270° in this example, with an accuracy of 3 cm. The radius of the robot is 9 cm, and its collision threshold is 14 cm.
The maximum linear and angular velocity of the robot is 50 cm/s and 2 rad/s, respectively. The control frequency of the robot is 10Hz (basic_ctrl_interval
). And we use a simple but useful model to describe the kinematics of the robot
Here, Sparrow_V2.py
and customized according to your own scenario.
In DRL-based navigation, it is common that the dimension of the LiDAR state is far larger than the dimension of the robot state (e.g. the location, orientation, and speed). To prevent the LiDAR state from submerging the robot state, we provide a useful LiDAR group option (dowm-sampling). The raw LiDAR output will be grouped every ld_GN
streams, and each group outputs its minimal distance.
# Example of ld_num=12, ld_GN=2:
raw_lidar_output = [100,100, 40,60, 90,100, 100,100, 100,100, 100,100]
grouped_lidar_output = [100, 40, 90, 100, 100, 100]
As illustrated below (left), the basic task in Sparrow is about navigating the robot to the green target area as soon as possible, without colliding with obstacles (RED:Dynamic obstacles, BLACK:Static obstacles). The position of the robot and target point will be reseted when reaching target / collide / exceed maximum episodic steps. The obstacles will be regenerated when calling the reset()
function.
The state of the robot is a vector of length 17,
-
$V_l^{ct}$ : the current target linear velocity generated by the agent. It is normalized by the maximal linear velocity of the robot (50 cm/s), thus$\in [-1,1]$ -
$V_{a}^{ct}$ : the current target angular velocity generated by the agent. It is normalized by the maximal angular velocity of the robot (2 rad/s), thus$\in [-1,1]$ -
$V_l^{rt}$ : the real target linear velocity received by the robot (recall control delay). It is normalized by the maximal linear velocity of the robot (50 cm/s), thus$\in [-1,1]$ -
$V_{a}^{rt}$ : the real target angular velocity received by the robot (recall control delay). It is normalized by the maximal angular velocity of the robot (2 rad/s), thus$\in [-1,1]$ -
$D2T$ : the normalized distance from the robot to the target. The distance is normalized by the radius of the map, thus$D2T \in [0,1]$ -
$\alpha$ : the normalized relative orientation of the robot,$\alpha \in [-1,1]$ -
$V_l$ : the normalized real linear velocity of the robot. It is normalized by the maximal linear velocity of the robot (50 cm/s), thus$V_l \in [-1,1]$ -
$V_{a}$ : the normalized real angular velocity of the robot. It is normalized by the maximal angular velocity of the robot (2 rad/s), thus$V_{a} \in [-1,1]$ -
$ld_0, ..., ld_{8}$ : the grouped and normalized LiDAR result. The LiDAR result will be normalized by the maximal perception distance (300 cm), thus$ld \in [0,1]$ . Note that here in this example,--ld_num=27
and--ld_GN=3
.
There are 8 discrete actions avaliable in Sparrow-V2, controlling the target velocity
- 0 >> Turn Left: [ 10 cm/s, 2 rad/s ]
- 1 >> Turn Left + Move Forward: [ 50 cm/s, 2 rad/s ]
- 2 >> Move Forward: [ 50 cm/s, 0 rad/s ]
- 3 >> Turn Right + Move Forward: [ 50 cm/s, -2 rad/s ]
- 4 >> Turn Right: [ 10 cm/s, -2 rad/s ]
- 5 >> Move Backward: [ -50 cm/s, 0 rad/s ]
- 6 >> Slow Down: [ 5 cm/s, 0 rad/s ]
- 7 >> Stop: [ 0 cm/s, 0 rad/s ]
We strongly suggest not using the Stop action when training an RL model, because it may result in the robot standing still and generating low-quality data. Thus, the action space of Sparrow-V2 is 7 by default. You might have also noted that when the robot is turning left or right, we also give it a small linear velocity. We do this to help the robot escape from the deadlock.
If you wish to use continuous action to control the robot, you can set --action_type
to Continuous
. Then, the action should be a torch.tensor of shape (N,2)
between (-1,1), controling the linear/angular velocity of N vectorized environments. Note that the maximal linear/angular velocity are still 50cm/s and 2rad/s.
The reward function of Sparrow-V2 is:
-
$r=200$ , when arriving at the target point -
$r=-200$ , when colliding with obstacle -
$r=-200$ , when exceeding the maximal local planning distanceD
-
$r= 0.5 \cdot r_d + r_a \cdot r_o - 0.5 \cdot r_p - 0.5$ , otherwise
Here,
The episode would be terminated when:
- the robot collides with the obstacles
- the robot reaches the target area
D2T
>D
(the distance to target point > maximal local planning distance)
The episode would be truncated only when the episode steps exceed --max_ep_steps
.
The absolute location and orientation of the robot can be derived from the info
yielded by the step()
and reset()
function. The absolute location and orientation coordinate system is illustrated bellow:
If render_mode=None
, Sparrow would run at its maximum simulation speed (depending on the hardware). However, if render_mode='human'
, there would be three options regarding the simulation speed:
render_speed == 'fast'
: render the Sparrow in a pygame window with maximum FPSrender_speed == 'slow'
: render the Sparrow in a pygame window with 30 FPS. Might be useful when debugging.render_speed == 'real'
: render the Sparrow in a pygame window with 1/ctrl_interval FPS, in accordance with the real world speed.
You can configure --ri
to render different maps of the vectorized envs.
The environment copies inside the vectorized environment may be done (terminated or truncated) in different timesteps. Consequently, it is inefficient or even improper to call the env.reset() function to reset all copies whenever one copy is done, necessitating the design of AutoReset mechanism, which is illustrated below:
Note:
-
the AutoReset mechanism of Sparrow-V2 is the same as V1 but different from V0
-
different environment copies are reset independently
- Sparrow-V1: Single Robot, Static environments
- Sparrow-V2: Single Robot, Dynamic/Static environments
- Sparrow-V3: Multiple/Single Robot, Dynamic/Static environments
To cite this repository in publications:
@article{ColorDynamic,
title={ColorDynamic: Generalizable, Scalable, Real-time, End-to-end Local Planner for Unstructured and Dynamic Environments},
author={Jinghao Xin, Zhichao Liang, Zihuan Zhang, Peng Wang, and Ning Li},
journal={arXiv preprint arXiv:2502.19892},
year={2025}
}
The name "Sparrow" actually comes from an old saying “麻雀虽小,五脏俱全.”
Hope you enjoy using Sparrow!
Additionally, we have made detailed comments on the source code (Sparrow_V2.py
) so that you can modify Sparrow to fit your own problem. But only for non-commercial purposes, and all rights are reserved by Jinghao Xin.