The light and soft characteristics of Buoyancy Assisted Lightweight Legged
Unit (BALLU) robots have a great potential to provide intrinsically safe
interactions in environments involving humans, unlike many heavy and rigid
robots. However, their unique and sensitive dynamics impose challenges to
obtaining robust control policies in the real world. In this work, we
demonstrate robust sim-to-real transfer of control policies on the BALLU robots
via system identification and our novel residual physics learning method,
Environment Mimic (EnvMimic). First, we model the nonlinear dynamics of the
actuators by collecting hardware data and optimizing the simulation parameters.
Rather than relying on standard supervised learning formulations, we utilize
deep reinforcement learning to train an external force policy to match
real-world trajectories, which enables us to model residual physics with
greater fidelity. We analyze the improved simulation fidelity by comparing the
simulation trajectories against the real-world ones. We finally demonstrate
that the improved simulator allows us to learn better walking and turning
policies that can be successfully deployed on the hardware of BALLU.