Paarth Shah

I’m a research scientist at Toyota Research Institute where I work on using internet scale data to solve real world robotics problems. In particular, I work on multimodal foundation models to see how far we can push robots to do useful every day tasks. These range from training world class vision-language-action (VLA’s) to combining these models with video and other outputs to see how far we can push generalization with as little action data as possible.

Previously, I worked on optimization for robotics, and specifically, whole body MPC methods for quadrupeds and humanoids. I believe that combining both optimization and learning methods are essential for getting robots to work robustly in the real world and have experience in both sides of robotics.

If you’re ever interested in reaching out to talk about research please feel free to reach me at: paarthrs@gmail.com

Current Work

https://toyotaresearchinstitute.github.io/lbm1/

Large Behavior Models

At TRI we scaled up our internal diffusion policy work to study how scaling imitation learning across a wide variety of tasks can help visuomotor policies learn and adapt to new tasks as fast as possible. During this project, we learned a lot about how to best study robot policies (it takes a lot or rollouts!) as well as engineer a system so that we could study data and understand how our design choices affect policy performance as fast as possible.
We were able to create policies that exhibited incredibly dexterity such as coring and cutting an apple (shown) as well as as putting a bike rotor on a wheel.
This project actually started as my internship project which allowed me to understand the entire pipeline of how to get large scale imitation learning working on real robots!

Unified World Models

In this work we aimed to find a scalable architecture that acts as a data sponge for multimodal foundation models (although we focused on video only here!). By utilizing separate diffusion timesteps we can leverage both action-free (e.g. videos) and action data to train across a wide variety of datasets to show cross-modality transfer from video to actions.

Previous Work

Picture taken from: https://awinkler.github.io/publications/mypdfs/18-ral-winkler.pdf

Phase-Based End-Effector Parameterization

This project has allowed me to research a tractable way of “contact-implicit optimization.” Previously, Winkler showed his phase-based trajectory optimization running and implemented on a quadruped, however, no results have been shown for a real humanoid [1]. I am currently exploring the implementation of this method on the SARCOS humanoid at USC.

The Phase-Based End-Effector Parameterization framework is also set up to account for different dynamic models. Although initially set up for use with the Singled Rigid Body Dynamic model, I have extended this to come up with trajectories for both the LIPM (Linear Inverted Pendulum Model) as well as the Centroidal Momentum model. Results forthcoming.

[1] https://awinkler.github.io/publications/mypdfs/18-ral-winkler.pdf

Screen Shot 2018-10-21 at 2.21.39 PM.png

Joint Heat Constraint-Based Trajectory Optimization

As brushless DC (BLDC) motors become more prevalent in robotics, limitations of these motors must be thoughtfully considered. Current controllers on operational robots rely on the use of manufacturer provided specifications to utilize torque limits of the motor. These limitations provide a safe limit for continuous operation of these motors based on ’normal’ conditions. Similar to humans muscles, however, these motors do allow for brief overloading to achieve short-term tasks as long as they do not exceed a critical temperature [1]. Although we may provide a simple heuristic while using the torque limits (such as checking the current temperature to adjust torque limits every time a torque command is calculated), it is more useful to utilize this in a predictive manner for certain short-term tasks (lifting heavy equipment, keeping humanoid Center of Mass trajectory within specific limits, etc.) as a trajectory optimization problem.

This optimization was implemented using a full rigid body dynamic model and modeling the actual heat generation as the constraints rather than the traditional torque limits [1]. Initial results of the optimization look promising [2], however, I found the approach limited for locomotion due to reliance on mode scheduling. This lead me to explore contact implicit optimization to better leverage the heat constraint framework.

[1]: https://ieeexplore.ieee.org/document/4651110

[2] https://www.youtube.com/watch?v=rmBNeIo0ZPI

Screen Shot 2018-10-21 at 2.33.28 PM.png

Inverted Pendulum Walking Controller

In order to become familiar with writing a walking controller and using our SARCOS Hhmanoid, I decided to look into implementing an inverted pendulum controller on our SARCOS humanoid. This project was inspired by [1] and [2].

This controller was eventually tested on the actual SARCOS humanoid with promising results (videos at the bottom). Unmodeled dynamics (force of hoses, model simplifications, etc.), however, greatly affected the robot. The humanoid was able to take 3-4 steps successfully before these unmodeled dynamics affected the stability of the system.

The project allowed me to become familiar with Rigid Body Dynamics, Trajectory Optimization, and writing code for real time systems. In addition, I learned about the difficulties of implementing code on actual hardware and allowed me to start exploring more modern techniques for robotic controllers (contact implicit trajectory optimization, etc.)

Results:

https://youtu.be/fW2X1ajR1uU

https://youtu.be/IO3ADtbn1MY

References:

[1]: https://ieeexplore.ieee.org/document/7363423

[2] https://arxiv.org/pdf/1607.08729.pdf

Research Overview