Quad-Move | Eklavya 2025

Project Domains	Mentors	Project Difficulty
Reinforcement Learning, Robotics Simulation, Imitation Learning	Ansh, Prajwal	Hard

Kurma is a budget quadruped with a turtle-like frame.
Your mission: teach it to walk—first in simulation, then in the real world.

Policy Learning (PPO)
- Build a Gym-compatible MuJoCo environment for Kurma.
- Craft rewards for speed, stability, and energy use.
- Train a continuous-action neural policy using Proximal Policy Optimization.
Imitation / Inverse RL (stretch)
- Script or joystick-teleop a stable gait.
- Use Inverse RL to learn a reward that reproduces the demonstration, then refine with PPO.

Deploy the final network to a Raspberry Pi controller, drive affordable servos, and film an untethered demo of Kurma on the move.