MHaurel/ppo-PyramidsRND · Training metrics