Bridging Reasoning Trajectories in On-Policy… · DeepSignal