A Study on Hidden Layer Distillation for Large Language Model Pre-Training · DeepSignal