Measuring What Matters: Benchmarking Generative, Multimodal, and Agentic AI in Healthcare · DeepSignal