Model Configuration
Performance Metrics
size28 GB
accuracy85%
Inference Speed45 t/s
memory28 GB
Memory Savings: 0.0%
Speed Improvement: 0.0%
Weight Distribution Analysis
Histogram showing how FP32 quantization affects weight value distribution. Lower precision results in fewer unique values and stepped distributions.
Performance Triangle
Larger area indicates better overall performance across all metrics.
Cross-Phase Impact
Shows how quantization affects different phases of the AI pipeline.