Optimal Temperature
{{ report_data.hyperparameter_analysis.optimal_temperature|default("4.0") }}
Best performance
Alpha (α)
{{ report_data.hyperparameter_analysis.optimal_alpha|default("0.7") }}
Distillation weight
Distilled Layers
{{ report_data.hyperparameter_analysis.distilled_layers|default("3/6") }}
Intermediate layers
Configs Tested
{{ report_data.hyperparameter_analysis.configs_tested|default("48") }}
Grid search complete

Temperature Impact Analysis

Temperature Insights
  • Optimal range: 3.0 - 5.0
  • Performance peaks at T=4.0
  • Higher temperatures improve soft target quality
  • Diminishing returns above T=7.0

Alpha (α) Distribution Impact

3D Parameter Space

Layer-wise Distillation

Hyperparameter Interaction Heatmap

Optimization History

Grid Search Results

Rank Temperature Alpha (α) Layers Learning Rate Batch Size Accuracy F1-Score Model Size (MB)

Best Configuration

Temperature: 4.0
Alpha (α): 0.7
Distilled Layers: 3
Learning Rate: 0.001
Batch Size: 32
Final Accuracy: 94.2%

Recommendations

  • Temperature of 4.0 provides best knowledge transfer
  • Alpha=0.7 balances task and distillation losses well
  • Consider layer-specific temperatures for complex models
  • Batch size of 32 offers good stability-speed trade-off
  • Learning rate scheduling improves final performance