SVM

Before moving forward with the to-do list, let’s throw a Random Forest to it.

SVM

For many reasons, Random Forest is usually a very good baseline model. In this particular case I started with the polynomial OLS as baseline model, just because it was so evident from the correlations that the relationship between temperature and consumption follows a polynomial shape. But let’s go back to a beloved RF.

/home/runner/work/strom/strom/.venv/lib/python3.10/site-packages/sklearn/svm/_base.py:1250: ConvergenceWarning:

Liblinear failed to converge, increase the number of iterations.

Model Cards provide a framework for transparent, responsible reporting. 
 Use the vetiver `.qmd` Quarto template as a place to start, 
 with vetiver.model_card()
Writing pin:
Name: 'wd-svm'
Version: 20251124T032611Z-164a4

⏩ stepit 'svm_raw': Starting execution of `strom.modelling.assess_model()` 2025-11-24 03:26:11

/home/runner/work/strom/strom/.venv/lib/python3.10/site-packages/sklearn/svm/_base.py:1250: ConvergenceWarning:



Liblinear failed to converge, increase the number of iterations.



⏩ stepit 'get_single_split_metrics': Starting execution of `strom.modelling.get_single_split_metrics()` 2025-11-24 03:26:11

✅ stepit 'get_single_split_metrics': Successfully completed and cached [exec time 0.0 seconds, cache time 0.0 seconds, size 1.0 KB] `strom.modelling.get_single_split_metrics()` 2025-11-24 03:26:11

♻️  stepit 'cross_validate_pipe': is up-to-date. Using cached result for `strom.modelling.cross_validate_pipe()` 2025-11-24 03:26:11

✅ stepit 'svm_raw': Successfully completed and cached [exec time 0.1 seconds, cache time 0.0 seconds, size 14.8 KB] `strom.modelling.assess_model()` 2025-11-24 03:26:11

Metrics

	Single Split		CV
	train	test	test	train
MAE - Mean Absolute Error	2.253863	2.404185	2.684980	3.410023
MSE - Mean Squared Error	15.655275	21.344099	12.038683	27.480901
RMSE - Root Mean Squared Error	3.956675	4.619967	2.982166	5.136139
R2 - Coefficient of Determination	0.832035	0.774003	-2.236684	0.722929
MAPE - Mean Absolute Percentage Error	0.226225	0.250558	0.395428	0.309699
EVS - Explained Variance Score	0.833088	0.775890	0.532616	0.818702
MeAE - Median Absolute Error	1.362887	1.408742	2.555536	2.625422
D2 - D2 Absolute Error Score	0.674623	0.662013	-0.765698	0.520175
Pinball - Mean Pinball Loss	1.126932	1.202092	1.342490	1.705011

Scatter plot matrix

Observed vs. Predicted and Residuals vs. Predicted

Check for …

check the residuals to assess the goodness of fit.

white noise or is there a pattern?
heteroscedasticity?
non-linearity?

Normality of Residuals:

Check for …

Are residuals normally distributed?

Leverage

Scale-Location plot

Residuals Autocorrelation Plot

Residuals vs Time

Well, not that bad, but it is overfitting quite a lot.

♻️  stepit 'grid_search_pipe': is up-to-date. Using cached result for `strom.modelling.grid_search_pipe()` 2025-11-24 03:26:15

Model Cards provide a framework for transparent, responsible reporting. 

 Use the vetiver `.qmd` Quarto template as a place to start, 

 with vetiver.model_card()

Writing pin:

Name: 'wd-svm'

Version: 20251124T032615Z-37e61

⏩ stepit 'svm_tuned': Starting execution of `strom.modelling.assess_model()` 2025-11-24 03:26:15

/home/runner/work/strom/strom/.venv/lib/python3.10/site-packages/sklearn/svm/_base.py:1250: ConvergenceWarning: Liblinear failed to converge, increase the number of iterations.

⏩ stepit 'get_single_split_metrics': Starting execution of `strom.modelling.get_single_split_metrics()` 2025-11-24 03:26:15

✅ stepit 'get_single_split_metrics': Successfully completed and cached [exec time 0.0 seconds, cache time 0.0 seconds, size 1.0 KB] `strom.modelling.get_single_split_metrics()` 2025-11-24 03:26:15

♻️  stepit 'cross_validate_pipe': is up-to-date. Using cached result for `strom.modelling.cross_validate_pipe()` 2025-11-24 03:26:15

✅ stepit 'svm_tuned': Successfully completed and cached [exec time 0.1 seconds, cache time 0.0 seconds, size 14.7 KB] `strom.modelling.assess_model()` 2025-11-24 03:26:15

Metrics

	Single Split		CV
	train	test	test	train
MAE - Mean Absolute Error	2.191867	2.337857	1.293039	2.436891
MSE - Mean Squared Error	15.723957	22.742093	2.980500	18.131235
RMSE - Root Mean Squared Error	3.965344	4.768867	1.645820	4.255542
R2 - Coefficient of Determination	0.831298	0.759200	0.090348	0.816841
MAPE - Mean Absolute Percentage Error	0.191461	0.199679	0.214997	0.194748
EVS - Explained Variance Score	0.832178	0.772519	0.505620	0.817841
MeAE - Median Absolute Error	1.216609	1.212842	1.094747	1.465526
D2 - D2 Absolute Error Score	0.683573	0.671338	0.149081	0.656816
Pinball - Mean Pinball Loss	1.095934	1.168928	0.646519	1.218445