Returns the Bayesian and Akaike information criteria of a regime-(ii) fit,
together with the integrated completed likelihood (ICL). All three are
computed against the empirical log-likelihood of the samples used to fit
the model and are reported on the same scale (smaller is better). They are
NA for regimes that do not have an empirical likelihood ("moment",
"kld").
Arguments
- fit
A gmm_fit.
Details
The ICL of Biernacki, Celeux and Govaert (2000) adds to the BIC twice the
entropy of the fitted classification,
\(\mathrm{ICL} = \mathrm{BIC} + 2 E_N\), where
\(E_N = -\sum_{i,k} \gamma_{ik} \log \gamma_{ik} \ge 0\) is the entropy of
the responsibilities \(\gamma_{ik}\). It therefore penalises mixtures whose
components overlap (uncertain assignments), and favours well-separated
clustering solutions over the merely best-fitting ones. Because
\(E_N \ge 0\), the ICL is never smaller than the BIC, and the two coincide
for a single component (\(K = 1\)), where every responsibility is one. The
classification entropy itself is returned as classification_entropy.
References
Biernacki, C., Celeux, G. and Govaert, G. (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(7), 719–725. doi:10.1109/34.865189
Examples
x <- matrix(stats::rnorm(200), ncol = 2)
tgt <- gmm_target_from_samples(x)
fit <- fit_proxymix(tgt, N = 2L, regime = "sample", max_iter = 25L)
bic_aic(fit)
#> $bic
#> [1] 637.4143
#>
#> $aic
#> [1] 608.7574
#>
#> $icl
#> [1] 718.8163
#>
#> $classification_entropy
#> [1] 40.70102
#>
#> $n_params
#> [1] 11
#>