This is a test notebook for trying out sagemath features

For binary classification, if there are $$n$$ component models and the rate of success of each model is $$p$$, then the rate of success of a majority voting ensemble can be given using the binomial distribution.

For a simple case of $$n = 21$$ and $$p = 0.5$$, this means that the ensemble success is given by:

n = 21; c = 2
sum(binom_pmf(0.5).subs(N == n), M, floor(n/c) + 1, n)

0.5



Notice that $$M$$ is summed from 11 to 21 which is the zone of majority. A plot of varying $$p$$ values follows:

curve = plot(sum(binom_pmf(p).subs(N = n), M, floor(n/c) + 1, n), (p, 0.001, 0.999))
curve + parametric_plot((1/c, x), (x, 0, 1), linestyle="--", color="green")


## 1 Multiclass

Now assume we have $$c$$ classes. The models have a par accuracy of $$1/c$$ now. This asks for a question whether a success rate (call it $$q$$ as before) greater than $$1 - 1/c$$ (or some other value) will result in an ensemble with better performance than any single one.

For a simple case of 3 classes, we have the par error as $$1/3$$ and for the ensemble to be right, we need at least $$\lfloor n/3 \rfloor + 1$$ models to be right. Meaning the success rate for the ensemble would be:

c = 3
float(sum(binom_pmf(1/c).subs(N == n), M, floor(n/c) + 1, n))

0.3992381153824027



Okay so it does better than the $$1/3$$ value.

For 7 classes:

c = 7
float(sum(binom_pmf(1/c).subs(N == n), M, floor(n/c) + 1, n))

0.3523243319568171



A plot of $$q$$ vs ensemble success for 7 classes follows:

curve = plot(sum(binom_pmf(p).subs(N == n), M, floor(n/c) + 1, n), (p, 0.001, 0.999))
curve + parametric_plot((1/c, x), (x, 0, 1), linestyle="--", color="green")


## 2 Critical p

So you don't need to be above the $$1/c$$ threshold in accuracy to gain using a voting ensemble. This means, even a poorer than random classifier will gain here. In general, any $$p$$ that makes the following true will be good:

sum(binom_pmf(p), M, floor(N/c + 1), N) > 1/c

sum(p^M*(-p + 1)^(-M + N)*binomial(N, M), M, floor(1/7*N) + 1, N) > (1/7)



Slight increase in $$p$$ around this critical value help a lot (as seen from the curves).