Fondazione Bruno Kessler - Technologies of Vision

contains material from

Template Matching Techniques in Computer Vision: Theory and Practice

Roberto Brunelli © 2009 John Wiley & Sons, Ltd

A binary classification task can be considered as a binary hypothesis testin problem where one of the
two competing hypotheses H_{0} and H_{1} must hold. The two basic probabilities characterizing the
Neyman-Pearson approach to testing are the false alarm error probability P_{F } and the
detection probability P_{D}. The former is the probability of returning H_{1} when the true world
state is described by H_{0} and is also known as the probability of a type I error (or false
acceptance rate, FAR). The latter, also known as the test power, gives the probability with
which H_{1} is returned (by the classifier) when the true world state is H_{1}. Neyman-Pearson
classification, which maximizes P_{D} under a specified bound on P_{F }, results in a simple
thresholding operation on the likelihood ratio value Λ() for a pattern under hypothesis
H_{0}:

| (3.1) |

The relation between the two probabilities, specifically P_{D} as a function of P_{F }, is usually
represented with a receiver operating characteristic (ROC) curve that is extensively discussed in
Appendix TM:C:

| (3.2) |

The quantity 1 - P_{D} represents the false negative error probability and is also known as the false
rejection rate (FRR). The ROC curve is often reported as

| (3.3) |

While the ROC curve provides detailed information on the trade-off between the two types of errors, classification system are often synthetically characterized by means of the equal error rate (EER), the intersection of the ROC curve with the diagonal

| (3.4) |

When the data distribution under the two competing hypotheses is Gaussian with the same covariance matrix (and different means) the probabilities considered above can be computed in close form (see Section TM:3.3) and the multidimensional case does not present significant differences from the one dimensional one. The key parameter, fixing the maximum achieavable performance, is the separation of the distributions mean with respect to distribution standard deviation. With reference to Equations TM:3.50-58, we can from parameter ν to

| (3.5) |

in order to compute P_{D} = P_{D}(P_{F }) exploint the fact that the Q-function is simply the complement to
1 of the distribution function (pnorm)

1 z <- function(nu, s) (nu/s + s/2)

2 # generate a sequence of threhsolds

3 nus <- seq(-10,10,by=0.1)

4 # transform them to z (with sigma = 3) ...

5 zs <- z(nus, 3)

6 tm.dev("figures/normalRoc")

7 plot(1-pnorm(zs), 1 - pnorm(zs-3), type="l", lty=1,

8 ... xlab="False alarm rate", ylab="Detection rate")

9 zs <- z(nus, 2)

10 lines(1-pnorm(zs), 1 - pnorm(zs-2), type="l", lty=2)

11 zs <- z(nus, 1)

12 lines(1-pnorm(zs), 1 - pnorm(zs-1), type="l", lty=3)

13 grid()

14 legend(0.6,0.6, c("sigma=3", "sigma=2", "sigma=1"), lty=c(1,2,3))

15 dev.off()

2 # generate a sequence of threhsolds

3 nus <- seq(-10,10,by=0.1)

4 # transform them to z (with sigma = 3) ...

5 zs <- z(nus, 3)

6 tm.dev("figures/normalRoc")

7 plot(1-pnorm(zs), 1 - pnorm(zs-3), type="l", lty=1,

8 ... xlab="False alarm rate", ylab="Detection rate")

9 zs <- z(nus, 2)

10 lines(1-pnorm(zs), 1 - pnorm(zs-2), type="l", lty=2)

11 zs <- z(nus, 1)

12 lines(1-pnorm(zs), 1 - pnorm(zs-1), type="l", lty=3)

13 grid()

14 legend(0.6,0.6, c("sigma=3", "sigma=2", "sigma=1"), lty=c(1,2,3))

15 dev.off()