# Summary statistics for accuracy and error

Error matrices are easy to construct for raster datasets because the image for the ground validation samples can be overlain onto the predictive habitat image, assuming the two images to have the same pixel size and format so that there is a pixel-on-pixel comparison. Most image processing software or GIS will calculate the matrix and standard accuracy measures.

The error matrix will be an N x N matrix where N = number of classes. The rows headings will be the ground validation classes and the columns the predictive map classes where there is overlap between the two images. The data in the matrix are the number of pixels of each ground validation class which fall in each predictive map class. The diagonal shows correspondence (correctly classified), the off-diagonal values indicates error. Errors of omission, where a habitat class was present at the location of a particular pixel but not predicted, can be read along the rows (less the diagonal cell). Errors of commission, where a habitat class was predicted to be present when, in actuality, it was not, can be read down the columns (again, less the diagonal cell).

From this basic matrix a number of summary statistics for accuracy and error can be derived:

**Overall percentage correct:**[(sum of diagonal cells)/total cells in overlap] x 100.**Omission error**(for any class or group of classes): Pixels in rows minus the appropriate diagonal cell for the class or group of classes.**Producer’s accuracy**(for any habitat class): Classes correctly predicted: Number of pixels of a class correctly predicted/total number of that class known to exist in the ground truth image.**Commission error**: Pixels in columns minus the appropriate diagonal cell for the class or group of classes.**Consumer’s accuracy**: Pixels correctly classified: Number of pixels correctly predicting a habitat/total number of pixels of that class predicted in the classified image**Average accuracy**: Sum of producer accuracies for each class/number of classes.**Kappa**(and other similar statistics): A statistic that adjusts overall accuracy to account for chance agreement (used in preference to percentage correct).

In the above example, the predictions that are verified by the
ground validation data are in the diagonal, pink cells. To find
errors of omission, read along the rows for cells other than the
diagonal cell, for example, the yellow row highlights pixels that
should have been classed as *Sabellaria* but were predicted
to be one of the other habitats. The error is given as a
proportion. For errors of commission, read down the columns for
cells other than the diagonal cell, for example, the blue column
highlights pixels that were classed as *Sabellaria* but were
found to be one of the other habitats. The error is given as a
proportion). In the above example, the percentage correct is 71%
and the Kappa index is 0.68 (where 1 is a perfect match, 0 is
entirely random). Note that in this case the error matrix also
indicates that *Sabellaria* reefs and non-reef habitats are
most likely to be confused (read along the yellow row). This might
be expected because of the lack of a distinctive difference between
these two habitats.