Massachusetts General Hospital, Center for Morphometric Analysis
Internet Brain Segmentation Repository

Comparison Results

Average Overlap Metric

The overlap metric is a method for comparing two segmentations that is more critical than comparisons using the volume. It is defined for a given voxel class assignment as the sum of the number of voxels that both have the class assignment in each segmentation divided by the sum of voxels where either segmentation has the class assignment. This is the same as the Tanimoto coefficient (See Pattern Classification and Scene Analysis by Duda and Hart, 1973, p. 216).

This metric approaches a value of 1.0 for results that are very similar and is near 0.0 when they share no similarly classified voxels.

The following results are from work done by Jagath C. Rajapakse and are partially based on the method described in: Rajapakse JC and Kruggel F, Segmentation of MR Images with Intensity Inhomogeneities, Image and Vision Computing, 1998, In press. The data sets used were the 20 normal subjects (brain-only MR data files) which are available along with the manual segmentations from this IBSR.

     Average Overlap between manually-guided segmentations 
            and various methods for 20 brain scans

        gray   white      method
        -----  -----  ------------------------------------
        0.564  0.567  adaptive MAP
        0.558  0.562  biased MAP
        0.473  0.567  fuzzy c-means
        0.550  0.554  Maximum Aposteriori Probability (MAP)
        0.535  0.551  Maximum-Likelihood
        0.477  0.571  tree-structure k-means

        0.876  0.832  Manual (4 brains averaged over 2 experts)

More Details
Also available are the average overlap numbers for the background, CSF, gray and white regions separately for each method and each scan.

Overlap of Gray Voxels for Each Brain Scan

Overlap of White Voxels for Each Brain Scan

The graphs above show the overlap scores for each of the 20 brains. Scores have been multiplied by 1000. The bran scans have been roughly ordered by their difficulty to be segmented. The line labeled "expert" is the average overlap between two expert operators who segmented the same four brain scans for a study using different data. This was included to give a sense of the overlap level that has been found to be acceptable for volumetric studies.

Discussion

The 20 coronal brain scans used to generate these results were chosen because they have been used in published volumetric studies in the past and because they have various levels of difficulty. The worst ones have low contrast and relatively large intensity gradients. More recently acquired (i.e. better quality) data should result in far better overlap scores for the automated methods.


Go to [IBSR Main Page]. Questions about this page should be directed to Andy.