The Journal of the Geological Society of Japan
Online ISSN : 1349-9963
Print ISSN : 0016-7630
ISSN-L : 0016-7630
Review
Note on zero and missing values in compositional data
Hiroyoshi AraiTohru Ohta
Author information
JOURNAL FREE ACCESS

2006 Volume 112 Issue 7 Pages 439-451

Details
Abstract

In the field of geology, compositional data, such as petrochemical compositions, faunal compositions and modal compositions of sandstones, are common. This type of data contains an awkward mathematical problem known as constant-sum constraint. To resolve this problem, logratio and simplicial analyses have been developed in the last two decades. However, zero and missing values are common in practical compositional data, which are troublesome for logratio or simplicial analysis because neither logarithm nor geometric mean can take zeros. In this context, many authors have suggested nonparametric replacement methods of zero and missing values to overcome this problem. We review these nonparametric methods, additive replacement and multiplicative replacement, with their merits and limitations, after showing types and nature of zeros: rounded zeros stemmed from a detection limit of apparatus and essential (or true) zeros designating nothing. Zero replacement, however, may create outliers of data vectors and would lead us to erroneous conclusions. For this reason, we also review how to assess the outlier: by atypicality indices of data vectors and by confidence regions of a population. To disseminate statistically rigorous replacement and outlier detection, computer programs for open source statistical environment `R', which replace zeros in a given data set and calculate atypicality indices, were developed.

Content from these authors
© 2006 by The Geological Society of Japan
Next article
feedback
Top