Login

Proceedings

Find matching any: Reset
Add filter to result:
Analysis of soil compositional data for site-specific nutrient management
L. Parent
Universite Laval

Compositional data are strictly positive data closed to the unit or scale of measurement. They are relative to each other within the constrained, interactive, sample space and return negative correlations due to resonance between them (if one increase others must decrease due to closure). Plant tissue analysis (“ionome”) was first diagnosed compositionally in 1992. There are several examples of compositional data in soil science: soil textural composition (sand, silt, clay), organic matter composition (refractory and labile carbon forms), phosphorus and potassium fractions of contrasting availabilities, cationic make-up of cation exchange capacity, … Agronomic models are most often run using raw proportions or concentrations, leading to numerical biases such as confidence intervals sometimes overlapping the limits of the sample space, reaching less than zero or more than 100%, which is absurd. Indeed, there are D-1 degrees of freedom in a D-part composition because one component can be computed by difference between the whole and the sum of other components. The dual ratio reduces two data to a single one, but a dual ratio mean can hardly be back-transformed to original variables (e.g., concentrations or proportions). The challenge is to reduce D parts to orthogonal D-1 variables without losing any information on the composition. Compositional data analysis, introduced by John Aitchison in 1986 and updated by J.J. Egozcue in 2003, provides tools to avoid biases by projecting raw data into a Euclidian space of orthogonal variables. Orthogonality is a special case of linear independence whereby vectors are at perfectly right angles to one another, thus providing variables that are additive in the Euclidian space. A log ratio between two subsets of components is a log contrast that scan the real space (±¥) as required to run statistical analysis. Components are arranged into orthonormal balances following a sequential binary partition (SBP) between non-overlapping subsets of components, then computed as isometric log ratios. The SBP can represent interpretable balances between two meaningful subsets of components. However, any balance arrangement leads to the same multivariate distances between compositions if the sole objective of the model is to compare compositions. Balances can be expressed as [subset B | subset A] with the geometric mean across components of subset A at numerator and the geometric mean across components of subset B at denominator, weighted by a normalization coefficient. For modeling continuous variables, carbon, sand, silt and clay can be contrasted as follows: [sand, silt, clay | C] to represent carbon relationship with soil texture, [clay | sand, silt] to contrast finer and coarser particles, and [silt | sand] to contrast coarser particles. The basic cation saturation ratio formerly expressed as base saturation and dual cationic ratios could be expressed as: [Ca, Mg, exchangeable acidity | K] to assess K requirement, [exchangeable acidity | Ca, Mg] to assess lime requirement, and [Mg | Ca] to select the right source of lime. Plant tissue composition can be broken up into balances as continuous variables to account for the specific nutrient profiles of cultivars. Examples will be presented.

Keyword: Balances, isometric log ratios, soil and plant compositional data, negative correlations, Euclidian space