Table of Contents Table of Contents
Previous Page  9 / 96 Next Page
Information
Show Menu
Previous Page 9 / 96 Next Page
Page Background

Bolet´ın de Estad´ıstica e Investigaci´on Operativa

Vol. 32, No. 1, Marzo 2016, pp. 5-29

Estad´ıstica

What are compositional data and how should they be

analyzed?

Juan Jos´e Egozcue

Departamento de Ingenier´ıa Civil y Ambiental

Universidad Polit´ecnica de Catalu˜na

!

juan.jose.egozcue@upc.edu

Vera Pawlowsky-Glahn

Departamento de Inform´atica, Matem´atica Aplicada y Estad´ıstica

Universidad de Girona

!

vera.pawlowsky@udg.edu

Abstract

Compositions describe parts of a whole which carry relative informa-

tion. Compositional data appear in all fields of science and their analysis

requires paying attention to the appropriate sample space. The log-ratio

approach proposes the simplex, endowed with the Aitchison geometry, as

an appropriate sample space. The main characteristics of the Aitchison

geometry are presented, which open the door to compositional statistical

analysis. The main consequence is that compositions can be represented

in Cartesian coordinates by using the so called isometric log-ratio transfor-

mation. Standard statistical techniques can be used on these coordinates.

Employment-unemployment data for the period 2008-2015, distributed by

activity sectors across Comunidades Aut´onomas in Spain, provides an ex-

ample to demonstrate the exploratory capabilities of three specific tools

of compositional data analysis: the variation matrix, the compositional

biplot, and the dendrogram. An exploratory regression on time is also

presented.

Keywords:

Compositional data analysis, Aitchison geometry, simplex,

variation matrix, compositional biplot, balance dendrogram, ilr, clr

AMS Subject classifications:

62-07, 62-02

c

2016 SEIO