![Hier klicken, um den Treffer aus der Auswahl zu entfernen](images/unchecked.gif) |
Titel |
A discrete perspective on nonlinear dimension reduction |
VerfasserIn |
Christina Bogner, Baltasar Trancón y Widemann, Holger Lange |
Konferenz |
EGU General Assembly 2013
|
Medientyp |
Artikel
|
Sprache |
Englisch
|
Digitales Dokument |
PDF |
Erschienen |
In: GRA - Volume 15 (2013) |
Datensatznummer |
250077388
|
|
|
|
Zusammenfassung |
Environmental data sets are often large and high-dimensional and thus difficult to visualize
and analyze. In hydrology, for example, we often deal with time series from long-term
physical and chemical monitoring of stream water, groundwater or soils. These data can be
seen as a multivariate characteristic of chemico-physical properties of water or soil and are
used to infer processes in ecosystems.
Despite their high dimensionality, ecological data are often assumed to have a simple
underlying intrinsic structure. It means that despite their high-dimensional nature they can be
summarized in less dimensions without a serious loss of information. Therefore,
dimensionality reduction techniques are often the first step to data analysis. They
are used to visualize data as well as to uncover the intrinsic (low-dimensional)
structure.
As an example application, we use high-dimensional hydrochemical data at the headwater
catchment level (ion concentrations from first-order streams). We investigate the Isometric
Feature Mapping (Isomap), a popular method for non-linear dimension reduction. Here, the
topology of the data set is approximated by constructing a local neighbourhood graph.
However, the assumption of smoothness underlying this approximation is difficult to justify
for many environmental data sets, and issues of measurement errors and sampling gaps
render Isomap analyses questionable. Thus, we extend our methodology by an analogous, but
more robust, discrete (non-smooth) transformation leading to a set of binary data. For the
latter, a plethora of data-mining techniques, in particular unsupervised and semi-supervised
machine learning algorithms, exists. These can be employed to automate or support
classification and feature detection tasks, taking the non-linear structure of available data into
account. First results of this newly developed analysis method will be presented. |
|
|
|
|
|