R-package aquap2 – Multivariate Data Analysis Tools for R including Aquaphotomics Methods

Bernhard Pollner (bernhard.pollner@mac.com)

Zoltan Kovacs (kovacs@correltech.com)

Introduction

We have been working on a software in the R programming environment since 2013 to provide an easy to use tool for the analysis of NIRS data from an aquaphotomics perspective. This development is a volunteer undertaking therefore the extension of the functionalities might looks sometimes slow but in spite the package is still in beta stage it is already powerful and extremely flexible & versatile. Its main advantages above other commercial software, beside special functionalities not available elsewhere (e.g. aquagram calculations) are that it can dramatically speed up analysis time and highly repetitive tasks get completely scriptable, i.e. automated.

The software can assist you in the following main areas:

  • Experiment designing and randomization of representation of the samples
  • Data Import from various file formats (e.g. *.DA, *.txt, *.xls, *.xlsx, *.pir, can be extended by custom import function)
  • Sample names, grouping variables and further constituents’ information can be automatically fused together with the spectral data
  • Data of logger used to monitor the environmental parameters of the experiment (e.g. temperature, humidity) can be also automatically fused together with the spectral data
  • Data Analysis and visualization:
  • Grouping / splitting / slicing of data with encapsulated, i.e. stable color-coding of samples / groups (e.g. same analyses flow repeated in separate parts of the data)
  • Various data pre-treatments (e.g. smoothing, snv, msc, emsc, detrend, derivatives [with different methods], averaging, resampling, artificial noise loading)
  • Implemented analysis methods are e.g. PCA, PLSR, SIMCA; and different versions of aquagram calculation (in the pipeline: PLS-DA; ANN; SVM; ICA)
  • Very flexible data visualization from raw spectra to peak marked vectors through plotting ellipses having statistical meaning in score plots
  • Different cross-validation and independent prediction options to support model optimization

Typical Workflow

After the initial setup (only has to be done once), the first step is to generate the folder structure needed for an experiment. You then can design the experiment using the entries in the metadata of the experiment and after that have the package generate a randomized sample list which you can use during your measurements.

After the data acquisition has been performed you can import data either from one of the supported data formats, or you can easily plug in your own data import function to read your very specific data format. (It is possible to automatically align data from a provided temperature and relative humidity log file to the timestamp in your dataset, should you have one.)

Once your data is imported into the standard dataset of aquap2, you can use the analysis procedure anProc.r to split up and group the data (by different wavelength ranges and/or by different sample groups), calculate various models and then use the implemented methods and tools to analyze your data.

Installation procedure for R-package “aquap2”

Step 1.

# download and install the latest version of R on your system.

# https://www.r-project.org

Step 2.

# download and install the latest version of R-Studio on your system.

# https://www.rstudio.com/products/rstudio/download/

Step 3.

# go to R-Studio and run the following codes:

install.packages(c("devtools", "iterators"))

library(devtools)

install_github(repo="bpollner/aquap2", ref="latestPublic",  build_vignettes=FALSE, force=TRUE)

# now you should have the package “aquap2” installed on your system

# you can check this by

library(aquap2)

# now you can proceed to set up the package “aquap2” as described in its help pages

# good luck – and have fun! And, please do not forget to cite us! 🙂

citation("aquap2")

# Note: you have to have an internet connection, as you will download several packages.

# Note: if you encounter an error, please READ the error message and try to react accordingly.

Feedbacks and remarks

We sincerely hope we can welcome you in the growing community of aquaP2 users and with your feedbacks the powerfulness and applicability of our package can be further improved so our work become useful.

In the past, our development of the R-Package has been introduced in the following events:

  • Understanding Water in Biology 2nd International Symposium, Kobe University, Faculty of Agriculture, 26.-29. November 2016, Kobe, Japan
    • Pollner, B., Kovacs, Z. (2016): Dedicated Aquaphotomics-Software R-Package „aquap2“ General Introduction and Workshop (Download)
  • AQUAPHOTOMICS: UNDERSTANDING WATER in the BIOLOGICAL WORLD at The 5th Kobe University Brussels European Centre Symposium Innovation, Environment and Globalization – Latest EU-Japan Research Collaboration – 14th October 2014: Pollner, B., Kovacs, Z. (2014)
    • Demonstrations of new tools for spectral data analysis & NIRS of Waters (Download)