Skip to main content

Advertisement

Development of the software tool Sample Size for Arbitrary Distributions and exemplarily applying it for calculating minimum numbers of moss samples used as accumulation indicators for atmospheric deposition

Abstract

Background

Do we measure enough to calculate statistically valid characteristic values from random sample measurements, or do we measure too much—without any further increase in knowledge? This question is actually one of the key issues of every empirical measurement design, but is rarely investigated in environmental monitoring.

Results

In this study, the methodology used for the design of the German Moss Survey 2015 network to determine statistically valid minimum sample numbers (MSN) for the calculation of the arithmetic mean value in compliance with certain accuracy requirements was further developed for data that are neither normally nor lognormally distributed. The core element of the procedure for estimating MSN without prerequisite to the distribution of data is an iterative Monte Carlo simulation. The methodological principle consists of using reference data (values measured in Moss Surveys preceding that in 2015) for a series of MSN candidate values to determine what accuracy would be achieved with these, and then calculating the MSN with which the specified accuracy requirement is met from a quadratic function between MSN candidates and their accuracy. The program Sample Size for Arbitrary Distributions (SSAD) was developed for the calculation of the MSN in the open programming language R.

Conclusions

The SSAD procedure closes a gap in the existing methodology for calculating statistically valid minimum sample numbers.

Background

With the exception of 2010, Germany participated in the European Moss Survey conducted every 5 years between 1990 and 2015 [4]. As a methodological basis for all national contributions to this international environmental monitoring programme, a guideline is used in which, on the initiative of the authors of this article, compliance with statistically justified minimum sample numbers (MSN = number of moss sampling sites) is recommended for compliance with certain error tolerances when calculating the arithmetic mean for different spatial categories (e.g., administrative units, ecological spatial classes) [1]. The calculation formula of the International Cooperative Programme on Effects of Air Pollution on Natural Vegetation and Crops [1] (“method A”) assumes that the standard deviations from many measurements are well known and the element concentrations in the mosses are normally distributed. In the frequent case that the substance concentrations in the mosses are lognormally distributed (41% of the German data set in 2005), Wosniok ([11], cited in [8]) extended the MSN methodology based on the calculation formula proposed by Cox for the determination of the confidence interval for the mean value in lognormally distributed data (mentioned as “personal communication” in Land ([2], cited in [5]) (“method B”). The purpose of this study is to further develop the methodology for calculating minimum sample numbers for data that are neither normal nor lognormally distributed (“method C”). To verify the methodology, the newly developed method is applied to data from the Moss Survey 2015 and compared with MSN calculations based on the previous methodology.

Method development

Theory

The core element of the procedure for estimating MSN without prerequisite to the distribution of data (method C) is an iterative Monte Carlo simulation [7]. The methodological principle consists of using reference data (previous measured values) for a series of MSN candidate values to determine what accuracy would be achieved with these, and then calculating the minimum MSN with which the specified accuracy requirement is met from a quadratic function between MSN candidates and their accuracy. This is based on the same accuracy criteria as the formula for normally distributed data (method A) or the extension based on the formula of Cox (method B) for lognormally distributed data [3, 9].

The starting point for determining MSN candidate values for method C is the same number of \(n_{1}\) which is determined under the assumption of a normal distribution according to method A [1]. If \(n_{1} < 20\), \(n_{1} = 20\) will be set to numerically stabilize the further procedure. In addition to \(n_{1}\) further candidate values \(n_{i} ''i = 2, \ldots ,I\) for the MSN you are looking for in the range of \(\left( {\sqrt {n_{1} } , 2 \cdot n_{1} } \right)\) above and below the first estimate. The typical number of candidate values is \(I = 11.\)

For each candidate value, \(n_{i}\) is then determined by Monte Carlo simulation (stochastic simulation) how accurately a mean value from the distribution of the available reference data on the basis of a sample of size \(n_{i}\) would be determined. The maximum difference between the mean value and the limits of the 95% confidence interval serves as a measure of accuracy. For the simulation, the density of the available data is calculated once as a kernel density estimate and from this the estimate of the associated distribution function is calculated. From this distribution, random Monte Carlo samples are then taken. \(x_{\text{sb}} , m = 1, \ldots , n_{i}\) the size \(n_{i}\) (inversion method). For each sample, its arithmetic mean \(\overline{x}_{j}\) is determined. Each L successive mean values form a block \(B_{b} = \left\{ {\overline{x}_{{\left( {b - 1} \right)L + 1}} ,\overline{x}_{{\left( {b - 1} \right)L + 2}} , \ldots , \overline{x}_{bL} } \right\}\) of averages. For each block, the absolute local quality criterion

$$\delta_{{b,{\text{abs}}}} = \hbox{max} \left( {\left| {\overline{x}_{\text{ref}} - Q_{2,5} \left( {B_{b} } \right))} \right|,\left| {Q_{97,5} \left( {B_{b} } \right) - \overline{x}_{\text{ref}} } \right|} \right)$$
(1)

and the relative local criterion

$$\delta_{{b,{\text{rel}}}} = \frac{{ 100 \delta_{{b,{\text{abs}}}} }}{{\overline{x}_{\text{ref}} }}$$
(2)

is calculated. Here, \(\overline{x}_{\text{ref}}\) is the arithmetic mean of the available reference data, \(b\) the index of the current block and \(Q_{p} \left( {B_{b} } \right)\) the p% quantile of those \(\overline{x}_{j}\) which are assigned to block \(B_{b}\) belong to. With a block size of L = 120, the following results are obtained \(Q_{2,5} \left( {B_{b} } \right)\) and \(Q_{97,5} \left( {B_{b} } \right)\) by counting in the ascending sorted sequence of the \(\overline{x}_{j}\) in block \(B_{b}\). Both criteria are called local because they refer to only one block. The searched global quality criterion \(\Delta_{\text{rel}} \left( {n_{i} ,N} \right)\) for a sample of size \(n_{i}\) is calculated by averaging the local contributions over all \(N\) drawn blocks:

$$\Delta_{\text{rel}} \left( {n_{i} ,N} \right) = \frac{1}{N}\mathop \sum \limits_{b} \delta_{{b,{\text{rel}}}} \left( {n_{i} } \right).$$
(3)

With growing \(N\), the sequence of the \(\Delta_{\text{rel}} \left( {n_{i} ,N} \right)\) converges to the true value \(\Delta_{\text{rel}} \left( {n_{i} } \right)\). The practical decision whether enough blocks have already been considered, because the current \(\Delta_{\text{rel}} \left( {n_{i} ,N} \right)\) is close enough to \(\Delta_{\text{rel}} \left( {n_{i} } \right)\), is based on the behavior of the last 5 calculated values of \(\Delta_{\text{rel}} \left( {n_{i} ,N} \right)\). If all last 4 calculated values of \(\Delta_{\text{rel}} \left( {n_{i} ,N} \right)\), i.e.. \(\Delta_{\text{rel}} \left( {n_{i} ,N - 3} \right), \Delta_{\text{rel}} \left( {n_{i} ,N - 2} \right),\Delta_{\text{rel}} \left( {n_{i} ,N - 1} \right), \Delta_{\text{rel}} \left( {n_{i} ,N} \right),\) differ less than the specified convergence criterion \(\varepsilon\) (typical: \(\varepsilon = 0.1\)) from \(\Delta_{\text{rel}} \left( {n_{i} ,N - 4} \right)\), then \(\Delta_{\text{rel}} \left( {n_{i} ,N} \right)\) is regarded as a sufficiently accurate determination of precision \(\Delta_{\text{rel}} \left( {n_{i} } \right)\). The simulation procedure for the current candidate value \(n_{i}\) is terminated.

After passing through the loop described above over all candidate values \(n_{i}\), \(I\) support points for describing the relationship between \(n_{i}\) and \(\Delta_{\text{rel}} \left( {n_{i} } \right)\) are available. Previous experience has shown that this relationship can be approximated well by the relationship

$$\ln \left( {\Delta_{\text{rel}} \left( {n_{i} } \right)} \right) = \beta_{0} + \beta_{1} n_{i} + \beta_{2} n_{i}^{2} .$$
(4)

A quadratic parabola can always considered as a local second order Taylor expansion of the unknown true relationship between \(\ln \left( {\Delta_{\text{rel}} \left( {n_{i} } \right)} \right)\) and \(n_{i}\). The error of a Taylor expansion in the approximated range could be calculated theoretically, if the true relation was formally known. For the present data, the quality of a quadratic approximation was checked with simulated test data (normally and log normally distributed and overlays of these) and with concentration data from the 2005 moss survey. Presumably a quadratic parabola will provide a sufficient approximation to the simulated data in many cases; however, we advocate the future user of the SSAD approach to check the validity of this approximation for the reference data at hand. This can be done easily when using the SSAD R script described in “Implementation” section, which provides the relevant figure. If needed, Eq. (4) can be modified to either a higher-order polynomial or a spline function.

The \(\beta\)-coefficients in Eq. (4) can be determined by linear regression. From the estimated coefficients \(\hat{\beta }_{i}\) Eq. (4) and the accuracy requirement \(\Delta_{\text{rel,goal}}\) the MSN results from the resolution of the quadratic equation

$$\ln \left( { \Delta_{\text{rel, goal}} } \right) = \hat{\beta }_{0} + \hat{\beta }_{1} {\text{MSN}} + \hat{\beta }_{2} ({\text{MSN}})^{2} ,$$
(5)

when

$$a = \frac{{\hat{\beta }_{1} }}{{\hat{\beta }_{2} }}, b = \hat{\beta }_{0} - \frac{{\ln \Delta_{\text{rel, goal}} }}{{\hat{\beta }_{2} }}, d = \sqrt {\frac{{a^{2} }}{4} - b}$$
(6)
$${\text{MSN}}_{1,2} = - \frac{a}{2} \pm d.$$
(7)

Equation (5) formally has two solutions of which only \({\text{MSN}}_{1}\) in Eq. (7) is a solution for the problem at hand.

The SSAD approach relies on the estimated distribution function of the reference data. The accuracy of this estimate and consequently also of values derived from the estimate depends (among others) on the size of the reference data. For a typical log normal distribution with 2.5% quantile = 10 and 97.5% quantile = 50, a sample size of n = 25 allows estimating the mean on the linear scale with a precision of 20% (half-width of the confidence interval). This precision should not be exceeded by a reference sample, therefore the minimum requirement of n = 25 for the SSAD approach.

Implementation

The program Sample Size for Arbitrary Distributions (SSAD) in the open programming language R [6] was developed for the practical determination of the MSN. The SSAD program requires a file with reference data as well as further inputs to control the calculation. Reference data must be available as a column in a csv file .csv files can be created with any editor. Furthermore, most software products for data storage or evaluation allow the export of data in this format. The first line of the file must contain unique variable names, even if there is only one variable. It is recommended to use variable names with a maximum of 32 characters, letters, numbers, “.” and “_” only. A distinction is made between upper and lower case. Since the csv format allows different separators between fields and also does not specify the decimal character, the character for separating columns as well as the decimal character must be given to the SSAD program, as described below. Entries for controlling the invoice are made directly in SSAD_V9.R at correspondingly commented places. These are:

  • A description of the current analysis to identify the results (free text),

  • the path and file name of the file containing the reference data,

  • the character used to separate fields in the file (typically: semicolon),

  • the decimal point used (typically: period or comma),

  • the accuracy requirement (\(\Delta_{\text{rel, goal}}\) in the previous description). The accuracy requirement is directed at the size of the 95% confidence interval with which the arithmetic mean of future samples is to be determined. The (relative) accuracy is calculated as a percentage according to Eqs. (1) and (2). Alternatively, it is possible to specify the accuracy as an absolute value, not as a relative percentage of the mean value. Further control options, which were not required for this application, are described in the program. The execution of SSAD_V9.R creates a table containing the input parameters and the calculated MSN. This table appears on the R console and is also saved as a text file in the txt/directory under the specified evaluation name. In the Fig/directory, images with corresponding names are stored which document the course of the simulation and calculation.

Calculation and method verification

To verify the SSAD method, the MSN was calculated using the data of the Moos Survey 2015 with the concentrations of 12 heavy metals (Al, As, Cd, Cr, Cu, Fe, Hg, Ni, Pb, Sb, V, Zn) and nitrogen [3] measured at 400 moss sampling sites in Germany. The calculation was carried out throughout Germany using the SSAD method (method C)Footnote 1 and comparing the calculation formula of the Moss Manual (method A) with identical data, i.e.. with data from the Moss Survey 2015. On the other hand, element-specific MSN for different spatial categories (Federal Republic of Germany, federal states, ecological spatial classes) were calculated with the data from the Moss Survey 2015 using the SSAD method and compared with the results of the measurement network planning on the basis of the data from the Moss Survey 2005 (Tables 1 and 2). In each case, a uniform error factor (tol) of 0.2 (= 20%) was used and a significance level of α = 0.05 was selected to ensure comparability with earlier surveys [1]. For validation the SSAD procedure was applied to all subsamples regardless of their distribution form. Only in the case of sample sizes ≤ 25 (method C needs sample sizes ≥ 25), SSAD was supplemented by the formula of the manual of the European Moss Survey [1] (method A) in the case of normally distributed variables or the extension proposed by Wosniok ([11], cited in [8]) in the case of differently distributed variables (method B). As in Nickel and Schröder [3] and Schröder et al. [9] for the four elements of the Convention on Long-Range Transboundary Air Pollution (CLRTAP)—i.e., Cd, Hg, Pb and N—the results of the MSN calculations were cartographically illustrated for comparisons of the spatial distributions of deviances from the minimum sample numbers.

Table 1 Element-specific minimum sample numbers (MSN) and actual sample sizes (n) for ecoregions of the Ecological Land Classification (ELCE40) [10] in Germany, calculated using Moss Survey 2005 data [3, 9]
Table 2 Element-specific minimum sample numbers (MSN) and actual sample sizes (n) b calculated for the German federal states using Moss Survey 2005 data [3, 9]

Results

In the nationwide data set of Moss Survey 2015, the substance concentrations with the exception of Zn (lognormally distributed) are all neither normally nor lognormally distributed (α < 0.05). The MSN calculated on this basis for Germany on an element-specific basis are within a range of 8–92 for the Manual formula (method A) and between 10 and 117 for the SSAD method (method C) (Table 3). The results of the SSAD procedure are thus on average 38% higher than those of the Manual formula. In extreme cases, the MSN of the SSAD process are 267% above the MSN of the Manual formula for Cu and − 68% below the MSN of the Manual formula for Fe. The lower limit, at which the empirically determined mean value for all elements does not differ more than 20% from the true mean value with a 95% certainty, results from the maximum MSN and is 92 (As) for the Manual formula and 117 (Cd) for the SSAD procedure.

Table 3 Element-specific minimum sample numbers (MSN) and actual sample sizes (n) for 12 elements in Germany, calculated using two methods applied to data of the Moss Survey 2015

Of the total of 312 ELCE and state-specific partial data sets (12 heavy metals and nitrogen) investigated in the Moss Survey 2015 monitoring network, 35% follow the normal distribution and 37% the log normal distribution, the remainder was distributed differently. Using the SSAD method, for the 144 ELCE-specific partial data sets, the proportion of ecological area classes in Germany where the minimum number of samples was met is between 12% (Al) and 53% (N), depending on the element (average 27%) (Table 4). With regard to the 168 country-specific data sets, the proportions vary between 21% (Al) and 100% (Cu, N) and average 58% (Table 5). In order to fully guarantee the MSN for all 13 elements, the German moss monitoring network would have to be expanded to 906 sites with regard to the ELCE and to 701 sites for the federal states of Germany. When methods A and B are applied, the proportions of ELCE classes in Germany where the minimum number of samples is reached are between 6% (Al, Cr) and 47% (Hg, N) and the average amounts to 33% (Table 1). In the federal states of Germany, the percentages vary between 7% (Cr) and 100% (Hg, N) and average 62% (Table 2). On the basis of this MSN calculation, the German moss monitoring network would have to comprise 1335 sites in relation to the ELCE40 [10] and 1283 sites in relation to the federal states in order to fully guarantee the MSN.

Table 4 Element-specific minimum sample numbers (MSN) and actual sample sizes (n) for ecoregions of the Ecological Land Classification (ELCE40) [10] using Moss Survey 2015 data
Table 5 Element-specific minimum sample numbers (MSN) and actual sample sizes (n) calculated for Germany’s federal states using Moss Survey 2015 data

For 72 of the 144 ELCE40-specific partial data sets, the sample sizes are above n = 25, for the state-specific data sets 96 out of 168. When comparing the case numbers calculated with the two method combinations (only data sets with n > 25), the deviations of the MSN calculated with the methods A, B, and C with data for the year 2015 (Table 4) from those calculated with the methods A and B for the year 2005 (Table 1) for the ELCE40-related analysis range from − 321 to 176 (mean value: 4.8; standard deviation: 58.8). For the federal states (only data sets with n ≥ 25), a comparison of the MSN for 2015 (Table 5) with those for 2005 (Table 2) reveals deviations between − 402 and 149 (mean value: 4.9; standard deviation: 56.6). The maximum deviations with significantly lower MSN estimates when using the SSAD method (method C) on the basis of the data for 2015 are shown in Cr for the ecoregion F2_6 (− 321) and in Saxony (− 402).

Figure 1 shows the spatial distribution of MSN statistics as calculated with SSAD for different spatial categories (ELCE40, federal states) using the Moss Survey data 2015 for the four elements of CLRTAP. Compared with the MSN calculations in the same measuring network using the values measured in 2005 and without the SSAD method (Figs. 2, 3, 4, 5, 6, 7 and 8), the spatial proportions of the spatial units matching the MSN are correspondingly smaller. The area shares are on average 44% lower for Cd, 23% lower for Hg, 18% lower for Pb and 14% lower for N in the calculations with the extended methodology (Fig. 1) than for the MSN calculated with the method combination A and B (Figs. 2, 3, 4, 5, 6, 7 and 8; Table 6).

Fig. 1
figure1

Comparisons of the sample sizes in the 2015 Moss Survey network with the minimum sample numbers determined for Cd, Hg, Pb and N (reference planes): ELCE40 [10], BL (= federal states)

Fig. 2
figure2

Comparison of two sample sizes in the 2005 monitoring network with the MSN determined for Cd (spatial reference plane: ELCE40 [10])

Fig. 3
figure3

Comparison of two sample sizes in the 2005 monitoring network with the MSN determined for Hg (spatial reference plane: ELCE40 [10])

Fig. 4
figure4

Comparison of two sample sizes in the 2005 monitoring network with the MSN determined for N (spatial reference plane: ELCE40 [10])

Fig. 5
figure5

Comparison of two sample sizes in the 2005 monitoring network with the MSN determined for Pb (spatial reference plane: ELCE40 [10])

Fig. 6
figure6

Comparison of two sample sizes in the 2005 monitoring network with the MSN determined for Cd (spatial reference level: federal states)

Fig. 7
figure7

Comparison of two sample sizes in the 2005 measuring network of the MSN determined for Hg and N (spatial reference level: federal states)

Fig. 8
figure8

Comparison of two sample sizes in the 2005 monitoring network with the MSN determined for Pb (spatial reference level: federal states)

Table 6 Area percentages of the ecological area classes (ELCE40, [10]) and federal states (BL) matching the MSN calculated based on the Moss Survey network 2015 (n = 400, data collected in 2005 (previous methodology) and 2015 (extended methodology)

Discussion and conclusions

The SSAD methodology should be regarded as a tool for spatially designing monitoring networks. A comparison of the MSN calculations using different reference data (here: 2005, 2015) clearly shows that the MSN estimates can only be transferred to future measured value variants to a limited extent depending from available data. Uncertainties arise not only from different statistical distributions of the reference data, but also from the MSN methods chosen due to these differences. But even using the same data, the results of the three procedures sometimes differ greatly from each other. In addition, the SSAD method as a stochastic method produces different results even with identical data, although the extent of these differences can be reduced by tightening the accuracy requirement in the application of the method. The entire package of methods is particularly suitable for locating conspicuous non-compliances with minimum sample numbers for different spatial categories (e.g., administrative units, ecological spatial classes) and for correcting them in measurement network planning within the individual participating states and also across states. For a more precise quantification of the differences between the three partial methods, these would have to be applied to reference data from the same survey (2015) in addition to the data from the 2005 and 2015 moss surveys used here. In addition, mean deviations between different simulation runs with identical data should be quantified.

Due to strong deviations of the three partial methods even with identical data, the SSAD procedure cannot be recommended as the only procedure. For the application of the methodology it is recommended to still estimate minimum sample numbers from normally distributed reference data using the calculation formula of the moss manual [2; 8] (method A). For lognormally distributed reference data, the MSN formula according to Wosniok ([11], quoted in [8]) is recommended (method B). Both methods A and B are based on parametric statistics, which—assuming that the assumptions about the statistical distribution of data are correct—generally allow more accurate and precise estimates than the SSAD method (method C). The MSN formula according to Wosniok ([11], cited in [8]) is also recommended for all non-normally distributed data series with n < 25, since the lognormal distribution represents the more frequent case compared to the normal distribution and the SSAD procedure requires sample sizes with n ≥ 25. For data with n ≥ 25 that is neither normally nor lognormally distributed, the use of the SSAD procedure is recommended, since this does not impose any preconditions on the distribution of the data.

Conclusions

The SSAD procedure developed for the European Moss Survey closes a gap in the previous methodology for calculating MSN with regard to compliance with certain error tolerances in the calculation of the arithmetic mean of, e.g., element concentrations in mosses measures across Germany. Thus, for the first time a method package is available which does not impose any preconditions on the distribution of the data for the MSN calculation. The procedure is directly transferable for the planning of many other environmental monitoring networks.

Availability of data and materials

The software tool Sample Size for Arbitrary Distributions (SSAD) developed has been made freely accessible for future applications via the research data repository ZENODO® [12].

Notes

  1. 1.

    R version 3.4.1 and SSAD version 9 were used to calculate the MSN [10].

Abbreviations

Al:

aluminium

As:

arsenic

B_1:

western and northern Scandinavia, northwest Russia

B_2:

The Alps, Iceland, northwest Russia

C_0:

The Alps, Iceland, western and northern Scandinavia, Kola Peninsula, northwest Russia, Caucasus

BB:

Brandenburg

BE:

Berlin

BL:

Federal state of Germany

BW:

Baden-Wuerttemberg

BY:

Bavaria

Cd:

cadmium

Cr:

chromium

Cu:

copper

CSV:

comma separated value

D_13:

The Alps, dispersed small areas in eastern and southeast Europe

D_14:

Baltic States, Belarus, western Russia

ELCE:

ecological land classes of Europe

F1_1:

Poland, northwest Ukraine

F1_2:

Ireland, Great Britain, western and central Europe

F2_6:

Central Europe, eastern and southeast Europe

F3_1:

Germany, northwest Poland, Czech Republic, northern Austria, Slovenia, the Balkans

F3_2:

Western Europe (including northern Spain, France, Benelux countries, western Germany), Denmark

F4_1:

Southeast Great Britain, southeast Denmark, northeast Germany, northwest Poland

F4_2:

Western/central and southern Europe (including southern Great Britain, eastern France, southern Belgium, Luxembourg, the Alps, Italy), eastern and southeast Europe (including the Carpathian Mountains, the Balkans)

Fe:

iron

G1_0:

Italy, southeast Europe

HE:

Hesse

Hg:

mercury

HH:

Hamburg

ICP:

International Cooperative Programme

M:

method

MSN:

minimum sample numbers

MV:

Mecklenburg Western-Pomerania

N:

nitrogen

n :

sample size

NI:

Lower Saxony

Ni:

nickel

NW:

North Rhine-Westphalia

Pb:

lead

RP:

Rhineland Palatinate

S_0:

Northern parts of Europe (including parts of Iceland, Ireland, Great Britain, Scandinavia, northwest Russia, the Baltic states and Belarus)

Sb:

antimony

SSAD:

Sample Size for Arbitrary Distributions

SH:

Schleswig-Holstein

SL:

Saarland

SN:

Saxony

ST:

Saxony-Anhalt

TH:

Thuringia

U_1:

dispersed small areas within a stripe reaching form Ireland via central Europe and the Byelorussian–Ukrainian borderline to Russia

U_2:

dispersed small areas in southern Europe reaching form the Iberian Peninsula via southeast Europe including, e.g., the Balkans, the Carpathians, Greece and northern Turkey to southwest Russia

V:

vanadium

Zn:

zinc

References

  1. 1.

    ICP Vegetation (International Cooperative Programme on Effects of Air Pollution on Natural Vegetation and Crops) (2014) Monitoring of atmospheric deposition of heavy metals, nitrogen and POPs in Europe using bryophytes. Monitoring manual 2015 survey. United Nations Economic Commission for Europe Convention on Long-Range Transboundary Air Pollution. ICP Vegetation Moss Survey Coordination Centre, Dubna, Russian Federation, and Programme Coordination Centre. Bangor, Wales, UK. https://icpvegetation.ceh.ac.uk/sites/default/files/MossmonitoringMANUAL-2015-17.07.14.pdf. Accessed 04 Apr 2019

  2. 2.

    Land CE (1971) Confidence intervals for linear functions of the normal mean and variance. Ann Math Stat 42:1187–1205

  3. 3.

    Nickel S, Schröder W (2017) Umstrukturierung des deutschen Moos-Monitoring-Messnetzes für eine regionalisierende Abschätzung atmosphärischer Deposition in terrestrische Ökosysteme. In: Schröder W, Fränzle O, Müller F (Hg) Handbuch der Umweltwissenschaften. Grundlagen und Anwendungen der Ökosystemforschung. 24. Erg.Lfg., Kap. VI-1.8:1–48

  4. 4.

    Nickel S, Schröder W (2018) Schwermetall- und Stickstoffkonzentrationen in Moosen deutscher Waldgebiete zwischen 1990 und 2015 – Ein Bund-Länder-Vergleich. Gefahrstoffe - Reinhaltung der Luft, vol 3. Springer, VDI, Berlin, pp 1–14

  5. 5.

    Olsson U (2005) Confidence intervals for the mean of a log-normal distribution. J Stat Educ 13(1). http://www.amstat.org/publications/jse/v13n1/olsson.html. Accessed 04 Apr 2019

  6. 6.

    R Core Team (2019) R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. https://www.R-project.org/. Accessed 04 Apr 2019

  7. 7.

    Rubinstein RY, Kroese DP (2017) Simulation and the Monte Carlo method, 3rd edn. Wiley, New York

  8. 8.

    Schröder W, Nickel S, Schönrock S, Meyer M, Wosniok W, Harmens H, Frontasyeva MV, Alber R, Aleksiayenak J, Barandovski L, Danielsson H, de Temmermann L, Fernández Escribano A, Godzik B, Jeran Z, Pihl Karlsson G, Lazo P, Leblond S, Lindroos A-J, Liiv S, Magnússon SH, Mankovska B, Martínez-Abaigar J, Piispanen J, Poikolainen J, Popescu IV, Qarri F, Santamaria JM, Skudnik M, Špirić Z, Stafilov T, Steinnes E, Stihi C, Thöni L, Uggerud HT, Zechmeister HG (2016) Spatially valid data of atmospheric deposition of heavy metals and nitrogen derived by moss surveys for pollution risk assessments of ecosystems. Environ Sci Pollut Res 23:10457–10476

  9. 9.

    Schröder W, Nickel S, Völksen B, Dreyer A, Wosniok W (2019) Nutzung von Bioindikationsmethoden zur Bestimmung und Regionalisierung von Schadstoffeinträgen für eine Abschätzung des atmosphärischen Beitrags zu aktuellen Belastungen von Ökosystemen. Ressortforschungsplan des Bundesministeriums für Umwelt, Naturschutz, Bau und nukleare Sicherheit. Forschungskennzahl 3715 63 212 0. Dessau: 1–188 Bericht, 1–288 Anhänge

  10. 10.

    Schröder W, Schmidt G, Hornsmann I (2006) Landschaftsökologische Raumgliederung Deutschlands. In: Handbuch der Umweltwissenschaften. Grundlagen und Anwendungen der Ökosystemforschung. Landsberg am Lech, München, Zürich, Kap V-1.9, ErgLfg 17:-100, 2006

  11. 11.

    Wosniok W (2015) Fallzahlen für das Moosmonitoring - Ergänzungsvorschläge für das Monitoring manual 2015 survey (ICP Vegetation 2014). Arbeitspapier vom 04.09.2015, Universität Bremen, Bremen

  12. 12.

    Wosniok W, Nickel S, Schröder W (2019) R Software tool for calculating Minimum Sample Sizes for Arbitrary Distributions (SSAD), link to scientific software (Version v1). Zenodo. https://doi.org/10.5281/zenodo.2583010

Download references

Acknowledgements

We would like to thank the German Environment Agency (Dessau-Roßlau, Germany) for financial support and professional advice.

Funding

German Environment Agency, Dessau-Roßlau, Germany (Grant no. 3715 63 212 0).

Author information

Werner Wosniok developed the methodology and wrote the R script. Winfried Schröder drafted the article and headed the computations executed by Stefan Nickel. All authors read and approved the final manuscript.

Correspondence to Stefan Nickel.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wosniok, W., Nickel, S. & Schröder, W. Development of the software tool Sample Size for Arbitrary Distributions and exemplarily applying it for calculating minimum numbers of moss samples used as accumulation indicators for atmospheric deposition. Environ Sci Eur 32, 9 (2020). https://doi.org/10.1186/s12302-020-0290-1

Download citation

Keywords

  • Atmospheric deposition
  • German Moss Survey
  • Minimum sample size
  • Monte Carlo method
  • Scientific software
  • Spatial sampling design