A comparison of log Kow (n-octanol–water partition coefficient) values for non-ionic, anionic, cationic and amphoteric surfactants determined using predictions and experimental methods

Surfactants are widely used across the globe both in industrial and consumer products. The n-octanol/water partition ratio or coefficient (log Kow) and n-octanol/water distribution coefficient (log D) are key parameters in environmental risk assessment of chemicals as they are often used to estimate the environmental fate and bioavailability and thus exposure and toxicity of a compound. Determining log Kow data for surfactants is a technical challenge due to their amphiphilic properties. Currently several existing experimental OECD methods (e.g. slow-stirring, HPLC, solubility ratio) and QSPR models are available for log Kow/D measurement or prediction. However, there are concerns that these methods have not been fully validated for surfactants and may not be applicable due to the specific phase behaviour of surfactants. The current methods were evaluated for the four surfactant classes (non-ionic, anionic, cationic and amphoteric). The solubility ratio approach, based on comparative n-octanol and water solubility measurements, did not generate robust or accurate data. The HPLC method generates consistently higher log Kow values than the slow-stirring method for non-ionics, but this positive bias could be removed using reference surfactants with log Kow values determined using the slow-stirring method. The slow-stirring method is the most widely applicable experimental method for generating log Kow/D data for all the surface-active test compounds. Generally, QSPR-predicted log Kow/D values do not correlate well with experimental values, apart for the group of non-ionic surfactants. Relatively, large differences in predicted log Kow/D values were observed when comparing various QSPR models, which were most noticeable for the ionised surfactants. The slow-stirring method is the most widely applicable experimental method for generating log Kow/D data for all the four surfactant classes. A weight of evidence approach is considered appropriate for non-ionic surfactants using experimental and model predications. However, it is more difficult to apply this approach to ionisable surfactants. Recommendations are made for the preferred existing QSPR predictive methods for determination of log Kow/D values for the surfactant classes. Investigation of newer alternative experimental log Kow methods as well as more biologically relevant and methodologically defensible alternative methods for describing partitioning of surfactants are recommended.


Background
Surfactants are widely used across the globe both in industrial and consumer products so properties which determine their distribution in the environment are of particular importance. The n-octanol/water partition ratio or partition coefficient (log K ow ) and n-octanol/ water distribution coefficient (log D) are key parameters in environmental risk assessment of chemicals as they are often used to estimate the environmental fate and bioavailability and thus exposure and toxicity of a compound. The partition coefficient (log K ow ) is a constant for the molecule in its neutral form. The distribution coefficient (log D) takes into account all neutral and charged forms of the molecule. In the pH region where the molecule is predominantly unionised, log D = log K ow · Log D values at pH 7 are considered more relevant for understanding environmental fate and bioavailability of ionisable compounds with a low or high pK a , compared to log K ow values generated at a pH unrepresentative of typical aquatic environmental conditions. Due to their amphiphilic properties, surfactants form aggregates in solution and tend to accumulate at the interface of hydrophobic and hydrophilic phases. Surfactants can even emulsify the n-octanol-water system, making the measurement of log K ow a technical challenge. For this reason, the traditional 'shake-flask' method (OECD 107 Test Guideline) [1] is no longer considered appropriate for log K ow determination of surfactants.
Currently, several existing experimental methods of the Organisation for Economic Co-operation and Development (OECD) and Quantitative Structure-Property Relationship (QSPR) models are available for log K ow measurement or prediction. The experimental methods include: the 'slow-stirring' method (OECD 123) [2], the high-performance liquid chromatography (HPLC) method (OECD 117) [3], and a solubility ratio method (referred to in OECD 107) [1] which uses the ratio of the chemical solubility in n-octanol and in water. All these methods are listed in EU Technical Guidance Document (TGD) guidance [4] and have been used for regulatory notification purposes by different lead registrants in REACH Phases 1 and 2 (i.e. chemicals manufactured or imported in Europe > 1000 and > 100 tonnes per annum, respectively). However, there are concerns that these methods have not been fully validated for surfactants and may not be applicable due to the specific phase behaviour of surfactants. This is complicated by the fact that aqueous 'solubility' is not properly defined for surfactants and also is difficult to measure. Surfactants dissolve not only as single molecules (mono-molecular solution), but at higher concentrations also form different types of soluble aggregates, e.g. spherical micelles, vesicles (depending on their chemical structure, concentration, temperature). The maximum mono-molecular solubility of a surfactant is defined as the critical micelle concentration (CMC). However, the CMC is not a good descriptor of water solubility, as micelles themselves are also a perfectly watersoluble state of surfactants [5]. A working approach for surfactants might be the comparison of measured solubilities in n-octanol and water. However, it is then prudent to take the CMC in water as the solubility limit, in order to avoid the artefact of unrealistically low log K ow values [4].
The Environmental Risk Assessment of Surfactants Management (ERASM) 'Hydrophobicity of Surfactants' Task Force was established in 2011 with the objective to evaluate the most appropriate log K ow /D method for surfactants. The Task Force coordinated a laboratory study at the Fraunhofer IME Institute in Schmallenberg, Germany to measure log K ow /D values using three different recognised experimental methods side-by-side for a set of 12 surfactants from the four main surfactant categories (non-ionics, anionics, cationics and amphoterics). This study was conducted consistently in one experienced laboratory, with the aim of reducing uncertainties and to identify whether any of the existing methods predominate over the others in providing consistency and reliability of results across all surfactant classes. In addition, the Task Force applied several QSPR methods and predicted log K ow property data for the same set of test compounds for comparison with the experimental data generated by Fraunhofer IME.
Keywords: Surfactants, Log K ow /D, Experimental methods, Predictive (QSPR) models C12-16 alkyldimethyl betaine (C12-16ADB), (3-lauramidopropyl) dimethylbetaine (C12AAP), C12 alkyldimethyl, N-oxide (C12DAO) and 2 reference compounds [atrazine (ATR) and pentachlorophenol (PCP)] used in this study are listed in Table 1 (full structures are shown elsewhere [6]). It is well known that commercially used surfactants are generally mixtures of homologues (e.g. a distribution of alkyl chain lengths). To reduce complexity in the data interpretation, high purity single chain length test items were obtained either from commercial sources or were synthesized and provided by the ERASM Task Force member companies. The activity and purity information were confirmed either by Certificates of Analysis (CoA) documents or from analytical data shared by the suppliers. The exception was C12-16 alkyl dimethyl betaine (approximately 70% C12, 20% C14, 10% C16). Additional information on the source, purity and appearance of all test and reference substances is detailed in Additional file 1: Section S1.

Log K ow determination approaches
Full details of the experimental test conditions for all the log K ow methods plus supporting analytical methodology are provided in Additional file 2: Section S2.

HPLC method
The OECD Test Guideline 117 was followed, with adjustments made to the mobile phase to accommodate high log K ow values (> 6 may be expected when analysing the more hydrophobic surfactants). This HPLC method is currently only validated for neutral compounds and subsequent work of Eadsforth et al. has validated the chemical domain of applicability of the method to neutral non-ionic surfactants [3,7]. However, the method has not been validated for the other ionisable surfactant classes (anionics, cationics and amphoterics). Log K ow values of three alcohol ethoxylates, C8EO4, C12EO4 and C12EO8, together with the reference compound ATR, as a neutral reference compound, were determined by the HPLC method. A calibration graph was generated to facilitate the determination of log K ow (Additional file 2:

Slow-stirring method
The slow-stirring method (OECD Test Guideline 123) was followed which minimises turbulence and thereby enhances the exchange between n-octanol and water without microdroplets being formed. The water phase (lower phase) is sampled from a stopcock at the bottom of the vessel, whereas the n-octanol phase (upper phase) is sampled using a microsyringe, taking care not to disturb the boundary layer. This method has been successfully applied to the determination of log K ow values of highly hydrophobic compounds up to 8.2 [8]. For surfactants, the method should operate below the CMC to ensure no micelles are present during the equilibration study.
Water, n-octanol and the test compound are equilibrated in a thermostated stirred reactor at constant temperature. Exchange between the phases is improved by carefully controlled stirring (150 rpm) which limits turbulence, thereby enhancing the exchange between n-octanol and water and thus increasing the accuracy of the determination of the K ow value. In practice, for each test compound log K ow values were generated at a range of volume ratios of n-octanol and water (i.e. 0.5:1, 1:1 and 2:1) for each of two (normally 48 h and a longer period, either 148 h or 168 h) or more stir periods. In this experiment, the majority of the test compounds (# 1-6, 10-12) were added to the water phase, while the cationics and reference compound (# 7-9, 13, 14) were added to the n-octanol phase. It is considered that either application mode should give the correct log K ow value, the main justifications for adding cationics in n-octanol, are that (a) they were soluble in this solvent and (b) this application mode would reduce any losses resulting from their strong adsorption to glass surfaces. Further studies were carried out for test compounds # 2, 3 and 12 using both improved, more sensitive, analytical methods and over longer stir periods (48 h, 168 h, 240 h and 336 h) to ensure equilibration had been reached. These three compounds were applied in both the water and n-octanol phases. For these three compounds, reasonably consistent data as demonstrated from the mean and standard errors (Additional file 2: Table S2) were generated for each test compound at time points 168 h, 240 h and 336 h and for both phases so a mean value was calculated from these time points and under both dosing methods. For the other test compounds (# 1 and 4-11), reasonably consistent data as demonstrated from the mean and standard error (Additional file 2: Table S2) were generated for each test compound at two or more time points and from these data a mean value has been calculated.
In this study, the slow-stirring method should be taken as the benchmark for comparison with all other methods as it is the most consistently applicable method across all the surfactant classes and provides a complete dataset.

Solubility ratio method
The solubility ratio method (referred to in OECD 107) is based on the log of the ratio of the n-octanol solubility and the water solubility, determined experimentally. However, as the water solubility of surfactants is neither properly defined nor easy to measure, it is recommended in the EU TGD [4] to take the CMC in water as a working approach for determining the water solubility of a surfactant.

Determination of the solubility in n-octanol
The solubilities of the test compounds in n-octanol were determined by adapting the procedure described in OECD Test Guideline 105 (water solubility) [9]. Solubility determinations for each test compound were carried out at three stir times (24 h, 48 h and 144 h) and a mean value was calculated from these time points.

Determination of critical micelle concentrations (CMC)
The standard definition as given in OECD 105 ('the water solubility of a compound is the saturation mass concentration of the compound in water at a given temperature') does not apply to surfactants. At low concentrations, there may be true homogeneous solutions, whereas at higher concentrations lyotropic phase separation can occur [10]. The creation and characterisation of 'saturated' solutions are usually not possible; one exception is anionic surfactants below the Krafft point [5]. Therefore, the term 'water solubility' is not easy to define nor determine for surface-active compounds. As explained previously, the CMC, for which there are defined methods, was used as a 'surrogate' for water solubility for the 12 test surfactants.
In this study, CMC determinations were performed by two methods. The first approach was by adding the surface-active compound step by step to a buffered aqueous solution (pH 7) at 25 °C and measuring the surface tension of the solution by the ring method (OECD 115) [11]. The determination of the CMC values by this method was performed by IMETER ® /MSB Augsburg, Germany (http://www.imete r.de). For the determination of accurate surface tension values, calibration factors were applied as described in OECD 115. Several algorithms (1) K ow = C n-octanol/C water are available for the correction of systematic deviations. Appropriate calculations [12] were used. In addition, a calibration factor was applied to adjust the system by measurement of a reference liquid as water.
The second approach involved using Solid-Phase Micro-extraction (SPME). SPME fibres coated with polyacrylate have been shown to be applicable for the measurement of freely dissolved concentrations of non-ionic, anionic, and cationic surfactants [6,[13][14][15].
The ratio of the n-octanol solubility with the two water solubilities (i.e. CMC values) determined by the different methods was taken to generate log K ow /D of surfactants. Literature values for CMC were also available and similarly compared.

Testing strategy
As the current methods were not appropriate for all the surfactant classes, the following approach was devised for the test (and reference) compounds shown in Table 1.
• Log K ow values for all compounds (# 1-14) at the selected pH values were determined using both the slow-stirring (OECD 123) and solubility ratio estimation methods. • Log K ow values for the three non-ionics and the neutral reference (ATZ) (i.e. # 1-3, 13) were determined using the HPLC method (OECD 117).
All test and reference compounds were tested at pH 7; in addition, the C12 carboxylate was also tested at pH 2 and pH 9. Standard aqueous buffer solutions of pH 2, 7 and 9 were prepared. Saturated aqueous and n-octanol phases for log K ow slow-stirring studies were prepared by stirring overnight at 150 rpm and 25 °C in the following proportions.

Log K ow predictive methods
The number of publicly available QSPR methods for calculating log K ow values has increased significantly over the last few years and there are now multiple methods published and/or commercially available as software. Few reviews are available which make a side-by-side comparison of log K ow predictive methods. A review of ten commonly used commercial software packages was conducted by Dearden [16] and a further review of Mannhold et al. [17] considered a larger selection of both substructure and property-based methods. None of the publicly available methods or commercially available software packages have been developed to specifically accommodate prediction of surfactants nor considered as part of these reviews. As part of this review, we focussed on those methods which were considered in these previous reviews, but which have been used commonly in calculating log K ow of surfactants for regulatory submissions due to either reasons of availability, a clear understanding of the underlying method and/or history of use. These are CLOGP version 5.0 [18], KOWWIN [19], Pipeline Pilot [20], ACD Labs [21] and SPARC [22]. Most methods for calculating log K ow values assume a neutral state of the compound. For most of these methods, the exact algorithm is confidential or not published which makes it difficult to determine the accuracy or applicability of the method to surfactants. There are a few QSPRs, however, for which the background calculations are easier to understand and for this reason make themselves more appealing for use with surfactants since the result can be investigated and modified to account for charge. Such methods include those of Meylan and Howard [23] as incorporated into the KOWWIN software and Hansch and Leo (H&L) [24] which forms the basis of the CLOGP method. The H&L prediction method has been applied successfully to a number of surfactant classes when combined with modification factors which have been developed to specifically address the difficulties in calculating log K ow values for surfactants [25][26][27][28][29].
In addition to the above methods, we have also included some additional methods to predict log K ow /D values for the test and reference compounds: Molinspiration [30], Crippen Fragmentation in Chemdraw [31], Viswanadhan's Fragmentation in Chemdraw [32] and Broto in Chemdraw [33]. These were again selected for ease of availability for practical application to regulatory submissions. Each individual software package/model will produce different predicted log K ow values depending on the approach used. These commercially available programmes are generally designed for the prediction of log K ow values for neutral organic compounds. Significant differences in predicted values can arise as a result of the way in which 'charged' moieties are handled, not only between the programmes but also within the programmes, depending on how the Simplified Molecular Input Line Entry System (SMILES) notation is entered. The format of the SMILES notation has been found to be of particular importance when using KOWWIN and CLOGP (see Table 1 for SMILES used in model calculation).
To demonstrate some of the potential inaccuracies, log K ow predictions for different structural SMILES notations were run for each test compound (# 1-12). In addition, the average predicted log K ow values computed by different QSPRs were also calculated to compare with experimental measurements. In this case, only the ionised forms have been included for anionic, cationic, and amphoteric compounds since they are all fully ionised under test conditions. A prediction for neutral C12 carboxylate was also made for comparison with the experimental value log K ow determined under conditions in which the compound was fully protonated. For non-ionic compounds, only the neutral forms were used to calculate the averages. Further details for the methods and how they have been applied to the test compounds in this study are provided in Additional file 3: Section S3.
In order to enable some comparison and judgement to be made as to the predictivity of the different QSPR methods, regression coefficients (R 2 ) were calculated between predicted values for each method for each surfactant class and observed values determined using the most appropriate method. However, R 2 is not sufficient by itself to enable comparison of such data since it only provides a relative pattern of differences between observed and predicted values and as such can still provide acceptable values for a constant magnitude of error even when this magnitude is very high [34]. Mean Absolute Error (MAE) values thus were also calculated to provide a better indication of the magnitude of the differences between predicted values and observed values for each method and for each surfactant class. The MAE values enable the magnitude of differences between observed and predicted values to be assessed. An additional threshold approach based on MAE was conducted following a modified method of Roy et al. [34] to help further discriminate between predictive methods. The details of this approach are presented in Additional file 4: Section S4.

Log K ow versus log D
Measured values have been corrected to account for ionisation where appropriate for relevant comparisons between values. A derivation of the Henderson-Hasselbalch equation [35] was used to achieve this: The % ionisation at any given pH can also be estimated from: where log P (also referred to as log K) refers to the partition coefficient of the unionised compound in an (2) log D acids = log P + log 1 1 + 10 pH−pK a (3) log D bases = log P + log 1 1 + 10 pK a−pH (4) % ionised = 100 1 + 10 pK a−pH aqueous-organic phase system. For this study, we thus consider the organic phase as n-octanol. The % ionised was calculated for each compound under test conditions as described in Additional file 2: Section S2, Table S5. pK a calculation Inherent to the ability to correct for ionisation are the pH of the system and the pK a of the compound. Literature values for pK a were selected where available. The remaining pK a values for the test compounds were calculated by ACD Labs from Chemsketch [21], Chemicalize [36] and Pipeline pilot [20] (software available from ChemAxon and Accrelys, respectively). Other tools [37] are available which have been more widely assessed for variability of results. However, these three tools are easily accessible and provide readily available values for users.

Experimental log K ow values
Details of test and reference substances are shown in Table 1. Calculations of % ionisation under test conditions suggested that all ionisable test compounds except for C12 carboxylate and C12DAO should be in 100% ionised state at pH 7 based on predicted pK a values (Additional file 2: Section S2, Table S5). Calculations of C12DAO suggest that this compound should be 99% ionised at pH 7. Calculations for C12 carboxylate suggested that this compound should be 100% protonated at pH 2 and 100% ionised at pH9. Therefore, all measured log K ow values at pH 7 should be considered as log D values except for C12DAO, C12 carboxylate and non-ionic surfactants.
All the experimentally measured log K ow values, based on the slow-stirring, HPLC and solubility ratio methods, are reported in Table 2. Additionally, for several ionisable compounds, log D values at pH 7 are extrapolated to log K ow for the neutral species equivalent for comparison purposes and listed in brackets in Table 2.
Two reference compounds (ATR and PCP) were included in this study to check consistency of results with previously recorded log K ow /D values (Table 2). Both the HPLC and slow-stirring methods generated values close to that reported in the literature for ATR, though the solubility ratio method generated a higher value (approx. 0.5 log unit). For PCP, when values were corrected for ionisation using Eq. 2, the slow-stirring method provided a value for the neutral species which was consistent with the literature value when this was also corrected for ionisation and reported as the neutral species (< 0.4 log unit difference between the 5.55 value reported in this study and the corrected 5.87 value reported from the literature). The small observed difference is likely due to the inaccuracy of the predicted pK a value. The solubility ratio CMC values determined for the 12 surfactants and 2 reference compounds using two different methods (surface tension [11] and SPME [6]) are compared against literature values in Table 3. For all non-ionic compounds (alcohol ethoxylates and atrazine), the determined CMC or solubility values were in reasonable agreement with values sourced from the literature. For the remaining compounds, there is some variability between the literature values and those measured by both methods ranging from a factor of 2 (PCP) to over a 100 (C18BAC) highlighting the variable nature of the measurements. These will be influenced by experimental conditions (e.g. pH, equilibrium time, etc.).
Experimental log K ow values generated for surfactant compounds ( Table 2) are highly varied for the different methods. Although there is a reasonable correlation (R 2 = 0.8639) between log K ow values for non-ionic surfactants generated by HPLC and slow-stirring methods, HPLC derived log K ow values are consistently higher than those generated by the slow-stirring method (Fig. 1). Similar slow-stirring values for C8EO4 (2.68) and C12EO8 (4.25) [40] were found by other researchers. No comparison could be made between log K ow values generated using the HPLC and solubility ratio methods since there are insufficient solubility ratio data for non-ionic surfactants.
Cationic surfactants demonstrate good correlation (R 2 = 1) between log K ow values generated using the slowstirring and solubility ratio methods (Fig. 2). This correlation should be taken with some caution given the size of the dataset and the slope of the regression line of 1.2 and y intercept of 1.4 indicate systematic over-estimation by the solubility ratio method. However, the values determined using the slow-stirring method seem lower than would be expected, particularly given the size of the longer alkyl chain molecules. The differences in log K ow values between the slow-stirring and solubility ratio methods perhaps reflect the added complexity of analysing cationic compounds which are known to strongly adsorb to surfaces such as glassware. For both anionic and amphoteric surfactants, there is little correlation between log K ow values generated using the slow-stirring and solubility ratio methods (Fig. 2) as seen in the R 2 values, although available data suggest that the solubility ratio approach may underestimate log K ow /D values compared to the slow-stirring method. No correlation could be made between log K ow values generated using the slow-stirring and solubility ratio methods for non-ionics, since two out of the three test compounds were totally miscible in n-octanol, so a value for their n-octanol solubility could not be provided. There is reasonable consistency between the solubility ratio log K ow values (Table 2) as demonstrated by the mean and standard deviations of CMC values for the majority of compounds (Table 3). However, where  observed differences occur (e.g. C12AAPB, C12DAO and C18BAC), it suggests difficulties in measuring solubility for these compounds. When calculating log K ow from log D (using Eq. 2) for C12 carboxylate, a predicted value consistent with 4.49 measured at pH 2 (under fully protonated conditions) would be expected. However, the predicted values of 5.23 and 5.53 (for values measured at pH 7 and pH 9, respectively) do not correspond exactly, suggesting either problems with the experimental method or in the calculated pK a value, or both.
When determining solubility in n-octanol, data for some test compounds at different time points are reasonably consistent, whereas others are less so. In addition, some compounds (2 non-ionics and the amine oxide) were infinitely soluble (fully miscible) in n-octanol. In conclusion, it was not possible to produce reliable solubility data for all test compounds in both n-octanol and water. Even where it has been possible to get realistic solubility data in this study, the correlation between log K ow values using the solubility ratio method and other approaches is generally low as observed from the R 2 values. C12EO8 is the only surfactant with comparable values generated using both the HPLC and solubility ratio methods and these show between 2.02 and 3.31 log units difference between values generated by both methods. When comparing with slow-stirring log K ow values, the datasets generated using both methods show good correlation for cationic compounds (R 2 = 1 and a slope of 1.2) but either no correlation (for anionics, R 2 = 0.0004) or too few data to make any firm conclusions on the remaining two surfactant categories (Fig. 2). Despite good correlation observed with the cationics, the solubility ratio method cannot be applied to all surfactants when solubility cannot be determined in either or both of the solvent phases. Given that the EU TGD also recommends treating the method with caution for reasons of poor correlation typically observed between octanol solubility and K ow [4], the solubility ratio method is not recommended as a robust or accurate method for the determination of log K ow values for the four classes of surfactants assessed in this study.

Predicted log K ow values
Predicted log K ow values for the twelve surfactants and two reference chemicals are given in Table 4. It can be concluded that QSPR predictions for the ionised reference PCP show good agreement between all the software packages, though less for the neutral reference ATR. The situation for the surfactants is, perhaps not surprisingly, more complex.
All QSPR predicted log K ow /D values have been compared with the log K ow data from the slow-stirring experiments. Several stir times were evaluated for each test substance during the slow-stirring study to ensure that the log K ow values were generated at optimum stirring times (i.e. when the analytical data confirmed that there was equilibrium between the n-octanol and water phases). A comparison of QSPR predicted log K ow /D values with experimental slow-stirring log K ow /D values is provided in Table 5. Broad comparisons of the mean predicted values across all methods compared with mean experimental values derived from values generated in this study [HPLC, slow-stirring and solubility ratio (based on CMC values derived using the surface tension method)] are presented by surfactant class in Fig. 3. These comparisons provide an indication of which class of surfactants is best predicted using the QSPR methods. Non-ionic surfactants with an R 2 = 0.980 demonstrate the highest correlation between experimental and predictive methods and although the regression slope is approximately 1, the intercept demonstrates a systematic difference between predicted and experimental values. Anionics have lower correlation with R 2 = 0.698 whereas cationics can be considered to have no correlation with an R 2 of 0.251. The negative slope of the regression line for amphoterics suggests a complete inability of the predictive methods to calculate representative log K ow /D values for these structures. A more detailed analysis of each surfactant class was conducted to identify and discriminate predictivity of individual QSPR methods.
All the software programmes used were able to predict a log K ow value for neutral (non-ionic) surfactants. This class of surfactants posed no issue with regard to SMILES notation and there are no reasons to discount any individual values. CLOGP [18], modified Hansch and Leo (H&L) [24], ALOGP [41] and the Broto atomic fragment [33] all demonstrate R 2 values of > 0.98 for correlation between predicted and observed values (Additional file 4: Section S4, Table S7). R 2 values for all methods are above the threshold for acceptability as defined in ECHA guidance [42]. However, when considering MAE values as a better indicator of absolute predictivity, Broto, CLOGP, Molinspiration, ALOGP and modified Hansch and Leo have the lowest values (0.06, 0.18, 0.21, 0.33 and 0.43, respectively) indicating that these are the best ranked of the considered QSPR prediction methods for predictivity (Additional file 4: Section S4, Table S8). Whilst MAE values provide only a ranking of predictivity between methods, when considering the threshold approach (Additional file 4: Section S4, Table S9) CLOGP, Molinspiration, ALOGP and Broto all classify as good methods and would, therefore, be the most recommended for predicting log K ow of non-ionic surfactants based on the small dataset considered.
For anionic surfactants SPARC [22], Crippen Fragmentation [31] and Viswanadhan's Fragmentation [32]  are all unable to generate a prediction due to their inability to handle charged compounds. All the remaining programmes are able to generate predicted log K ow /D values although ACD Labs [21] requires removal of the counter ion from the SMILES notation and KOWWIN [19] will always 'force' the structure to its neutral form, either by adding an 'H' atom or bonding the counter ion to the negative charge, when it has been included in the SMILES notation. This can lead to significantly different predicted values of log K ow for what is apparently the 'neutral' form. (See Test compounds #4 and #5; Table 4). When at neutral pH, most anionics will exist in their ionised form; therefore, it is recommended that the SMILES notation reflect this (i.e. do not include the counter ion). The remaining predictive methods appear able to discount the counter ion. Calculation of log K ow /D for the majority of anionic surfactants, e.g. alkylbenzene sulphonates, alkyl sulphates, using the H&L approach with a variety of surfactant specific modifications has been widely researched and validated. When compared to this approach, CLOGP appears to give consistently higher log K ow values for the sulphate-containing surfactants. This is due to the lower fragment value used for the sulphate fragment (− 2.17 cf. − 5.87 in H&L method). KOWWIN and Broto both scored highly when considering R 2 alone with values of 0.999 for both methods (Additional file 4: Table S7). Whilst ALOGP predictions also appear consistently high for selected compounds in this class, using the MAE measure of predictivity, ALOGP ranked by far the best when considering magnitude of the error with an MAE value of 0.16 (Additional file 4: Table S8) followed by Molinspiration, KOWWIN and H&L with modifications (with MAE values of 0.46, 0.83 and 1.06, respectively). When taking into account the threshold approach also, in which only the ALOGP method scores as a moderate predictor compared to poor/bad scoring for all other methods (Additional file 4: Table S9), ALOGP is consistently better for predicting log K ow /D for anionic surfactants based on this small dataset. Molinspiration, KOWWIN and H&L with modifications would be next recommended methods for anionics based on MAE scores (Tables 5 and Additional file 4: Table S8).
For cationic surfactants, SPARC, Crippen Fragmentation, Broto and Viswanadhan's Fragmentation are all unable to generate a prediction due to their inability to handle charged compounds or missing fragment values for N+. As for anionic surfactants, ALOGP predictions appear consistently high for the compounds in the cationic class of surfactants. Care should be taken when entering the SMILES notation for quaternary nitrogen in both CLOGP and KOWWIN since significantly different  Table S9). However, based on MAE and R 2 measures, preference is given to the ACD Labs and CLOGP methods, providing values are derived using [N+] with counter ion SMILES notation with CLOGP. For amphoterics, there are considerable uncertainties surrounding the appropriate approach to be taken where N+ is present in conjunction with other polar groups. SPARC, Crippen Fragmentation, Broto and Viswanadhan's Fragmentation are all unable to predict log K ow for amphoterics due to their inability to handle charged compounds or missing fragment values for N+. The same is true for the standard H&L method since there is no published value for an N+ fragment (without an associated halide ion). As with cationic surfactants, it is recommended that the quaternary nitrogen is entered in the SMILES string as [N+] for amphoterics to avoid miscalculation. Neither Molinspiration, ALOGP or ACD Labs are able to calculate a value in the absence of the '+' charge. KOW-WIN will protonate any negatively charged groups and treat the N+ as a pentavalent nitrogen. When working with sulphobetaines, it is suggested [44] that when using KOWWIN the [Na+] should be included in the SMILES notation to avoid protonation of the N+ which leads to an underestimation of log K ow . The value of the Na+ can then be subtracted. This approach was validated against K IAM values taken from experiments using immobilised artificial membranes (IAM). Using the same approach here, subtraction of the Na+ value appears to prevent over-estimation of the log K ow for the carboxybetaines and brings the values closer to those predicted by CLOGP and Molinspiration, although the differences in predictions using these methods are still large (Table 4). Overall when comparing methods for predictivity, no method stands out and all methods score as poor/bad based on the MAE threshold approach (Additional file 4: Table S9). ALOGP generates the best MAE value compared to the other methods (MAE value of 2.25, Additional file 4: Table S8) but also the lowest R 2 value of 0.529 (Additional file 4: Table S7).

Discussion
It should be borne in mind that, for simplicity, the experimental data generated in this study involved the deliberate use of single chain constituents. In reality, commercial surfactants are often complex mixtures containing several components with a range of different water solubilities and hence n-octanol/water partition coefficient values. Evaluation of the experimental methods investigated in this study for application to multi-component surfactant products still needs to be undertaken.
Log K ow /D data calculated using the HPLC method, slow-stirring method, solubility ratio approach or predictive software are generally not in agreement when assessing the 12 test compounds, though non-ionic log K ow values were rather more consistent than the other three classes of surfactants. Of the experimental techniques, the slow-stirring method is considered to be the most widely applicable method for generating log K ow data for all the surface-active test compounds, provided it can be demonstrated that the 'surfactant'-'water'-'noctanol' system was allowed to reach equilibrium. This is supported by good agreement in slow-stirring log K ow data for C8EO4 and C12EO8 generated in the current study and earlier work [40]. It is possible by minimising micelle formation, emulsification and adsorption effects [45,46] to obtain reasonably reliable log K ow values for surface-active molecules using a slow-stirring method. Corrections to apparent log K ow data can be made if the concentration in the aqueous phase at equilibrium is above the CMC. The main limitation of the slow-stirring method from the current study is that it requires sensitive analytical methods (e.g. liquid chromatography coupled with mass spectrometry; LC-MS) for analysis of the water phase for the more hydrophobic test compounds.
Predicted log K ow /D values do not show a great degree of correlation with experimental values, with the exception of slow-stirring derived log K ow /D values for nonionics. It is recognised that conclusions drawn from this study are based on a relatively small dataset and so further studies would be recommended to confirm findings. However, this conclusion is also not restricted to surfactants. It has been shown that log K ow values derived by different methods for a range of organics were not comparable [47]. It has been advised [48] that log K ow data for organics derived from software packages should be used cautiously as they cannot always cope with the complex and/or ionisable compounds. A more recent study [49] used a combination of molecular dynamics simulations and the quantum chemical conductor-like screening model for realistic solvents (COSMO-RS). A weight of evidence (WoE) approach is a reasonable approach to take for non-ionic surfactants using experimental and predicted values, given the greater degree of correlation and lower incidence of prediction errors between slow-stirring log K ow /D values and log K ow /D predictions using various methods. Figure 3 also demonstrates the good correlation achieved when taking this approach for non-ionics. However, a WoE or averaging approach is difficult to justify for the other classes of surfactants given that the correlations as determined by R 2 are lower and the incidence of prediction errors as determined by MAE scores are higher (Additional file 4: Tables S7-S9). Figure 3 also demonstrates the reduced correlation for anionics when taking this approach and the lack of correlation when considering cationics and amphoterics. Recommendations of currently available prediction models are provided for those methods which seem to provide the most robust predictions for surfactants at pH 7 (Table 6). In dealing with complex multi-component surfactant products, the recommended approach is to calculate a weighted average from the predictions of each individual chain length.
Given the intrinsic difficulties with phase separation, emulsification, limits of detection, ionisation state in the environment and lack of a clear definition of solubility for surfactants, all current experimental methods have limitations for determining accurate log K ow values. Therefore, it is recommended [50] that promising alternative experimental log K ow methods and alternative methods to log K ow , which may be more biologically relevant, should be evaluated and validated for surfactants.
The alternative experimental log K ow methods which have the potential for overcoming some of the experimental difficulties associated with current methods with surfactants include: • pH metric (potentiometric) method for ionisable compounds [48,51,52]. • Proton nuclear magnetic resonance (H-NMR) A recent study has demonstrated how proton nuclear magnetic resonance (H-NMR) spectra can be used as a predictive method to determine log K ow values [53]. • Centrifugal partition chromatography (CPC), also known as counter-current chromatography (CCC) [54,55].
It is beyond the scope of this study to assess these methods. These relatively unused approaches require evaluation against existing methods for application for all compounds including surfactants.

Conclusions
All current experimental methods have limitations for determining accurate log K ow values given the intrinsic difficulties with phase separation, emulsification, limits of detection, lack of defined solubility, etc. Given these limitations, on the basis of the current study, the slow-stirring method is the preferred of the currently available experimental methods for generating experimental log K ow /D data for all the surface-active test compounds, provided (a) sufficient time has been allowed to ensure equilibration of the test substance and the n-octanol and water phases, (b) a low stir rate is used to minimise any emulsion formation and (c) care is taken to sample the aqueous and n-octanol phases to minimise any contamination from the n-octanol/water interface. For the experimental methods outlined above, it is important that log K ow /D data are generated for test compounds in both their neutral and fully ionised forms. Where the pK a approximates to the environmental pH (range 5-9), it is recommended that log K ow /D is measured under both sets of conditions under which the surfactant is fully neutral and fully ionised (i.e. two values should be determined at both high and low pH). If the pK a is < 5 or > 9 then testing at pH 7 is recommended to represent relevant environmental conditions. Measured values can be corrected using a derivation of the Henderson-Hasselbalch equation (Eq. 2) for any ionisation state to generate a log K ow or log D under relevant environmental conditions. Thus, for any determination of partitioning the pK a and the pH of the test system should be reported.
Although there is a reasonable correlation between log K ow values for non-ionics generated by the slow-stirring and HPLC methods, it is apparent from this work that HPLC generates consistently higher log K ow values. As with other indirect methods, HPLC suffers from the lack of reference surfactants with accurately determined log K ow values. If slow-stirring derived log K ow values for non-ionic reference standards were developed further and applied in an OECD 117 HPLC method, this positive bias would be removed, making the HPLC approach a more rapid and attractive approach to determining log