Unravelling the chemical exposome in cohort studies: routes explored and steps to become comprehensive

Environmental factors contribute to the risk for adverse health outcomes against a background of genetic predisposition. Among these factors, chemical exposures may substantially contribute to disease risk and adverse outcomes. In fact, epidemiological cohort studies have established associations between exposure against individual chemicals and adverse health effects. Yet, in daily life individuals are exposed to complex mixtures in varying compositions. To capture the totality of environmental exposures the concept of the exposome has been developed. Here, we undertake an overview of major exposome projects, which pioneered the field of exposomics and explored the links between chemical exposure and health outcomes using cohort studies. We seek to reflect their achievements with regard to (i) capturing a comprehensive picture of the environmental chemical exposome, (ii) aggregating internal exposures using chemical and bioanalytical means of detection, and (iii) identifying associations that provide novel options for risk assessment and intervention. Various complementary approaches can be distinguished in addressing relevant exposure routes and it emerges that individual exposure histories may not easily be grouped. The number of chemicals for which human exposure can be detected is substantial and highlights the reality of mixture exposures. Yet, to a large extent it depends on targeted chemical analysis with the specific challenges to capture all relevant exposure routes and assess the chemical concentrations occurring in humans. The currently used approaches imply prior knowledge or hypotheses about relevant exposures. Typically, the number of chemicals considered in exposome projects is counted in dozens—in contrast to the several thousands of chemicals for which occurrence have been reported in human serum and urine. Furthermore, health outcomes are often still compared to single chemicals only. Moreover, explicit consideration of mixture effects and the interrelations between different outcomes to support causal relationships and identify risk drivers in complex mixtures remain underdeveloped and call for specifically designed exposome-cohort studies.

environmental changes [54,57]. There is an extensive number of environmental factors that may be relevant. To approximate this multitude, the holistic concept of the exposome was proposed by Wild [108] and popularized and refined by Rappaport [81]. Many scientists have followed up on the idea and suggested further specifications for specific sensitive time periods (e.g. pregnancy, perinatal period) [17,82,95], selected tissues (e.g. teeth, placenta) [5,56], focus on specific methodologies [22,63] or prevalent diseases [83].
The exposome concept aspires to identify chemical exposure, among other environmental factors, relevant for adverse health effects, thus complementing the contribution of life style factors and genomic susceptibility to human disease development [108]. Most exposome definitions acknowledge that typically only chemicals that enter the organism can interfere with cellular or organ functions and provoke adverse effects. Hence, the exposome can be described as the totality of internal human exposure with regards to exogenous chemicals, their biotransformation products, and endogenous chemicals sensitive to various environmental exposures and potentially involved in signaling pathways [30]. This internal chemical environment is highly dynamic and the exposome strives to consider the totality of exposures of an individual over the entire life course from conception until death [81,85,108]. It is obvious that the conceptual claim to capture the totality of internal exposure may have practical limitations. For instance, short-lived or reactive chemicals may not be detected, the exposome assessment can be biased by snap-shot samples and/or analytical restriction to available body fluids such as saliva, blood or urine. The endeavor also challenges chemical analytics [78], which need to consider matrix effects for internal exposure, the distribution of chemicals between tissues, and high transformation rates. Moreover, the assessment has to account for sequential exposures and mixture effects [30]. Nevertheless, taking a perspective on the internal exposure seems useful in order to advance from mere associations towards establishing causal links between exposure and effect. Understanding the relation of external to internal exposures, therefore, is a central aspect of an exposome assessment. It allows a revisiting of existing concepts of biomarkers of exposures.
The exposome assessment could represent a critical entity to broaden our understanding of the contribution of environmental factors in the etiology of diseases; it could help to advance the nature-versus-nurture debate. To this end, health data obtained from human cohort studies needs to be evaluated in close linkage to environmental data. Here, especially the integration of various high-resolution cohorts with in depth-phenotyping is aspirational to ensure a sufficient sample size and statistical power. At best, this includes data to approximate a lifetime exposome with all its vulnerable phases beginning as early as the conception, throughout the developmental phases of childhood and adolescence to adulthood and into old age [85]. However, such endeavors are accompanied by challenges such as the sharing and harmonization of data, but also legal and ethical considerations for the use of sensitive human data. Viewed in conjunction with genomic analyses and as part of an overall exposome approach, such data could push us towards the understanding of disease development as well as the prevention thereof. For instance, as many environmental factors are typically subject to regulatory policies, this perspective could ultimately reveal novel points of action for prevention and treatment of civilization diseases [81].
As for now, there is no consensus on how to assess the exposome. From a practical point of view, this is due to the multitude of environmental factors, variation in individual behaviors determining exposures, and the novelty of the exposome concept. Therefore, this review aims to characterize major current projects and their approaches to implement the exposome concept with focus on environmental chemicals.
Major knowledge gaps were identified with regard to the relation between environmental chemical monitoring (external exposure) and human biomonitoring (internal exposure) [24]. In line with this argument, the European Union (EU) funded several projects under the EU Framework Programme 7 (HELIX, EXPOsOMICS, HEALS) in order to advance specific approaches to capture exposome data and link it to health outcomes gained from cohort studies [24]. As the exposome might vary substantially between geographic regions, this review focuses on the analysis of these major European exposome projects with special emphasis on chemical exposure. We, compared the approaches of the European exposome projects to the EU human biomonitoring initiative HBM4EU (https ://www.hbm4e u.eu/) and a literature review to summarize current knowledge of associations between human exposures to chemicals and health outcomes. Additionally, this review analyses the chemical and bioanalytical methods used for exposure detection, the aggregation of internal exposures, and novelties related to the association between exposome and health outcome. With regard to the latter, we elaborate on selected aspects of the cohort studies, which were included in the exposome projects.

Aims and framing of current European exposome studies
As part of this review, projects funded by the European Union under EU Framework Programme FP7 and EU Horizon 2020 were considered in detail. These include the projects HELIX, HEALS, EXPOsOMICS, and HBM4EU [18,92,98,104]. All of these projects related their research to existing infrastructures and data available in different European cohorts with the aim of comparing health outcomes and exposure information [6]. Moreover, they all invested dedicated specific efforts to generate cohort-related biosamples and exposome data.
The Human Early Life Exposure (HELIX) project targeting the characterization of the early-life (pregnancy and childhood) exposome of European populations combined six European birth cohort studies [104]. Research included the characterization and aggregation of external exposures, its integration with internal exposures, and their association with major child health outcomes [104]. Further, it comprised omics data and health outcomes and attempts for simplifying complex exposures into patterns [104].
The Health and Environment-wide Associations based on Large Population Surveys (HEALS) consortium major objective was to advance the methodology and analysis of the human exposome employing advanced statistical tools [92]. The project used data acquired from several current European epidemiological studies including mother/infant pairs, children, and adults [93] to characterize human exposures in conjunction with disease mechanisms and health outcomes [92]. HEALS efforts were built upon human biomonitoring samples, the assessment of exposure biomarkers, and various omicstechniques [92].
The EXPOsOMICS project focus lay on the development of assessment strategies to characterize the mixture exposures to environmental pollutants of consented priority and to approximate the individual-level exposome [98]. To this end, the project utilized data of 12 cohorts including three experimental studies, five mother-child cohorts, four adult cohorts, and subsamples with personal exposure monitoring [98]. Personal and population-level measurements were combined with various omics technologies to characterize biological samples in depth [98].
Finally, the European Human Biomonitoring Initiative (HBM4EU) is funded by the recent EU Horizon 2020 program. The main objective of HBM4EU lies in the coordination and advancement of human biomonitoring efforts across Europe with the ultimate goal to support policy making [14]. HBM4EU is a joint effort of more than 119 institutional partners charged with human biomonitoring tasks and linked third parties, mostly research entities in a total of 30 countries. The project strives at closing knowledge gaps with regard to protocols for detecting and harmonized methods for assessing internal chemical exposures and respective health consequences [14]. Therefore, HMB4EU plans to also investigate health effects in relation to human biomonitoring data with the use of existing cohort studies and biobanks [14].
In summary, the ultimate goal of all projects is to advance the characterization of complex human exposures from environmental and other sources and the association with health outcomes. However, the pursued objectives and chosen approaches are distinctly different between the endeavors. Although exposome and human biomonitoring projects both ultimately aim at controlling environmental risks for human health, they build on different foundations, concepts, and study designs. Exposome projects aim at the totality of environmental exposures and relate dozens to thousands more or less precisely defined signals of exposure to health endpoints in various cohort studies ranging from observational studies to interventions, cross-sectional to longitudinal approaches, and birth up to high age cohorts. Human biomonitoring projects, in contrast, aggregate knowledge to identify comparably few substances of particular concern, derive levels of tolerable exposure to these substances and precisely quantify these using reliable and representative marker compounds.

Exposure analysis strategies
Major decisions on which factors to include in an exposome study, comprise the selection of considered domains of chemical (e.g. pharmaceuticals, food additives, contaminants), physical (e.g. build environment, noise, green space), and social environment (e.g. neighborhood, socio-economic status, infrastructure), as well as biosamples to be used for analysis (e.g. urine, blood, or external proxies). They are central to what we could denominate as the conceptual analysis framework (Fig. 1). Ideally, the framework matches the problem formulation in order to adequately characterize risks. Strategies for the selected exposure analysis may be considered in this context. Technical aspects are decisive in many situations with respect to whether or not an environmental stressor can actually be traced and quantified at the required spatio-temporal scale. Thus, while the exposome concept intends to assess the time-and component-aggregated internal exposure that human bodies experience, many compounds may either not be traceable or are eliminated fast despite elucidating longer-lasting responses. Therefore, exposome research strives to (i) complementary describe external exposure situations more comprehensively to identify potentially unacknowledged stressors, (ii) to relate external to internal exposure to strengthen plausibility of associations between external exposures and observed health outcomes, and (iii) explores options of omics technologies to provide novel, untargeted biomarker detection tools.
The HELIX project, in its goal to provide a global view, developed a multistep approach to characterize various co-exposures in early life including outdoor exposures, individual exposures, and an integration step for external and internal exposure [102]. As their most complex exposure perspective, they defined exposure groups comprising items such as atmospheric pollutants, surrounding natural spaces, traffic, water disinfection by-products, tobacco smoking, lifestyle or socio-economic capital [95] testing whether grouping at this level would already allow suitable simplification of the chemical exposome characterization.
To investigate the notion that the socioeconomic status of individuals could be a major predictor of environmental exposure, a comparative study using data from nine European urban areas attempted to identify reoccurring exposure patterns through co-correlations of the above exposure groups with socioeconomic determinants for urban exposomes during pregnancy [82]. The exposure groups, meteorological factors, air pollutants, traffic indicators, density of the build environment and others were found to correlate at moderate to weak degrees while distance to green space was inversely correlated. Yet, urban exposures of individuals were not found to be associated with socio-economic descriptors (SED), such as education or income of individuals, in this case study of urban populations. Instead social patterning was shown to be of considerable heterogeneity between the different cities [82] and correlation between SED and the exposome was specific for a local area. However, using their exposure biomarkers data from blood or urine for 41 chemical contaminants from six European birth cohorts, it could be demonstrated that specific associations between SED and level of internal contaminant exposure could be discerned [64]. E.g. higher socio-economic position was found associated with certain per-and polyfluoroalkyl substances (PFAS), mercury, arsenic, phenols, and organophosphorus pesticides, while lower socio-economic position was seen associated with cadmium exposure during pregnancy and increased lead and phthalate exposure during childhood [64]. The increased cadmium exposure in pregnant woman could in part be explained by smoking habits.
The HELIX project characterized the exposome for 87 and 122 diverse exposure variables for mothers during pregnancy and their children during development Internal body burdens are analyzed in individual or population-based biosamples, such as blood. Typical analytical techniques are indicated in the white egg shapes. All approaches cover different aspects of a comprehensive exposome with regard to aggregation level, spatio-temporal characterization or indication of biological impact (6-11 years of age), respectively. It was based on six European birth cohort studies, comprising of 1300 mother-child pairs [95]. Among them, about 60 different chemicals were determined in blood or urine samples [43] (see Table 1). On the basis of this data, it was possible to differentiate a pregnancy exposome from the exposome for childhood. At the same time, for some exposure groups such as atmospheric pollutants, surrounding green spaces, meteorology and build environment correlations between pregnancy and childhood exposomes were found, probably reflecting a stable neighborhood for the time domain considered. Furthermore, networks with larger clusters of exposure variables could be described and were interpreted in relation to the predefined exposures groups. Yet, when conducting principal component analysis (PCA) for the pregnancy and the childhood exposome it required 65 of the 87 (75%) and 90 of the 122 (74%) exposure variables to explain 95% of the variance in the original data pairs [95]. Thus, the authors conclude from their collaborative efforts, 'the early lifetime exposome to be high dimensional in terms of having little redundancy' [95].
Based on reviewing available approaches for assessing individual exposomes with external measures [58,97] and internal biomarkers of exposures [92], the HEALS consortium employed a series of analytical, bioanalytical, and computational tools to advance the performance of environment/exposome-wide association studies (EWAS) and tested them with population samples. Biomarkers of exposure are understood in this context according to earlier definitions of e.g. NRC (1987) as indicators signaling exposures from within biological systems or samples.
Regarding the use of modern sensor technologies for chemical exposome assessment, two commentary papers from the consortium suggest that personal location devices, such as smartphone tracking combined with temporal-spatial pollution mapping, promises unique opportunities to gain estimates of an individual's external exposures against air pollutants such as particulate matter (PM2.5) or nitrogen dioxide (NO 2 ) [58,97]. Furthermore, the project undertook to categorize various types of stressors and reflect the availability of specific biomarkers of exposure and their readiness for use in exposome assessment [92].
The defined stressor categories address different aspects, namely chemical composition (groups of organic and inorganic compounds), physico-chemical properties (persistent and volatile), intentional use such as pharmaceuticals and life style, DNA-damaging agents, context or media of unintentional exposure such as occupational environments, air pollution, food or water contamination. While these categories are certainly not mutually exclusive but overlap, they clearly relate individual stressors to management options. E.g., exposure against volatile organics may be reduced by ventilation, which would not work equally effective for persistent organics; water contamination can be treated in drinking water treatment plants while food contamination must be avoided at the source. Smoking and lifestyle factors can efficiently be dealt with individually. Yet, chemical categories may not be mutually exclusive, e.g. a substance can be volatile and persistent.
Sixty four individual stressors were specified and considered for the different stressor categories, many of them chemical entities. Others address chemical groups such as dioxin-like polychlorobiphenyls (PCB), antibiotics or poorly defined complex mixtures such as diesel exhaust, bio-aerosols or disinfection by-products. The compilation of specific biomarkers of exposure typically comprise chemical compounds of anthropogenic origin or their transformation products detectable in biosample matrices, such as blood, blood serum, blood plasma, urine or breast milk. In total 135 biomarkers of exposure were considered and reviewed with regard to the availability of reference and exposure limit values. Yet, for 12 out of the 64 individual stressors considered, no biomarker of exposure could be retrieved from the literature [92].
Various specific contributions in the field of chemical exposure-human effect investigations, typically based on existing cohort studies or existing biosamples have been provided throughout the HEALS project [44]. The planned integrated use of advanced tools within a European Exposure and Health Survey (EXHES) and the results from the application of the environment-wide association approach to EXHES data have not yet been reported.
The overall ambition of the EXPOsOMICS project was to study the opportunities of novel exposure analysis approaches for strengthening the plausibility between environmental exposures and health outcomes. The project focused on air and water contamination and their health effects during critical periods of life [98]. In particular, the deconstruction of the complex air and water pollution mixtures was pursued as a leading idea.
For personalized detection of exposure of humans by air pollution several routes were followed. The composition of particulate matter with regard to improved size classification, elemental composition and organic and polycyclic aromatic hydrocarbon (PAH) load were differentiated using a novel mobile monitoring design [32]. A portable monitor was coupled with a smartphone app integrating geo-location and other information to provide more accurate individual's exposure and allow analysis of the influence of microenvironments in contrast to current standard practices of modelled average exposure levels [26]. Personalized air particle samplers apparently still struggle with methodological and sensitivity issues; however, various developments are under way [31] that may render them widely useful in the near future.
To foster the idea that molecules from within the body could indicate past exposure histories that may help in predicting future disease risks [99], the EXPOsOMICS consortium substantially contributed to the Exposome-Explorer [66,67] developed at the International Agency for Research on Cancer (IARC). The Exposome-Explorer provides an internet-accessible database that summarizes information on biomarkers of exposure related to environmental risk factors for diseases. More than 800 biomarkers have been collated up to date, half of which are directly related to exposure against chemical pollutants, while other reflect e.g. dietary and metabolic components. They are listed in conjunction with metadata characterizing the biomarker structure, methodological information, concentrations in biosamples, and correlations with exposures, use in cohort studies and associations with cancer occurrence.
The development of unbiased exposure biomarkers was sought by employing omics techniques such as adductomics, epigenomics, transcriptomics, proteomics or metabolomics. It is a strategy similar to untargeted chemical analytics to account better for the diversity of pollutants. The omics techniques strive for comprehensive snapshot detection of the respective class of biomolecules and by contrasting different situations they were employed in EXPOsOMICS to obtain fingerprints of specific exposure situations [99]. From the project, success is reported for the task to disentangle contributions from different components of air pollution using metabolome and transcriptome profiles [31]. By contrast, the attempt to discriminate transcriptional and microRNA change patterns for different disinfection by-products (DBPs) exposure was not possible, despite that DBPs may consist of over 700 components. Both types of findings, however, provided additional information employing a combination of top-down (observing complex exposure) and bottom-up (studying component-defined mixture exposures) approaches. This 'meet in the middle' approach [99] supported the plausibility between estimated past exposures and observable health effects.
Finally, the perspective of the ongoing HBM4EU project is to provide aggregated indicators of human body burden of chemicals. These summary measures are thought to provide a basis to study and understand variations in time, between countries, sex, age, or socioeconomic status and thus provide scope for monitoring of future management activities [18]. The selection process of chemicals, for which monitoring and research activities shall be carried out has been based on expert knowledge around the criteria, relevance of a potential indicator for policy, society and health, as well as consideration of the available biomarker data. The experts were drawn from international, European and national institutions charged with chemical risk assessment and stakeholder consultations. The compound prioritization followed a multi-step procedure, which started with an online survey to nominate substances for further research, included a stakeholder workshop, and several rounds of shortlisting and discussions with stakeholders, national hubs, and the EU Policy Board in order to obtain the final list Thus a transparent strategy was chosen for prioritization of compounds which did not rely solely on scientific evidence, but also societal relevance on a European level. With regard to biomonitoring, the activities subsequently focus on provision and ring-testing of analytical methods for subsequent campaigns within or subsequent to the project. The list of what is called HBM4EU priority substances comprises altogether 18 entries, among them single chemicals or groups of chemicals with related structures, usage pattern or origin. Prioritization, objectives and policy-related questions for each chemical (group) are detailed in two scoping documents in 2016 and 2017-2018 [69,86].

Chemicals considered in the EU exposome projects
To provide a structured list of all chemicals considered across the different project, compounds were grouped into broad stressor categories in line with the suggestions made by HEALS [92]. The list of all chemicals, where dedicated analytical efforts were performed within the considered exposome-oriented projects, is collated in Table 1. Persistent organic pollutants (POPs), many of them legacy chemicals and metals, were typically analyzed in human blood samples. Other organic contaminants and current-use pesticides are less persistent and, therefore, often urinary metabolites are quantified in urine samples. Also, volatile organic chemicals cannot be captured in their original form but they may be detected as urinary metabolites.
The four projects described show different priorities regarding the consideration and analytical efforts devoted to different chemical stressor categories. In HELIX, HEALS, and HBM4EU the analyses comprised potentially toxic elements such as lead, mercury, arsenic, and cadmium, various persistent organic pollutants including. e.g. brominated flame retardants, polychlorinated biphenyls, and pesticides, as well as other organic contaminants, such as bisphenols, parabens, and phthalates. HEALS and HELIX furthermore looked at pollution sources such as smoking, air pollution and water contamination. Moreover, HEALS and HBM4EU put emphasis on volatile organic compounds and food contamination.
HBM4EU additionally includes research on substances of emerging concern [18,39,98]. Finally, HELIX is the only project that explicitly added selected pharmaceuticals.
Thus, the analyses comprised a range of different chemicals including candidates with suspected health effects (e.g. hormonally active substances, carcinogens), chemicals related to air quality and pollution, and, finally, also life style-related exposures (e.g. smoking) nutritionderived residue and contamination-related exposures.
The chemical selection process can be seen in relation to the project-specific focus. In order to achieve a holistic exposure approach, the HELIX project included the analysis of multiple chemicals instead of 'one-exposureone-health-outcome' perspectives [43]. To meet this Table 1 List of chemicals analyzed in the EU projects HELIX, HEALS, and EXPOsOMICS and prioritized chemicals compounds of HBM4EU, HEALS, HELIX ambition, HELIX looked into a total of 59 environmental chemicals comprising 14 essential elements, 15 persistent, and 20 non-persistent environmental chemicals among them pesticides, bisphenols, and phthalates [43].
The HEALS project put focus on substances with known particular relevance to human health in Europe [92]. They based the selection of chemicals on an expertdriven process, which included, among others, national and EU level policy stakeholders but also scientific partners. This resulted in a list of 98 analytes, which can be attributed to nine of the stressor categories ranging from elements via organic pollutants to food and water contamination.
The EXPOsOMICS project focused on contaminations of air and water with the aim to provide data on a broad spectrum of chemicals and chemical mixtures [98]. In particular, for air pollution, EXPOsOMICS analyzed particulate matter and ultrafine particles, while for water contamination they focused on disinfection by-products [98]. Here, mixtures were acknowledged with the aim to identify risk drivers.
In HBM4EU, several international and European human health risk assessment schemes were evaluated (e.g. WHO, FAO) with the aim to implement a prioritization strategy [14,84]. The first group of priority substances was described in 2016 [84] and was complemented by a second list in 2018 [69,84]. The priority groups were clustered according to their toxic potency, for modes of actions regarding different cellular systems, common target organs, or common phenomenological effects [29]. Also, adverse outcome pathways (AOP) like fatty acid composition changes in liver, decreased anogenital distance and cranio-facial malformation were used for grouping [14]. The concept of AOPs aims at linking external exposures to respective cellular concentrations and biochemical mechanisms that ultimately lead to responses at the level of organs and organisms but finally also at population levels [30].
The array of substances, which have been collated from the described exposome projects, will subsequently be compared against other biomonitoring efforts as well as against literature-based knowledge about chemicals that have been reported in connection with elucidation of health effects.

Association of exposome and health outcomes
One of the biggest challenges of the exposome approach lies in establishing robust relationships between multidimensional environmental data and health outcomes acknowledging the "extremely complex multistage development process" [102].
To cover health-relevant data throughout the entire life course, it is crucial to include cohort studies that cover the entire age range, that are representative for the general population (including all sexes, socio-economic strata, and ethnic groups), and follow, at best, a prospective, longitudinal design [85,109]. The cohorts included in the European exposome projects cover several of these ambitions, yet are still too few to allow generalizations (for a detailed overview of study characteristics, please see Additional file 1: Table S1). The study designs included cross-sectional surveys, longitudinal cohort studies, and interventional trials with an age range from 0 to > 80 years with sample sizes from 30 to 14,000 participants [50,112]. While HELIX focused on childhood with the inclusion of birth-cohorts [104], HEALS, EXPOsOM-ICS, and HBM4EU cover the entire life span with a stronger focus on adulthood [93,94,98]. The gender ratio across the majority of studies is balanced with exceptions that focus on females [62,94]. Another relevant aspect is the respective time of enrollment, which starts as early as the 1990s (e.g. ALSPAC, [36]) and is still ongoing in other studies (e.g. ENVIRonAGE, [49]). This is of major relevance as the overall exposure of a child born in the 1990s will clearly differ from a child born nowadays. To name only a few buzzwords indicating those changes: smartphones, non-smoking protection laws, gas exhaust standards, a changing climate, and the coronavirus pandemic.

Table 1 (continued)
The data in this table is based on following sources: HELIX: [43,60], HEALS: [92], EXPOsOMICS: [98], HBM4EU: [69,84] An expansion of the endeavor to link the (chemical) exposome to health outcomes is highly desirable. However, the inclusion of more cohorts into one overall project poses several challenges. These include not only methodological issues, such as the harmonization, inventory, and cataloguing of data, but also comprise legal and ethical challenges with respect to personal data and respective data protection laws. Furthermore, it poses a series of statistical issues. Amongst others they have to cope with the large overall number of exposure variables, missing data, the correlational structure within exposure and health data, interaction-or mixture effects, and biases in multi-center studies. Not surprisingly, therefore, all exposome projects considered in this review strived to systematically summarize, evaluate, and advance statistical methods for building exposome-health associations [7,85,87,97]. While significant progress has been made for variable selection and identification of pairwise interactions between exposures, the untangling of relevant and confounding exposures and the analysis of exposome data in longitudinal studies remain challenging [21,55,85].
Throughout the described exposome projects, a large amount of data has been collected. Exposures related to lifestyle factors include e.g. maternal stress [73], tobacco smoking [40,72], or occupational exposure to chemicals [11] but also air pollution. Regarding chemicals, HEALS reported on the analysis of 98 chemicals [92], HELIX included 59 [43,60], and EXPOsOMICS seven substances [98]. In HBM4EU 230 compounds are considered as potentially relevant for human biomonitoring [18,84]. However, stating the exact number of analyzed chemicals is ambiguous as the projects sometimes do refer not only to single entities (e.g. lead, selenium, mercury volatile organic compounds, and various organic pollutants, see Table 1) but rather to substance groups (e.g. benzophenones, trihalomethanes), which potentially contain multiple compounds.
On top of these already diverse data, many cohort studies additionally complement their analyses with biomarkers gained from biospecimen (e.g. blood, urine) such as enzyme levels and activities, hematological markers [52], hormone levels, metabolite levels (metabolome) or DNA adducts (adductome, [80]) such as DNA methylation levels [107]. Metabolome and transcriptome analyses, e.g., were performed in HELIX, HEALS, and EXPOsOMICS, while epigenomics and proteomics were done in HELIX and EXPOsOMICS only [92,98,104] and the projects HEALS and EXPOsOMICS, furthermore, completed adductomics [92,98]. As they are interpreted in different directions it is not straightforward to summarize their added value. Biomarkers may be seen to aggregate exposures of different substances that hit the same target. This would support a comprehensive exposome detection. They can, however, also be employed to support causality between exposure and outcome by looking at them as effect indicators for downstream biological responses, or they are used to explain observable variation within populations through different susceptibilities of subgroups.
In practice, the exposome projects, which were considered in this review, pursued different approaches to address external and internal exposures in their studies and relate these to a variety of health outcomes. For an overview of the addressed health outcomes see Table 2 and for characteristics of the used cohort studies refer to Additional file 1: Table S1.
HELIX undertook a variety of outcome assessments starting with one-exposure-one-outcome assessments to elaborate benefit-harm scenarios in order to cover different complexities of exposure scenarios [104]. These results were documented in various publications on blood pressure during pregnancy and in children [105,106], birth weight [68], fetal growth [2], lung function in children [3], and childhood obesity [103]. For this, HELIX used data of six European birth cohort studies with a total of 32,000 mother-child pairs with initial recruitments ranging from 1999 to 2007 and a representative gender ratio [104]. Additionally, two subsamples to assess biomarkers in mother-child pairs (N = 1.200) and a nested repeat-sampling panel study (N = 150) have been established [104]. Overall, HELIX contributed to demonstrate the feasibility of a harmonized, large-scale exposome wide association study and advanced standardized exposure assessment especially in the urban context [46]. This includes, among others, methodological developments, such as more efficient statistical approaches, the generation of a complete molecular profile data set, and improved measurement of a variety of exposures as e.g.

Table 2 Health outcomes assessed in the EU exposome projects
Please refer to Additional file 1: . More specifically, HELIX increased the overall reliability of findings as they based their analyses on the assessment and statistical testing of a large number of exposures. This allowed for discerning confounding co-exposures and account for multiple testing [46]. HELIX identified air pollution (in relation to infant mortality) and secondhand smoke (in relation to asthma) as the largest negative contributions with regard to child health. Furthermore, they confirmed potential hazards from factors such as dampness, formaldehyde, and ozone and their associations with childhood asthma and respiratory symptoms, as well as lead in association with mild mental retardation [46]. Moreover, with the comprehensive generation of molecular profile data sets (including urinary and serum metabolomics, plasma proteomics, blood cell DNA methylation, transcriptomics, and microRNA data) for a subgroup of 874 children, HELIX was able to conduct an Exposure Wide Association Study (ExWAS), which identified several clusters by association. These were used to identify exposure sources, e.g. fish/seafood as source of polyunsaturated fatty acids, and also metal contaminations [46]. HEALS, so far, published work on multiple health outcomes including (gestational) weight gain [62,100], birth weight [11], psychomotor development [70,71,74], fine motor skills [76], neurodevelopment [72,73,89], and allergy and asthma [40]. To that end, HEALS used data of at least 17 different cohort studies, which involve birth cohort studies, national registries, screening trials, and adults with sample sizes from 186 to > 130,000 and an enrolment as early as 1992 and up to 2011. The information on gender that the authors were able to verify was available for only a few cohorts from the publications. Major achievements of HEALS include the establishment of large, harmonized exposure and health databases, the advancement of EWAS methodologies including the linkage with omics data, data mining techniques, and machine learning [45]. Further milestones comprise reliability testing and validation of personal and remote sensors for individual exposure in five European countries [45]. In detail, HEALS applied a life-course approach that is based on existing data and characterized the external exposures of about 550 individuals. This demonstrated as proof of principle the applicability of the exposome methodology to existing studies [45]. As a first step in establishing individual 'life-long multi-stressor exposure profiles' , this might improve preventive strategies or assist policy-makers [45].
Up to now, EXPOsOMICS yielded several publications relating outcomes such as arterial blood pressure [37], cardio-and cerebrovascular diseases and events [34,91], natural-cause mortality [9], cardiovascular mortality [10], nonmalignant respiratory mortality [25], and lung cancer [77] to exposures such as environmental pollutants and disinfection byproducts. To achieve this, EXPOsOMICS used three different types of cohorts, namely experimental short-term studies, mother-child cohorts, and adult long-term studies with sample sizes from 30 (TAPAS2 study) to > 500.000 participants (EPIC CVD) in the original cohorts (see Additional file 1: Table S1 for details; [98]). However, subsamples with only limited information on their selection criteria were included in EXPOsOMICS, which could raise concerns for statistical pitfalls. In a subsample of 205 participants, personal exposure monitoring was performed to assess air pollution, movement, and biological measurements (e.g. blood; [98]). EXPOsOMICS contributions to exposure science comprise, e.g. harmonized exposure assessment for air pollution variables and methodological contributions to personal exposure monitoring. At the very least this resulted in a reduction of uncertainty regarding exposure assessment at the individual level [31]. Additionally, several statistical challenges related to exposome studies were addressed and resulted in a statistical toolkit that includes solutions for, e.g., multiple testing, the interaction of exposures, and analysis techniques for multivariate data [31]. In detail, EXPOsOM-ICS improved the credibility of the association between air pollution and asthma onset, which previously could have been underestimated [31]. Therefore, with help of the meet-in-the-middle approach (see "Exposure analysis strategies" section), they identified oxidative stress and subsequent inflammatory responses as potential key events in the respective adverse outcome pathways. Additional metabolomic analyses further supported mechanistic understanding and thus strengthened the evidence basis on the crucial role of ultrafine particles in their relation to adverse cardiovascular health outcomes [31]. Concerning the exposure against disinfectant byproducts (DBPs) related to swimming in chlorinated pool water, EXPOsOMICS added evidence for 'toxicity at real life levels' [31]. The combined analysis of metabolomics, transcriptional and microRNA changes linked the exposure to DBPs to bladder cancer, thus supporting a previously established association. Moreover, as a novel finding they also linked DBP exposure to prevalence of colorectal cancer [31].
HBM4EU, which is the most recent of the described projects, has, so far, resulted in one publication that addressed breast cancer in a sample of 585 females (age 28-85 years), who were enrolled between 2007 and 2011 [94]. A further review and the projects overall ambition promise that further work will follow [1]. Ongoing and planned work strives to link human exposure to general health status by expanding chemical analyses to biospecimen available from established health cohort studies [96] and address pesticide mixture exposure and associated health effects [101].
Overall, a wide heterogeneity remains with regard to the assessed external and internal exposures and their association with various health outcomes. These associations can either include associations of health outcomes with single exposure variables, with an aggregated number of variables or using a more comprehensive exposome measure. In order to comprehend the overall progress derived from these pioneering exposome studies, all the different variables can be summarized into four broader exposure categories for which exposurehealth association have been reported. We found it useful to separate the exposure categories lifestyle, air pollution, integrative exposome groups and defined chemicals. Out of the 25 publications that were identified for the EU projects in relation to health outcomes, most associations established considered only a single or few exposures, while five HELIX publications explicitly covered exposome-based groupings; namely the pregnancy exposome [2], the early-life exposome [3,103,106], and the urban exposome [68]. HELIX so far published on health outcomes and their relations to integrative exposome groupings and to individual chemicals, while HEALS addressed defined chemicals and single lifestyle factors (as simple correlations and not as part of a specific, more comprehensive exposome concept). EXPOsOMICS explicitly focused on the relation of air pollution as a complex exposure to health outcomes and HBM4EU aimed at the effects of individual chemicals while viewing them as proxies for complex exposures.
In sum, the results already reported are highly promising, yet, also quite diverse in terms of study design, sample size, and novelty of insights. However, as first prime examples how a deeper understanding of the chemical exposome and related health effects can be obtained they represent a big step forward for exposure and health science. The efforts were build on major collaborative efforts of various disciplines and PIs, creative ideas, and an extensive EU third-party funding. It provides ample learning opportunities for future endeavors. It would be very valuable if the current exposome cohorts would in addition report insights into major challenges they had to solve; be it on legal and ethical aspects, the design of studies and assessment methodology, or the harmonization and sharing of data.
Besides resolving the remaining conceptual challenges, a European strategy would be helpful that supports collaboration with stakeholders including regulatory and health authorities in order to implement major findings [27,93]. Acknowledging the need for additional coordinated efforts, the EU selected nine follow-up HORIZON 2020 Exposome projects for funding beginning January 2020. It aims "to decipher the life-long impact of external and internal exposures on human health" (https ://www. human expos ome.eu) under more specific settings and summarizes the efforts in a joint cluster. With such joint efforts of science and policy, we could finally get closer to the understanding of health risks due to multiple environmental factors, in order to minimize the burden of disease, derive more effective preventive strategies, and inform future policies.

Complementing the exposome assessment The exposome in relation to the chemical universe
An individual may be exposed to a vast number of different substances. An upper bound for this number is set by the chemical universe that summarizes all known and unknown chemicals (Fig. 2a). The Chemical Abstracts Service (CAS) Registry, as the most complete database of known chemicals, currently lists more than 163 million unique organic and inorganic chemical substances [4]. This CAS universe contains chemicals of anthropogenic and non-anthropogenic origin. However, a considerable fraction of the non-anthropogenic substances may not have been characterized so far. From an exposome perspective, many substances in the chemical universe are not relevant since they are not used or merchandised in sufficient amounts, never released to the environment, or are not of sufficient stability. A better proxy for the number of anthropogenic chemicals that are relevant for exposure considerations may be provided by the list of chemicals currently available on the market. According to the KEMI market list, which is compiled from regulatory databases, more than 30 thousand substances are expected to be available on the EU market [35]. The exposure-relevant chemical universe may however be considerably larger, due to non-anthropogenic substances and due to the products of biotic and abiotic transformation and degradation of chemicals.
The set of chemicals that comprises the chemical exposomes of individuals, i.e. the exposome's chemical universe, is yet unknown (Fig. 2b). Furthermore, it does not only harbor chemicals on the market and their transformation products but also endogenous molecules of which the level may change in response to environmental exposure or other stress stimuli. However, several datasets have been compiled that may serve as proxies for subsets of this universe. Stephen Rappaport and colleagues compiled a set of 1561 small molecules found in human blood from the Human Metabolome Database [111] and the U.S. National Health and Nutrition Examination Survey (NHANES) [19,79]. In a more comprehensive approach, based on text mining of PubMed abstracts and PMC full texts, Dinesh Barupal and Oliver Fiehn compiled a set of approximately 50 thousand chemicals, which they provide in a blood exposome database [8]. For urine, a similarly comprehensive analysis is missing. To obtain such an estimate of substances the Human Metabolome Database (HMDB) was queried using "Advanced search" with "Origin does not match endogenous" AND "Biolfluid matches Urine" AND "Origin is present" for non-endogenous substances found in urine. This revealed 1455 unique substances. Other proxies, like the dermal exposome obtained using skin wipes or passive samplers have been used repeatedly in exposome studies and may provide important contributions to the exposome universe (e.g. [42]. However, these initiatives on dermal exposure so far lack a broader chemical characterization. In contrast to the list of substances included in the exposome universe, the number of chemicals included in human biomonitoring projects is minute (Fig. 2c). NHANES, as the-to our knowledge-most comprehensive monitoring program investigated 319 substances in blood and/or urine, while HBM4EU has prioritized about 230 individual compounds, for which further information should be gathered in the project or in future human biomonitoring activities. Within HBM4EU steps are undertaken to increase the number of analyzed compounds based on the development and harmonization of screening approaches [75] and the compilation as a database of suspect chemicals potentially relevant for human biomonitoring comprising 66,000 parent compounds and more than 300,000 in silico predicted metabolites [61]. Subsets of this database will be used to screen samples from cohort studies for these emerging contaminants to obtain a more comprehensive picture of chemical exposure and to prioritize compounds for further development of targeted HBM methods.

Complementing exposome characterization using in vitro assays
A potentially complementary approach to more comprehensively assess internal exposure would use in vitro bioassays. Here, chemicals in the extract from biosamples are quantified through detections of mixture effects for selected, typically health-relevant biological responses [38]. This approach has not been used in the exposome studies referred before.
Early work often targeted dioxin-like POPs, which act via activation of the arylhydrocarbon receptor (AhR) [23]. There are many reporter gene assays available for the AhR and some, such as the AhR-CALUX have specifically been adapted for blood analysis [65]. Environmental and blood levels of POPs could be associated to AhR activity [59]. If no acid clean-up is performed on blood extracts, naturally occurring AhR agonists cause the majority of the effect in AhR-CALUX while the effects were dependent on the diet [20]. These assays have been widely used for the analysis of tissues [110]. AhR activity in plasma has also been associated with diverse adverse outcomes.
Many environmental pollutants may cause endocrine disruption. Estrogenic activity in serum was detected as early as 1997 in human serum [90]. These early studies focused on persistent chemicals, so blood samples were solvent-extracted and processed using a sulfuric acid clean-up that digested the lipids but also non-persistent chemicals. More recent studies tried to identify xeno-estrogenic chemicals in serum with bioassays while differentiating them from endogenous hormones [12]. This method was applied to demonstrate the estrogenic effects in serum from pregnant women, which were found associated with levels of perfluorinated compounds in these samples [13]. Estrogenic activity and AhR activity was also often associated with high levels of other chemicals detected in biomonitoring [38]. Activation of the androgen receptor has been used for doping tests in urine samples to detect steroid abuse [47]. The thyroid hormone system is yet another important nuclear receptor affected by environmental pollutants [16] and binding to transthyretin has been used to identify activity in extracts from organism samples, here polar bear plasma [88].
More recently the peroxisome proliferator activated receptor has become of interest because its activation is related to lipid metabolism [41] and chemicals that activate peroxisome proliferator-activated receptor (PPAR) may be considered obesogens [48]. Recently, a serum PPAR activity assay has been developed that quantifies total PPAR ligand activity in serum [28]. Additional nuclear receptors may be involved in metabolic disruption and many of these are accessible via in vitro testing [51].
With effect-directed analysis, the mixture in an extract can be separated and stepwise bioactive chemicals and mixtures can be identified. This approach is quite popular in water and sediment quality assessment [15] but has also been used to identify estrogenic chemicals in adipose tissue [33] or thyroid activity in plasma of polar bears [88].
High-throughput screening of chemicals in large numbers using mechanistic in vitro cellular bioassays has transformed chemical risk assessment in the USA [53] and all these tools have potential for application in biomonitoring to complement chemical analysis. Bioanalysis, as an advantage, captures the entire mixture of bioactive compounds and not only selected target analytes. As for chemical analysis, one of the biggest challenges remains the sample preparation and extraction. The goal is to be as comprehensive in the extraction as possible to assure that less pollutants are overlooked. Less selective extraction, however, leads to coextraction of endogenous compounds and a lipid matrix that is typically present in much higher concentration than the micropollutants [79]. Clean-up methods are continuously improved and alternative methods such as polymer-based passive sampling are explored for this purpose. Ideally, the same sample preparation is used for chemical analysis and bioassays to allow a direct comparison of the mixture composition and the mixture effects.

Conclusions and summary
To elucidate the role of a changing environment for the prevalence of diseases, the concept of the exposome was proposed. It strives to systematically account for the potentially large number of environmental factors that might impact human disease development. Four large collaborative EU projects are pioneering as proof-of-principles studies in order to identify environment-related health risks based on the exposome. They explored diverse means of describing individual exposure to a multitude of environmental stressors and in particular to a diversity of chemical substances of various origin. Furthermore, they elaborated means of associating multivariate exposure variables with health outcomes observed in various existing populationbased European cohort studies.
In particular, the achieved progress comprised: • Novel evidence that humans are simultaneously exposed to different environmental stressors including complex mixtures of chemicals; • Novel evidence on health outcomes attributable to environmental stressors where humans are exposed as part of complex mixtures; • Methodological advances to observe multiple exposures resolved in space and time, and link external and internal measurements on the level of individuals; • Acknowledgement that chemical mixture exposure varies strongly in composition, and, therefore, assessment of individual exposomes seems adequate to identify and address relevant burdens of disease outcomes; • Novel insights to identify risk drivers in complex mixtures and grouping of substances that jointly contribute to health risks by associating exposome and adverse health outcomes.
While the described EU projects have analyzed about 100 different substances or groups of chemicals, this provides neither a comprehensive picture of suspected exposures, nor a common rational for representative exposome proxy measures. Not surprisingly, substantial variation is found for different health outcomes with previously suspected associations between chemical exposure and adverse health effects being the most convincing. Clearly, the means to address mixture effects are unique.

Research agenda
As the feasibility of the exposome concept has now been demonstrated at least for the part of chemical mixture exposures, future research could focus on more specific questions and the observational efforts to characterize human exposure accordingly. Strategically, while more specificity in research perspective is due, at the same time, collaborative efforts are timely to establish communication among the various follow-up projects at European and international level. A unique opportunity can be anticipated for the European human exposome network (https ://www.human expos ome.eu/) that strives to cross-link the research projects in the exposome field that began in 2019. This should strive to enable the pooling of cohorts for an improved statistical data basis needed to associate multiple mixtures and their potential interactions with adverse outcomes of lesser frequency.
Objectives for future research more specifically, should comprise: • To advance the systematics of chemical exposome assessment, i.e. tiered strategies to capture relevant exposure patterns are needed. These should employ research data methods to utilize different sources of available information to define target analytics for precise quantification of defined exposures and complement these with untargeted exposure screens to qualitatively detect patterns of emerging concern; • For more aggregated exposome detection, complementary bioanalytical tools, such as in vitro assays and omics approaches, should be adopted and brought to high throughput format. This would also help to link different components of the exposome like the anthropogenic pollutants and other stressors; • For identification of risk drivers in exposome assessment, complementary chemical non-target screening methods in conjunction with effect-directed analysis and combined effect models need to be developed; • To make progress with causality in environmenthealth relationships, dedicated efforts are needed to combine cohort studies with exposome assessment, e.g. allowing longitudinal and cross-sectional analyses to capture the dynamics of individual exposomes, as well as to adequately identify vulnerable subgroups; • To obtain the relevant sample sizes and by that sufficient statistical power, pooling of cohort data across European cohorts is inevitable; this, however, remains a major challenge as study participants need to consent to the respective use of their data, the data needs to be harmonized to become fully usable, and legal hurdles of data sharing between countries have to be overcome; in sum, such efforts call for more collaborative research efforts as well as anticipatory study designs, and consideration of adequate participant's consent in order to address legal and ethical constraints; • A structural objective to guarantee future and advanced data analysis is to accommodate high end means that allow for open access within the research community. In particular, it will be vital to define standards that enable coupling of diverse cohort studies to allow detection of small effect sizes for complex exposures of varying composition.
Additional file 1: Table S1. Characteristics of the used cohort studies including study design, sample size, age, time of enrolment, and gender ratio.