Skip to main content

Effect of geographical parameters on PM10 pollution in European landscapes: a machine learning algorithm-based analysis

Abstract

Background

PM10, comprising particles with diameters of 10 µm or less, has been identified as a significant environmental pollutant associated with adverse health outcomes in European cities. Understanding the temporal variation of the relationship between PM10 and geographical parameters is crucial for sustainable land use planning and air quality management in European landscapes. This study utilizes Conditional Inference Forest modeling and partial correlation to examine the impact of geographical factors on monthly average concentrations of PM10 in European suburban and urban landscapes during heating and cooling periods. The investigation focuses on two buffer zones (1000 m and 3000 m circle radiuses) surrounding 1216 European air quality monitoring stations.

Results

Results reveal importance and significant correlations between various geographical variables (soil texture, land use, transportation network, and meteorological) and PM10 quality on a continental scale. In suburban landscapes, soil texture, temperature, roads, and rail density play pivotal roles, while meteorological variables, particularly monthly average temperature and wind speed, dominate in urban landscapes. Urban sites exhibit higher R-squared values during both cooling (0.41) and heating periods (0.61) compared to suburban sites (cooling period R-squared: 0.39; heating period: R-squared: 0.51), indicating better predictive performance likely attributed to the less heterogeneous land use patterns surrounding urban PM10 monitoring sites.

Conclusion

The study underscores the importance of investigating spatial and temporal dynamics of geographical factors for accurate PM10 air quality prediction models in European urban and suburban landscapes. These findings provide valuable insights for policymakers, urban planners, and environmental scientists, guiding efforts toward sustainable and healthier urban environments.

Introduction

Air quality is a critical concern with direct implications for public health, particularly in urban and suburban areas where various anthropogenic activities contribute to the dispersion of particulate matter (PM) [82]. Particulate matter is considered one of the most serious hazards for human health, the environment, and the climate on a global scale [28]. Despite considerable improvements in recent decades in many industrialized countries, PM pollution still causes thousands of premature deaths and increases in various pathologies each year in Europe [36]. PM10, comprising particles with diameters of 10 µm or less, has been identified as a significant environmental pollutant associated with adverse health outcomes [46, 47, 74]. In Europe, air pollution significantly impacts human health (e.g., reducing life expectancy) and the economy (e.g., increasing medical costs and reducing productivity).

Various factors influence the spatiotemporal variability of PM10 dynamics, including land use structures [57, 66], soil physical and chemical characteristics [60, 93], meteorological variables [11, 18, 32, 38, 45, 55, 97, 102], and the spatial characteristics of road and railway networks [17, 50, 84]. Additionally, soil physical characteristics, such as moisture and texture, affect PM10 dispersion and deposition patterns [60, 63, 89]. Climatological variables, including temperature, wind speed, and atmospheric stability, are crucial in dispersing and transforming PM10 in the atmosphere [3, 24].

Our previous study [83] examined the connection between different land use categories from the Urban Atlas Land Cover/Land Use 2018 [26] and the monthly average PM10 concentrations reported by the European Environmental Agency (EEA), finding significant seasonal variations between heating (cold seasons) and cooling (warm seasons) periods [49]. We hypothesized that the relationship between PM10 and geographical parameters varies significantly between these periods. Understanding the temporal variations in PM10 concentrations during both heating and cooling periods is critical for developing effective air quality management strategies, particularly in Europe’s dynamic urban and suburban landscapes [51, 67, 83]. Previous studies have primarily investigated correlations between influential factors, such as land use structures, climatological variables, or soil properties, and either PM10 or PM2.5 individually, typically on a local or regional scale [37, 39, 44, 51, 62, 73, 80, 86, 92, 98, 100].

Given the complex relationship between air pollution and geographical variables [5], comprehensive analyses and advanced modeling techniques are required [4, 13, 31, 37, 73, 78, 81, 88]. Machine learning algorithms, leveraging artificial intelligence to identify patterns in air pollution data, have shown promise for PM10 forecasting [1, 4, 21, 37].

This study applied an advanced Random Forest (CRF) model to estimate monthly average PM10 concentrations across European countries. Our objectives are to (1) estimate monthly average PM10 concentrations based on geographical parameters, (2) determine the importance of each independent geographical variable in predicting PM10 concentrations during the cooling and heating period, and (3) examine the scale dependence of these variables' effects on PM10 levels using two buffer zone radii (1000 m and 3000 m) around air quality stations across European cities. This research is essential to understand the effects of geographical factors on the level of PM10 pollution. Our research opens possibilities for predicting PM10 levels and offers a foundation for assessing particulate matter exposure, particularly in developing countries, and regions with limited PM10 monitoring stations. The findings will be crucial for future modeling efforts, enabling more accurate PM10 predictions and informing air quality management strategies that directly impact the health of billions of inhabitants in urban and suburban areas.

Materials and methods

Data

PM10 concentrations

The monthly average dataset of PM10 concentrations (in µg/m3) in 2018 is from the European AQ Portal (2020), including 1216 European AQ monitoring stations (1039 stations in urban areas and 177 stations in suburban areas) (Fig. 1). This AQ e-Reporting system was established by the European Commission and is run by the European Environmental Agency ([29].

Fig. 1
figure 1

Distribution of AQ monitoring stations in the study area (urban, suburban)

The stations are classified according to the old Exchange of Information Decision (EoI) 97/101/EC and the current Implementing Provisions on Reporting (IPR) 2011/850/EC, regarding the type of area (urban, suburban or rural) and also depending on the influence of the immediate surroundings (traffic, industrial or background). This classification is based mainly on spatial considerations like the degree of built-up areas, population distribution, or the influence of near sources [7, 33].

Land use/cover data

Urban areas in the Urban Atlas are typically characterized by higher-density built-up areas, such as residential, commercial, and industrial zones, along with infrastructure like roads and railways. These areas are often marked by a high degree of impervious surfaces and minimal green spaces. Suburban areas, on the other hand, are characterized by lower-density residential zones, mixed land use with more open spaces, and often include suburban agricultural lands. These areas have a mixture of built-up surfaces and more natural or semi-natural spaces, providing a transition between urban cores and rural areas.

The composition of the landscape structure (area of different land use categories) around the PM10 monitoring stations was calculated using the European Union Copernicus Land Monitoring Service’s LULC 2018 Urban Atlas database. The minimum mapping unit of this polygon-based LULC dataset is 0.25 hectares. It was published by the EEA within the framework of the Copernicus program on April 16, 2020 and revised on July 16, 2021 [26]. The aggregated Urban Atlas land use categories were employed in this study based on the findings of previous investigations [83] (Table 1).

Table 1 Main (aggregated) land cover categories based on the 2018 Urban Atlas LULC map [26], which are significantly correlated with the monthly average concentrations of PM10 [83]

Transportation network and soil data

The GEOFABRIK (OpenStreetMap database) road network dataset was utilized in our study. Additionally, the rail network dataset for European territories was downloaded from the Euro-Global Map, published on July 27, 2018, and updated on January 14, 2019. (January 14, 2019, updated) (“Geofabrik”, 2022 [35]). The United States Department of Agriculture (USDA) soil textural class dataset, based on LUCAS topsoil data, was also utilized as qualitative data for European topsoil physical properties. This dataset includes soil texture categories such as clay, silty clay, silty clay-loam, sandy clay, sandy clay-loam, clay-loam, silt, silt-loam, loam, sand, loamy loam, and sandy loam [72], and was published by the European Soil Data Centre (ESDAC), European Commission and Joint Research Center in 2015 (Fig. 2).

Fig. 2
figure 2

Map of soil textural classes [6]

Meteorological data

Monthly data for 2018 were obtained in NetCDF format from the Climate Data Store (CDS) to analyze meteorological conditions, which serve as the foundational infrastructure for the Copernicus Climate Change Service: monthly averaged 10 m wind speed (m s−1), mean sea level pressure (Pa), average temperature, and total precipitation (m) (Table 2). According to the ERA5 data documentation, the dataset was evaluated for usability and reliability by the Evaluation and Quality Control (EQC) function of C3S [40]. Figure 3 illustrates all the European-scale digital datasets utilized in this study.

Table 2 PC results between PM10 concentration and selected variables using Conditional Inference Forest analysis in suburban landscapes during the cooling period; A) 1000 m buffer zones, B) 3000 m buffer zones
Fig. 3
figure 3

Flowchart of the datasets used for analysis. MA monthly average

GIS analysis

Based on our previous studies [83, 84], we selected the area of each aggregated Urban Atlas land cover category (km2) that has shown a significant correlation with the monthly average concentration of PM10, road and railway density (km/km2), distance from the nearest road and railway network (m), soil texture, the monthly average of meteorological variables such as wind speed (m s−1), mean sea level pressure (Pa), total precipitation (mm), and mean temperature (C°) as effective factors. The selected independent variables were calculated or extracted within the 1000 m and 3000 m buffer zones, based on previous results on scale sensibility [83]. Arc Map 10.6.1, QGIS 3.22, and ArcGIS Pro 2.8 were used to perform spatial analysis and mapping.

Data preparation

As part of the data-cleaning procedures, missing, no-data values, and outliers were identified and removed from all dependent and independent variables. Additionally, LULC classes with fewer than 25 occurrences within the specified buffer zones were excluded [12]. As demonstrated in our previous study [83], the relationship between PM10 concentrations and land use structures varies significantly between cooling and heating periods. In the context of heating degree days (HDDs) and cooling degree days (CDDs), the "base temperature" is a threshold outside temperature used to determine when a building requires heating or cooling. Base temperatures may be defined for a particular building as a function of the temperature that the building is heated to, or it may be defined for a country or region. The range of heating threshold temperatures varies from 10–19 °C among European countries [56]. To have a reliable dataset, we ignored AQ stations in countries with different temperature thresholds (e.g., extremely cold), such as Denmark, Norway, Finland, Switzerland, and Sweden [85]. (In total, we had 148 AQ stations in urban landscapes and 970 AQ stations in urban landscapes.) [48, 95, 103] evaluated the indoor temperature set point (ST) is often 26 °C in summer and 18 °C in winter according to local building design codes and indoor thermal comfort standards in China. Also according to [68] the base indoor temperature is 18◦C in Europe. Furthermore, the World Health Organization (WHO) suggests that 18 °C is a “safe and well-balanced indoor temperature to protect the health of general populations during cold seasons” [94]. The United States system states that the base temperature is 18.3 °C [2, 8, 69]. Therefore, we selected 18.3 °C as the European monthly average temperature threshold and divided the datasets between the heating period (the monthly mean temperature was below 18.3 °C) and the cooling period (the monthly mean temperature was above 18.3 °C). For the last step, we split the final dataset according to the type of landscape (urban and suburban) to better understand and explain the results in different landscape structures.

Statistical modeling

This study used a conditional inference regression random forest model (CRF), a robust ensemble learning technique recognized for its predictive capabilities and resistance to overfitting. The CRF model is suitable for analyzing the impact of both quantitative independent variables and qualitative variables, such as soil texture categories, on monthly average PM10 values. CRFs are effective when predictors are highly intercorrelated or when models have numeric response variables, and they can handle relationships between response variables and predictors at any measurement scale [61].

Cforest is an implementation of the random forest and bagging ensemble algorithms utilizing conditional inference trees as base learners. While measuring variable importance in random forests is common for variable selection, it becomes unreliable when predictor variables differ in scale or category number. Cforest offers options not available in traditional random forests, such as fitting forests to censored, multivariate, and ordered responses [42]. Additionally, when predictors (geographical factors) vary in measurement scale or category number, variable selection and importance computation in random forests are biased towards variables with many potential cut points. In contrast, cforest uses unbiased trees and an appropriate resampling scheme by default [43, 87].

Both random forest and cforest use random subsets of input data for recursive partitioning to develop multiple classification or regression trees, but they differ in determining predictor variable importance. Random forest does not accurately account for correlation among predictors, leading to overestimating the significance of highly correlated predictors. Cforest addresses this issue with a conditional permutation-importance measure. Moreover, a random forest requires normally distributed continuous dependent variables [58], which was not the case in this study. Our dependent variable, monthly average PM10 concentration, was continuous but not normally distributed, and we had soil texture as a categorical predictor.

Conditional inference trees (CITs) and conditional random forests (CRFs) allow researchers to model relationships between numeric or categorical response variables (PM10) and various predictors (geographical factors), especially when parametric methods are problematic due to complex interactions, non-linearity, and correlated predictors [61]. Given these considerations, we chose the cforest algorithm within the "party" package in R to determine the most important predictor variables [23].

Model evaluation was conducted using a combination of train-test splitting and cross-validation through the following steps: (1) split the dataset into 70% training and 30% testing data. (2) Perform a grid search on the training set using cross-validation to select the optimal 'ntrees' and 'mtry' parameters. (3) Train the model using k-fold cross-validation and identify the best model. (4) Retrain the final model on the entire training dataset using the best hyperparameters [78]. (5) Use the final model to make predictions on the test dataset. (6) Evaluate the model's performance on the test set using RMSE, MAE, and R2 metrics. The root mean square error (RMSE) measured the average magnitude of errors between predicted and observed values [16]. This indicator is commonly used in regression-based machine learning models to evaluate predictive performance [52], the mean absolute error (MAE) to show the distance of the predicted values from the observed values [77], and the coefficient of determination (R-square or R2) that determines the proportion of variance in the dependent variable that can be explained by the independent variable [14]. The study aims to determine the parameters related to PM10 air pollution. As a final step, we analyzed each variable's importance as a percentage, indicating its contribution to the model’s predictive accuracy. Conditional inference forests provided variable importance scores, where the sign (positive/negative) indicates whether a variable's presence improves or degrades model efficiency [23]. Partial correlation analysis was performed using the Spearman method to assess nonparametric correlations, with R Studio software. This analysis explored relationships between PM10 and variables with an importance of 5% or higher in PM10 prediction accuracy. The "pcor" function calculates pairwise partial correlations of each variable pair, providing p-values and statistics [54]. The flowchart of our research process is shown in Fig. 4.

Fig. 4
figure 4

Flowchart of the research process

Results

Impact of geographical variables on PM10 concentration in suburban landscapes

The variable importance analysis for two different scenarios during the cooling period, suburban landscape buffer zones of 1000 m and 3000 m expresses the importance of each variable as a percentage, indicating its contribution to the model’s predictive accuracy (Fig. 5). In suburban areas within a 1000 m radius during the cooling period, soil texture emerges as the most influential variable, with a notable importance of 24.79%. Temperature follows closely behind at 14.45%, succeeded by roads land use area at 14.1%, and road density at 11.36%. The analysis of variables that affect PM10 levels during the cooling period in suburban areas within a radius of 3000 m reveals compelling insights. As the most predominant variable, soil texture (35.62%) plays a crucial role in shaping AQ dynamics. Following closely in importance in the second position, railway density (in m/km2) (14.5%) significantly impacts PM10 concentrations.

Fig. 5
figure 5

Importance of independent variables based on the conditional inference regression random forest model in suburban landscapes. Cooling period (1000 m and 3000 m buffer zones), and heating period (1000 m and 3000 m buffer zones)

Roads land use area (9.23%) is the third highest in importance. Atmospheric conditions, represented by the monthly average wind speed (8.58%), and temperature (8.23%), are the fourth and fifth most influential variables, respectively. Urban parks have the least impact on PM10 predictions, contributing only 1.9% (Fig. 7). Table 3 presents the performance statistics of the Conditional Inference Forest (CRF) models for the monthly average concentration of PM10 at 148 AQ stations in suburban European landscapes. For suburban 1000 m, the RMSE is 4.83, the R-squared is 0.36, and the MAE is 3.48, reflecting a reasonable but not exceptionally accurate predictive performance. In comparison, suburban 3000 m exhibits slightly higher RMSE (5.01) and R-squared (0.39) values and a slightly increased MAE of 3.63.

Table 3 Partial correlation results between PM10 concentration and selected variables using Conditional Inference Forest analysis in suburban landscapes during the heating period; C) 1000 m buffer zones, D) 3000 m buffer zones

The variable importance analysis conducted for suburban landscapes under two different scenarios during the heating period within 1000 m and 3000 m buffer zones is presented in Fig. 5. Within a 1000 m radius, temperature emerges as the most influential factor, with a significance of 17.06%. This is closely followed by total precipitation at 12.24%. The area of built-up land use is the third in importance at 10.57. Conversely, water cover holds the least significance at 0.79% in affecting PM10 levels during the heating period (Fig. 7).

Our analysis of geographical variables affecting monthly average PM10 concentrations during the heating period in suburban landscapes within a 3000 m radius reveals the key influencing factors (Fig. 8). Temperature is the most significant variable, contributing 15.83% to PM10 levels. Total precipitation follows at 13.09%, and soil texture is the third most important factor at 10.56%. In contrast, railway density (0.99%) and the industrial unit land use category (0.96%) have relatively minor impacts on PM10 levels during the heating period in suburban areas within a 3000 m radius.

The model's performance evaluation for suburban landscapes within 1000 m and 3000 m during the heating period is summarized in Table 4. The R2 values indicate a moderately good fit of the models to the observed data, with an R2 of 0.52 for suburban 1000 m and an R2 of 0.51 for suburban 3000 m.

Table 4 Conditional Inference Forest performance statistics for PM10 AQ monitoring sites in European urban landscapes during the cooling period

During the cooling period in suburban landscapes, examining the 1000 m buffer zones reveals a significantly positive relationship between PM10 concentration and both temperature and built-up areas, suggesting that higher temperatures and more developed areas are associated with higher PM10 levels. In the 3000 m buffer zones, a positive significant relationship is found between PM10 concentration and rail density, roads, and temperature (Table 5). In contrast, a negative significant correlation between PM10 concentration and wind speed indicates that higher wind speeds correspond to lower PM10 levels.

Table 5 Conditional Inference Forest performance statistics for PM10 AQ monitoring sites in European urban landscapes during the heating period

The PC analysis of the 1000 m buffer zones in suburban landscapes during the heating period shows a significantly negative correlation between PM10 concentration and the variables of temperature and total precipitation, suggesting that higher temperatures and greater precipitation levels are associated with lower PM10 concentrations. In contrast, a positive correlation exists between PM10 concentration and the expansion of built-up areas, wind speed, mining, dumping and construction sites, and rail density. Similar results are observed in the 3000 m buffer zones, but with notable differences: a significant negative correlation is found between PM10 concentration and both the urban park area and the distance to railways within the 3000 m radius surrounding the PM10 emission monitoring stations (Table 6).

Table 6 Partial correlation results between PM10 concentration and selected variables using Conditional Inference Forest analysis in urban landscapes during the cooling period

Impact of geographical variables on PM10 concentration in urban landscapes

Our analysis of variable importance during the cooling period in urban areas within a 1000 m radius reveals that soil texture is the most influential variable, accounting for 20.75% of the importance. Roads are the second most significant at 11.77%, followed closely by temperature at 10.26%. When extending the analysis to a 3000 m radius, forests emerge as the most influential variable, with substantial importance of 15.44%, followed by soil texture at 12.83%. Other notable variables include vacant land (8.25%), 10 m wind speed (7.49%), and temperature (7.04%), highlighting the impact of open spaces and meteorological conditions on air quality. The least impactful variables are mining, dumping, and construction sites (3.9%) and road density (3.02%) (Fig. 9).

For the cooling period in the 1000 m urban buffer zones, the model validation results show an RMSE of 4.84, an R-squared value of 0.36, and an MAE of 3.58, indicating average to low accuracy in predicting PM10 concentrations. In urban landscapes with 3000 m buffer zones, a slightly lower RMSE of 4.63 suggests improved predictive accuracy in this larger buffer zone. The R-squared value of 0.41 indicates a moderate fit of the model to the observed data, while a reduced MAE of 3.35 signifies a decrease in the average absolute errors compared to the 1000 m buffer zones (Table 7).

Table 7 Partial correlation results between PM10 concentration and selected variables using Conditional Inference Forest analysis in urban landscapes during the heating period

During the heating period in urban areas within a 1000 m radius, temperature is the most influential factor, with an importance of 26.33%. Wind speed follows closely, contributing 20.43%, and total precipitation accounts for 12.76%. The least influential variables are water cover (0.89%), mining, dumping, and construction sites (0.86%) indicating their limited impact on PM10 concentrations during the heating period in these urban landscapes (Fig. 6).

Fig. 6
figure 6

Importance of variables based on the conditional inference regression random forest model in urban landscapes. Cooling period (1000 m and 3000 m buffer zones) and heating period (1000 m and 3000 m buffer zones)

During the heating period in urban areas within a 3000 m buffer zone, temperature is the most significant factor (24.02%), followed by wind speed (21.28%) and total precipitation (12.79%). In contrast, surface water, mining, dumping and construction sites, and railway density are less influential, with each contributing less than 1% (Fig. 10).

The model validation outcomes for the heating period within the 1000 m buffer zones indicate an RMSE of 6.83, an R-squared value of 0.57, and an MAE of 4.92. Transitioning to urban 3000 m, the model shows improved performance with an RMSE of 6.64, an R-squared value of 0.61, and an MAE of 4.7 (Table 8).

The partial correlation analysis for the 1000 m buffer zones during the cooling period shows significant positive correlations between PM10 concentration and both road land use area and temperature, while forest areas and total precipitation are significantly negatively correlated with PM10 (Table 9). In the 3000 m buffer zones, PM10 concentration is significantly negatively correlated with forests, wind speed, and total precipitation, but positively correlated with temperature and vacant lands.

During the heating period in urban landscapes, PM10 concentration is significantly negatively correlated with temperature, wind speed, and total precipitation. Similar patterns are observed in the 3000 m buffer zones, where PM10 concentration also shows significant negative correlations with temperature, wind speed, and total precipitation (Table 10). Additionally, there is a significant positive correlation between PM10 concentration and arable land.

Discussion

Temperature and total precipitation are the most important factors during the heating period in suburban landscapes' buffer zones, reflecting the crucial role of weather conditions in determining PM10 concentrations. We confirmed the results of [11] during the heating period, with the temperature decline causing an increase in PM10 in the air surrounding urban areas built up due to the increase in heating intensity. Furthermore, colder temperatures and stable air conditions exacerbate particle pollution, which has been proven by [30, 70, 96]. The dust-fixing capacity of precipitation is also well supported by previous studies [65, 76].

The negative correlation between PM10 concentration and total precipitation suggests that precipitation may help to clean the atmosphere of PM10 particles by scavenging them from the air [76]. Higher temperatures, stronger winds, and increased precipitation are associated with lower PM10 concentrations, probably due to improved dispersion and removal of pollutants, because in the winter months, pollutants resulting from natural and anthropogenic sources are trapped in the boundary layer due to frequent temperature inversions [59]. During the winter, the atmospheric conditions are different; lower average wind speeds, lower temperatures, and a lack of precipitation reduce surface vertical mixing, resulting in limited dilution and dispersion [11, 22]. However [91], the negative correlations between PM10 and wind speed and precipitation become weaker during warm seasons, probably due to secondary aerosol formation and enhanced soil dust resuspension. In summer, PM10 production is mainly related to secondary inorganic production [15]. On the contrary, [10] found that the temperature was statistically nonsignificant in PM10 concentrations, while the wind speed was negatively correlated with PM10, similar to the studies [11, 25, 65]. However, this result contradicts the findings of [76] found that annual PM10 emissions decreased from 1958 to 2018 and that this trend was significantly associated with the decrease in wind speed.

Based on our results, there is the same significant positive correlation between temperature and PM10 concentration in both urban and suburban areas. During the warmer months (cooling period), an increase in temperature leads to increased PM10 pollution. On the contrary, during the colder months (heating period), there is a negative correlation, with decreasing temperatures causing higher PM10 levels. These temporary changes are observed because, in the warm months, higher temperatures can enhance PM10 formation, while in the cold months, increased heating activities in both urban and suburban areas lead to more PM10 emissions.

In terms of the importance of variables show that roads are the main source of PM10 pollution in summer (cooling period). At the same time, built-up areas are the main cause of accelerated PM10 pollution, particularly in winter (heating period) [83]. However, there is a negative correlation between road density and monthly average PM10 concentration in suburban landscapes during both cooling and heating periods, consistent with our previous study [84]. However, we did not find a significant correlation between road density and PM10 values in urban landscapes. This might be due to several reasons, for instance, in urban areas where PM10 sources are well known, such as industrial zones or built-up areas near air quality (AQ) immission measurement points, secondary winds generated by vehicles can help reduce PM10 concentrations. This is supported by the cleaning effect of wind speed observed in our current study. Furthermore, in locations with significant industrial pollutants, the influence of the density of the road network on PM10 concentration could be overshadowed. PM10 levels could be more affected by the types and sizes of industrial estates or built-up areas within a 3000-m radius around the AQ point, rather than by the density of the road network [9, 84].

On the other hand, suburban landscapes typically have a lower density network density of road networks compared to urban areas. This difference in transport infrastructure can lead to lower PM10 concentrations from traffic sources in suburban areas, especially in European public transport systems. In addition, urban landscapes experience the heat island effect, where higher temperatures due to dense buildings and less vegetation can enhance PM10 formation [34, 67]. Because of the urban heat island, the road network is often ventilation corridors of the cool winds from the rural areas into the city center.

The negative correlation found between PM10 concentration and forests underscores the potential role of green spaces in mitigating air pollution by acting as filters, absorbing and trapping particulate matter. Therefore, designing a proper configuration of planted spaces that facilitate ventilation may be a more effective approach to achieving the PM purification function of vegetation [19, 20]. The high cleaning effect of forest land use, particularly during the vegetation period (approximately during the cooling season) has been proven by other studies [19, 41, 99]. On the contrary, the positive correlation between PM10 concentration and vacant land suggests that areas with more vacant land could exhibit higher PM10 concentrations [76] [83].

The positive correlation between PM10 concentration and arable land during this period can also be attributed to agricultural activities, such as emissions from neighboring agricultural land (e.g., burning crop straw in fields after harvest), tilling and harvesting, which generate PM10 emissions and provide no vegetation cover during harvesting in the winter [75, 79]. Therefore, farming activities must be sustainable for the landscape and environmentally friendly [53]. Additionally, the percentages of the categories of soil texture of sand, silt and clay are important factors that affect soil erosion and dust emission [27, 71], particularly during the warm seasons, which aligns with our findings that identified soil texture as the most significant factor in predicting PM10 levels in both suburban and urban areas during the cooling period. A rise in temperature causes an increase in PM10 concentrations, as the topsoil dries out during the drier summer and the wind moves soil grains more easily due to the lack of soil moisture to compact [76].

A limitation of our study is the missing of certain AQ stations not covered by the Urban Atlas 2018 land cover map along with the lack of data on traffic density. Additionally, the moderate precision reflected by R-squared values between 0.36 and 0.61 is comparable to some prior studies related to ours [64, 90, 101]. Nonetheless, these values highlight the necessity for enhancements in future research. We will analyze landscape metrics around AQ stations within target buffer zones and evaluate the impact of landscape patterns on PM10 quality. Following this, we will develop a highly accurate PM10 prediction model incorporating only the geographical variables from our current study and the landscape metrics that demonstrate a significant correlation with PM10 concentrations.

Conclusions

According to our research findings the geographical (soil, climatic, etc.) variables during and outside the heating season affect PM10 concentrations with fundamentally different weights and often with different signs (e.g., monthly average temperature and road density). Based on our results it is possible to predict PM10 concentrations after separating the heating and non-heating periods, consider with different input geographical variables within these periods. Our research makes an essential contribution to the estimation of PM10 pollution in areas where soil, climatic and land use data are available, but the network of PM10 immission monitoring stations is very sparse. This research provides valuable information to policymakers, urban planners, and environmental scientists for formulating targeted strategies to predict and mitigate the impact of PM10 pollution.

Data availability

Data are provided within the manuscript or supplementary information files.

References

  1. Abbey DE, Nishino N, McDonnell WF, Burchette RJ, Knutsen SF, Beeson WL, Yang JX (1999) Long-term inhalable particles and other air pollutants related to mortality in nonsmokers. Am J Respir Crit Care Med 159:373–382. https://doi.org/10.1164/ajrccm.159.2.9806020

    Article  CAS  Google Scholar 

  2. Al-Hadhrami LM (2013) Comprehensive review of cooling and heating degree days characteristics over Kingdom of Saudi Arabia. Renew Sustain Energy Rev 27:305–314. https://doi.org/10.1016/j.rser.2013.04.034

    Article  Google Scholar 

  3. Alizadeh-Choobari O, Bidokhti AA, Ghafarian P, Najafi MS (2016) Temporal and spatial variations of particulate matter and gaseous pollutants in the urban area of Tehran. Atmos Environ 141:443–453. https://doi.org/10.1016/j.atmosenv.2016.07.003

    Article  CAS  Google Scholar 

  4. Analitis A, Barratt B, Green D, Beddows A, Samoli E, Schwartz J, Katsouyanni K (2020) Prediction of PM25 concentrations at the locations of monitoring sites measuring PM10 and NOx, using generalized additive models and machine learning methods: a case study in London. Atmos Environ 240:117757. https://doi.org/10.1016/j.atmosenv.2020.117757

    Article  CAS  Google Scholar 

  5. Bai L, Wang J, Ma X, Lu H (2018) Air pollution forecasts: an overview. Int J Environ Res Public Health 15:1–44. https://doi.org/10.3390/ijerph15040780

    Article  CAS  Google Scholar 

  6. Ballabio C, Panagos P, Monatanarella L (2016) Mapping topsoil physical properties at European scale using the LUCAS database. Geoderma 261:110–123. https://doi.org/10.1016/j.geoderma.2015.07.006

    Article  Google Scholar 

  7. Barrero MA, Orza JAG, Cabello M, Cantón L (2015) Categorisation of air quality monitoring stations by evaluation of PM10 variability. Sci Total Environ 524–525:225–236. https://doi.org/10.1016/j.scitotenv.2015.03.138

    Article  CAS  Google Scholar 

  8. Baumert K, Selman M. Heating and cooling degree days. WRI World Resour Inst. 1–12.

  9. Belis CA, Karagulian F, Larsen BR, Hopke PK (2013) Critical review and meta-analysis of ambient particulate matter source apportionment using receptor models in Europe. Atmos Environ 69:94–108. https://doi.org/10.1016/j.atmosenv.2012.11.009

    Article  CAS  Google Scholar 

  10. Birim NG, Turhan C, Atalay AS, Gokcen Akkurt G (2023) The influence of meteorological parameters on PM10: a statistical analysis of an urban and rural environment in Izmir/Türkiye. Atmosphere (Basel). https://doi.org/10.3390/atmos14030421

    Article  Google Scholar 

  11. Bodor Z, Bodor K, Keresztesi Á, Szép R (2020) Major air pollutants seasonal variation analysis and long-range transport of PM10 in an urban environment with specific climate condition in Transylvania (Romania). Environ Sci Pollut Res 27:38181–38199. https://doi.org/10.1007/s11356-020-09838-2

    Article  CAS  Google Scholar 

  12. Bonett DG, Wright TA (2000) Sample size requirements for estimating Pearson, Kendall and Spearman correlations. Psychometrika 65:23–28. https://doi.org/10.1007/BF02294183

    Article  Google Scholar 

  13. Bui DH, Mucsi L (2022) Predicting the future land-use change and evaluating the change in landscape pattern in Binh Duong province, Vietnam. Hungarian Geogr Bull. 71:349–364

    Article  Google Scholar 

  14. Cameron AC, Windmeijer FAG (1997) An R-squared measure of goodness of fit for some common nonlinear regression models. J Econom 77:329–342. https://doi.org/10.1016/s0304-4076(96)01818-0

    Article  Google Scholar 

  15. Carnevale C, Pisoni E, Volta M (2010) A non-linear analysis to detect the origin of PM10 concentrations in Northern Italy. Sci Total Environ 409:182–191. https://doi.org/10.1016/j.scitotenv.2010.09.038

    Article  CAS  Google Scholar 

  16. Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?—Arguments against avoiding RMSE in the literature. Geosci Model Dev 7:1247–1250. https://doi.org/10.5194/gmd-7-1247-2014

    Article  Google Scholar 

  17. Chang X, Huang X, Jiang X, Xiao R. Impacts of transportation networks on the landscape patterns—a case study of Shanghai. 2022; 1–13.

  18. Cheewinsiriwat P, Duangyiwa C, Sukitpaneenit M, Stettler MEJ (2022) Influence of land use and meteorological factors on PM2.5 and PM10 concentrations in Bangkok, Thailand. Sustain. https://doi.org/10.3390/su14095367

    Article  Google Scholar 

  19. Chen L, Liu C, Zou R, Yang M, Zhang Z (2016) Experimental examination of effectiveness of vegetation as bio-filter of particulate matters in the urban environment. Environ Pollut 208:198–208. https://doi.org/10.1016/j.envpol.2015.09.006

    Article  CAS  Google Scholar 

  20. Chen X, Pei T, Zhou Z, Teng M, He L, Luo M, Liu X (2015) Efficiency differences of roadside greenbelts with three configurations in removing coarse particles (PM10): a street scale investigation in Wuhan, China. Urban For Urban Green 14:354–360. https://doi.org/10.1016/j.ufug.2015.02.013

    Article  Google Scholar 

  21. Choubin B, Abdolshahnejad M, Moradi E, Querol X, Mosavi A, Shamshirband S, Ghamisi P (2020) Spatial hazard assessment of the PM10 using machine learning models in Barcelona, Spain. Sci Total Environ 701:134474. https://doi.org/10.1016/j.scitotenv.2019.134474

    Article  CAS  Google Scholar 

  22. Cichowicz R, Wielgosi G, Fetter W (2020) Effect of wind speed on the level of particulate matter PM10 concentration in atmospheric air during winter season in vicinity of large combustion plant. J Atmos Chem 77:35–48

    Article  CAS  Google Scholar 

  23. Das A, Abdel-Aty M, Pande A (2009) Using conditional inference forests to identify the factors affecting crash severity on arterial corridors. J Safety Res 40:317–327. https://doi.org/10.1016/j.jsr.2009.05.003

    Article  Google Scholar 

  24. Diapouli E, Manousakas M, Vratolis S, Vasilatou V, Maggos T, Saraga D, Grigoratos T, Argyropoulos G, Voutsa D, Samara C, Eleftheriadis K (2017) Evolution of air pollution source contributions over one decade, derived by PM10 and PM2.5 source apportionment in two metropolitan urban areas in Greece. Atmos Environ 164:416–430. https://doi.org/10.1016/j.atmosenv.2017.06.016

    Article  CAS  Google Scholar 

  25. Dung NA, Son DH, Hanh NTD, Tri DQ (2019) Effect of Meteorological Factors on PM10 Concentration in Hanoi, Vietnam. J Geosci Environ Prot 07:138–150. https://doi.org/10.4236/gep.2019.711010

    Article  Google Scholar 

  26. EEA. 2021. Urban Atlas 2018 [WWW Document]. Eur. Environ. Agency. https://doi.org/10.2909/fb4dffa1-6ceb-4cc0-8372-1ed354c285e6.

  27. Enayatizamir N, Landi A, Ghafari H, Mokfi M (2022) Wind erodibility and dust (PM10) emission control in two different soil textures using microbial inoculation and sugarcane bagasse application. Arab J Geosci. https://doi.org/10.1007/s12517-022-10363-4

    Article  Google Scholar 

  28. Engelbrecht JP, Derbyshire E (2010) Airborne mineral dust. Elements 6:241–246. https://doi.org/10.2113/gselements.6.4.241

    Article  Google Scholar 

  29. European Environment Agency (2019) Sources and emissions of air pollutants in Europe. Air Qual Eur 2021:1–15

    Google Scholar 

  30. Fan S, Zhang M, Li Y, Li K, Dong L (2021) Impacts of composition and canopy characteristics of plant communities on microclimate and airborne particles in Beijing, China. Sustain. https://doi.org/10.3390/su13094791

    Article  Google Scholar 

  31. Farmonov N, Amankulova K, Khan SN, Abdurakhimova M, Szatmári J, Khabiba T, Makhliyo R, Khodicha M, Mucsi L (2023) Effectiveness of machine learning and deep learning models at county-level soybean yield forecasting. Hungarian Geogr Bull. 72:383–398. https://doi.org/10.15201/hungeobull.72.4.4

    Article  Google Scholar 

  32. Ferenczi Z, Imre K, Lakatos M, Molnár Á, Bozó L, Homolya E, Gelencsér A (2021) Long-term characterization of urban pm10 in Hungary. Res Aerosol Air Qual. https://doi.org/10.4209/AAQR.210048

    Article  Google Scholar 

  33. Flemming J, Stern R, Yamartino RJ (2005) A new air quality regime classification scheme for O3, NO 2, SO2 and PM10 observations sites. Atmos Environ 39:6121–6129. https://doi.org/10.1016/j.atmosenv.2005.06.039

    Article  CAS  Google Scholar 

  34. Gál T, Skarbit N, Unger J (2016) Urban heat island patterns and their dynamics based on an urban climate measurement network. Hungarian Geogr Bull. 65:105–116. https://doi.org/10.15201/hungeobull.65.2.2

    Article  Google Scholar 

  35. Geofabrik [WWW Document], 2022. https://download.geofabrik.de/europe.html.

  36. Gozzi F, Della VG, Marcelli A, Lucci F. 2017. 201-JMES-2584-Gozzi 8, 1901–1909.

  37. Grange SK, Carslaw DC, Lewis AC, Boleti E, Hueglin C (2018) Random forest meteorological normalisation models for Swiss PM10 trend analysis. Atmos Chem Phys 18:6223–6239. https://doi.org/10.5194/acp-18-6223-2018

    Article  CAS  Google Scholar 

  38. György V (2012) Spatio-temporal distribution of dust storms–a global coverage using NASA TOMS aerosol measurements. Hungarian Geogr Bull 61:275–298

    Google Scholar 

  39. Halim NDA, Latif MT, Mohamed AF, Maulud KNA, Idrus S, Azhari A, Othman M, Sofwan NM (2020) Spatial assessment of land use impact on air quality in mega urban regions, Malaysia. Sustain Cities Soc 63:102436. https://doi.org/10.1016/j.scs.2020.102436

    Article  Google Scholar 

  40. Hersbach H, Bell B, Berrisford P, Biavati G, Horányi A, Muñoz Sabater J, Nicolas J, Peubey C, Radu R, Rozum I, Schepers D, Simmons A, Soci C, Dee D, Thépaut J-N. 2023. ERA5 monthly averaged data on single levels from 1940 to present [WWW Document]. Copernicus Clim Chang Serv Clim Data Store. https://doi.org/10.24381/cds.f17050d7

  41. Hofman J, Bartholomeus H, Janssen S, Calders K, Wuyts K, Van Wittenberghe S, Samson R (2016) Influence of tree crown characteristics on the local PM10 distribution inside an urban street canyon in Antwerp (Belgium): a model and experimental approach. Urban For Urban Green 20:265–276. https://doi.org/10.1016/j.ufug.2016.09.013

    Article  Google Scholar 

  42. Hothorn T, Bühlmann P, Dudoit S, Molinaro A, Van Der Laan MJ (2006) Survival ensembles. Biostatistics 7:355–373. https://doi.org/10.1093/biostatistics/kxj011

    Article  Google Scholar 

  43. Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat 15:651–674. https://doi.org/10.1198/106186006X133933

    Article  Google Scholar 

  44. Hu H, Chen Q, Qian Q, Lin C, Chen Y, Tian W (2021) Impacts of traffic and street characteristics on the exposure of cycling commuters to PM2.5 and PM10 in urban street environments. Build Environ 188:107476. https://doi.org/10.1016/j.buildenv.2020.107476

    Article  Google Scholar 

  45. Ivanovski M, Alatič K, Urbancl D, Simonič M, Goričanec D, Vončina R (2023) Assessment of air pollution in different areas (urban, suburban, and rural) in Slovenia from 2017 to 2021. Atmosphere (Basel). https://doi.org/10.3390/atmos14030578

    Article  Google Scholar 

  46. Jaafari S, Shabani AA, Moeinaddini M, Danehkar A, Sakieh Y (2020) Applying landscape metrics and structural equation modeling to predict the effect of urban green space on air pollution and respiratory mortality in Tehran. Environ Monit Assess. https://doi.org/10.1007/s10661-020-08377-0

    Article  Google Scholar 

  47. Janssen NAH, Fischer P, Marra M, Ameling C, Cassee FR (2013) Short-term effects of PM2.5, PM10 and PM2.5-10 on daily mortality in the Netherlands. Sci Total Environ 463–464:20–26. https://doi.org/10.1016/j.scitotenv.2013.05.062

    Article  CAS  Google Scholar 

  48. Jin Z, Zheng Y, Zhang Y (2023) A novel method for building air conditioning energy saving potential pre-estimation based on thermodynamic perfection index for space cooling. J Asian Archit Build Eng 22:2348–2364. https://doi.org/10.1080/13467581.2022.2109645

    Article  Google Scholar 

  49. Joksić J, Radenković M, Cvetković A, Matić-Besarabić S, Jovašević-Stojanović M, Bartonova A, Yttri KE (2010) Variations of PM10 mass concentrations and correlations with other pollutants in Belgrade urban area. Chem Ind Chem Eng Q 16:251–258. https://doi.org/10.2298/CICEQ090910041J

    Article  CAS  Google Scholar 

  50. Jung MC, Park J, Kim S (2019) Spatial relationships between urban structures and air pollution in Korea. Sustain 11:1–17. https://doi.org/10.3390/su11020476

    Article  CAS  Google Scholar 

  51. Kaleta D, Kozielska B (2023) Spatial and temporal volatility of PM2.5, PM10 and PM10-bound B[a]P concentrations and assessment of the exposure of the population of Silesia in 2018–2021. Int J Environ Res Public Health. https://doi.org/10.3390/ijerph20010138

    Article  Google Scholar 

  52. Karunasingha DSK (2022) Root mean square error or mean absolute error? Use their ratio as well. Inf Sci (Ny) 585:609–629. https://doi.org/10.1016/j.ins.2021.11.036

    Article  Google Scholar 

  53. Kertész Á, Madarász B, Csepinszky B, Benke S (2010) The role of conservation agriculture in landscape protection. Hungarian Geogr Bull 59:167–180

    Google Scholar 

  54. Kim S (2015) Communications for statistical applications and methods ppcor: An R package for a fast calculation to semi-partial correlation coefficients. Commun Stat Appl Methods 22:665–674

    Google Scholar 

  55. Kirešová S, Guzan M (2022) Determining the correlation between particulate matter PM10 and meteorological factors. Eng 3:343–363. https://doi.org/10.3390/eng3030025

    Article  Google Scholar 

  56. Kozarcanin S, Andresen GB, Staffell I (2019) Estimating country-specific space heating threshold temperatures from national gas and electricity consumption data. Energy Build 199:368–380. https://doi.org/10.1016/j.enbuild.2019.07.013

    Article  Google Scholar 

  57. Ku CA (2020) Exploring the spatial and temporal relationship between air quality and urban land-use patterns based on an integrated method. Sustain. https://doi.org/10.3390/su12072964

    Article  Google Scholar 

  58. Lan B, Haaland P, Krishnamurthy A, Peden DB, Schmitt PL, Sharma P, Sinha M, Xu H, Fecho K (2021) Open application of statistical and machine learning models to explore the impact of environmental exposures on health and disease : an asthma use case. IJERPH. https://doi.org/10.3390/ijerph182111398

    Article  Google Scholar 

  59. Largeron Y, Staquet C (2016) Persistent inversion dynamics and wintertime PM10 air pollution in Alpine valleys. Atmos Environ 135:92–108. https://doi.org/10.1016/j.atmosenv.2016.03.045

    Article  CAS  Google Scholar 

  60. Lenschow P, Abraham HJ, Kutzner K, Lutz M, Preuß JD, Reichenbächer W (2001) Some ideas about the sources of PM10. Atmos Environ 35:23–33. https://doi.org/10.1016/s1352-2310(01)00122-4

    Article  Google Scholar 

  61. Levshina N (2020) Conditional inference trees and random forests. In: Paquot M, Gries ST (eds) A practical handbook of corpus linguistics. Springer International Publishing, Cham, pp 611–643. https://doi.org/10.1007/978-3-030-46216-1_25

    Chapter  Google Scholar 

  62. Li C, Zhang K, Dai Z, Ma Z, Liu X (2020) Investigation of the impact of land-use distribution on pm2.5 in Weifang: seasonal variations. Int J Environ Res Public Health 17:1–20. https://doi.org/10.3390/ijerph17145135

    Article  Google Scholar 

  63. Li X, Ding C, Liao J, Du L, Sun Q, Yang J, Yang Y, Zhang D, Tang J, Liu N (2017) Microbial reduction of uranium (VI) by Bacillus sp. dwc-2: a macroscopic and spectroscopic study. J Environ Sci (China) 53:9–15. https://doi.org/10.1016/j.jes.2016.01.030

    Article  CAS  Google Scholar 

  64. Liu Y, Franklin M, Kahn R, Koutrakis P (2007) Using aerosol optical thickness to predict ground-level PM2.5 concentrations in the St. Louis area: a comparison between MISR and MODIS. Remote Sens Environ 107:33–44. https://doi.org/10.1016/j.rse.2006.05.022

    Article  Google Scholar 

  65. Łowicki D (2019) Landscape pattern as an indicator of urban air pollution of particulate matter in Poland. Ecol Indic 97:17–24. https://doi.org/10.1016/j.ecolind.2018.09.050

    Article  CAS  Google Scholar 

  66. Lu D, Mao W, Yang D, Zhao J, Xu J (2018) Effects of land use and landscape pattern on PM 2. 5 in Yangtze River Delta, China. Atmos Pollut Res 9:705–713. https://doi.org/10.1016/j.apr.2018.01.012

    Article  CAS  Google Scholar 

  67. Meng X, Wu Y, Pan Z, Wang H, Yin G, Zhao H (2019) Seasonal characteristics and particle-size distributions of particulate air pollutants in Urumqi. Int J Environ Res Public Health. https://doi.org/10.3390/ijerph16030396

    Article  Google Scholar 

  68. Moreci E, Ciulla G, Lo Brano V (2016) Annual heating energy requirements of office buildings in a European climate. Sustain Cities Soc 20:81–95. https://doi.org/10.1016/j.scs.2015.10.005

    Article  Google Scholar 

  69. Moustris KP, Zacharia PT, Larissi IK, Nastos PT, Paliatsos AG. Cooling and heating degree-days calculation for representative locations within the greater Athens Area, Greece. Proc. 12th Int. Conf. Environ. Sci. Technol. 2011. 8–10.

  70. Ng E, Chen L, Wang Y, Yuan C (2012) A study on the cooling effects of greening in a high-density city: an experience from Hong Kong. Build Environ 47:256–271. https://doi.org/10.1016/j.buildenv.2011.07.014

    Article  Google Scholar 

  71. Padoan E, Maffia J, Balsari P, Ajmone-Marsan F, Dinuccio E (2021) Soil PM10 emission potential under specific mechanical stress and particles characteristics. Sci Total Environ. https://doi.org/10.1016/j.scitotenv.2021.146468

    Article  Google Scholar 

  72. Panagos P, Van Liedekerke M, Borrelli P, Köninger J, Ballabio C, Orgiazzi A, Lugato E, Liakos L, Hervas J, Jones A, Montanarella L (2022) European soil data centre 2.0: soil data and knowledge in support of the EU policies. Eur J Soil Sci 73:1–18. https://doi.org/10.1111/ejss.13315

    Article  Google Scholar 

  73. Park S, Shin M, Im J, Song CK, Choi M, Kim J, Lee S, Park R, Kim J, Lee DW, Kim SK (2019) Estimation of ground-level particulate matter concentrations through the synergistic use of satellite observations and process-based models over South Korea. Atmos Chem Phys 19:1097–1113. https://doi.org/10.5194/acp-19-1097-2019

    Article  CAS  Google Scholar 

  74. Pascal M, Falq G, Wagner V, Chatignoux E, Corso M, Blanchard M, Host S, Pascal L, Larrieu S (2014) Short-term impacts of particulate matter (PM10, PM10-2.5, PM2.5) on mortality in nine French cities. Atmos Environ 95:175–184. https://doi.org/10.1016/j.atmosenv.2014.06.030

    Article  CAS  Google Scholar 

  75. Péterfalvi N, Keller B, Magyar M (2018) PM10 emission from crop production and agricultural soils. Agrokem es Talajt 67:143–159. https://doi.org/10.1556/0088.2018.67.1.10

    Article  Google Scholar 

  76. Pi H, Webb NP, Lei J, Li S (2022) Soil loss and PM10 emissions from agricultural fields in the Junggar Basin over the past six decades. J Soil Water Conserv 77:113–125. https://doi.org/10.2489/jswc.2022.00018

    Article  Google Scholar 

  77. Res C, Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error ( MAE ) over the root mean square error (RMSE) in assessing average model performance. Clim Res. 30:79–82

    Article  Google Scholar 

  78. Schmeller D, Pirisi G (2023) Green capital East of the Leitha? The chances and disadvantages of major cities in the Pannonian Basin to win the European Green Capital Award. Hungarian Geogr Bull. 72:287–309. https://doi.org/10.15201/hungeobull.72.3.5

    Article  Google Scholar 

  79. Scotto F, Bacco D, Lasagni S, Trentini A, Poluzzi V, Vecchi R (2021) A multi-year source apportionment of PM2.5 at multiple sites in the southern Po Valley (Italy). Atmos Pollut Res. https://doi.org/10.1016/j.apr.2021.101192

    Article  Google Scholar 

  80. Sgrigna G, Relvas H, Miranda AI, Calfapietra C (2022) Particulate matter in an urban-industrial environment: comparing data of dispersion modeling with tree leaves deposition. Sustain. https://doi.org/10.3390/su14020793

    Article  Google Scholar 

  81. Shahraiyni HT, Sodoudi S (2016) Statistical modeling approaches for pm10 prediction in urban areas; a review of 21st-century studies. Atmosphere (Basel) 7:10–13. https://doi.org/10.3390/atmos7020015

    Article  Google Scholar 

  82. Sicard P, Khaniabadi YO, Perez S, Gualtieri M, De Marco A (2019) Effect of O3, PM10 and PM2.5 on cardiovascular and respiratory diseases in cities of France, Iran and Italy. Environ Sci Pollut Res 26:32645–32665. https://doi.org/10.1007/s11356-019-06445-8

    Article  CAS  Google Scholar 

  83. Sohrab S, Csikos N, Szilassi P (2023) Effects of land use patterns on PM10 concentrations in urban and suburban areas. A European scale analysis. Pollut Res Atmos. https://doi.org/10.1016/j.apr.2023.101942

    Article  Google Scholar 

  84. Sohrab S, Csikós N, Szilassi P (2022) Connection between the spatial characteristics of the road and railway networks and the air pollution (PM10) in urban-rural fringe zones. Sustain. https://doi.org/10.3390/su141610103

    Article  Google Scholar 

  85. Spinoni J, Vogt JV, Barbosa P, Dosio A, McCormick N, Bigano A, Füssel HM (2018) Changes of heating and cooling degree-days in Europe from 1981 to 2100. Int J Climatol 38:e191–e208. https://doi.org/10.1002/joc.5362

    Article  Google Scholar 

  86. Stafoggia M, Schwartz J, Badaloni C, Bellander T, Alessandrini E, Cattani G, Donato FD, Gaeta A, Leone G, Lyapustin A, Sorek-hamer M, Hoogh KD, Di Q, Forastiere F, Kloog I (2020) Estimation of daily PM 10 concentrations in Italy (2006–2012) using finely resolved satellite data, land use variables and meteorology. Environ Int 99:234–244. https://doi.org/10.1016/j.envint.2016.11.024

    Article  CAS  Google Scholar 

  87. Strobl C, Boulesteix A, Kneib T, Augustin T, Zeileis A (2008) Conditional variable importance for random forests. BMC Bioinform 11:1–11. https://doi.org/10.1186/1471-2105-9-307

    Article  CAS  Google Scholar 

  88. Suárez Sánchez A, García Nieto PJ, Riesgo Fernández P, del Coz Díaz JJ, Iglesias-Rodríguez FJ (2011) Application of an SVM-based regression model to the air quality study at local scale in the Avilés urban area (Spain). Math Comput Model 54:1453–1466. https://doi.org/10.1016/j.mcm.2011.04.017

    Article  Google Scholar 

  89. Thorpe A, Harrison RM (2008) Sources and properties of non-exhaust particulate matter from road traffic: a review. Sci Total Environ 400:270–282. https://doi.org/10.1016/j.scitotenv.2008.06.007

    Article  CAS  Google Scholar 

  90. van Donkelaar A, Martin RV, Brauer M, Kahn R, Levy R, Verduzco C, Villeneuve PJ (2010) Global estimates of ambient fine particulate matter concentrations from satellite-based aerosol optical depth: development and application. Environ Health Perspect 118:847–855. https://doi.org/10.1289/ehp.0901623

    Article  CAS  Google Scholar 

  91. Vardoulakis S, Kassomenos P (2008) Sources and factors affecting PM 10 levels in two European cities: implications for local air quality management. Atmos Environ 42:3949–3963. https://doi.org/10.1016/j.atmosenv.2006.12.021

    Article  CAS  Google Scholar 

  92. Varga G, Rostási Á, Meiramova A, Dagsson-Waldhauserová P, Gresina F (2023) Increasing frequency and changing nature of Saharan dust storm events in the Carpathian Basin (2019–2023)—the new normal? Hungarian Geogr Bull. 72:319–337. https://doi.org/10.15201/hungeobull.72.4.1

    Article  Google Scholar 

  93. Viana M, Kuhlbusch TAJ, Querol X, Alastuey A, Harrison RM, Hopke PK, Winiwarter W, Vallius M, Szidat S, Prévôt ASH, Hueglin C, Bloemen H, Wåhlin P, Vecchi R, Miranda AI, Kasper-Giebl A, Maenhaut W, Hitzenberger R (2008) Source apportionment of particulate matter in Europe: a review of methods and results. J Aerosol Sci 39:827–849. https://doi.org/10.1016/j.jaerosci.2008.05.007

    Article  CAS  Google Scholar 

  94. World Health Organization, 2018. WHO Housing and health guidelines.

  95. Xiong J, Chen L, Zhang Y (2023) Building energy saving for indoor cooling and heating: mechanism and comparison on temperature difference. Sustain. https://doi.org/10.3390/su151411241

    Article  Google Scholar 

  96. Xu G, Jiao L, Zhang B, Zhao S, Yuan M, Gu Y, Liu J, Tang X (2017) Spatial and temporal variability of the PM2.5/PM10 ratio in Wuhan, Central China. Aerosol Air Qual Res 17:741–751. https://doi.org/10.4209/aaqr.2016.09.0406

    Article  CAS  Google Scholar 

  97. Xu G, Jiao L, Zhao S, Yuan M, Li X, Han Y, Zhang B, Dong T (2016) Examining the impacts of land use on air quality from a spatio-temporal perspective in Wuhan, China. Atmosphere (Basel) 7:1–18. https://doi.org/10.3390/atmos7050062

    Article  CAS  Google Scholar 

  98. Yan C, Wang L, Zhang Q (2021) Study on coupled relationship between urban air quality and land use in Lanzhou, China. Sustain. https://doi.org/10.3390/su13147724

    Article  Google Scholar 

  99. Yli-Pelkonen V, Setälä H, Viippola V (2017) Urban forests near roads do not reduce gaseous air pollutant concentrations but have an impact on particles levels. Landsc Urban Plan 158:39–47. https://doi.org/10.1016/j.landurbplan.2016.09.014

    Article  Google Scholar 

  100. Zahari MAZ, Majid MR, Ho CS, Kurata G, Nadhirah N, Irina SZ (2016) Relationship between land use composition and PM10 concentrations in Iskandar Malaysia. Clean Technol Environ Policy 18:2429–2439. https://doi.org/10.1007/s10098-016-1263-3

    Article  CAS  Google Scholar 

  101. Zang Z, Wang W, You W, Li Y, Ye F, Wang C (2017) Estimating ground-level PM2.5 concentrations in Beijing, China using aerosol optical depth and parameters of the temperature inversion layer. Sci Total Environ 575:1219–1227. https://doi.org/10.1016/j.scitotenv.2016.09.186

    Article  CAS  Google Scholar 

  102. Zhang M, Chen S, Zhang X, Guo S, Wang Y, Zhao F, Chen J, Qi P, Lu F, Chen M, Bilal M (2023) Characters of particulate matter and their relationship with meteorological factors during winter Nanyang 2021–2022. Atmosphere (Basel) 14:137. https://doi.org/10.3390/atmos14010137

    Article  CAS  Google Scholar 

  103. Zheng Y, Si P, Zhang Y, Shi L, Huang C, Huang D, Jin Z (2023) Study on the effect of radiant insulation panel in cavity on the thermal performance of broken-bridge aluminum window frame. Buildings. https://doi.org/10.3390/buildings13010058

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their thoughtful comments and efforts towards improving our manuscript.

Funding

Open access funding provided by University of Szeged. University of Szeged Open Access Fund, Grant number: 6977.

Author information

Authors and Affiliations

Authors

Contributions

S.S. assessed data curation, formal analysis, methodology, visualization, writing original draft; N.C. was responsible for methodology, visualization; P.S. mainly conceptualization, methodology. All authors reviewed the manuscript.

Corresponding author

Correspondence to Nándor Csikós.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Tables 8, 9, 10 and Figs. 7, 8, 9, 10.

Table 8 The main meteorological variables used in this study were downloaded from the Climate Data Store, published on April 18, 2019
Fig. 7
figure 7

Conditional inference forest results for suburban landscapes during the cooling period: A 1000 m distance from AQ station points and B 3000 m distance from AQ station points. Grey shaded variable with importance < 5%

Table 9 Conditional inference forest performance statistics for PM10 AQ monitoring sites in European suburban landscapes during the cooling period
Table 10 Conditional inference forest performance statistics for PM10 AQ Monitoring Sites in European suburban Landscapes during the heating period
Fig. 8
figure 8

Conditional inference forest results for suburban landscapes during the heating period: A 1000 m distance from AQ station points and B 3000 m distance from AQ station points. Grey shaded variable with importance < 5%

Fig. 9
figure 9

Conditional Inference Forest results for urban landscapes during the cooling period: A 1000 m distance from AQ station points and B 3000 m distance from AQ station points. Grey shaded variable with importance < 5%

Fig. 10
figure 10

Conditional Inference Forest results for urban landscapes during the heating period: A 1000 m distance from AQ station points and B 3000 m distance from AQ station points. Grey-shaded variable with importance < 5%

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sohrab, S., Csikós, N. & Szilassi, P. Effect of geographical parameters on PM10 pollution in European landscapes: a machine learning algorithm-based analysis. Environ Sci Eur 36, 152 (2024). https://doi.org/10.1186/s12302-024-00972-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12302-024-00972-z

Keywords