In order to refine the research on the impact of environmental factors on the concentration of pollutants in the air, in this paper, we present a mathematical model that allows the possibility of taking into account the past values of factors (explanatory variables) when modeling the current concentration of pollution. We conducted numerical analyzes based on hourly data from meteorological, traffic and air quality monitoring stations in Wrocław (Poland, Central Europe) from 2015–2017. In order to determine the optimal delay of each explanatory variable, we used a multi-objective optimization model (MO). It turned out that for the concentration of nitrogen oxides, delayed traffic flow, wind speed and sunshine duration time are more important than current ones. Then we built two random forest models: an actual model of current values of explanatory variables and a lag model with delayed variables determined by the MO method. Taking into account variables with an optimal delay (lag model) results in an increase in model accuracy for NO2 with R2 = 0.51 to 0.56 and for NOx from 0.46 to 0.52. We deduced that in pollutant concentrations modeling, the possibility of greater influence of variables with delay should always be considered because it can significantly increase the accuracy of the model and indicate additional relationships or dependencies.

### Lag Variables in Air Pollution Modeling Based on Traffic Flow and Meteorological Factors

#### Abstract

In order to refine the research on the impact of environmental factors on the concentration of pollutants in the air, in this paper, we present a mathematical model that allows the possibility of taking into account the past values of factors (explanatory variables) when modeling the current concentration of pollution. We conducted numerical analyzes based on hourly data from meteorological, traffic and air quality monitoring stations in Wrocław (Poland, Central Europe) from 2015–2017. In order to determine the optimal delay of each explanatory variable, we used a multi-objective optimization model (MO). It turned out that for the concentration of nitrogen oxides, delayed traffic flow, wind speed and sunshine duration time are more important than current ones. Then we built two random forest models: an actual model of current values of explanatory variables and a lag model with delayed variables determined by the MO method. Taking into account variables with an optimal delay (lag model) results in an increase in model accuracy for NO2 with R2 = 0.51 to 0.56 and for NOx from 0.46 to 0.52. We deduced that in pollutant concentrations modeling, the possibility of greater influence of variables with delay should always be considered because it can significantly increase the accuracy of the model and indicate additional relationships or dependencies.
##### Scheda breve Scheda completa Scheda completa (DC)
2020
air pollution; nitrogen oxides; random forest; lag variables; multi-objective optimization; traffic flow; meteorological conditions
File in questo prodotto:
File
ismo2020.pdf

accesso aperto

Descrizione: Full text editoriale
Tipologia: Full text (versione editoriale)
Licenza: Creative commons
Dimensione 226.61 kB

I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11392/2420900
• ND
• ND
• ND