A COMPARATIVE ANALYSIS OF POACEAE POLLEN SEASONS IN LUBLIN ( POLAND ) 1

The aim of the present study was to compare the dynamics of grass pollen seasons and to assess whether the method of grouping pollen seasons and years with similar weather conditions would apply to the same groups. On the basis of Spearman’s correlation test between pollen counts and weather parameters during the pollen season, the strongest positive correlation was found with temperature and air humidity. The pollen seasons greatly varied in terms of air humidity, rainfall, and cloud cover, whereas temperature variations were small. The seasons in 2004 (very cold) and in 2010 (very warm) are exceptions. As a result of cluster analysis, three groups of seasons were distinguished. The grouping of seasons by using various criteria produced different groups of pollen seasons. No strong direct relationship was found between the mean values of the seasonal meteorological factors analysed and groups of seasons. PCA analysis can be used for quick and easy interpretation of weather characteristics of a particular season and to compare it with other seasons.


INTRODUCTION
Grass pollen allergens are the most frequent cause of pollen allergy incidence in Poland (O b t u ł o w i c z et al., 1990; R a p i e j k o and W e r y s z k o -C h m i e l e w s k a , 1998).Therefore, research on grass pollen monitoring is of very great importance.People with pollen allergy symptoms show individual varying sensitivity to grass pollen.According to different authors, the threshold value for grass pollen, i.e. the minimum number of pollen grains that induces allergy symptoms in all allergic people, is 50 P × m -3 (D a v i e s and S m i t h , 1973; R a p i e j k o and W e r y s z k o -C h m i e l e w s k a , 1998; Rapiejko et al. 2004).The presence of airborne pollen is related to the timing of flowering of plants.The atmospheric pollen season lasts long due to the occurrence of various grass species with different flowering times (K a s p rz y k and W a l a n u s , 2010).The patterns of daily airborne pollen concentrations and the main season parameters are characterized by high variability.
The characteristics of the main parameters of grass pollen seasons (start, end, duration, peak value, peak date, seasonal pollen index (SPI)) and predictive models for the season parameters have been presented earlier (P i o t r o w s k a , 2012).The present study analysed the dynamics of pollen seasons in relation to meteorological factors.The similarity between pollen seasons was also determined using cluster analysis.An attempt was made, using clustering methods, to determine whether any dominant types of pollen seasons occur in Lublin.The grouping of pollen seasons has also been used by other authors (S m i t h et al. 2009; D ą b r o w s k a -Z a p a r t , 2010; M a l k i e w i c z and K l a c z a k , 2011).Principal component analysis (PCA) was employed to compare weather conditions during the pollen seasons (G a r c i a -M o z o et al. 2008; G o n z á l e z P a r r a d o et al. 2009; L i g ht h a r t et al. 2009; P i o t r o w s k a and K u b i k -K o m a r , 2012).During the further part of the study, it was checked whether the groups of pollen seasons would correspond to the same groups derived on the basis of meteorological conditions.
This study analysed whether the groups identified by k-means clustering were close to each other in the scatter plot of mean factor scores in the PC1 and PC2 coordinate system.Such an arrangement would suggest the influence of seasonal weather factors on the grouping of seasons in terms of pollen concentration.

MATERIAL AND METHODS
This aeropalynological study was carried out in Lublin in the period 2001-2010.Average daily pollen concentrations were measured with a Hirst-type trap (Lanzoni VPPS 2000).The pollen sampler was located on the roof of a building of the University of Life Sciences at an altitude of 197 m and at a height of 18 m above ground level.The standard methods recommended by the International Association for Aerobiology were used for analysis (M a n d r i o l i et al. 1998).The 98% method was applied to determine the atmospheric pollen season (E m b e r l i n et al. 1994; S p i e k s m a and N i k k e l s , 1998).According to this method, the pollen season is defined to be from the day in which 1% of total pollen is registered to the day in which 99% of total pollen is noted.Daily average grass pollen counts are expressed as pollen grains/m 3 (P × m -3 ).The study identified periods of high pollen concentration of more than 50 P/m 3 .
Meteorological data obtained from the Meteorological Observatory, located at a distance of about 1.5 km from the pollen sampling site, of the Meteorology and Climatology Department of the Maria Curie--Sklodowska University in Lublin were used to determine the effect of weather on the patterns of grass pollen seasons.The following meteorological data were used for the analysis: mean, minimum and maximum air temperature, relative air humidity, rainfall, cloud cover, and wind speed.The results are presented in the form of a table of factor loadings showing correlations between weather parameters and the obtained factors (PC1, PC2) as well as in the form of a scatter plot of seasonal mean factor scores in the PC1-PC2 coordinate system.The factor loadings were obtained after VA-RIMAX rotation maximizing the sum of the variances of the squared loadings (F e r g u s o n and T a k a n e , 1989).The statistical dependence between pollen concentration and meteorological factors was determined by Spearman's test.
The similarity between pollen seasons was determined on the basis of the results of cluster analysis.Data clustering was performed using hierarchical cluster analysis and the k-means method.Different ways of clustering were analysed in the hierarchical classification method but the results obtained by Ward's method proved to be the closest to the k-means clustering results, since in both these methods a season is classified into a particular cluster by minimizing the within cluster variance (inside a cluster) and maximizing the between cluster variance (G o z d o w s k i et al. 2008).Different characteristics of the pollen season were used for this classification.Pollen seasons were grouped in terms of the following parameters: 1) daily pollen concentration patterns, 2) season parameters (start, end, duration, peak value, peak date, seasonal pollen index -SPI, average seasonal pollen concentration), and 3) rate of increase in pollen concentration.Where the season parameters were the criterion of clustering, the data were standardized before analysis in order to avoid the effect of the differences in measurement units between the parameters on the values of Euclidean distances.The rate of increase in pollen concentration was determined on the basis of the stages when the cumulative sum of pollen grains attained the values of 1%, 2.5%, 5%, 25%, 50%, 75%, 95%, 97.5%, 99% of the total annual pollen concentration.
All statistical calculations were performed using STATISTICA ver.8.

RESULTS
The grass pollen seasons in Lublin during the period 2001-2010 were characterized by high variation.These differences related to both concentration levels and timing of occurrence of airborne pollen.Meteorological conditions recorded during the grass pollen seasons were analysed using principal component analysis.Before performing the principal component analysis, the data were standardized in order to eliminate any differences in the parameters under investigation resulting from the differences in measurement units.Furthermore, the variable representing mean temperature was excluded from the analysis due to its high correlation with maximum temperature (r=0.97).Two principal components were obtained -PC1 and PC2 (Tab.1).PC1 was most correlated with minimum and maximum temperature, while PC2 with air humidity, cloud cover, and rainfall.This factor was termed as inclement weather.During the grass pollen season, thus on average in the period from the middle of May to the beginning of September, variable weather conditions were recorded in the years 2001-2010 (Fig. 1).The mean values of the factor scores in Figure 1 indicate that the 2010 season was characterized by the highest temperature, while the season in 2004 was by far the coldest.The mean factor scores for the years 2007 and 2010 are located in the first quarter of this scatter plot, which means that these seasons belonged to warm and wet ones, while the location of the 2003, 2005 and 2008 seasons below the PC1 axis and close to the PC2 axis in the scatter plot shows that these seasons were rather sunny with temperature remaining at an average level.The high values of PC2 for the 2001 season suggest that this was a season with a predominance of cloudy and wet days.The analysis of the meteorological data shows that the highest rainfall total was in 2001 compared to the other study years.
Spearman's test was employed to determine the degree of dependence between pollen concentration and meteorological conditions and the data from the entire 10-year study were analysed.The correlation coefficients for the atmospheric pollen season were statistically significant in the case of all the meteorological factors in question (Table 2).Nevertheless, they had a low value and therefore correlations in different periods of occurrence of airborne pollen were also examined.The following periods were taken into considerationthe period of high pollen concentrations (> 50 P/m 3 ), and the pre-peak period.The highest degree of correlation was obtained for the pre-peak period in the case of temperature.The analysis of Spearman's correlation coefficients shows that the grass pollen concentration was most affected by mean and maximum air temperature (positive correlation) and air humidity (negative correlation).A negative correlation, but at a lower level, was found for cloud cover and rainfall.The lowest correlation coefficient was recorded for wind speed.
The agglomeration method was applied to cluster pollen seasons on the basis of daily pollen concentrations; the results of this analysis are shown in graphic form (dendrogram) in Figure 2. Three clusters of pollen seasons that are characterized by a similar pattern can be distinguished in this diagram.
Three types of pollen seasons were distinguished on the basis of the above results derived using k-means clustering (Fig. 3).The clusters obtained by the agglomeration and k-means methods covered the same years.4 years were included in type A: 2005, 2008, 2009, and 2010.They were characterized by high seasonal pollen indexes (SPI).Besides, a long period in which very high pollen concentrations were recorded was noted in phase II of the season.This period lasted about two weeks and this was a time when similar pollen concentrations occurred, rarely with a short-term decline.Type B included the years 2002, 2003, and 2007.During these seasons, high pollen concentrations, exceeding 50 P/m 3 , were recorded from the second half of May, while the maximum values occurred at the turn of June and July.The curves representing the pattern of type B pollen seasons had several peak values, with no single clear peak; their characteristic feature was a sudden decrease in pollen concentration after a short period of high concentration.For this reason, the image of the curve has several peaks separated by low values.High grass pollen concentrations (>50 P/m 3 ) ended earliest in the case of type B seasons, since this occurred between 9 and 11 July.Type C comprises grass pollen seasons from the years 2001, 2004, and 2006.Contrary to the previous type, in type C one clear peak can be distinguished which was recorded on 8 or 9 July.A very high airborne pollen concentration remained for several days and subsequently it declined quite quickly; much lower pollen concentrations were recorded on the next days.
The pollen seasons of type A and B are more similar to each other (a smaller Euclidean distance) than cluster C (Table 3).Clusters B and C are the most distant.Similar results can be observed in the dendrogram (Fig. 2) where the Euclidean distances can be seen in the cluster distance axis.A single clear peak occurring at a similar time in the seasons from cluster C was most probably a feature that differed the most these groups.
The data presented in Figure 3 show that two periods (phases) with a high pollen concentration (>50 P/m 3 ), which were separated from each other by several days of low concentration, mostly occurred in the grass pollen season in Lublin.On average, the first period started at the end of May and ended on different days of June, whereas the second period started on average on 19 June and ended on 18 July (Table 4).The highest variation was observed in the start dates of the first period of high pollen concentration, while the lowest variation was noted for the end of the second period (Table 4).The first phase of the pollen season was characterized by much lower pollen concentrations than the second phase.The maximum pollen concentrations during the first phase of the season were in the range of 94-219 P × m -3 and they occurred between 28 May and 22 June.During this time, the highest concentrations exceeded 2-4 times the threshold value at which all allergic people have allergy symptoms.The maximum pollen concentrations during the second phase were much higher, on average 3 times higher than during the first phase.The seasonal maximum averaged 459 P × m 3 (from 276 P × m -3 to 643 P × m -3 ).The curves of the grass pollen seasons were right-skewed.The coefficient of skewness was in the range from 1.85 (2007) to 3.92 (2003).On average, the pre-peak period lasted 46 days, while the post-peak period 56 days.
The further results of cluster analysis are presented only in the form of dendrograms.The similarity of seasons was analysed in terms of pollen season parameters and rate of increase in airborne pollen concentration.The results of this analysis show that three main groups were identified that include different seasons depending on the criterion used (Figs 4, 5).In terms of pollen season parameters, three clusters were distinguished: 2008 and 2010 (A); 2003, 2004, 2006, and 2009 (B); as well as 2001, 2002, 2005, and 2007 (C) (Fig. 4).In order to determine which of the season parameters had the greatest effect on identifying the clusters included in Fig. 4, mean values for these parameters were calculated in the clusters (Tab.5).It can be clearly seen that group A definitely differs from the other groups in terms of mean values of the following parameters: SPI, average pollen concentration in season, seasonal maximum, and season duration.Therefore, these differences confirm the large cluster distance in Figure 4 between group A and the other groups.
In terms of the rate of increase in concentration, the years and 2007, 2005 and 2006 as well as 2008 and 2009 were the most similar (Fig. 5).If the rate of increase in pollen concentration was used as the criterion, the years 2008 and 2009 as well as 2002 and 2007 were in the same groups as in the case when k-means clustering was applied (Fig. 5).Only two pollen seasons (2002 and 2007) were similar irrespective of the criterion used.

DISCUSSION AND CONCLUSIONS
Many authors stress the need to carry out ongoing pollen monitoring in different monitoring centres due to high variability in the timing and severity of pollen seasons (S m i t h et al. 2009; M y s z k o w sk a , 2010; S a b a r i e g o et al. 2011).The research conducted at 13 European sites has found the geographic location, mainly the latitude, to be the most important factor affecting the start date of the grass pollen season in different regions (S m i t h et al. 2009).This is primarily associated with the accumulation of temperatures from the start of the growing season, since plants require a definite dose of thermal energy.The grass pollen season is usually determined by the 98% method (E m b e r l i n et al. 1994; G al a n et al. 1995; S p i e k s m a and N i k k e l s , 1998; Puc, 2011).Clot (1998) proposes that the start of the grass pollen season should be defined on the basis of cumulative temperature.According to the method of the above-mentioned author, the second rainless day when the cumulative mean daily temperature from 1 March reaches 500 o C should be defined as the start of the grass pollen season.In comparing the start dates of the grass pollen season determined by the 98% method and by C l o t ' s method (1998), P i o t r o w s k a ( 2006) obtained similar results.The research conducted in Lublin shows that grass pollen reaches very high airborne concentrations.In the years 2001-2010, high grass pollen concentrations (>50 P × m -3 ) were recorded in the period from 15 May to 30 July.Weather conditions are included among the main factors influencing the pollen season patterns (M y s z k o w s k a , 2010; P u c , 2011; S a b a r i e g o et al. 2011).In Lublin mean and maximum temperature as well as air humidity belonged to the most important meteorological parameters affecting airborne grass pollen concentrations.A high positive correlation was found between grass pollen concentration and mean and maximum air temperature as well as a negative correlation with air humidity.Similar results were obtained by K a s p rz y k and W a l a n u s (2010) who found that on days when maximum and mean temperature rises, the pollen concentration also increases, whereas the pollen concentration decreases when there is an increase in humidity.Thanks to meteorological observations, it is possible to forecast the most important features   2009) propose principal component analysis (PCA) as an alternative statistical method for interpreting airborne pollen counts.The above-mentioned authors think that the use of this type of analysis, in combination with other methods, is necessary to obtain exact recording of differences in pollen concentration.On the basis of PCA, they found periods of high pollen concentration to be characterized by high maximum temperature and low rainfall levels (G o n z á l e z P a r r a d o et al. 2009).During the study carried out in Lublin, an attempt was also made to use a similar method, but the inclusion of the meteorological and pollen data combined did not give a satisfactory result.Hence, in the present paper PCA is used to classify pollen seasons in terms of weather.The inclusion of the data from all seasons in one plot gives a possibility to create weather characteristics of a given pollen season that are quick and easy to interpret and to compare such a season with the other ones.The PCA analysis for Lublin shows that the greatest variation in the period 2001-2010 related to inclement weather.In the case of temperature, most seasons concentrated around the mean value.The PCA classification results were compared with the groups of pollen seasons that were identified by using cluster analysis.In terms of temperature, the years 2004 (very cold) and 2010 (very warm) clearly stood out; in spite of the fact that in cluster analysis these years were classified into particular groups, but in all the dendrograms they were clustered to these groups at the very end (large cluster distance), which indicates that they differed more than the other pollen seasons.In the other cases, no direct relationship was found between the groups distinguished on the basis of PCA and cluster analysis.Taking into consideration meteorological data from pollen season as well as additionally including data before the season into the analysis, did not bring expected results.According to S m i t h et al. (2009), apart from local meteorological conditions, the pollen season is also affected by large-scale patterns of climate variability, like for example the NAO (North Atlantic Oscillation).Moreover, the grass pollen concentration, and what follows the pattern of pollen seasons, can also be influenced by human activity, e.g. the timing of hay cutting (K a s p r z y k and W a l a n u s , 2010).
Two phases of high pollen concentration were distinguished in the grass pollen season pattern in Lublin.The similar dynamics of grass pollen seasons has been observed in Rzeszów where the reason for the characteristic pattern of grass pollen seasons was probably the timing of flowering of various grass species and dates of hay cutting (K a s p r z y k and W a l an u s , 2010).No dominant pollen season types were found based on the cluster analysis performed for Lublin.A relatively short data series can be one of the reasons.However, on the basis of an 8-year study and using k-means clustering, M a l k i e w i c z and K l a c z a k (2011) identified 3 types of seasons out of which one included 4 years and may prove to be the dominant one.All the pollen seasons in Lublin were right-skewed, but the difference between the length of the pre-peak period and post-peak period was not as large as in other cities in Poland.In Kraków the average number of days in the pre-peak period was twice smaller than in the post-peak period (M y s z k o ws k a , 2010).The post-peak period was found to be much longer than the pre-peak period also in Wrocław (M a l k i e w i c z and K l a c z a k , 2011).

Fig. 1 .
Fig. 1.Comparison of the weather conditions during the pollen seasons in Lublin in 2001-2010.

Fig. 2 .
Fig. 2. Comparison of the dynamics of grass pollen seasons in Lublin in 2001-2010.

Fig. 4 .Fig. 5 .
Fig. 4. Groups of pollen seasons identified on the basis of the season parameters in Lublin in 2001-2010.
of the pollen season (E m b e r l i n and A d a m s -G r o o m , 2004; S t a c h et al. 2008; P i o t r o w s k a , 2012).Earlier research has found that the mean minimum temperature of March and cloud cover in the first tenday period of May were the best meteorological factors to predict the start of the grass pollen season in Lublin.Rainfall in May was the most important factor to determine season duration (P i o t r o w s k a , 2012).
Meteorological factors in relation to pollen counts are analysed using different statistical methods.G o n z á l e z P a r r a d o et al. (