Formalized Hierarchically Nested Expert System for Classification of Mesic and Wet Grasslands in Poland

The goal of this study was to propose a hierarchically nested classification system comprising four principal levels of the Braun-Blanquet system for Polish grasslands of the class Molinio-Arrhenatheretea. Using the Cocktail method, we defined consistent criteria for delimitation of the class, three orders, nine alliances, and 45 associations. Formal definitions were prepared using the summed cover and presence/absence information of species groups and individual dominant species. We created an expert system with a set of assignment rules that unambiguously classify relevés to a single unit at the given abstraction level of the Braun-Blanquet system in such a way that a relevé matched by the definition of a focal vegetation unit must be matched by definitions of all superior units. Of 11,535 relevés classified to Molinio-Arrhenatheretea, 36% were recognized at the association level, and 57% and 85% at the alliance and order level, respectively. All relevés were assigned unambiguously, meaning that a single relevé could not be assigned to more than one unit within the same hierarchical level (no overlap between vegetation units). This study is the first proposal of a hierarchically nested classification system that classifies grassland vegetation at different syntaxonomical levels unequivocally. It is important to create definitions for different syntaxonomical levels because the majority of vegetation patches do not fit to the associations, but can only be assigned to high-rank units such as alliance, order, or class.


Introduction
Seminatural grasslands of the class Molinio-Arrhenatheretea are one of the major components of the agricultural landscape in Poland (Szymura & Szymura, 2019). Although grasslands occur quite commonly across the country, their classification is still unclear and differs among regions. Only a few vegetation surveys have been devoted to the classification of grasslands on the country-wide scale (Matuszkiewicz, 1984;Nowiński, 1967). There is an increasing need for a syntaxonomical revision of this type of vegetation in Poland for several reasons. First, a number of regional surveys with substantial changes to the syntaxonomy of grasslands have been introduced in recent years. Second, the progress in the classification methods and the current availability of vegetation-plot databases provide an opportunity to tackle the classification across wide environmental and geographical ranges. The lack of a comprehensive classification of the Molinio-Arrhenatheretea class in the national survey was probably caused by the lack of a suitable digital vegetation-plot database. This obstacle has been overcome by the development of the national vegetation-plot data (Kącki & Śliwiński, 2012) and grassland-focused databases for Polish Carpathians (Korzeniak, 2016).
One of the main challenges of contemporary phytosociology is to make classification systems as consistent as possible within a broad study region of interest (De Cáceres & Wiser, 2012;Dengler et al., 2013); however, this is difficult to achieve because of the wide range of classification approaches . Considering that most vegetation surveys in Europe have been carried out in accordance with the traditional Braun-Blanquet approach, the relatively standardized data collection protocols in European vegetation surveys should enhance consistency (Mucina et al., 2000). A notable contribution to vegetation classification is the development of vegetation-plot databases in which millions of records are stored Dengler et al., 2011). A key point is, however, to use a classification method that ensures stable, repeatable, and consistent classification outcome. One of the methods that meet these criteria is the Cocktail method, which utilizes the explicit definition of a given vegetation unit corresponding to the concept of that unit (Bruelheide, 1997(Bruelheide, , 2000. The Cocktail method is a supervised classification method that uses formal rules to match relevés to predefined vegetation units. The Cocktail method is found to be particularly useful for classifying vegetation in countries with a long-lasting history of phytosociological studies, where vegetation units had already been defined. Formal definitions of vegetation units can be gathered into an expert system (ES) for automatic allocation of relevés to syntaxa (Noble, 1987). Such ES could be, for example, a set of formal definitions of vegetation units for automatic classification of vegetation data sets. With this approach, several nationwide formalized classifications of grassland vegetation were introduced in the recent national classification systems (Chytrý et al., 2007;Janišová et al., 2007).
Since the association is the basic unit in syntaxonomy (Braun-Blanquet, 1964), formal definitions of vegetation units were mostly created for syntaxa of this rank at a regional or national scale (Boublík, 2010;Dítě et al., 2007;Šilc & Čarni, 2007). Along with its proposed extensions (Bruelheide, 2016;Kočí et al., 2003;Landucci et al., 2015), the Cocktail method has been widely used to create formalized classifications of various vegetation types at the alliance level (Douda et al., 2016;Marcenò et al., 2018;Willner et al., 2017). However, a single-level classification system is not coherent with the logic of the hierarchical system of the traditional Braun-Blanquet approach, which classifies vegetation at different levels of abstraction, namely association, alliance, order, and class, depending on the presence and abundance of diagnostic species (Braun-Blanquet, 1964). It needs to be highlighted that the multilevel syntaxonomical system requires that each vegetation unit of a given rank has its own diagnostic species. The philosophy of the Braun-Blanquet approach has been partly met by the hierarchical expert system representing a two-level, formalized classification of rocky Pannonian grasslands and dealpine Sesleria-dominated grasslands in Slovakia (Janišová & Dúbravková, 2010). This approach was also successfully used to classify Molinia grasslands in Poland (Swacha et al., 2016) and the Hyrcanian forest vegetation in Northern Iran (Gholizadeh et al., 2020). A two-level hierarchically nested system was also performed using fuzzy clustering (Wiser & De Cáceres, 2018). However, the two-level classification system needs to be developed into a full hierarchical system because a large number of relevés could not be classified even at the alliance level (Janišová & Dúbravková, 2010). A recently developed classification system for the European marsh vegetation demonstrated the possibility of creating a multilevel ES .
In the present paper, we introduce a hierarchical ES that includes all the basic hierarchical levels of syntaxonomy, that is, associations, alliances, orders, and classes. This complete and hierarchically nested classification system was introduced and tested on the temperate grasslands of the Molinio-Arrhenatheretea class. The novelty of this system is that the definitions of lower-ranked units are formulated in such a way that they match those of inclusive higher-ranked units. Therefore, similar to syntaxa, definitions are also nested. Moreover, definitions are mutually exclusive, thereby reducing the possibility of ambiguities in the results.

Material and Methods
Our data set comprised 89,181 relevés from the Polish Vegetation Database (Kącki & Śliwiński, 2012) and data from the Grasslands in the Polish Carpathians (Korzeniak, 2016). Species recognized only at the genus level were deleted. Species records from multiple layers were merged together and classified into four categories: (i) tree layer -included trees recorded in a tree layer, (ii) shrub layer -included shrubs and trees recorded in the shrub layer, (iii) herb layercomprised vascular plants recorded in the herb layer (including seedlings and juveniles of woody species), and (iv) moss-lichen layer -included nonvascular plants and lichens. Relevés taken from the area outside the range of 10 to 100 m 2 were excluded. The whole data set comprised a total of 2,268 vascular plant species. The term "species" was used in this study for simplicity, including taxa that were merged into aggregates (agg.) and sensu lato (s. l.) that were treated in a broader sense than originally defined or accepted following Kącki et al. (2013).
Formal definitions of vegetation units were created using the Cocktail method (Bruelheide, 1997(Bruelheide, , 2000. They were created for each principal level of syntaxonomical hierarchy in accordance with the Braun-Blanquet approach (Braun-Blanquet, 1964;Westhoff & van der Maarel, 1980). We used the top-down approach, where formal definition of a class was first created, followed by defining the order(s), alliance(s) and association(s), in that order. The successive build-up of definitions was done under the condition that a relevé matched by the definition of a low-rank unit in the syntaxonomical hierarchy had to be simultaneously matched by the definition(s) of a higher-rank unit(s). The proposed classification system is referred to as a complete hierarchically nested classification system, because it classifies relevés at each principal level of the Braun-Blanquet system. Formal definitions of higher vegetation units (i.e., class, order, and alliance) were created using a combination of total cover groups (TCGs) of diagnostic species only (Landucci et al., 2015;Willner, 2011), while definitions of associations were created using a combination of TCGs and sociological species groups (SSGs) proposed by Bruelheide (1997) and dominance thresholds for species proposed by Kočí et al. (2003).
Sociologically related species were grouped into syntaxonomical TCGs of different hierarchical levels, except for the association (Appendix S1). The TCGs created for the class are termed TCG name of the class, for the order are TCG name of the order, and for the alliance are TCG name of the alliance. A syntaxonomical TCG contains a priori defined list of diagnostic species (character and differential species) for a given syntaxon. Syntaxonomical TCGs were created provided that a single species cannot be listed in more than one syntaxonomical TCG at the same hierarchical level. If a species, for instance, was listed as diagnostic to a given class, it could not be assigned to any other class. However, we allowed species to be shared between at most two vegetation classes only if one of the classes represented nonforest vegetation, while another class was forest vegetation. We compiled diagnostic species into syntaxonomical TCGs using a top-down approach. This means that diagnostic species were first compiled for the class, and then for the order(s) and alliance(s). Species listed on syntaxonomical TCG at the lower level (i.e., alliance) had to be listed in all superior units but not vice versa. We compiled diagnostic species for class and subordinated units including orders and alliances based on expert knowledge and broad phytosociological literature surveys (e.g., Matuszkiewicz, 1984;Oberdorfer, 1977Oberdorfer, , 1978Oberdorfer, , 1983. Functionally related species were grouped in functional TCGs. Functional TCG can be any meaningful group of related species, e.g., TCG tree layer comprising trees occurring in tree layer or TC neophytes comprising neophytes occurring in Poland. A single species was allowed to be listed in multiple functional TCGs, since, for example, a tree species can be simultaneously a neophyte as well. Syntaxonomical and functional TCGs are considered present in a relevé if the summed cover of species from the group exceeds a specified cover threshold (Landucci et al., 2015). SSG is a group of co-occurring species with similar habitat preferences. The SSGs were created based on interspecific associations. The SSGs were acquired from the national classification of vegetation to higher syntaxonomical units (Kącki et al., 2013).
A baseline for the classification of relevés to the class Molinio-Arrhenatheretea was the original assessment of the authors in the respective data set. In order to establish a threshold for a given TCG in the cluster representing the high-rank syntaxonomical unit, we computed the summed cover of species belonging to this group for each relevé assigned to the cluster. For example, the threshold for TC Molinio-Arrhenatheretea was set at ≥35%, which was the minimum summed cover of species in each relevé originally classified by their authors to the class. Once the formula of the class Molinio-Arrhenatheretea was created, it was used to extract data for further classification at the level of orders, alliances, and associations. In order to recognize diversity within the class, we used TWINSPAN (Hill, 1979) and modified TWINSPAN (Roleček et al., 2009). Based on the interpretation of the results (clusters) derived from clustering algorithms, we successively created formal definitions for syntaxonomical units.
The formal definition of a given vegetation unit was composed of TCGs and SSGs combined by logical operators AND, OR, and NOT (Bruelheide, 1997). Each formal definition was composed of a positive and negative part, separated by the logical operator NOT. The positive part of the formula was followed by the logical operator NOT, and the negative part of the logical formula followed. An example of nestedness of formal definitions is presented in Appendix S2. The positive part of the definition sets required a cover threshold of a given TCG(s), and, in the case of associations, also the SSGs that should be present in the relevé. The required summed percentage representation of species from the TCG was set by the operator greater than or equal to (GE). The SSG was considered to be present in a relevé if more than half of the members of the group were present in the relevé.
The negative part of the definition is composed of TCGs or SSGs that must be absent from the relevé. The definition of a vegetation unit at a given syntaxonomical level was created using hierarchically equivalent syntaxonomical TCGs. This means that to create a formal definition of class, only TCGs at the class level (e.g., TCG Molinio-Arrhenatheretea, TCG Phragmito-Magnocaricetea, TCG Scheuchzerio-Caricetea) were used in the positive and negative parts of the definition. The entire definition of the class was used as an obligatory part of the definition of the nearest subordinate units in the hierarchy, that is, at the level of order. Analogously, the entire definition of the order (including the definition of class) was used to create definitions of the next lower-ranked syntaxonomical level, that is, the alliance. This procedure was repeated until the lowest-ranked units in the hierarchy, the associations, were created. This conservative method of constructing formal definitions ensured that low-rank syntaxonomical units were matched by definitions of the superior units. Definitions of vegetation units were gathered into a computer system called an expert system (ES) (Appendix S1). The ES classified relevés in such a way that a relevé assigned to the association was also matched by the definition(s) of all superior syntaxonomical levels, with no exception. The syntaxonomical level to which a relevé was assigned depended on the presence and cover of diagnostic species in that relevé. Relevé containing diagnostic species only for the Molinio-Arrhenatheretea class, but not for the lower rank units, could only be classified at the class level. The ES can be run in the JUICE program (Tichý, 2002). For a detailed description of the structure and formal language of the ES functions used in the JUICE program, see Tichý et al. (2019).
We assigned all relevés in the data set to vegetation units using the ES introduced above, and then described the syntaxa using the lists of diagnostic, constant, and dominant species as well as the synoptic table (Appendix S3, Appendix S4). Constancy and fidelity values for species were computed based on a geographically stratified data set in order to mitigate the effects of oversampling of some regions against the others (Knollová et al., 2005). A maximum of three relevés found in the stratum of ca. 2 km 2 and assigned to the same syntaxon, were randomly selected. Groups containing less than 10 relevés were not subjected to geographical stratification. As a measure of fidelity, we used the phi coefficient (Chytrý et al., 2002). Each species with phi ≥ 0.20 was considered diagnostic. The significance of species occurrence patterns was tested by Fisher's exact test after virtual standardization of groups to equal size (Tichý & Chytry, 2006). Diagnostic species were determined by analyzing the hierarchically equivalent syntaxa. In other words, to determine the diagnostic value of species for associations, we analyzed only groups comprising relevés that matched the definition of associations. This scheme was applied for each hierarchical level, including associations, alliances, and orders, except the class whose diagnostic species were determined by analyzing relevés that matched the definition of the class Molinio-Arrhenatheretea and the set of relevés representing nonforest vegetation in the Polish Vegetation Database, that is, relevés without shrub or tree layer.

Results
The ES consisted of formal definitions for different hierarchical levels that were created using 32 syntaxonomical TCGs, eight functional TCGs, and 35 SSGs (Appendix S1). Additionally, we used cover thresholds for 27 dominant species. The Molinio-Arrhenatheretea class was divided into three orders comprising nine alliances and 45 associations. A total of 11,535 relevés were assigned to the class Molinio-Arrhenatheretea by applying the formal definition of the class on the geographically stratified data set. The Molinio-Arrhenatheretea data set included a total of 1,226 vascular plant species. The proportion of relevés from the Molinio-Arrhenatheretea data set classified into subordinated units decreased with descending hierarchical position in the hierarchical system, that is, from order to association (Table 1). A total of 36% of the data set classified to the Molinio-Arrhenatheretea class was recognized at the association level. For the alliance and order level, the proportion of classified relevés was 57% and 85% of the Molinio-Arrhenatheretea data set, respectively. The number of relevés representing individual associations constituted a small portion of the data classified to their superior units.  Delimited vegetation units at each level of the hierarchical system were characterized by the composition of diagnostic, constant, and dominant species (Appendix S3, Appendix S4). Distinct patterns of geographical distribution were clearer at the association level than at the higher-rank vegetation units, that is, alliance or order (Figure 1, Appendix S3).

Methodological Approach
We presented a hierarchically nested classification system for Polish grasslands. The nested classification was achieved by creating formal definitions for vegetation units with the assumption that the formal definition of a unit includes definitions of the superior units. The logical formulas were gathered in the ES and used for classification of the initial data set. Using this protocol, we attempted to reflect the classification of meadow vegetation described in Central and Eastern Europe (Botta-Dukát et al., 2005;Chytrý et al., 2007;Ellmauer, 1993;Janišová et al., 2007;Kuzemko, 2016;Matuszkiewicz, 2006;Rodríguez-Rojo et al., 2017). The major advantage of the proposed ES for automatic vegetation classification is that it meets the theoretical fundamentals of vegetation classification of the Braun-Blanquet approach because of its hierarchical character and nestedness of vegetation units (Westhoff & van der Maarel, 1980). The association is the basic entity in the system. In the proposed multilevel ES, associations are, however, unavoidably affected by how higher units are defined. In syntaxonomy, there is a continuous problem of assigning the association to the right high-ranked unit. Anthoxantho-Agrostietum is an example of an association with ambiguous position in syntaxonomy because it has been assigned by various authors to either Cynosurion or Arrhenatherion (Chytrý et al., 2007;Uhliarová et al., 2014). The proposed system is nevertheless easy to modify by replacing the part of the definition corresponding to the higher-ranked unit while maintaining the focal part, i.e., SSGs and dominance threshold.
Our ES applies to the area for which it was defined. Polish grassland vegetation is, however, strongly related to other Central European grasslands. Therefore, the proposed protocol for automatic classification might be used outside Poland after considering the adjustments to species groups due to changing interspecific associations in phytosociological data sets along the geographical gradient (Kuželová & Chytrý, 2004). We expect these modifications to be slight, as many of the species we considered diagnostic have a wide geographic distribution and well-defined diagnostic value, at least in Central Europe (Oberdorfer, 1978).
The proposed ES allows for unambiguous classification of vegetation-plot data at each level of syntaxonomical hierarchy for better and unambiguous discrimination of plant communities. The majority of the previously proposed ES for classification of various types of vegetation enabled the assignment of relevés at one level only, most often at the association or alliance level. There are only a few examples of ES that were created for classification of at least two hierarchical levels (Gholizadeh et al., 2020;Janišová & Dúbravková, 2010;Landucci et al., 2020;Swacha et al., 2016). The necessity of developing an ES that includes definitions of syntaxa representing all principal hierarchical levels is needed considering that a relatively large portion of vegetation plots could not be classified at the association level. It was reported that approximately 50% of relevés could not be assigned at the association level in the vegetation survey of the Czech Republic . In the hierarchically nested classification, approximately 50% of the relevés matched by the alliance definition could not be assigned to any association (Janišová & Dúbravková, 2010;Swacha et al., 2016). In our study, the proportion of relevés classified at the association level was remarkably lower than the proportion of relevés classified at the level of higher-rank units, the alliance and order. The results showed that 36% of the data set matching the definition of the class Molinio-Arrhenatheretea were classified at the association level, 57% were classified at the alliance level, and 85% were classified at the order level. On the one hand, the results reflect the structure of the data set and all components of the ES, that is, groups of species and formal definitions. On the other hand, this indicates that plant communities corresponding to the association are rather rare. The proportion of relevés classified at the association level in relation to higher-rank syntaxa (Table 1) was strongly determined by the concept of association adopted in this study. Although the association is the basic syntaxonomical unit, it can be defined as either a narrow or broad unit . In our study, associations were defined by Cocktail definitions as narrow units and should therefore be considered to a large extent examples of well-defined reference units. The results of our and other studies suggest that most of the vegetation stands documented with relevés correspond to high-ranked abstraction levels due to the lack of diagnostic species for associations, high level of stochasticity in species composition, and vulnerability of species composition to the intensity, type, or lack of human activity (Swacha et al., 2018). The introduced ES assigns a relevé with an insufficient representation of the diagnostic species of an association to the corresponding alliance. If a relevé lacks the required representation of diagnostic species for an alliance, it will be assigned to either order or class.

Syntaxonomical Remarks
Using the Cocktail method, we recognized within the Molinio-Arrhenatheretea class three orders, nine alliances, and 45 associations ( Table 1). The proposed classification system generally complies with the recent European hierarchical system of syntaxonomical units (Mucina et al., 2016). However, we did not recognize several units. One of them is the order Poo alpinae-Trisetetalia, which includes high-altitude mesic hay meadows and pastures in the European mountains. The main reason for not recognizing this order is insufficient diagnostic species of the high mountain meadow vegetation for its separation from the closely related Arrhenatheretalia order. We also found it impossible to delimit the order Filipendulo ulmariae-Lotetalia uliginosi comprising tall-herb wet meadows. Veronico longifoliae-Lysimachion vulgaris and Mentho longifoliae-Juncion inflexi alliances could not be delimited, and both were included in the alliance Filipendulion ulmariae. In Mucina et al. (2016), these tall-herb meadow fringes were considered three separate alliances. We were unable to follow this concept due to the high floristic similarity of these units and the overlap of diagnostic species. The classification of tall-herb vegetation is unclear in Europe. Veronico-Lysimachion and Filipendulion were included in the alliance Calthion and order Molinietalia in vegetation surveys of the Czech Republic and Slovakia (Chytrý et al., 2007;Janišová et al., 2007). We excluded from the class Molinio-Arrhenatheretea plant communities dominated by the genus Petasites, which are classified to the alliance Filipendulo-Petasition because they are usually found along ditches, forest-edges, and eventually on abandoned wet grasslands. Although these communities are floristically related to wet meadows, they do not represent man-managed grassland ecosystems. This vegetation is suggested to be classified to the Mulgedio-Aconitetea or Epilobietea angustifolii class (Mucina et al., 2016).
We did not recognize two alliances of the Arrhenatheretalia order reported by Mucina et al. (2016), namely Phyteumato-Trisetion representing mesic mown meadows in the submontane and montane regions of Central Europe and Alchemillo-Ranunculion repentis representing communities of trampled, low grasslands in montane belts. These two alliances were excluded due to a lack of diagnostic species. Our data set did not contain sufficient data to recognize the alliance Anthrisco-Arrhenatherion (Rodríguez-Rojo et al., 2017). However, we found communities related to Tanaceto-Arrhenatheretum, which we included in the Arrhenatherion alliance. Thus, in this study, we presented the traditional concept of the order Arrhenatheretalia with four alliances, which is in accordance with Dierschke (1999). There are many discrepancies between the proposed classification system and previous vegetation surveys of the Molinio-Arrhenatheretea class in Poland, particularly in relation to the number of associations (Matuszkiewicz, 1984(Matuszkiewicz, , 2006). Our classification system, on the other hand, strongly agrees with more recent vegetation surveys in neighboring countries (Chytrý et al., 2007;Janišová et al., 2007;Rozbrojová et al., 2010), with the exception of Triseto-Polygonion and Poion alpinae. The differences between former systems published in Poland and the one presented here concern mainly associations. For example, the diversity of the Triseto-Polygonion alliance was poorly recognized in Poland. In our study, we delimited six associations, while only two associations, Phyteumo orbicularis-Trifolietum and Meo-Festucetum, were reported previously (Matuszkiewicz, 2006). Gladiolo-Agrostietum was formerly classified to the alliance Arrhenatherion, but the high share of mountain species places this association within the Triseto-Polygonion. In contrast, we have not found data representing Phyteumo orbicularis-Trifolietum reported from the Tatra Mountains (Balcerkiewicz, 1978). We distinguished the subalpine alliance Poion alpinae with one association, Alchemilletum pastoralis, which has been reported from the Tatra Mountains at the beginning of the twentieth century (Szafer et al., 1927). This association was recently reported in Western and Eastern Carpathians in Poland (Hegedüšová Vantarová, 2014). The alliance Calthion is the most diverse within the class Molinio-Arrhenatheretea (Kucharski & Michalska-Hejduk, 1994;Matuszkiewicz, 2006;Trąba & Wolański, 2012). We reported 12 associations, with the most substantial changes detected in mountain wet meadows in the Sudetes (Krahulec et al., 1996;Kwiatkowski, 2011). The alliance Filipendulion is well recognized in Poland (Kucharski, 1999). We found six associations, of which Phragmiti-Euphorbietum is reported for the first time in Poland. The alluvial meadows with a single association were described by Załuski (1995). We identified two more associations, Lathyro-Gratioletum and Poo-Alopecuretum, both classified to the alliance Deschampsion cespitosae (Mucina et al., 2016). Temporarily flooded and heavily grazed nutrient-rich pastures are poorly represented in the data set; thus, the classification of this vegetation should be considered as a preliminary. We delimited five associations in the Potentillion anserinae alliance.
This paper proposes classification system for Molinio-Arrhenatheretea class. The system is limited due to insufficient representation of data from some vegetation types and regions of Poland. The proposed hierarchically nested system for classification is important from a practical point of view because many syntaxa correspond to protected habitat types, such as those recognized in the European Habitat Directive .

Conclusions
Our study shows that using formal definitions of vegetation units, it is possible to create a hierarchically nested system of vegetation classification and unequivocally classify grassland communities at different syntaxonomical levels. The ES uses the summed cover and presence/absence information of species groups, defined according to habitat preference and functional characteristics, and that of individual species. It is particularly important to create definitions for different syntaxonomical levels because the majority of vegetation patches do not match the association, but can only be assigned to the high-rank units.

Supporting Material
The following supporting material is available for this article: • Appendix S1. Expert system for automatic classification of mesic and wet grasslands. • Appendix S2. Example of hierarchical nestedness of the association Pastinaco-Arrhenatheretum. • Appendix S3. Numeric and textual description of delimited vegetation units, and maps of the distribution of relevés representing syntaxonomical units. • Appendix S4. Synoptic table.