The segetal flora of Italy: an occurrence dataset from relevés in winter cereals and allied crop types

Abstract The segetal flora of winter crops includes mostly native or archaeophyte annual species that are often strong specialists of their habitats. Threatened by the intensification of agriculture, segetal flora is particularly valuable from a perspective of biodiversity conservation and evolution. Moreover, it contributes to maintain biodiversity in agroecosystems and provides several ecosystem services. The dataset here described was set up to provide the first inventory of the segetal flora of Italian winter cereal crops and allied crop types, the latter including flax and autumn-sown legumes. It includes 24,676 georeferenced occurrence data deriving from 1,240 floristic and phytosociological relevés. The data were collected from the greater part of Italian territory, in a temporal range spanning from 1946 to 2018.


Introduction
The concept of "weed" is very subjective, as any plant that interferes with human activities can be considered as such, implying the existence of agricultural weeds, environmental weeds, ruderal weeds and many others. Weeds of arable land are almost exclusively annual and are called "agrestals" or "segetals" (Holzner 1982). For decades, they have been negatively affected by the intensification of agriculture all over Europe (Storkey et al. 2012;Richner et al. 2015;Janssen et al. 2016;Pannacci et al. 2017;Woźniak 2020). In recent years, many studies highlighted the ecological and agronomic benefits of these species in agricultural systems, including the provision of ecosystem services, such as support to biodiversity, storage of crop genetic resources, pest regulation and soil protection (Hammer et al. 1997;Marshall et al. 2003;Storkey and Neve 2018). At middle and high latitudes, segetal plants can be divided into two main groups according to their phenology, the latter depending on the crops they colonise: species of winter crops, like wheat and species of summer crops, like maize. In Europe, winter-annual crops host mostly native or archaeophyte segetal species, which are often strong specialists of these habitats (Lososová et al. 2004;Abbate et al. 2013;Nowak et al. 2015;Latini et al. 2020). Several anecophytes are present amongst them: "homeless weeds" without a natural habitat, which recently evolved under the pressure of agriculture and developed biological and ecological features similar to those of crop species (Zohary 1962;McElroy 2014). For all these reasons, the segetal flora of winter cereal crops owns a peculiar value from the perspectives of biodiversity conservation and evolution.
The here presented dataset is available in GBIF (Fanfarillo et al. 2020a) and includes 24,676 records. Of the latter, 2,878 were newly acquired through field sampling and 21,798 were retrieved from literature. The dataset was set up to define the first inventory of the segetal flora of Italian winter cereal crops and allied crop types (from here on simply "segetal flora"), i.e. flax and autumn-sown legumes (Fanfarillo et al. 2020b). It is the first contribution by the Laboratory of Systematic Botany and Floristics, Department of Environmental Biology, Sapienza University of Rome to the GBIF, which approved it as a data editor in March 2020 (responsible person: Mauro Iberite; technical contact: Marta Latini).
In the light of what is stated above, the main aims of the present paper are the description and presentation of this recently-released dataset, providing information on its usefulness and possible future applications.

Project title
Plant biodiversity in traditional agroecosystems of Italy: a floristic and ecological multi-scale analysis based on geodatabases.

Project data
The disappearance of traditional agroecosystems and the consequent biodiversity loss due to changes in agriculture are receiving increasing attention in Europe. The use of databases on plant taxonomical, distributional, ecological and functional traits is of crucial importance in conservation actions. The need to improve monitoring and reporting activities by improving the quality of biodiversity data is also underlined by the European Biodiversity Strategy. This project aimed at fulfilling a global analysis of the plant diversity existing in the traditional agroecosystems of Italy, knowledge of which is currently lacking, by means of the collection, digitisation and processing of original and archival data. The proposed actions concerned: the preparation of thematic databases on segetal flora and vegetation, including the features of plant species and communities; the analysis of data at different spatial and temporal scales; the production of thematic maps on plant diversity and its related topics; the development of new methods to estimate the nature value of agroecosystems; the detection of bio-indicator plant species for floristic richness, agricultural intensity and environmental quality. Special attention was given to winter arable plants and communities, currently at high risk of disappearance in Europe. The achieved results provided an important basis for any future research, with special regards to the definition of conservation strategies for plant diversity in European rural areas.

Methods
The occurrence data were retrieved through extensive literature searches and intensive field samplings, the latter being carried out in the greater part of Italy to fill the knowledge gaps in some geographic areas. Literature data were selected using a habitat-based criterion: only the records for taxa unambiguously reported to grow in winter cereals, flax and autumn-sown legumes were collected. Consequently, all the records with no or with generic information on the growing habitat (e.g. "fields" or "cultivated land") were excluded. Likewise, records of taxa identified to the genus or higher level, doubtful identifications, nomenclatural ambiguities and crop species were not considered. The bibliographic source of each record is available upon request to the authors.
All the occurrence data were georeferenced. Geographic coordinates (decimal latitude and decimal longitude), geodetic datum and a value of uncertainty for coordinates were attributed to each single record. The geographic coordinates were manually attributed, based on the descriptions of the relevé location provided in the original source. If coordinates were already available, they were converted in WGS84 geodetic datum, when differently expressed. The uncertainty of geographic positions was estimated according to the 9-degree scale defined by Murphey et al. (2004) and then converted into metres, as requested by GBIF (1, 100, 500, 1000, 5000, 10,000, 50,000 m or accordingly higher, if only the administrative region/country were given for data, following the same method used in Küzmič et al. 2020). Georeferencing historical data was often challenging due to vague information on the collection place or to the report of non-localisable toponyms. In these cases, the records where georeferenced as accurately as possible on a wider scale (e.g. the "comune" when the reported locality within the "comune" could not be identified).

Taxonomic coverage
Most of the records belong to the class Magnoliopsida (20,307 records; 82% of the total), followed by Liliopsida (4,208 records; 17%) and Polypodiopsida (117 records; 0.5%). Though, on the basis of the most recent results summarised by the APG (Stevens 2001 onwards), this classification is outdated, the technical schemes of the GBIF Backbone Taxonomy impose following this taxonomic scheme (GBIF Secretariat 2019).

Temporal coverage
The dataset includes species occurrences recorded from 1946 to 2018 (Fig. 2). Most of the records were collected in the 1970s, 1990s and 2010s. The date of collection is available for 84% of the data (20,703 records). As expected, a high seasonality characterises the dataset. Most of the occurrences were recorded in spring and early summer. The months of greatest occurrence of the investigated taxa are, respectively, June, May, July and April (Fig. 3).

Interest and use of the dataset
The "Segetal flora of Italy" dataset was the basis for the definition of the first inventory of the segetal flora of Italian winter cereal crops and allied crop types (Fanfarillo et al. 2020b). The latter is one of the first of its kind for European countries, following the French one (Aboucaya et al. 2000;Cambecédes et al. 2012). Part of the stored data was used to highlight the influence of the geo-environmental factors and the patterns of co-occurrence of rare and threatened arable species in winter arable plant communities of mainland Italy (Fanfarillo et al. 2020c, d). Moreover, another subset of the data contributed, in the form of vegetation plots, to the establishment of the European Weed Vegetation Database (Küzmič et al. 2020). Besides GBIF, the occurrences stored in the "Segetal flora of Italy" database will be also stored in other important biodiversity data repositories, such as the Italian Wikiplantbase #Italia  onwards; Dipartimento di Biologia, Università di Pisa 2020).