Waarnemingen.be – Plant occurrences in Flanders and the Brussels Capital Region, Belgium

Abstract Waarnemingen.be - Plant occurrences in Flanders and the Brussels Capital Region, Belgium is a species occurrence dataset published by Natuurpunt. The dataset contains almost 1.2 million plant occurrences of 1,222 native vascular plant species, mostly recorded by volunteers (citizen scientists), mainly since 2008. The occurrences are derived from the database http://www.waarnemingen.be, hosted by Stichting Natuurinformatie and managed by the nature conservation NGO Natuurpunt. Together with the datasets Florabank1 (Van Landuyt and Brosens 2017) and the Belgian IFBL (Instituut voor Floristiek van België en Luxemburg) Flora Checklists (Van Landuyt and Noé 2015), the dataset represents the most complete overview of indigenous plants in Flanders and the Brussels Capital Region.

Natuurinformatie and managed by the nature conservation NGO Natuurpunt. Together with the datasets Florabank1 (Van Landuyt and Brosens 2017) and the Belgian IFBL (Instituut voor Floristiek van België en Luxemburg) Flora Checklists (Van Landuyt and Noé 2015), the dataset represents the most complete overview of indigenous plants in Flanders and the Brussels Capital Region.

General description
Purpose: Plants have a long history of being recorded by both amateur and professional botanists. Volunteer data from amateur botanists were always an important source of distribution data of plants. The atlas of Flanders and the Brussels Capital region (Van Landuyt et al. 2006) was based on the teamwork of many volunteer botanists, NGOs, scientific institutes and governmental organisations. Since Natuurpunt, the largest nature conservation NGO in Flanders, Belgium, launched the web portal www.waarnemingen.be in 2008, the number of plant observations in Flanders and the Brussels Capital Region has risen sharply. Beside IFBL-mapping and projectrelated observations, this database is easily used for occasional observations and can be used for monitoring (wildlife) areas. Old notebooks and reports were screened and stored in the database (Steeman et al. 2012). The team of specialized validators motivates the inexperienced observers and validates observations. Here we publish these records on a IFBL (Instituut voor Floristiek van België en Luxemburg) grid cell resolution of 4 × 4 km².
General taxonomic coverage description: The datasets contains 1,222 native vascular plant (Plantae) species (as well as an additional number of subspecies, varieties, forms, hybrids and multispecies) recorded in Flanders and the Brussels Capital Region. This includes angiosperms (flowering plants), gymnosperms, ferns and allies, but not algae, mosses and lichens. If the observer remarked that the specific individual of this native plant was introduced by man, then this is recorded in the field establish-mentMeans. The number of records (observations) per plant species is shown in Fig. 1 and the top 10 most frequently recorded species are shown in Table 1.

Spatial coverage
General spatial coverage: Flanders and the Brussels Capital Region (Fig. 2). These regions are situated in the north of Belgium and cover an area of 13,522 km² and 162 km² respectively (13,684 km² in total or 45% of the Belgian territory). Flanders is largely covered by agricultural land (51%), urban areas (30%) and woodland (10%) while the Brussels Capital Region mainly consists of urban areas (73%), woodland (12%) and other green areas (10%) (Vriens et al. 2011). All occurrence data are generalized to IFBL grid cells of 4 × 4 km² (Fig. 3), with the grid codes indicated in the field verbatimCoordinates. The WGS84 centroids of these grid cells are calculated in decimalLatitude/Longitude with a coordinateUncertaintyInMeters of 2,828 meters (using Wieczorek et al. 2004).
Coordinates: 50°40'48"N and 51°30'36"N Latitude; 2°32'24"E and 5°55'12"E Longitude. We show the number of plant observations and the number of plant species per IFBL grid cell (Fig. 4). Figure 5 shows the frequency distribution of plant species per number of IFBL grid cells. The top 10 of the most widespread recorded plant species is shown in Table 2.
Temporal coverage: June 30, 1855 -December 31, 2016 The majority of records was collected since the launch of www.waarnemingen.be in 2008 (Fig. 6).     , 1910 = 1901-1910, etc.). Note the difference between the scales on the y-axis between the left and right figures and the strong increase in smartphone registration of records since the launch of an app (ObsMapp for Android) in 2012.

Sampling description:
Most observations (species, date, location, observer) were recorded by volunteers (citizen scientists). The dataset also includes historical records and datasets imported in waarnemingen.be. The large majority of records (95%) is a casual observation (presence only record). 5% of observations were registered as part of a species checklist. This is also recorded in the field samplingProtocol. The frequency distribution of number of observers per number of records or species is shown in Fig. 7. Quality control description: Recorded data are verified by a group of botanical experts (including professional botanists), based on collected specimens, the observer's species knowledge, added photographs and known species list of locations. The validation procedure from www.waarnemingen.be consists of an interactive procedure in which observers can be asked for additional information by a team of validators, after which the validator manually adds a validation status. Manual validation focuses on rare species, species that are reported outside their known range and observations accompanied by pictures. Records that are not manually validated are additionally checked by an automated validation procedure that takes into account the number of manually validated observations of a species within a specified date and distance range. 12% of the plant records in this dataset are supported by photographs in www.waarnemingen.be. The validation status is indicated in the field identificationVerificationStatus, the link to the original record in references.

Discussion
Since 2010, the number of plant observations registered annually is larger than all the records available in www.waarnemingen.be before 2008. Observations are currently mainly presence only records (95%). Presence is certain, absence of data can have multiple reasons: an IFBL grid cell was not visited, the species was not present/seen, the species was present but not registered in the database. For this reason, since the end of 2016, www. waarnemingen.be focusses more on lists and transect registration. During field work, the route can be tracked via the mobile app ObsMapp. At the end of the excursion, observers can indicate different types of lists, depending on whether: (1) the records are opportunistically collected presence only data (some records of some of the species encountered), (2) all individuals of selected species were registered, (3) all species were recorded or (4) all individuals of all species (more useful for animals than plants). This additional information allows to account for a better observation effort than currently is the case.
The most frequently and widespread observed plant in www.waarnemingen.be is Urtica dioica. This species was in Van Landuyt et al. (2006) also the most widespread plant. The other plants on the top 10 of most frequently recorded plants shows there is bias in the data collected by the plant observers of waarnemingen.be. Species like Poa annua or Sagina procumbens should be seen much more than striking species like Cardamine pratensis, Filipendula ulmaria and Anemone nemorosa. This might be explained by the observers' lack of interest in very common species (Mair and Ruete 2016). Furthermore, spatial biases are expected since the data is collected opportunistically without mandatory sampling protocol (Geldmann et al. 2016). Sampling bias related to variation in recorder activity has been grouped in four main categories by Isaac et al. (2014): 1) uneven recording intensity over time, 2) uneven spatial coverage, 3) uneven sampling effort per visit and 4) uneven detectability. We aim to understand these biases better by stimulating the use of species lists rather than the collection of presence only data.