Similarity analysis between species of the genus Quercus L. (Fagaceae) in southern Italy based on the fractal dimension

Abstract The fractal dimension (FD) is calculated for seven species of the genus Quercus L. in Calabria region (southern Italy), five of which have a marcescent-deciduous and two a sclerophyllous character. The fractal analysis applied to the leaves reveals different FD values for the two groups. The difference between the means and medians is very small in the case of the marcescent-deciduous group and very large when these differences are established between both groups: all this highlights the distance between the two groups in terms of similarity. Specifically, Q.crenata, which is hybridogenic in origin and whose parental species are Q.cerris and Q.suber, is more closely related to Q.cerris than to Q.suber, as also expressed in the molecular analysis. We consider that, in combination with other morphological, physiological and genetic parameters, the fractal dimension is a useful tool for studying similarities amongst species.


Introduction
Quercus L. is an important genus containing several species of trees dominating different forest communities. The ecological and economic role of Quercus spp. is well known (Quinto-Canas et al. 2010, Vila-Viçosa et al. 2015, Spampinato et al. 2016, Vessella et al. 2017. Some species (such as cork oak) are specifically very useful for carbon sequestration and as raw materials for a post carbon city , Spampinato et al. 2019.
Leaf morphology has been studied throughout the history of botany, using leaf shape, edge, vein arrangement, hairiness and other features as important characters in systematics (Coutinho 1939, Amaral Franco 1990. Species have been described by means of the analysis of the size and shape of several leaf characters and using biometric studies. Morphometry and the leaf vascular system have traditionally been key aspects for establishing the description and biometrics of the species; in morphometry, the leaf shape and edge and the arrangement of the veins are all common systematic characters used to characterise different species. For a correct determination of each species and their hybrids, their taxonomic characters must be observed with specific instruments, e.g. powerful microscopes capable of highlighting micromorphometric characters (Vila-Viçosa et al. 2014).
Numerous authors have noted the comparative inaccuracy of early descriptive and biometric studies (Mouton 1970, 1976, Hickey and Wolfe 1975, Hickey 1979. Classic descriptive methods do not establish clear differences between pure individuals and their hybrids, so molecular studies are proposed for pure and hybrid strains (Conte et al. 2007, Curtu et al. 2007, Coutinho et al. 2014. More precise biometric studies subsequently emerged that allowed a more meticulous representation of the leaf detail or the other parts of the plants (e.g. Cano et al. 2017). Biometrics thus came into its own for pinpointing the differences between species and taxonomic groups.
In their study of several Quercus species, Camarero et al. (2003) and Fortini et al. (2015) analysed the leaf morphology for pure and hybridogenic populations and observed the variability of their morphological characters. These phenotypical characters must be precisely quantified to establish the differences between pure species and their hybrids, which can be recognised through fractal analysis.
We calculated the fractal dimension by the box-counting method integrated in the ImageJ software (Abramoff et al. 2004), as it allows the possibility of assessing the fractal dimension of structures that are not totally self-similar. To resolve the controversy regarding certain species/subspecies in the genus Quercus, a discriminant analysis is required that can clearly differentiate the species/subspecies and the degree of relationship between them. The fractal dimension, which has not so far been widely applied in botany, although somewhat more so in medicine, was used for this purpose (Esteban et al. 2007, Lopes and Beltrouni 2009. The main aim of this work is to establish an analysis of similarity of leaf shape amongst seven species in the genus Quercus from Italy and corroborate our previous studies (Musarella et al. 2013), in which we proposed a FD < 1.6 for sclerophyllous Quercus and FD > 1.6 for deciduous and marcescent Quercus.

Data collection
In this work, we analysed 7 species living in Calabria using 275 tree samples belonging to Quercus robur subsp. brutia, Q. cerris, Q. congesta, Q. crenata, Q. ilex subsp. ilex, Q. suber and Q. virgiliana. Orientation largely determines the amount of light the leaves receive for photosynthesis and their size can thus be affected by this greater or lesser exposure to light. For this reason, samples were taken from the four cardinal points on each tree to examine the possible influence of orientation on leaf development. A total of 1,099 leaves were analysed from 120 samples of Q. robur subsp. brutia, 120 from Q. cerris, 154 from Q. congesta, 147 from Q. crenata, 240 from Q. ilex subsp. ilex, 139 from Q. suber and 179 from Q. virgiliana. All the leaves were colour-scanned in a scanner with a resolution of 1200 dpi and 24-bit colour. After scanning, the leaf was transformed to image 8-bit greyscales and the image was segmented by selecting the greyscale between 111 and 126. We opened this image with the ImageJ programme in order to determine its fractal dimension (FD).

The fractal dimension (FD)
Fractal geometry is the most suitable method for characterising the complexity of the vascular system or other mathematically similar structures such as stream drainage net-works in chicken embryos or the distribution of the vascular system of a leaf (Horton 1945, Vigo et al. 1998). De Araujo Mariath et al. (2010 developed a method using digital images of leaves to determine the fractal dimensions of the leaf vascular system in three species of Relbunium (Endl.) Hook. F. (Rubiaceae), with the aim of quantifying and determining its complexity so it could be used as a taxonomic character. Recently, Cuzzocrea et al. (2017) described an algorithm to estimate the parameters of Iterated Function System (IFS) fractal models, using IFS to model speech and electroencephalographic signals and to compare the results.
All man-made objects can be described in simple shapes using Euclidean geometry. However, natural objects have irregular forms that cannot always be represented using this method (Glenny et al. 1985).
Due to the recentness of the discovery and its wide range of applications, there is still no universal definition of what actually constitutes a fractal. They are thus described according to their common properties: specifically, they must have the same appearance at any scale of observation, meaning that a fractal object can be broken down into parts, each of which is identical to the whole object (self-affinity or self-similarity); they must have a fractional and not a whole dimension (fractal dimension); and finally the relationship between two of their variables must be a power law (where the exponent is its fractal dimension, Mandelbrot 1983). Topological and Euclidean dimensions cannot be applied to highly irregular objects such as coastlines. Mandelbrot (1967) published a widely-referenced work where he proved that it was impossible to give an exact value of the length of the coast, as this measurement depended on the unit of scale used. Thus in the case of irregular curves, a small FD of close to 1 signifies a low level of complexity, whereas values close to 2 indicate a very high level of irregularity. When an object is totally self-similar, such as the mathematical fractal known by the name of the Koch curve (Figure 1), the dimension used is known as the selfsimilarity dimension.
A unit segment can be divided -for example -into three pieces similar to the original, each with a length of 1/3. In general, where N(h) is the number of pieces with a length h, it follows that N(h) • h 1 = 1. If we now look at a square with a unit side, we can break it down into 9 = 3 2 smaller squares with a side of ⅓; that is to say N(h) • h 2 = 1. Finally, in the case of a cube, it is easy to see that the following is true: That is, the exponent of h coincides with the topological and Euclidean dimension of the straight line (1), the square (2) and the cube (3)   By extrapolation from this concept, if the object is completely self-similar, there is a relationship between the scale factor h and the number of pieces N(h) into which the object can be divided, which is given by N(h) = (1/h) D ; that is to say . Thus the fractal dimension of the Koch curve is: , a number that is very similar to the FD of the English coastline.
However, natural objects like leaves are not perfect fractals, as they are not totally self-similar but are said to be statistically similar. In this case, the value of their fractal dimension is known by the name of Hausdorff-Besicovitch and is: . The calculation of this limit is somewhat complicated and requires the use of different algorithms such as dilation methods, the perimeter method, Grassberger and Procaccia's correlation dimension and box-counting method. This last is the most widely used as it is very simple to implement with computer technology and highly accurate (Glenny et al. 1985, Jian Li et al. 2009).
To find the fractal dimension of a digital image using the box-counting method (Mandelbrot 1983), the image must be transformed into black (the leaf ) and white (the background). A grid is then superimposed on the image and the number of times the leaf intersects a grid square is counted. The image is covered with a grid of squares initially with side 2 and subsequently with squares with side 3, 4, 6, 8, 12, 16 and 32 (in Table 1; C2, C3, C4, C6, C8, C12, C16 and C32). The side of square h is then reduced and the logarithm of the number of intersections N(h) is represented based on the logarithm of the inverse function of the side. The dimension of the object coincides with the slope of the regression line defined by the point cluster (log(1/h), log(N(h)) produced when the value of the side of the grid square is changed.
The graphic representation of the regression line and the point cluster shows two very clearly differentiated parts. The minimum and maximum box size is therefore very important when applying this method. In fact, the approximation error must be reduced by selecting points with a "more linear" form as a box size.

Calculating the fractal dimension (FD)
The FD was calculated by the box-counting method (Esteban et al. 2007) using the free software ImageJ version 1.47 (http://imagej.com). The digital image of the leaf in RGB col-   (Figure 2b) where each pixel was represented with a greyscale from 1 to 256. In order to select the most important information, the image was subsequently segmented to produce a greyscale between 111 and 126 and then converted into binary so the leaf takes the value 1 and the rest the value 0 ( Figure 2c). The box-counting algorithm was then applied to this black-and-white image of the venation network of the leaf to calculate the FD with box sizes (h) ranging from 2 to 32. Specifically, the image is covered with a grid of squares initially with side 2 and subsequently with squares with sides 3, 4, 6, 8, 12, 16 and 32 (in the image C2, C3, C4, C6, C8, C12, C16 and C32). Table 1 shows the number of boxes occupied (N(h)) for each box size.
Once the points were represented (log(1/h), log(N(h)), we calculated the regression line ( Figure 3) whose slope corresponds to the value of the fractal dimension; in our case, the FD=1.9298, Standard Error= 0.0044, p-Value=1.01384*10^(-14). As can be seen in the graph, the fit is fairly good as the points are very close to the resulting regression line.
For the statistical treatment, the mean FDs were obtained for each species and an analysis of variance was undertaken to test for significant differences amongst the means. First, the Shapiro-Wilk normality test and the difference between the mean, median and kurtosis indicate that our data do not follow a normal distribution (Table 2), meaning non-parametric methods must be used. To determine whether orientation affects the leaf morphological character, we applied a non-parametric Kruskal-Wallis test which, based on the medians, compares the leaves from the same population  and from the four orientations. We also applied the standardised kurtosis coefficient to determine whether there is significant normality in the data. In the case of significant differences in the analysis of variance, we applied the LSD (Least Significant Difference) multiple comparison test.
In the hypothetical case that the difference between the fractal values (means and medians) for two species is zero or has a quotient of one, the degree of relationship between the two species is 100%; DfA -DfB = 0; DfA / DfB = 1, species A and B are equal; thus the lower the fractal difference or the nearer the fractal quotient is to 1, the greater the similarity between the species.

Results
The analysis of the FD values for each orientation and for each species shows that for Q. robur subsp. brutia, Q. cerris, Q. congesta and Q. virgiliana, the orientation influences the values of FD, as there are significant differences for these species (Table 3).
These species correspond to deciduous or marcescent species, whereas the perennial species Q. ilex subsp. ilex, Q. suber and Q. crenata do not show significant differences in the values of FD for the different levels of orientation. An analysis of the average FD values for each species indicates that there are significant differences between the different levels of species under study (Table 4). Subsequently, the Conover-Iman test of multiple comparisons between all pairs shows the pairs of species between which there are significant differences (Table 5).
As can be seen in Table 5, there are pairs of species for which there are significant differences in the values of FD. These differences are not only significant between the species Q. robur subsp. brutia -Q. cerris and between Q. crenata -Q. congesta. The fractal dimension is therefore sufficient alone to characterise and separate the species Q.  ilex subsp. ilex, Q. suber and Q. virgiliana, while the fractal dimension of the vascular network of the leaves calculated by the methodology described does not distinguish Q. robur subsp. brutia from Q. cerris and Q. congesta from Q. crenata on its own.
The analysis of the medians of the seven groups ( Figure 4) shows that the lowest values of FD correspond to the sclerophyllous Quercus species Q. ilex subsp. ilex and Q. suber, whose values are below 1.6, as occurs in the case of the medians. However the marcescent Quercus have a median FD of > 1.6; the mean FD values of Q. suber and Q. ilex subsp. ilex are 0.932 and 1.363, respectively, whereas it is 1.613 for the marcescent Q. robur subsp. brutia; 1.677 for Q. cerris; 1.881 for Q. congesta; 1.868 for Q. crenata; and 1.914 for Q. virgiliana.
In the multiple comparison analysis ( Figure 5) of means and medians, the most significant differences in the two cases are between the sclerophyllous and marcescent Quercus, where these differences (means) are 0.982 for Q. virgiliana-Q. suber and *0.984 in the case of the medians; however the differences between the marcescent Quercus are minimal with *0.015 for Q. congesta-Q. crenata and *0.188 between Q. cerris-Q. crenata. As the value for Q. crenata-Q. suber is *0.939, it is evident that Q. crenata is more closely related to Q. cerris than to Q. suber ( Figure 5).
In the case of both mean and median values, it is confirmed that the value of the fractal dimension (FD) is less than 1.6 in the case of sclerophyllous Quercus and greater for marcescent and deciduous Quercus (Figure 4).
The differences between average FD values for marcescent and deciduous Quercus species are very low (Table 6). These low differences between average FD values are due to the close similarity between these species. However, there are significant differences in the FD between marcescent and sclerophyllous Quercus as they are very distant from each other in evolutionary terms: Q. virgiliana-Q. ilex subsp. ilex 0.551; Q. virgiliana-   Figure 5).

Discussion
There is a widespread consensus that complex objects with the same features can be included in the category of fractals. Self-similarity is one of the characteristics of fractal objects, meaning that when these images are broken down into smaller pieces, each one is identical to the whole. The fractional dimension is another of its features.
In the hypothetical case that the difference between the fractal values of two species is zero, or their quotient is one, the degree of relationship between the two species is 100%: Df A -Df B = 0; Df A / Df B = 1, species A and B are equal. Thus the smaller the fractal difference or the closer the fractal quotient is to 1, the greater the similarity between the species; if the value of this quotient is far from 1, as occurs between Df vi /Df su > 2, the species Q. virgiliana and Q. suber are very distant from each other. This occurs when the fractal values are the same and means that the same or similar characters have been measured Conte et al. (2007) point out the hybridogenic origin of Q. crenata and the molecular analysis reveals a closer genetic similarity between Q. crenata and Q. cerris than between Q. crenata and Q. suber. The FD of Q. crenata is 1.868; for Q. cerris it is 1.677; and for Q. suber it is 0.932; where Df Qce -Df Qsu = 0.745 and Df Qce / Df Qsu = 1.8, pointing to a large phenotypical (genetic) difference between the parental species. More similarity can be seen between Q. crenata and Q. cerris than between Q. crenata and Q. suber, as the difference Df Qcr -Df Qce = 0.191 and Df Qcr / Df Qce = 1.1; they therefore have a high degree of similarity; whereas Df Qcr -Df Qsu = 0.936 and Df Qcr / Df Qsu > 2, indicating substantial phenotypical differences between the hybrid and parental species. Coutinho et al. (2014Coutinho et al. ( , 2015 report a high degree of polymorphism in the genus Quercus and establish the molecular analysis of ribosomal DNA through the restriction enzymes to confirm the taxonomic classifications and establish the phylogeny between Quercus species. Their results show that the group known as cerris contains Q. crenata and its parental species Q. cerris, whereas it excludes the parental species Q. suber; Q. crenata is closer to Q. cerris with a similarity of 96% compared to a 66% similarity between Q. suber and the previous species. Our fractal analysis corroborates the results of Conte et al. (2007) and Coutinho et al. (2015). Curtu et al. (2007) studied four oak species, including Q. robur and Q. cerris and the intermediate or hybridogenic forms using morphological leaf and genetic markers to classify the hybridisation. In our case, the intermediate or hybrid form corresponds to Q. crenata which has its origins in the parental species Q.cerris and Q. suber. Here the intermediate form Q. crenata has a fractal value close to Q. cerris and very far from Q. suber.
Finally, the orientation has no influence on the fractal dimension between either the same species or between the different species. This means that the shape of the distribution of the leaf vascular network is not affected by possible changes in orientation, thus discounting the effects of environmental variables such as amount of light, temperature, humidity etc., associated with orientation. This evidence is important in Quercus species, as in other cases, these environmental variables can influence seed germination and the capacity of some plant species to adapt to extreme environments (Signorino et al. 2011, Panuccio et al. 2018: in some cases, the survival or disappearance of a species in an environment may depend on it.

Conclusions
We confirm that the application of fractal analysis identifies the phenotypical differences between species and can be used as a method to establish their degree of relationship; this is supported by molecular analysis by various authors. In this work we can affirm that sclerophyllous Quercus species have a fractal dimension of < 1.6 and marcescent and deciduous Quercus species have FD > 1.6; and that Q. crenata, a hybrid of Q. suber and Q. cerris, has a greater similarity to Q. cerris than to Q. suber. The low values of the mean and median FD revealed by the differences between the FD for marcescent-deciduous Quercus species suggest a high degree of similarity amongst the five marcescent-deciduous species. Based on their FD, marcescent Quercus species (semideciduous) are more closely related to deciduous than to sclerophyllous Quercus species, whereas the sclerophyllous Q. ilex subsp. ilex and Q. suber show substantial morphological differences with the marcescent and deciduous Quercus species, as evidenced by fractal analysis. These two species have followed different evolutionary paths from the others, as is to be expected, as the centre of origin of sclerophyllous Quercus species is Mediterranean, whereas deciduous Quercus species have a temperate origin and marcescent Quercus species come from the boundary between the Temperate and Mediterranean bioclimates (Amaral Franco 1990, Sánchez de Dios et al. 2009).