Perspectives on the Grey Zone of Species Delimitation With Notes on Invertebrates in the Marine Environment

Defining species boundaries, or delimiting species, is a complex and often difficult task. Indeed, when such studies incorporate approaches that consider evolutionary mechanisms, there is much to be learned about species diversity and how the processes that play critical roles in speciation can impact species delineation. In 2021, a virtual workshop on species delimitation was held at the Smithsonian Institution National Museum of Natural History to train natural history scientists and taxonomists on the appropriate analytical tools that can be used to help delimit species when using molecular data. This perspective highlights some of the main themes discussed during that workshop while detailing three processes that can challenge any species delimitation study. Specifically, we discuss incomplete lineage sorting, gene flow, and population structure when delimiting species boundaries using molecular data. We highlight empirical studies and methodological approaches that have successfully met these challenges under various scenarios. Finally, we provide recommendations and considerations for undertaking species delimitation studies in a variety of taxa. To this end, we recommend that taxonomists fully embrace process-based species delimitation, which can provide important insights into speciation in their study systems. For those developing analytical approaches, we hope they consider incorporating less well-known taxa, such as marine invertebrates, into method testing. Marine invertebrates encompass many dark taxa across the tree of life yet represent the majority of animal phyla, many of which are vulnerable to extinction due to global ocean change. Thus, advancing species delimitation to address taxonomic revisions in these organisms will support conservation decisions on keystone ecosystems. Furthermore, the diversity of their life history strategies, the lack of obvious barriers to gene flow in the ocean environment, and their occurrence in isolated habitat patches can better inform our knowledge of speciation and the evolutionary processes that play a role in generating diversity in nature


Introduction
With many species on Earth left to be formally described (Appeltans et al., 2012;Mora et al., 2011;Scheffers et al., 2012) and global change contributing to unprecedented rates of biodiversity loss (Bellard et al., 2012;Johnson et al., 2017;Pimm et al., 2014), we must increase the pace of species discovery, delimitation, and formal taxonomic description.These tasks, however, are not trivial as they take expertise and often a considerable amount of time and effort.With the general undervaluing of taxonomy, there is little incentive within the scientific community, particularly in the next generation of scientists, for delimiting and describing species (Drew, 2011;Hamilton et al., 2021;Sluys, 2013).Yet, many agree that species are fundamental units of biology and delimiting them is essential for various scientific fields of study, including ecology, conservation biology, and developmental biology (Bortolus, 2008;Mace, 2004;Wiens, 2007).When molecular species delimitation is combined with advanced analytical tools, such as the multispecies coalescent and demographic models, there is much to be learned, not only about particular taxonomic groups, but about speciation and the processes that have generated the diverse forms of life on Earth.
Taxonomy, and the task of delimiting and describing species, has historically been disassociated from investigations of the processes that generate species (Smith & Carstens, 2020).However, to improve taxonomic studies, we must embrace the complexity and intricacies of speciation and recognize that speciation is on a continuum.Thus, a process-based approach for delimiting species, while identifying and incorporating evolutionary processes that play a role in divergence, can improve our ability to detect species boundaries and the evolutionary processes that generate or obscure them.Studies incorporating processbased species delimitation have increased in recent years, particularly for terrestrial vertebrates (e.g., birds, bats, amphibians, reptiles; Battey & Klicka, 2017;Burbrink et al., 2021;Dufresnes et al., 2020;Morales et al., 2016) and, to some extent, terrestrial invertebrates (e.g., snails, beetles, spiders; Chou et al., 2021;Razkin et al., 2020;Salgado-Roa et al., 2021).Yet, the field of process-based species delimitation has yet to be fully embraced by taxonomists and evolutionary biologists studying animal phyla in the marine environment.
Marine invertebrates encompass most animal phyla, yet they have been largely underrepresented in species delimitation studies using multi-locus molecular data and advanced analytical methods (Fontaneto et al., 2015).It is possible that this is due to the paucity of biological information on marine invertebrates in general, the challenges of collecting them across a broad geographical and bathymetric (i.e., depth) space, and the fact that so many of them await formal taxonomic description (Appeltans et al., 2012).However, placing marine invertebrates into a process-based framework could provide important insights into processes that have generated life on Earth, particularly as the ocean has relatively few obvious barriers to dispersal-challenging how species are generated in the apparent presence of gene flow (Palumbi, 1994).In addition, marine invertebrates have a variety of life-history strategies (e.g., timing and age to first reproduction, fecundity), reproductive modes (e.g., brooding, broadcast spawning, internal fertilization), motility forms (e.g., sessile, mobile, to highly migratory), and generation times that could lend new insights and challenges for the analytical tools that have been developed for species delimitation.Thus, we need to harness the power of process-based species delimitation to advance the scope and pace of species delimitation and description of marine invertebrates while improving our understanding of speciation in the oceans.
In this perspective, we highlight several themes learned in a recent (2021) workshop on species delimitation held virtually by the Smithsonian Institution National Museum of Natural History (NMNH).This workshop brought together a group of experts to teach species delimitation theory and methods to natural history scientists aiming to delimit species across a wide variety of taxa, including marine invertebrates.Thus, our perspective emphasizes using process-based methods to delimit species boundaries with multi-locus molecular data.Here, we also discuss three processes that pose challenges for delineating species boundaries, but can also elucidate where species are in the grey zone of delimitation.We highlight relevant case studies that can guide future work while showcasing marine invertebrates and conclude with recommendations.

The grey zone of species delimitation
Species constitute taxonomic hypotheses that rely on the criteria used to delineate their boundaries (Gaston & Mound, 1993).Thus, in any species delimitation study, it is crucial to state the underlying species concept (i.e., phylogenetic, unified, biological;de Queiroz, 2007) and divergence criteria.By stating the concept, one can enable adequate comparison with other studies and help establish guidelines within the taxonomic system of focus.
Within such a species-hypothesis testing scenario, it is advisable to use both discovery and validation approaches (following Carstens et al., 2013).These approaches differ inherently by whether or not the samples are partitioned a priori into taxonomic categories (Carstens et al., 2013;Ence & Carstens, 2010).Discovery approaches identify putative groups without prior assignment, and validation approaches rely on the existence of primary species hypotheses that can be further tested.In many empirical systems, particularly marine invertebrates, it is often not possible to conduct validation approaches without first identifying putative taxonomic groups or primary species hypotheses using discovery methods (e.g., machine learning, discriminant analysis of principal components, and allele-sharing methods).Therefore, it is imperative to use an integrative framework to discover putative species boundaries, formulate primary species hypotheses, and then validate species based on all available lines of evidence (e.g., Puillandre et al., 2012).
For those undertaking a species delimitation study, however, it is equally important to recognize that diversification is not static; rather, it occurs across a continuum (Stankowski & Ravinet, 2021).Several processes can drive divergence along this continuum in either direction (i.e., toward or away from strong divergence; Nosil et al., 2009).Within the intermediate area of the speciation continuum, hereafter referred to as the "grey zone," it can be difficult to delineate species (de Queiroz, 1998(de Queiroz, , 2007)).Still, delimiting species within a process-based framework can enable us to determine the mechanisms involved in generating or obscuring divergence while allowing us to place more confidence on species boundaries within this grey zone (Smith & Carstens, 2020).
Here, we focus on three processes that can occur at various points along the speciation continuum and can challenge the ability to discern species boundaries in the grey zone: incomplete lineage sorting, gene flow and hybridization, and population differentiation (Fig. 1).These processes can also violate assumptions of several existing species delimitation methods and, therefore, preclude the use of certain approaches to delimit species.However, some validation and discovery approaches can account for them while delineating species boundaries.When considering these processes, species can be elucidated in parallel with their evolutionary history, and thus their place within the grey zone.To this end, we also provide relevant case studies that can guide future work in species delimitation studies, and we highlight the progress made to date in processbased species delimitation of marine invertebrates.Accounting for the processes that can confound discovery and validation approaches, particularly between closely-related taxa, is necessary to delineate species boundaries.Figure adapted from (de Queiroz, 1998(de Queiroz, , 2005(de Queiroz, , 2007) ) that, instead of highlighting the time in which different species criteria are met within the "grey area of lineage diversification," displays three challenging processes that may often confound species delineation in what we have termed the "grey zone of species delimitation."The solid grey line follows a gene genealogy (from left to right) through species and population divergence, displaying incomplete lineage sorting (yellow stars), interspecific gene flow (blue dashed arrows), and population genetic structure (green circles).Note that any of these processes can happen at any time in a species history.

Incomplete lineage sorting
Owing to the complexity of speciation histories, incomplete lineage sorting (ILS) or deep coalescence has been acknowledged as one of the primary causes of incongruence between gene trees and species trees (Doyle, 1992;Edwards, 2009;Knowles & Kubatko, 2010;Maddison, 1997;Pamilo & Nei, 1988;Slowinski & Page, 1999).Due to stochastic processes, including genetic drift, certain genes can fail to coalesce because of the retention of ancestral polymorphisms.As such, a lack of monophyly is often observed in single-locus gene trees.ILS is particularly (but not exclusively) evident between recently diverged lineages with shallow divergence times and large ancestral population sizes (Degnan & Rosenberg, 2009;Edwards, 2009;Rosenberg, 2003).
Species delimitation derived from tree-based approaches can thus be confounded by ILS, particularly when phylogenies are based on one or a few loci (Fig. 1).Consequently, the resulting phylogenies may not mirror the corresponding speciation events in the lineages (Funk & Omland, 2003;Hudson & Coyne, 2002;Kapli et al., 2020;Naciri & Linder, 2015).A number of methods have been developed to overcome these issues by considering the processes that generate and potentially impact the phylogenetic signal (Carstens & Knowles, 2007;Maddison & Knowles, 2006;Yang & Rannala, 2010).In this context, the multispecies coalescent (MSC) model, in conjunction with the availability of multilocus data sets and computational resources, has emerged as a noteworthy framework for estimating phy-logenies, population sizes, and divergence times while accounting for unresolved lineage sorting (Mirarab et al., 2014;Rannala & Yang, 2003).
Coalescent-based species delimitation methods use probabilistic approaches on multi-locus datasets to help identify independently evolving lineages that each represent a species (see Fujita et al., 2012).This approach negates the requirement of reciprocal monophyly or fixed differences, while allowing for gene tree discordance, to delineate species boundaries.However, prior to MSC model implementation, it is important to test its fit to the data (Reid et al., 2014).Poor fit of the MSC model would suggest violation of its core assumptions, and therefore, failure to account for important processes that may be present in the study system (e.g., gene flow; Morales et al., 2016).Thus, implementing tools such as the R package P2C2M (Posterior Predictive Checks of Coalescent Models) can be useful, particularly when applying phylogenetic and species tree estimation methods based on the MSC model (Gruenstaeudl et al., 2016).
The use of Bayesian Phylogenetics and Phylogeography (BPP; Rannala & Yang, 2013;Yang & Rannala, 2010) has become commonplace as it can jointly estimate species trees and delimit species based on the MSC model; although, criticisms to this method include its ability to delimit structure, not necessarily species (Sukumaran & Knowles, 2017).Nevertheless, BPP has helped to elucidate species boundaries in challenging empirical systems, particularly when used in an integrative framework.by relying on both morphological and molecular evidence and using different analytical approaches to test species hypotheses in a group of iguanian lizards.Still, it can remain challenging to differentiate tree discordance from ILS versus other processes, such as gene flow (see further discussion below; Funk & Omland, 2003;Holder et al., 2001;Sang & Zhong, 2000).

Gene flow and hybridization
The accumulating evidence of diversification and the maintenance of lineages in parallel with gene exchange highlights the significance of this evolutionary process even between well-separated species (for further discussion see Coyne & Orr, 2004;Hey, 2006;Nosil, 2008;Petit & Excoffier, 2009;Pinho & Hey, 2010;Sousa & Hey, 2013).In the context of species delimitation, owing to the widespread use of the biological species concept (BSC), gene flow is often central to the question of species boundaries (i.e., are species boundaries porous?; Arias et al., 2016;Harrison & Larson, 2014).However, several species delimitation methods are either based on the assumption that no recent gene flow has occurred between lineages, do not consider the occurrence of this process explicitly, or disregard it overall (e.g., some MSC, model-based, tree-based, or distance-based approaches; Eckert & Carstens, 2008;Leaché et al., 2014).Therefore, the incidence of genetic exchange in empirical systems violates the premises of frequently employed species delimitation approaches (Smith & Carstens, 2020).Nevertheless, it has become clear that gene flow should not be ignored in species delimitation, and in fact, might be a more important process in speciation in some empirical systems than currently realized (e.g., Hobbs et al., 2022;Taylor & Larson, 2019).
Drawing species boundaries in gene flow scenarios becomes particularly complex when divergent lineages are recognized despite clear signatures of recent gene exchange between them (Jackson et al., 2017;Roux et al., 2016a).Consequently, in complex divergence scenarios such as those that involve taxa with gene flow, we suggest that the assumptions made for each simulation or empirical system be stated so that the operational criteria to delineate species are not obscured.For example, if there is gene flow, the lineages should exhibit diversification in other aspects (e.g., ecological niches, phenotype) to show they are on different evolutionary paths before being delineated as distinct species (e.g., Chan et al., 2020).
The challenges related to confidently assessing gene flow from complex empirical data sets are largely responsible for the lack of integration of gene flow into species delimitation approaches.Despite that new sequencing technologies have enabled genome-scale studies of gene flow, many factors, including ILS, can still misguide estimation of gene flow (see Adams et al., 2019;Hibbins & Hahn, 2022).However, coalescent-based modeling approaches can assess the contribution of hybridization to the observed tree incongruence while accounting for ILS (Kubatko, 2009;Meng & Kubatko, 2009).For instance, genetic distances between discordant branches within a phylogeny are expected to display different distributions under ILS and hy-bridization (Holder et al., 2001).As species sequences have been diverging since speciation, the minimum genetic distance between them under ILS is expected to be constrained by divergence time compared with that of introgressed sequences (Joly et al., 2009).Such information can be used then to statistically differentiate deep coalescence from hybridization based on tree topologies and branch lengths (Joly, 2012).Phylogenetic network approaches can also help by modeling processes contributing to tree heterogeneity, such as ILS and gene flow (e.g., Holland et al., 2008;Solís-Lemus & Ané, 2016;Than et al., 2008;Yu et al., 2014;C. Zhang et al., 2018).Although further simulations are needed to understand the effect of model violations (Blair & Ané, 2020), phylogenetic network approaches have been effectively implemented in terrestrial empirical systems (e.g., rattlesnakes and frogs; Blair et al., 2018;Chan et al., 2021).For example, Chan et al. (2021) identified extensively admixed populations of Philippine puddle frogs from which the number of species would have been otherwise overestimated when implementing widespread tree-and distancebased species delimitation approaches.
Moreover, the genomic data revolution has facilitated DNA-based demographic inference (i.e., population history models, parameters and plausible speciation scenarios) for a broader range of organisms in the tree of life (Boitard et al., 2016).Although this can shed light on scenarios of divergence with gene flow, integrating demographic histories into the species delimitation of many taxa is severely staggered.Life histories, evolution, and demographic heterogeneity are unknown for a variety of taxa (e.g., Hellberg, 2009;King & McFarlane, 2003;Trochet et al., 2014).Due to this paucity of relevant ecological and evolutionary information, specifying a subset of candidate demographic models can prove challenging for such non-model taxa (Fonseca et al., 2021).However, "borrowing" information from better-known, closely-related taxa with some degree of overlap in ecological or evolutionary traits might prove useful as prior information.This approach has been proposed in Perspectives on the Grey Zone of Species Delimitation With Notes on Invertebrates in the Marine Environment Bulletin of the Society of Systematic Biologists demographic inference performed for the conservation of data-deficient species (Kindsvater et al., 2018).
In this context, the amalgamation of basic knowledge about life histories, the increasing availability of genomic resources, and the implementation of novel approaches that allow for comparisons of competing models (e.g., de-limitR and convolutional neural networks; Fonseca et al., 2021;Smith & Carstens, 2020) hold promise to infer demographic histories in parallel with species delimitation for taxa in complex speciation scenarios (e.g., divergence with gene flow) or for which no prior information is available.For instance, we are enthusiastic about the development of machine-learning-based methods to perform model selection, which may help circumvent some limitations for non-model taxa (Blischak et al., 2021;Fonseca et al., 2021;Pudlo et al., 2016;Smith et al., 2017;Smith & Carstens, 2020).Overall, the intersection between the increasing availability of genomic resources and the development of novel frameworks to estimate demographic parameters has provided alternative means to get a better picture of species boundaries in data-limited taxa (see Gutenkunst et al., 2009;Prasad et al., 2022).

Population genetic structure
Distinguishing population-level structuring from lineage divergence at the species level is crucial to performing accurate species delimitation (Derkarabetian, Benavides, et al., 2019).On the one hand, speciation is not necessarily preceded by population subdivision; on the other, population structuring does not inevitably lead to speciation (Huang, 2020).For example, polyploidy can lead to speciation without population subdivision (see Van de Peer et al., 2017), and structured populations may not persist or diverge long enough to become species (e.g., Singhal et al., 2018).Certainly, given the complexity and continuum of the speciation process (Nosil et al., 2009), several assumptions made to simplify the parameter space explored by currently available species delimitation approaches can be violated under different biological scenarios (Carstens et al., 2013).One main controversy regarding the use of the MSC model in species delimitation stems from its inability to differentiate between species owing to the underlying assumption of sampling panmictic (i.e., random mating) populations (Leaché et al., 2019;Sukumaran & Knowles, 2017).Thus, drawing samples from lineages with population-level subdivisions will most likely result in populations incorrectly delimited as distinct species (e.g., Chambers & Hillis, 2020;Hedin, 2015).It is important to remember that the accuracy of species boundaries inferred by MSC model-based approaches relies on the model fit to the empirical data (Barley et al., 2018); a poor fit to the MSC model will violate core assumptions and obscure important processes that may be present (e.g., Morales et al., 2016).
Population structuring can also influence demographic inference by affecting the signal of population size changes (Chikhi et al., 2018;Mazet et al., 2016;Orozco-Terwengel, 2016), thus likely biasing the incorporation of more complex speciation scenarios into species delimitation.In addition, clusters of a panmictic population can appear struc-tured just because of isolation by distance (IBD) effects (i.e., genetic differentiation increasing with geographic separation; Bradburd et al., 2018).Therefore, it is not enough to infer population genetic structure; we need to account for IBD as a probable cause of the observed patterns, particularly when delimiting species in complex evolutionary scenarios or with sparse geographical sampling (e.g., Mason et al., 2020).In widespread taxa, the IBD bias can become particularly relevant as false species boundaries could emerge just from uneven sampling along the species' distribution range (Barley et al., 2018;Chambers & Hillis, 2020;del Pedraza-Marrón et al., 2019).In other words, a pattern of highly differentiated populations and subsequent delimitation of (inaccurate) species boundaries could emerge based on sampling design alone.
Several avenues can be implemented to circumvent population structure and sampling bias issues in species delimitation.When limited knowledge about the species assignment within a data set is available (e.g., suspected cryptic lineages; C. Li et al., 2020), programs that incorporate speciation as an extended process in which lineage splitting and completion are separate, rather than instantaneous events, can be used as an alternative to the MSC model for delineating species boundaries (e.g., DELINEATE; Sukumaran et al., 2021).For empirical systems where preliminary information or additional lines of evidence are available, widespread geographic sampling accentuated at suspected population contact zones could help differentiate populations from species (Leaché et al., 2019;Marshall et al., 2021).Additionally, reference-based taxonomy can emerge as a framework to determine whether or not the genetic divergence among populations reflects species-level differentiation in non-model taxa (Galtier, 2019;Leaché et al., 2021).For example, criteria applied to differentiate data-rich taxa with uncontroversial species boundaries and similar life histories and ecological traits to non-model taxa could be used as a reference to establish species' boundaries threshold in other taxa (Leaché et al., 2021;Tobias et al., 2010).

Additional Considerations for Future Work
Thus far, we have focused on the three main processes that can present challenges in species delimitation studies.However, there are other elements that are also relevant within an integrative species delimitation framework and will need consideration in future studies.Among them, there were three issues, both long standing and emerging, that were highlighted during the NMNH workshop: the potential impact of ghost lineages in molecular species delimitation, the importance of including phenotypic data in species delimitation studies to account for other evolutionary forces, and the bias frequently introduced by obscure taxonomy.
Ghost lineages are genetic components of extinct, unknown, or unsampled lineages, which arise from ancient horizontal transfer events or hybridization and that remain in extant species (see Hibbins & Hahn, 2022;Tricou et al., Perspectives on the Grey Zone of Species Delimitation With Notes on Invertebrates in the Marine Environment Bulletin of the Society of Systematic Biologists 2022).The role that ghost lineages play in the evolution of extant species was deemed inconsequential or restricted to polyploid species, until recently (Luo et al., 2017;Taylor & Larson, 2019;D. Zhang et al., 2019).Thanks to genomic advances and data availability for a broader range of nonmodel taxa (e.g., GIGA Community of Scientists, 2014; Liew et al., 2016;ReFuGe 2020Consortium, 2015), evidence of the confounding effect of ghost lineage remnants can be seen in genomes of extant species (D.Zhang et al., 2019).These ghost lineages can cause incorrect hybridization assessments, particularly in certain methods including ABBA-BABA, D-statistics, STRUCTURE, and ADMIXTURE, leading to wrong assessments of the species involved in introgression events and even the significance of hybridization itself (Hibbins & Hahn, 2022;Lawson et al., 2018;Tricou et al., 2022).These erroneous results could potentially lead to wrong estimations of species boundaries, particularly if one concludes ongoing hybridization between lineages.Therefore, comprehensive sampling of the phylogenetic breadth of the focal taxonomic group and adequate genomic-scale data will help to take uncertainty into account during hybridization tests.In addition, applying complementary tests to multiple scenarios and using information from all available genomes will be key to resolve issues of ghost lineages in future species delimitation studies (Hibbins & Hahn, 2022;Naciri & Linder, 2015;Tricou et al., 2022).
Pre-dating the advent of molecular-based technologies, traditional taxonomy and species delimitation were based on phenotypic distinctiveness primarily using morphological characters (MacLeod, 2002;Saraswati & Srinivasan, 2016), which were later found frequently at odds with molecular species boundaries (Wheeler, 2005).Presently, considerable effort is needed in developing and validating new tools for capturing the phenotypic complexity and improving screening for informative characters (Cadena et al., 2018;Giribet, 2010;Schlick-Steiner et al., 2007).Due to the increasing availability of genome-level data and the subsequent disparity between phenotypic and molecular information, a great extent of species delimitation studies relies exclusively on genomics (Cadena & Zapata, 2021).Consequently, additional lines of evidence (e.g., morphology, behavior, ecology) to delineate species arising from the phenotype are frequently overlooked, which could lead to inaccurate inference of species boundaries by failing to account for other evolutionary forces driving biodiversity (Cadena et al., 2018;Cadena & Zapata, 2021;Solís-Lemus et al., 2015;Sukumaran et al., 2021).For many taxa, logistic difficulties related to examining the array of phenotypical and phenological characteristics of these organisms in their natural habitats exist (Ficetola et al., 2019;Knowlton, 1993).Fortunately, forthcoming studies can benefit from novel technological advances that have increased our capabilities to collect and survey in difficult-to-sample ecosystems (Aucone et al., 2023;Costa et al., 2020;Danovaro et al., 2014;Mammola et al., 2021) and the revolutionary progress on analyses to document the multiple dimensions of phenotypes (e.g., Kramer et al., 2021;Radford et al., 2014;Ramírez-Portilla, Bieger, et al., 2022;Ziegler et al., 2010).Overall, adding phenotypic data to species delimita-tion studies may also improve our ability to discern species boundaries along the speciation continuum (Cadena et al., 2018).
Taxonomic inflation, or the artificial increase in the number of taxa for reasons other than genuine species discovery, has remained a confounding issue in species delimitation (Agapow et al., 2004;Dubois, 2008;Isaac et al., 2004;Padial & De la Riva, 2006;Zachos, 2015).For instance, conflicting species concepts have resulted in unwarranted taxonomic descriptions, such as when "subspecies" are spuriously inflated to species level or when new taxa are incorrectly described (Dubois, 2008).Owing to its diagnosability, the widespread and exclusive use of the Phylogenetic Species Concept (e.g., the presence of reciprocally monophyletic groups) has often led to the over-splitting of species, particularly without thoughtful incorporation of all necessary information (Mace, 2004;Zachos, 2013Zachos, , 2015; but see Agapow & Sluys, 2005;Padial & De la Riva, 2006).In a similar trend, the indiscriminate implementation of the MSC model disregarding its core assumptions (e.g., no recent gene flow has occurred between lineages) has led to oversplitting populations into distinct species as described above (Chambers & Hillis, 2020;Sukumaran & Knowles, 2017).Conveniently, the growing use of complementary lines of evidence to solve the multidimensional puzzle of species boundaries allows the collection of robust evidence either supporting or challenging existing taxonomic hypotheses within an integrative framework (Dayrat, 2005;Haszprunar, 2011;Pante, Schoelinck, et al., 2015;Winker, 2009).Hence, future studies first need to acknowledge that species are the taxonomic hypotheses, and therefore, require thorough testing using all available information to avoid the false consensus effect (i.e., "we give more value to the knowledge that we think is known and accepted," B. Carstens, pers. comm. 2021) and the derived confirmation bias (i.e., the predisposition to interpret new evidence as confirmation of previously accepted hypotheses).

Marine Invertebrates
Marine invertebrate diversity is grossly underestimated (Appeltans et al., 2012); therefore, we do not have a thorough understanding of their distributions throughout the world's oceans nor are we able to fully realize their degree of susceptibility to global ocean change.Yet, many anthropogenic impacts (e.g., temperature warming, ocean acidification, deoxygenation, and resource extraction) clearly threaten marine invertebrate diversity, particularly those that are important foundation species such as corals (Carpenter et al., 2008;T. P. Hughes et al., 2017;Pandolfi et al., 2003).It is clear that we need to advance our capacity to delimit and describe new marine invertebrate taxa to better document their diversity and distributions; but it will take a global effort throughout the next century.
Although the three processes outlined above affect numerous taxa across terrestrial and marine realms, we believe that they are particularly important to consider when delimiting species of marine invertebrates.Therefore, we discuss these issues in the context of delimiting marine Perspectives on the Grey Zone of Species Delimitation With Notes on Invertebrates in the Marine Environment Bulletin of the Society of Systematic Biologists invertebrate species and showcase a few empirical studies that have effectively addressed these issues.Notably, only a handful of studies to date have adequately addressed these concerns using multi-locus molecular data within a process-based species delimitation framework.
However, the relative contribution of incomplete lineage sorting versus hybridization ought to be gauged when adequate data are available (e.g., Weber et al., 2019).This is particularly necessary for benthic marine invertebrates that reproduce mainly by broadcast-spawning gametes into the water column (Alino & Coll, 1989;Baird et al., 2009;Crean & Marshall, 2008;Crimaldi & Zimmer, 2014;Yund, 2000).An overlap in the timing of gamete release between sympatric species and interspecific gametic compatibility observed in experimental crosses has given rise to the notion that gene tree incongruence is likely caused by hybridization (Gardner, 1997;van Oppen et al., 2001;Veron, 1995;Willis et al., 2006).Quattrini et al. (2019) tested the likelihood of introgressive hybridization in a diverse genus of soft coral using ABBA-BABA and D-statistics (Fig. 2).They found at least 15% of species were likely hybrids and concluded that introgressive hybridization was an important factor in speciation of the genus.But in a recent literature review of hybridization in coral reef ecosystems (Hobbs et al., 2022), evidence for hybrid lineages was found in only five species of stony corals; leading in most cases to a decrease in lineage diversity.Similarly, a recent study by Ramirez-Portilla et al. (2022), who robustly combined experimental crosses with genomic and morphological data, found no evidence for hybridization in several closely-related stony corals.In summary, whether or not hybridization is a common evolutionary process among benthic in-vertebrate lineages remains to be realized.Regardless, we now have the available analytical tools and the ability to obtain sufficient genomic data to delimit species under divergence-with-gene flow scenarios while improving our understanding of the degree to which hybridization generates or obscures lineage divergence of marine invertebrate taxa.
Assessing species delimitation with gene flow by incorporating demographic modeling can provide even more important insights; however, this can be challenging in many marine invertebrate taxa because the demographic information (e.g., divergence times, population sizes) and genomic-scale data that are often used in these approaches are unavailable or difficult to obtain.Thankfully, we are beginning to see an overall increase in the genomic data and resources available for some marine invertebrates (e.g., genomes, transcriptomes, bait sets for target capture; Cooke et al., 2020;Cowman et al., 2020;Erickson et al., 2021;Mao et al., 2018;Quattrini et al., 2017;Reitzel et al., 2013;Wolfe et al., 2019), which, at the very least, yield the input data needed to perform demographic inference.Notably, a few recent studies focusing on corals incorporated demographic models to successfully demonstrate complex gene exchange scenarios among lineages in shallow and deep habitats (Prada & Hellberg, 2021;Prata et al., 2022;Rippe et al., 2021).For example, Prada and Hellberg (2021) eloquently showed that lineages in shallow and deep habitats diverged ~800k years ago with periods of low, but symmetrical, gene flow followed by a long episode of isolation.Following an isolation period of ~ 100k years, asymmetrical gene flow occurred from shallow to deep lineages.This exemplary study highlights how gene flow, differential selection, and population isolation can all be acting to shape a lineage's divergence history across a strong environmental gradient of depth.Due to the sensitivity displayed and information obtained by demographic frameworks to detect gene flow, we encourage future studies to incorporate demographic models to help elucidate species boundaries and speciation scenarios in marine empirical systems.
Under the Unified Species Concept (de Queiroz, 2007), a species is considered as a separately evolving metapopulation or one that constitutes several sub-populations that are connected via gene flow.We believe that this species concept fits the marine invertebrate populations well as ocean currents can move larvae of sessile or low migratory (traits common to a majority of marine invertebrates) species across large distances (10s to 100s of km, see Baco et al., 2016;Kinlan & Gaines, 2003) where they can settle in new and/or discrete habitat patches.Yet, the full extent of population structure within a species in the marine environment can be difficult to determine, and thus likely not thoroughly considered in species delimitation studies (Pante et al. 2015b).In addition, while marine habitats may be connected via ocean currents, sampling in these environments can be highly patchy (e.g., Pante, Puillandre, et al., 2015;Pante, Schoelinck, et al., 2015).Indeed, sampling bias is rampant even in well-studied regions and biodiverse marine hotspots such as the Indo-Pacific (Keyse et al., 2014).The extensive variation in geographic sampling effort (Crandall et al., 2019;A. C. Hughes et al., 2021)  particularly if focused on either end of a species' distributional range, could result in supposedly strong population structure that is, in fact, shaped via IBD.Indeed, such strong population structure could lead to incorrectly assigning subpopulations of one species to multiple species.With a worldwide distribution of samples combined with genomic-scale data, Glon et al. (2021) simultaneously illuminated cryptic species and geographic structure within a genus of sea anemone, highlighting the importance of thoroughly sampling across geographic space to resolve species boundaries within a morphologically challenging taxonomic group of marine invertebrates.
Even when accounting for sampling-biases, the heterogeneity of structuring patterns can be strong in marine organisms within the same biogeographic region or between closely-related species (e.g., Ayre & Hughes, 2000;Crandall et al., 2019;Severance & Karl, 2006).This level of population heterogeneity calls for careful examination of preconceived ideas about marine populations being "open" and highly connected (Cerca et al., 2018;Cowen et al., 2000;Kinlan & Gaines, 2003;Palumbi, 2003;Paulay & Meyer, 2002).Current evidence from multiple studies points towards a more complex marine dispersal scenario where other major factors, such as environmental conditions, can also be at play in shaping population differentiation (Galindo et al., 2010;Selkoe & Toonen, 2011;Weersing & Toonen, 2009).Therefore, in addition to the geographic structuring based on degree of gene flow in populations, there could be a significant effect of the oceanic environment in the evolution of resident taxa (Cowen et al., 2007;Dawson & Hamner, 2008;Denny, 1993;Hare et al., 2005;Liggins et al., 2013;Strathmann, 1990;Vermeij & Grosberg, 2010).Complex gene-flow patterns might be further influenced by differential selection to environmental conditions, particularly across depth (Prada & Hellberg, 2013, 2021;Titus, Blischak, et al., 2019).Species might be connected across hundreds to thousands of kilometers within similar depths, but differentiate strongly across different depths at sites only a few kilometers apart (Galaska et al., 2021;Johnston et al., 2022;Quattrini et al., 2022a).Therefore, it is critical to take into account sampling bias not only across a species' geographic range but also the bathymetric range as well.Many cryptic species of sessile invertebrates, in particular, might be living just a few meters deeper than their sister taxa in nearby habitats (Johnston et al., 2022;Knowlton et al., 1992;Prada & Hellberg, 2013, 2021).
The diversity of population connectivity patterns from different empirical systems (e.g., Prata et al., 2022;Serrano et al., 2014;Severance & Karl, 2006;Warner et al., 2015) illustrates the complexity of the marine population seascape and the variety of selective pressures on species' traits; both of which can challenge our ability to successfully delimit species in the marine environment.Recent advances in seascape genomics (i.e., studies that use topographic, environmental, and oceanographic conditions as statistical predictors of population genomic patterns; Bongaerts et al., 2021;Galindo et al., 2010;Grummer et al., 2019;Riginos et al., 2016;Selmoni et al., 2020) could help disentangle population connectivity within the oceanic realm (Galaska et al., 2021;Liggins et al., 2019;Riginos & Liggins, 2013) and guide discovery and validation of species boundaries.Integrating genomics with data from the geographic information system (GIS) could help in understanding the significance of spatial scales for assessing the drivers of geographic genetic variation and population structure in nonmodel species for which sampling is sparse (Dalongeville et al., 2018;Riginos et al., 2016Riginos et al., , 2019)).Within a species delimitation framework, we anticipate that such assessments will expand the information available for empirical systems, and therefore, guide primary species hypothesis testing (i.e., validation) by incorporating better-informed data and robust proxies.

Concluding Remarks
In this perspective, we highlighted major themes discussed during the species delimitation workshop at the Smithsonian NMNH and detailed a few intrinsic features that can challenge any species delimitation study.It is clear that unambiguous delimitation of species boundaries is far from trivial and may be hindered by the continuous nature of speciation itself.Therefore, defining species criteria, using an integrative taxonomic framework with appropriate analyses, and even knowing your organism's natural history are all critical components for successfully delimiting boundaries among species.In addition, we can gain much more from process-based species delimitation studies than just knowledge of species boundaries.These studies can help illuminate the speciation process in a particular empirical system.Indeed, one of the most important insights gained from our species delimitation workshop is to try and delimit species while also inferring their history (B.Carstens, pers. comm. 2021).
It is essential to recognize, and try to account for, the processes outlined above (ILS, gene flow, and population genetic structure) when delimiting species.Genomic-scale studies help to circumvent dependence only on single-locus markers, thus helping to remove erroneous results due to ILS.In addition, it is important to consider population structure and gene flow, particularly in marine invertebrates that broadcast spawn their gametes, which are subsequently transported via ocean currents.These combined factors create the potential for high levels of gene flow between species and high degrees of population connectivity recognized only through isolation by distance (Palumbi, 1994).To this end, it is critical to also consider sampling bias in both terrestrial and marine systems.Due to the Racovitzan impediment (i.e., the constraints to survey and study species from inaccessible environments; sensu Ficetola et al., 2019), we recognize that it is often not possible to sample across a species range, particularly in habitats with limited access (e.g., isolated reefs, deep-sea habitats, mountain tops).Therefore, we suggest stating the potential of this bias in study systems and possibly remedy the issue by leveraging museum samples.Recent work has shown the promise of incorporating DNA from historically preserved specimens into genomic studies (Derkarabetian, Castillo, et Perspectives on the Grey Zone of Species Delimitation With Notes on Invertebrates in the Marine Environment Bulletin of the Society of Systematic Biologists al., 2019;Tsai et al., 2020;Untiedt et al., 2021), enabling us to unlock the full utility of museums' collections.
When species boundaries have been successfully established in focal taxonomic groups, the scientific community should follow with formal species descriptions and taxonomic treatments (Bonito et al., 2021;Pante, Schoelinck, et al., 2015).Taxonomic revisions and descriptions can be done within larger species delimitation frameworks or follow in separate publications for targeted journals.We recognize that many early-career researchers might avoid spending too much time on taxonomic-focused publications as these are often undervalued within the scientific community at large.However, the scientific field must recognize the value in taxonomy and the need for the next generation of scientists to have taxonomic expertise, particularly as we face worldwide biodiversity loss.By integrating process-based species delimitation into taxonomic studies, publications that simultaneously advance knowledge of speciation while formally describing species can be published in higher-impact journals (see Arrigoni et al., 2020;Esquerré et al., 2019;Venkatraman et al., 2018 as examples).Furthermore, through citing taxonomic papers more, we increase the impact factors of journals they are published in and boost citation numbers, both of which are particularly important for early-career researchers.
Finally, taxonomists need to embrace the genomic era and the novel methods that can be used to understand species boundaries and the speciation processes that shape them.To this end, collaboration is key to integrate taxonomic descriptions within a process-based species delimitation framework (Bonito et al., 2021;Pante, Schoelinck, et al., 2015).For instance, process-based species delimitation calls for researchers' expertise in their study system to propose reasonable sets of models and parameters to examine, such as divergence times, gene flow, migration, and likely species tree topologies (Smith & Carstens, 2020).In light of the evolutionary processes and groupings elucidated within the grey zone by process-based approaches, it is up to taxonomists to determine if the status as robust species hypotheses is warranted and under which particular species concept.By incorporating such process-based species hypotheses into online, open access databases (Costello et al., 2015), we can move beyond national borders to help encourage international collaborations.
We hope our perspective highlights the need for more emphasis on process-based species delimitation of marine invertebrates.This emphasis should include both empirical studies of species delimitation, along with formal taxonomic treatment, and studies on the application of new analytical approaches to different marine invertebrate groups.Due to the contrasting properties between sea and land ecosystems, having a deeper understanding of these organisms will continue to shed light on speciation processes, including ecological speciation, gene flow and hybridization, and reproductive isolation (Strathmann, 1990).Otherwise, how can we truly understand speciation without focusing on these underlooked animal groups living in the largest, and arguably most important, biome on Earth?We urge those who study species delimitation methods and speciation theory to consider including taxa such as marine invertebrates in their research and those who study marine invertebrates to use the advanced methods derived in terrestrial systems.In combination, we can then significantly advance knowledge on speciation and species boundaries within our world's oceans.

Figure 1 .
Figure 1.Common processes within the grey zone of species delimitation.
For example, Sánchez et al. (2021) inferred robust species limits Perspectives on the Grey Zone of Species Delimitation With Notes on Invertebrates in the Marine Environment Bulletin of the Society of Systematic Biologists

Figure 2 .
Figure 2. Assessing the role of hybridization while delimiting species boundaries in a cryptic marine invertebrate species complex.Species hypotheses of sympatric clades from the octocoral genus Sinularia (=Sclerophytum; McFadden et al., 2022) were evaluated by Quattrini et al. (2019) using a combination of approaches based on loci obtained through restriction-site associated DNA sequencing (RADseq).This figure adapted from Quattrini et al. (2019) shows the results for a clade containing four morphospecies and conflicting hypotheses for molecular operational taxonomic units (MOTUs, 0.3% genetic distance threshold) using two different mitochondrial barcodes (mtMutS = 6 MOTUs, 28S rDNA = 4 MOTUs).(a) Maximum likelihood phylogeny for Sinularia clade 4 with 100% bootstrap node support (200 replicates) unless indicated otherwise.The five supported molecular species hypotheses in the RADseq phylogeny are shaded and dark solid symbols indicate those supported by either morphology (triangles) or MOTUs delineated using the mtMutS (circles) or the 28S rDNA (squares) barcodes.(b) Plot of a discriminant analysis of principal components (DAPC) color-coded according to species matching those from the phylogeny.(c) ABBA-BABA test results for admixture, where test numbers are displayed above the corresponding bar for each 4-taxon test (((p1, p2), p3), p4) using S. humilis in grey bars as outgroup.The colored bars indicate the taxa included in each test and individuals are listed at the right side shaded according to species in the phylogeny.Each test consisted in assessing whether the "black bars" (P3) shared more derived SNPs with the "blue bars" (P1) relative to the "orange bars" (P2).Tests with the black stars on top correspond to those with significant D-statistics (alpha = 3), in this case signaling gene flow particularly between P2 and P3 (ABBA, black and orange bars).(d) Barplots depicting the probability of individual membership to each cluster obtained for the suggested K values (K= 4 for clade 4 and K= 2 for the S. tumulosa species group).A picture of S. tumulosa morphospecies is depicted above its corresponding barplot, which separates it in two different molecular species.Picture and identification credit to Leen van Ofwegen's collection.For more details, refer to the original publication.
Perspectives on the Grey Zone of Species Delimitation With Notes on Invertebrates in the Marine Environment Bulletin of the Society of Systematic Biologists

,
Perspectives on the Grey Zone of Species Delimitation With Notes on Invertebrates in the Marine Environment Bulletin of the Society of Systematic Biologists