Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae) and Expands Biogeographic Inference

Our knowledge of the biodiversity of Asia and Australasia continues to expand with more focused studies on systematics of various groups and their biogeography. Historically, fluctuating sea levels and cyclic connection and separation of now-disjunct landmasses have been invoked to explain the accumulation of biodiversity via species pump mechanisms. However, recent research has shown that geological shifts of the mainland and species dispersal events may be better explanations of the biodiversity in these regions. We investigate these processes using the poorly studied and geographically widespread Mud Snakes (Serpentes: Homalopsidae) using a target capture approach of ~4,800 nuclear loci from fresh tissues and supplemental mitochondrial data from formalin tissues from museum specimens. We use these datasets to reconstruct the first resolved phylogeny of the group, identify their biogeographic origins, and test hypotheses regarding the roles of sea-level change and habitat selection on their diversification. Divergence dating and ancestral range estimation yielded support for an Oligocene origin and diversification from mainland Southeast Asia and Sundaland in the rear-fanged group ~20 million years ago, followed by eastward and westward dispersal. GeoHiSSE models indicate that niche expansion of ancestral, rear-fanged lineages into aquatic environments did not impact their diversification rates. Our results highlight that Pleistocene sea-level changes and habitat specificity did not primarily lead to the extant species richness of Homalopsidae and that, alternatively, geological shifts in mainland Southeast Asia may have been a major driver of diversity in this group. We also emphasize the importance of using fresh and degraded tissues, and both nuclear and mitochondrial DNA, for filling knowledge gaps in poorly known but highly diverse and conceptually important groups. Here, Homalopsidae represents a non-traditional but effective model study system for understanding transitions between terrestrial, marine, and freshwater environments


Introduction Geological Paradigms of Southeast Asia and Australasia
The increasing use of phylogenomic approaches has resolved the evolutionary histories of many organismal groups, resulting in a better understanding of their diversification, biogeography, and trait evolution (Blair et al., 2018;Hallas et al., 2022;Streicher & Ruane, 2018).Recurrent systematic studies in 'model regions' might find that the same geological processes are responsible for the diversification of disparate taxonomic groups, providing insight into the effects of broad geological processes on speciation (Brown et al., 2013).South and Southeast Asia, New Guinea, and northern Australia are excellent examples of areas with complex geological histories that have influenced the diversification of many groups (e.g., Brown et al., 2013;de Bruyn et al., 2014); these regions have undergone significant geological shifts, including tectonic uplift, river catchment events, and Pleistocene sea-level fluctua-tions, the latter of which have cyclically joined and disconnected landmasses that are isolated today (Brown et al., 2013;Hall, 2009;Hutchison, 1989;Rainboth, 1996;Workman, 1977).The sea-level fluctuations of this region are often cited as a driver of speciation and population structure in invertebrates and vertebrates (Li & Li, 2018;Wu et al., 2022;Wüster et al., 2005).However, recent geological studies that better identify the dates of land bridges (Husson et al., 2020;Sarr et al., 2019) suggest that current hypotheses must be revisited to test if newly identified processes in other taxa also affect diversification and dispersal scenarios (Garg et al., 2022;Sholihah et al., 2021).

Focal System
Caenophidian snakes ('advanced snakes') of the family Homalopsidae, known as Old World Mud Snakes, are a morphologically and ecologically diverse group distributed throughout Asia and Australasia (excluding New Zealand; Gyi, 1970;Murphy, 2007).With 58 species in 29 genera, Homalopsidae is split into two major lineages: the fangless group (3 genera; 10 species) and rear-fanged group (26 genera; 48 species).The fangless homalopsids, only found on Sumatra, the Moluccas, and the Bird's Head Peninsula of New Guinea are poorly known but include aquatic and a few terrestrial, possibly vermivorous, species (Murphy, 2012;Murphy, Mumpuni, et al., 2012).The rear-fanged group is represented by mildy venomous, aquatic snakes that display a variety of natural history traits, morphologies, feeding behaviors, diets, and microhabitat preferences (Brooks et al., 2009;Fabre et al., 2016;Jayne et al., 2018;Murphy, 2007).Rear-fanged homalopsids are widely distributed across South and Southeast Asia, New Guinea, and northern Australia (Murphy & Voris, 2014).The distribution and habitat use of homalopsids makes them an ideal system to test hypotheses regarding Asian biogeography.However, a comprehensive understanding of the evolutionary history of this group is precluded by the lack of a well-resolved phylogeny.
Previous phylogenetic studies of Homalopsidae have been limited to two loci and less than half of the known species in the group (22 species, Alfaro et al., 2008;34 species, Bernstein et al., 2021), leaving many relationships uncertain.Additionally, 22 species (9 genera) are each known from only one or few specimens and lack tissue preserved specifically for DNA extraction (Burrell et al., 2015;Card et al., 2021;Simmons, 2014).Fortunately, within the last decade a variety of methods to obtain DNA from these intractable (also called 'formalin-fixed,' 'museum,' or 'historic'; we use 'degraded' for the remainder of the text) specimens have been developed (Hykin et al., 2015;O'Connell et al., 2021;Ruane & Austin, 2017;Totoiu et al., 2020), providing new opportunities for uncovering hidden diversity, resolving phylogenetic placement of rare species, and filling gaps in speciation and extinction hypotheses (Roycroft et al., 2021;Ruane, 2021).These advances, and the increased ease of acquiring genomic data, provide an opportunity to resolve the evolutionary relationships and timing of diversification in a family-wide gap in the phylogeny of snakes.

Objectives
Increased genomic and taxon sampling will allow for the testing of hypotheses regarding homalopsid evolution.The estimated divergence times of the crown group have ranged from ~20-55 Ma using single-and multi-locus approaches.Bernstein et al. (2021) hypothesized that Pleistocene sealevel fluctuations were responsible for the diversification of many homalopsids, but low information in non-mitochondrial sequences and many missing taxa resulted in ambiguous and possibly overestimated divergence times for genera.Phylogenetic uncertainty has also limited inference of the biogeographic origins and drivers of diversity within Homalopsidae.More than half of homalopsid diversity is known from mainland Southeast Asia, so it has been hypothesized to be the biogeographic area of origin of this family (Murphy, 2007).While sea-level fluctuations are often thought to be responsible for diversification in this biogeographic region, the environment can also be a contributing factor.Unlike the fangless group (3 genera, 10 species), half of which are terrestrial, the more speciose rear-fanged homalopsids (26 genera, 48 species) inhabit a variety of aquatic habitats.This difference in species richness between the two subgroups of Homalopsidae might indicate that the use of different aquatic environments has facilitated the diversification of the rear-fanged group.
To investigate hypotheses regarding homalopsid evolution, as well as the impacts of broader biogeographic paradigms in Southeast Asian vertebrate evolution (e.g., sealevel fluctuations, potential environmental influences on diversification), we use a target capture probe set to sequence thousands of loci (Faircloth et al., 2012;Lemmon et al., 2012;Singhal et al., 2017).We use these data, along with supplemental mitochondrial (mt)DNA, to test the hypotheses that i) diversification dates of homalopsids will predate Pleistocene sea-level changes, ii) homalopsids originated in mainland Southeast Asia, and iii) that living in a variety of aquatic environments (freshwater and brackishwater) led to increased diversification rates of the speciose rear-fanged homalopsid clade.We provide a resolved phylogeny of Homalopsidae, phylogenetically place five genera that have never been included in any evolutionary study to date, and discuss Oligocene divergence and range expansion of homalopsids that predate sea-level change in the Pleistocene.

Methods
We used both fresh tissues (high quality tissues with easily sequencable DNA) and formalin-preserved tissues or those that were taken from whole specimens fixed in ethanol (low quality, degraded DNA) from natural history museums and field collecting efforts.We targeted ~5,200 nuclear loci to generate a species tree based on genomic data for biogeographic and diversification rate analyses.Due to difficulty in capturing nuclear loci for the degraded tissues (see Results), we repeated our analyses on a cytochrome b (cytb) tree generated from mitochondrial bycatch of degraded samples and sequence data from Bernstein et al. (2021).
Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)...

Taxon Sampling and Data Generation
We obtained tissue from 157 homalopsids, consisting of 47 (~81%) species and 27 (~93%) genera (see Supplementary Appendix S1).Of these, 41 tissues (20 species) were from degraded samples.These samples from preserved specimens include members of the fangless and rear-fanged homalopsids and include 14 species and 12 genera that have never been included in any phylogenetic analysis.Because a majority of molecular studies on this group have used mtDNA, we include published cyt-b sequences from NCBI (Genbank) to supplement our taxon and locus sampling for the degraded samples (Alencar et al., 2016;Alfaro et al., 2008;Bernstein et al., 2021;Colli et al., 2002;Karns et al., 2010;Kumar et al., 2012;Murphy, Mumpuni, et al., 2012;Murphy, Voris, et al., 2012;Ukuwela et al., 2017;Wüster et al., 2002;Supplementary Appendix S1).Additionally, we obtained a sample of a rear-fanged homalopsid, Homalophis doriae, which has not been included in any phylogenetic study, after genomic sequencing.To include this taxon in our phylogeny, we used the DNA extraction and PCR protocols of Bernstein et al. (2021) to sequence cyt-b and add it to our cyt-b dataset containg fresh and degraded samples.To further maximize our taxon sampling, we downloaded the genome of Myanophis thanlyinensis (Köhler et al., 2021) to extract nuclear loci using the BLAST (megaBLAST) function in Geneious v11.1.5and include them in our pipelines and analyses.We incorporated two vipers (Bothrops moojeni, Bothrops pauloensis), a colubrid (Chironius exoletus), a dipsadid (Philodryas olfersii), and one elapid (Micrurus brasiliensis) from Singhal et al. (2017) as outgroups.
Total genomic DNA from fresh tissues of muscle and liver was extracted using Qiagen® DNeasy blood and tissue kit protocols.Tissues from museum specimens (liver and muscle) were extracted using the protocol of Ruane and Austin (2017).This method uses a heated alkali buffer solution and modifications to the Qiagen® kit protocol to increase the DNA yield from intractable specimens.Two of our museum samples were bone (see Supplementary Appendix S1).For these, 122 mg (sample 1) and 14 mg (sample 2) of bone were frozen with liquid nitrogen and pulverized with a mortar and pestle.Using protocols from Allentoft et al. (2018;2015), the bone powder was then incubated for 24 h at 45 °C in a ~5 mL digestion buffer containing 4.7 mL 0.5 M EDTA buffer, 50 μL of proteinase-K (0.14-0.22 mg/ mL, Roche, Basel, Switzerland), 250 μL 10% N-Lauryl-Sarcosyl, and 50 μL TE buffer (100×).An additional 35 μL of proteinase-K was added at 23 h.Then, 400 μL of AL buffer and 400 μL of 100% ethanol per 650 μL of lysis solution was added to the mixture.Finally, 200-μL batches of the mixture were used starting at step 7 of the Qiagen® DNeasy® Blood & Tissue Kit protocol for compact animal bone.All DNA extractions were performed on surfaces that were sterilized with bleach and with UV-sterilized equipment and filter pipette tips; the 41 samples with degraded DNA were extracted in a separate lab space.We used a Qubit 3 fluorometer (high sensitivity; Thermo Fisher Scientific: Invitrogen) to quantify all extractions.
Genomic DNA was sent to Daicel Arbor Biosciences (Ann Arbor, Michigan).Fresh samples were sonicated and size selected following a protocol to produce an average insert length of approximately 500 nucleotides (nt).Up to 200 ng of sonicated and size-selected DNA from the fresh samples was used for input in a library preparation method optimized for targeted capture using the Squamate Conserved Loci (SqCL) v2 probe set (Singhal et al., 2017).This probe set targets 5,462 nuclear loci consisting of ultraconserved elements (UCEs), anchored hybrid enrichment loci (AHEs), and nuclear protein coding genes (NPCGs) commonly used in Squamate phylogenetic studies (e.g., BDNF, CMOS, RAG2).Unique dual-index combinations were added to each sample via 6 cycles of PCR amplification.For the degraded samples, a single-stranded library preparation chemistry appropriate for short and degraded fragments was applied to the samples in a cleanroom setting using up to 5 ng of DNA as input.Unique dual-index combinations were added to each sample via 12 cycles of PCR amplification.
The degraded and fresh indexed libraries were quantified with both a spectrofluorimetric assay and a quantitative PCR assay.To prepare for capture, libraries were pooled in equimolar ratios for capture (7-to 10-plex captures for fresh, 3-to 5-plex captures for degraded) and dried down to 7 μL by vacuum centrifugation.Captures were performed following the myBaits v5 protocol with an overnight hybridization.Hybridization and washes were performed at 60 °C for the degraded and 65 °C for the fresh samples.For each sample, half of the volume of beads in the elution buffer were amplified for 10 cycles.For captures that did not yield sufficient mass, the second half of bead volume was amplified for 14 cycles.Final capture pools were quantified again with both a spectrofluorimetric assay and a quantitative PCR assay and were also visualized on an Agilent Tapestation 4200.One sequencing pool was made from the captures composed of fresh samples.A second sequencing pool was made from the captures composed of degraded samples.A third sequencing pool was prepared with unenriched libraries from the degraded samples, combined in equimolar ratios.Due to the presence of residual dimer in the degraded samples (both enriched and unenriched), a gel excision was required to remove the dimer.The two degraded pools were quantified and visualized a second time.Because there was residual dimer, the unenriched pool was reamplified for 6 cycles, and the gel excision and quantification were repeated.A final sequencing pool for the degraded samples was prepared by combining the enriched (85%) and unenriched (15%) pools.Samples were sequenced on the Illumina NovaSeq 6000 platform on partial S4 PE150 lanes.

Bioinformatics and Phylogenomic Analyses
To trim adapters and barcodes from raw reads, we used illumiprocessor (Del Fabbro et al., 2013;Faircloth, 2011;Lohse et al., 2012) with default settings.Reads were then assembled with SPAdes (Bankevich et al., 2012) and processed for phylogenomic analysis using the Phyluce v1.7.1 pipeline (Faircloth, 2016).Due to computational constraints and Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)... Bulletin of the Society of Systematic Biology some samples having higher numbers of reads, we subsampled 3.5 million reads from each pair of reads (up to 7 million reads total) using seqtk (https://github.com/lh3/seqtk).Alignments of homologous nucleotide sites for each locus were edge-trimmed using Gblocks, and data matrices were created for each locus that contained at least 75% of the taxa in the dataset.Due to varying degrees of missing data in historic specimens with degraded DNA (see Results), we ran the Phyluce pipeline separately for fresh and degraded samples.To obtain individual loci from degraded specimen raw reads, and confirm that obtained loci were not artefactual sequences, we created pseudoreference genomes in Geneious using loci from the fresh homalopsid sample that recovered the highest number of complete loci during data assembly.Using a pseudoreference provides a computationally less-demanding and more efficient way to identify loci obtained from degraded DNA in museum specimens (Bernstein & Ruane, 2022).We then mapped contigs obtained from Phyluce to this reference.Using loci greater than 200 bp, we manually created DNA alignments using the Geneious Alignment option under default parameters in Geneious R11.To get a better understanding of DNA alignment quality between those with and those without degraded samples, we calculated parsimony informative sites using the ape and ips packages in R (Heibl, 2008;Paradis & Schliep, 2019) and locus lengths in Geneious.To extract mitochondrial bycatch, we mapped raw reads from our museum samples to the Myanophis thanlyinensis homalopsid mitochondrial genome (Köhler et al., 2021).
We generated three trees to use as input for downsteam analyses: 1) a 'nuclear species tree' consisting of targeted loci from fresh tissues only, 2) a mitochondrial 'cyt-b tree' from both fresh and degraded tissues, and 3), a 'concatened nuclear tree' of homalopsids from only fresh tissues using the targeted nuclear loci to assess the monophyly of species and determine divergence dates at the population level.For the concatenated nuclear tree, we concatenated the alignments for each locus and ran a maximum likelihood tree search using IQ-TREE v1.6.12 (Nguyen et al., 2014), searching for the best nucleotide model for each dataset using ModelFinder (Kalyaanamoorthy et al., 2017), and assessed branch support with 1,000 ultrafast bootstrap (UFB) iterations (Hoang et al., 2017).Nodes with UFB ≥ 95 are considered strongly supported relationships (Hoang et al., 2017).For the nuclear species tree, we ran a coalescent tree search for species tree analysis in ASTRAL-III (Zhang et al., 2018).We generated individual nuclear trees for each locus using IQ-TREE with the same parameters used for the concatenated nuclear tree.These trees were used as input for ASTRAL-III, which was run using default parameters.Relationships were considered supported if Bayesian Posterior Probabilities (Bpp) were ≥ 0.95.
Our approaches were able to obtain some nuclear loci for degraded specimens (see Results), but the level of overlap with fresh specimens and the short alignment lengths preclude their use in the concatenated nuclear and species tree approaches.Cyt-b has been used in many studies on homalopsids (Alfaro et al., 2008;Bernstein et al., 2021;Quah et al., 2018), and, while it is informative enough to produce consistent, supported relationships for some genera, many taxa are poorly supported and have 'unstable' phylogenetic positions.Thus, to identify the phylogenetic position of the specimens with degraded DNA, we reconstructed a cyt-b tree using RAxML-NG v 1.1.0(Kozlov et al., 2019) with one tip per species, using the nuclear species tree as a backbone constraint on the topology.We consider bootstraps ≥ 70 to be supported.Tree reconstruction was run with 1,000 bootstrap iterations and using a GTR+G model of nucleotide evolution.

Divergence Dating, Biogeographic, and Modeling Analyses
For comparative reasons, we perform divergence dating seperately on three trees: 1) the nuclear species tree, which contains lower taxonomic sampling but greater locus sampling, 2) the cyt-b tree (one tip per species), consisting of a single locus but a higher taxonomic sampling than the nuclear species tree, and 3) the concatenated nuclear tree (containing more than one tip per species).Additionally, ancestral range estimation and hidden state speciation and extinction (HiSSE) model analyses were run on the nuclear species tree and cyt-b tree to compare potential differences in results due to greater taxonomic sampling in the cyt-b tree.
The estimation of divergence dates can become computationally demanding when many loci are included, so we used treePL v1.0 (Smith & O'Meara, 2012) to estimate the divergence times of our concatenated nuclear, nuclear species tree, and cyt-b trees.We iteratively ran the analysis until convergence, using optimal parameters obtained for the run and the 'thorough' and 'prime' commands, respectively.To ensure that this method obtained consistent results, we performed this process five times.We determined the best smoothing parameter, which affects the penalty for rate variation across the tree, by using random subsample and replicate cross-validation (RSRCV).The RSRCV approach randomly samples multiple terminal nodes with replacement and calculates the rates and dates of the tree with the terminal nodes removed; the average error is then sampled over the nodes (Smith & O'Meara, 2012).The smoothing parameter with the lowest error (= 0.1) was chosen to run the analysis.No known homalopsid fossils exist, thus we rely on one fossil and two secondary calibrations of our outgroup taxa.Using the fossils provided in Head et al. (2016), we used Coluber cadurci to calibrate the divergence of Colubridae+Elapoidea, with a minimum threshold of 30.9 million years (Myr) on the (Micrurus brasiliensis,(Chironius exoletus, Philodryas olfersii)) node in our trees.As fossils often only represent minimum ages for calibrations, we used custom R scripts with the ape (Paradis & Schliep, 2019) and phytools (Revell, 2012) packages to extract the upper bound of the 95% confidence interval from the Colubridae+Elapoidea node in Burbrink et al. (2020), making a constraint of 30.9-46.75 Myr.Using this R script and the same phylogeny from Burbrink et al. ( 2020), we also obtained the 95% confidence intervals to create lower and upper bounds on two additional nodes in the tree: i) Colubroidea+Viperidae (common ancestor of all outgroup Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)...
Using two of these time-calibrated trees (nuclear species and cyt-b trees), we estimated the ancestral ranges and dispersal history of Homalopsidae using the R package Bio-GeoBEARS (Matzke, 2013a).We used our ingroup (Homalopsidae) from our time-calibrated cyt-b tree and nuclear species tree (see below) as input, pruning outgroups from the tree using ape (Paradis & Schliep, 2019).Although we do not have 100% of homalopsid diversity in the cyt-b tree (see Results), the species that are not included occur in the same ecoregions that are used as input for BioGeoBEARS, thus it is likely we are not missing any important range transitions/states.Additionally, several studies have used BioGeoBEARS to obtain biologically meaningful results, even when species-level sampling has ranged from ~54-69% (Peterson et al., 2022;Schools et al., 2022) and when genus-level sampling was ~85% (Fric et al., 2022).
We coded areas using eight distinct geographic regions (Fig. 1), based on the geographic distributions of homalopsids (Murphy & Voris, 2014) and the geological history of these regions, particularly during periods of Pleistocene land bridge (Hall, 2009;Voris, 2000): Indochina (I), Sundaland (N), Philippines (P), Micronesia (M), Wallacea (W), South Asia (S), Australia (A), and New Guinea (G).The Malay Peninsula has been repeatedly separated and connected from the Greater Sunda Islands during Pleistocene sea-level fluctuations (Hall, 2009;Voris, 2000), thus we group the Greater Sunda Islands and Malay Peninsula as one distinct region (Sundaland).Similar to Weinell et al. (2020), we treated the Isthmus of Kra (5°-13° N), a zone of species turnover on the Thai-Malay Peninsula between Indochina and Sundaland, as a boundary.This region is where many faunal ranges of Indochina and Sundaland reach their southern-and northernmost distributions, respectively (de Bruyn et al., 2004;Hughes et al., 2003).Other regions were defined based on geographic distributions of species and endemism as well as geographic changes in topography (e.g., India separated from Indochina near the Arkan Mountains; Wallacea as oceanic islands separated from Sundaland and Australasia; New Guinea and Australia repeatedly separated and connected during the Pleistocene).Additionally, while parts of East Asia (e.g., coast of China and Taiwan) are not traditionally considered part of Indochina, we include them here given the continuous range of some homalopsids in Indochina.We created different dispersal scenarios within regions and between adjacent regions that have current contiguous landscapes or regions that were connected during the Pleistocene (e.g., Sundaland, Indochina to Sundaland, Australasia [Australia and New Guinea]); this created a total of 13 dispersal scenarios.We did not allow connection between Borneo and the Philippines as it is not known to what extent flora and fauna have dispersed into the Philippines via Palawan or the Sulu Archipelago (Brown et al., 2013), and two homalopsid species are currently documented to inhabit the Philippines.
We estimated ancestral ranges by testing six models: Dispersal-Extinction-Cladogeneses (DEC; Ree & Smith, 2008); DIVALIKE, which is a likelihood version of the parsimony model Dispersal-Vicariance (DIVA; Ronquist, 1997); and the BAYAREALIKE model, a likelihood version of the BAYAREA model (Landis et al., 2013).These models vary in the range evolution processes that can occur during cladogenesis (Matzke, 2013a(Matzke, , 2014)).The DEC model assumes that daughter lineages will inherit the ancestral area state if the most recent common ancestor (MRCA) is limited to a small range (single area), or, if the MRCA is widely distributed, the daughter lineage will inherit a range that is within the MRCA's ancestral area (Ree & Smith, 2008).The DIVA model assumes that speciation is dependent on vicariance events and does not make assumptions of relationships between areas (Ronquist, 1997).Lastly, the BAYAREA model assumes no range evolution during cladogenetic events, so the daughter lineages inherit the ancestral range of the MRCA (Landis et al., 2013;Matzke, 2013b).We also computed the likelihoods of these models with the '+J' jump dispersal parameter included to allow for founder-event speciation: DEC+J, DIVALIKE+J, BAYAREALIKE+J (Matzke, 2013b(Matzke, , 2014)), for a total of six models.It has been shown that statistical problems can arise when using the DEC and DEC+J models and that '+J' parameterizes the mode, but not the rate, of speciation, leading to inaccurate biogeographic inference (Ree & Sanmartín, 2018); there are also concerns that BioGeoBEARS does not take extinct lineages into account.However, simulations in Matzke (2014) show that BioGeoBEARS inference is not badly biased if extinction is random (an assumption we have made) and also that d and j parameters are identifiable (Klaus & Matzke, 2020;Matzke, 2014).Additionally, the statistical validity of using models implemented in BioGeoBEARS has been supported when compared to ClaSSE models (Matzke, 2022).We statistically compared the fit of all models under different dispersal scenarios using the Akaike Information Criterion (AIC; Akaike, 1974) and the AIC corrected for small sample sizes (AICc; Burnham & Anderson, 2004a, 2004b), considering model schemes with the lowest AIC and AICc scores to be the best fit models.Likelihood ratio tests were calculated to determine if likelihoods for models with and without the +J parameter were statistically different from each other.We also ran the ancestral range estimation on the nuclear species tree with only fresh samples; while we focus on the results of our cyt-b tree, the results for the ancestral range estimation of the nuclear species tree can be found in Supplementary Text S1.
Before performing HiSSE model analyses that assume the ancestral rear-fanged homalopsid was aquatic, we pefomed Ancestral Character Estimation (ace) in the R package ape (Paradis & Schliep, 2019) to determine the ancestral state of this clade.Because the input tree data is discrete, the estimation of character transitions were calculated using the Maximum Likelihood method (Pagel, 1994).For this analysis, we used our nuclear species tree as an input phylogeny and scored habitat states using three schemes: i) terrestrial or aquatic, ii) terrestrial or freshwater (FW) or brackish water (BW), or iii) terrestrial or FW or Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)... BW or unknown habitat.Natural history information was obtained from previously published literature (Köhler et al., 2021;Murphy, 2007;Quah et al., 2017).The third scheme includes 5 homalopsids that are known to be aquatic, but their microhabitat preference is unknown; thus, they are excluded from all analyses except one (see below).A total of 36 representative taxa (31 homalopsids and 5 outgroups) were labelled as being terrestrial (N T = 6), freshwater (N FW = 11), brackish-water (N BW = 14), or unknown (N U = 5).The analysis was coded to allow for an 'all-rates-different' (ARD) mode of character evolution: the conversion in habit from a FW to BW, and vice-versa, was assumed to occur at a different rate than the transition from terrestrial to aquatic and back.The 'equal-rates' (ER) model was also tested, but the results were poorly supported (likely due to changes between FW and BW being more likely to occur than between the states of terrestrial and aquatic); so ER results were not included.The analysis was run for every possible combination of states (called 'cases,' here and in the R code): 1) Terrestrial vs. FW vs. BW, 2) Terrestrial vs. Aquatic, 3) Terrestrial vs. FW vs. BW with unknown taxa removed, 4) Terrestrial vs. FW vs. BW with unknowns and outgroups removed, and 5) Terrestrial vs. Aquatic with unknowns and outgroups removed.Outgroups were included (or removed) to verify that their inclusion would not bias the analysis via additional terrestrial character weightage (all outgroups are terrestrial).All combinations were run multiple times to check for congruency between results.The R code for ancestral state reconstruction can be found in Supplementary Data D1-D7.

Bulletin of the Society of Systematic Biology
To determine if the partitioning of rear-fanged homalopsids into aquatic systems is correlated with their diversification rates, we used a Geographic Hidden-State Speciation and Extinction (GeoHiSSE) framework using the rearfanged group of our nuclear species and cyt-b trees.This model estimates speciation, extinction, and transition rates for two geographic areas while including unobserved character states ('hidden states') to incorporate rate heterogeneity that is independent of geography (Caetano et al., 2018).We scored homalopsids as existing in one of three states that correspond to their aquatic habitat preference: 0 = both freshwater and brackish, 1 = freshwater, and 2 = brackish.Natural history information was obtained from previously published literature (Köhler et al., 2021;Murphy, 2007;Quah et al., 2017).Because all rear-fanged homalopsids are aquatic, and our ancestral state reconstructions indicate that the ancestral lineages of the rearfanged group were aquatic (see Results; Supplementary Fig. Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)...

Bulletin of the Society of Systematic Biology
S6), we are confident in restricting our GeoHiSSE analysis to only aquatic states.Sampling fractions were estimated to be 0.692, 0.529, and 1 (nuclear species tree) and .870,.529,and 1 (cyt-b tree) based on our sampling divided by the total number of homalopsids inhabiting freshwater, brackish, and both types of habitats, respectively.While some homalopsids have been found in both brackish and fresh waters, we only scored homalopsids with state 0 if they are often found in both types of aquatic habitat.We fitted 15 models (Table 1), which vary in the models being character (state) dependent or independent, the inclusion or absence of hidden states, and the number of transition rates.Models include both GeoSSE/GeoHiSSE-like models in which habitats evolve according to dispersal and extinction as well as MuSSE/MuHiSSE-like models in which habitat shifts evolve according to transition rates among each state.We also fitted a classic Birth-Death model (Birth-Death 1, no hidden states or state-dependence) that includes dispersal parameters only and no range-dependent diversification as a null model.We fitted a second Birth-Death model (Birth-Death 2) that differs from Birth-Death 1 in that the model allows for transitions out of states based on transition rates, not just the extinction in one area.Following the GeoHiSSE vignette, we set the extinction fraction, "eps," equal across all states for all models (including range-dependent and hidden state models).Full details and parametrization of all ancestral range estimations, ancestral state reconstructions and HiSSE models can be found in the associated R code (see Supplementary Data D8-D12).

Phylogenomic Analyses
The concatenated nuclear tree (fresh samples only) is topologically identical to the species tree with strong support at most nodes (Supplementary Figs.S1-S3).All species are monophyletic in the concatenated nuclear tree except Cerberus schneiderii, Homalopsis buccata, and Hypsiscopus plumbea.Cerberus microlepis and C. dunsoni are nested within C. schneiderii, Homalopsis semizonata is embedded in H. buccata, and H. plumbea is paraphyletic with respect to H. matannensis.
The homalopsid species tree with only fresh specimens recovers a monophyletic Homalopsidae with strong support (Bpp=1) at all nodes except the divergence between Enhydris enhydris and E. longicauda+E.innominata (Fig. 2A).The fangless genus Brachyorrhos is sister to the rear-fanged clade, which comprises all other homalopsids.The rearfanged homalopsids consist of two subclades (subclade I and II; Fig 2B).The species tree is broadly consistent with the cyt-b (constrained by the genomic tree topology) tree regarding the fangless/rear-fanged split and the two subclades of rear-fanged taxa.The cyt-b tree recovered a poorly supported fangless clade (UFB=66), sister to the strongly supported rear-fanged clade (UFB=96), with all genera as monophyletic and most nodes strongly supported (Fig. 2B; Supplementary Figure S4).For the fangless taxa, the poorly known New Guinea endemic Calamophis diverges from Brachyorrhos.The rear-fanged South Asian homalopsids Ferania sieboldii, Mintonophis pakistanicus, and Dieurostus dussumieri are the closest related group to all other homalopsids in Subclade I with strong support (Fig. 2B).Miralia alternans is strongly supported as sister to Myrrrophis.Enhydris jagorii is minimally divergent from E. innominata and E. longicauda.Additionally, Homalophis doriae is recovered as part of a clade consisting of the Sundaic taxa Raclitia indica and Phytolopsis punctata.

Divergence Dating, Biogeographic, and Modeling Analyses
Our divergence dating, ancestral range, and GeoHiSSE results of the nuclear species and cyt-b tree are broadly congruent; therefore, we focus our discussion on the cyt-b tree for all subsequent analyses as it contains a greater level of taxonomic sampling.We mention the results for the nuclear species tree parenthetically, with full details of the analyses using the nuclear species tree in Supplementary Text S1.Divergence time estimation supports an Oligocene origin for crown homalopsids, ~26.4 (27.7) million years ago (Ma; Fig. 2B).Subclades I and II split ~21.8 (15.3)Ma, with both of these clades diverging around 21.2 (13.4) Ma and 19.8 (14.7)Ma, respectively.Most intergeneric-level splits occurred throughout the Miocene between 11 and 18 (7-15) Ma and most interspecific divergences ~200 thousand years ago (Ka) to 5 Ma (~300 Ka-3 Ma).
The best fit biogeographic model to our data was the DEC+J model.Although the ancestral range of crown homalopsids remains unresolved, this model suggests an In-dochina+South Asian origin of the rear-fanged clade (Fig. 2B; Tables 4-5; Supplementary Fig. S5).At 21.2 Ma, sub-Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)...

Bulletin of the Society of Systematic Biology
Table 1.List of models that were fitted in the GeoHiSSE analysis using the cyt-b tree as input.States differed in their inclusion of hidden states, if they were character dependent (CD) vs. character-independent (CID), and the number of transition rates and turnover parameters.The model with the highest AIC weight is the best fitted model (bolded).Models are given in order of highest to lowest AIC weight for the cyt-b tree.Model information for the nuclear species tree can be found in Supplementary Text S1.

Model
Hidden states  All ancestral state reconstructions support the ancestor of Homalopsidae as being terrestrial and the ancestor of the rear-fanged clade as aquatic, likely inhabiting brackish watersystems (Supplementary Fig. S6).The best supported model in the GeoHiSSE analysis, using AIC weights, was the character independent GeoHiSSE 1 model (model 3), in which diversification rates are constant across all taxa (AICweight = 0.372; Table 1).The model-averaged ancestral state reconstruction in the GeoHiSSE analysis suggests the ancestral rear-fanged homalopsid likely inhabited both freshwater and brackish aquatic systems.Reconstructions support transitions to brackish water (Bitia, Fordonia, Gerarda, Cantoria) and freshwater (Enhydris) states but also instances of reversals, inhabiting both states (MRCAs of Hypsiscopus, Myrrophis, and Cerberus; Fig. 3).These instances of reversals are also seen with the brackish and freshwater states in several lineages (Fig. 3).The GeoHiSSE analysis using the nuclear species tree as input found the best supported model was the Birth-Death 1 model (model 1), in which diversification rates are constant across all taxa (AICweight = 0.338; Supplementary Text S1).

Discussion
Genomic data has expanded our understanding of evolutionary relationships (e.g., Hime et al., 2021;McFadden et al., 2021), phylogenetic incongruence and gene conflict (Myers et al., 2021;Singhal et al., 2021), and evolutionaryecological dynamics (Brennan et al., 2021).In this study, we resolve the phylogeny of a morphologically and ecologically diverse family of snakes.Using thousands of loci from modern tissue samples and mtDNA from degraded specimens, we expand our knowledge of the biogeographic history and evolutionary origins of homalopsids and discuss current gaps in our knowledge of the group and areas of future study.

Evolutionary Relationships of Homalopsidae
The phylogenetic relationships here are broadly congruent with previously published homalopsid studies using multilocus Sanger datasets (Alfaro et al., 2008;Bernstein et al., 2021).Homalopsidae consists of two major lineages: the fangless and the rear-fanged groups, the latter divided into two subclades.In our study, we estimate younger divergence dates, which were consistent across independent treePL runs.Other squamate studies have used genomic data with penalized likelihood methods (e.g., Burbrink et al., 2020;Deepak et al., 2022), and our results yield dates that are broadly consistent with prior estimates.The importance of using genomic data is also exemplified in our concatenated analyses.Populations of Cerberus have long been classified as distinct, endemic species based on morphology, but our molecular data suggest that these may represent founder populations in the Philippines (C.microlepis) and Palau (Micronesia; C. dunsoni).We recover both nominal taxa embedded within C. schneiderii (Supplementary Figs.S2,S3).This founder scenario is also exemplified by a population of C. schneiderii on Timor Island that shows similar levels and patterns of divergence to C. microlepis and C. dunsoni.We note that genome-wide data support the non-monophyly of C. schneiderii with respect to C. microlepis and C. dunsoni and the morphologically distinct Homalopsis buccata and H. semizonata (Murphy, Mumpuni, et al., 2012).

Biogeographic Origins and Evolution of Homalopsidae
Our crown age estimate of homalopsids (26 Ma) points to an Oligocene origin, rather than the Eocene as previously suggested (Bernstein et al., 2021).The rear-fanged homalopsids are inferred to have diverged during the Oligocene in Indochina 21.8 Ma.Many studies focus on shallow timescale patterns and the importance of Pleistocene sealevel fluctuations that have led to species-and populationlevel divergences (de Bruyn et al., 2014;Hall, 2009;How & Kitchener, 2003;Lohman et al., 2011;Maryanto et al., 2021;Voris, 2000).However, our data suggest that older geological events may have facilitated homalopsid diversification in Indochina and Sundaland, supporting our hypothesis that divergence dates predate Pleistocene sea-level fluctations.Major riverways in mainland Southeast Asia began shifting southward ~17 Ma, close to several divergences in our tree (Fig. 2B; or closer to the divergence of the entire rear-fanged clade in the nuclear species tree).These geological changes affected the flow of these waterways downwards towards parts of the then-contiguous Sundaland (Clark et al., 2004;Hall, 2013;Hennig et al., 2018;Hennig-Breitfeld et al., 2019), especially Borneo (Breitfeld et al., 2020), and may have been an important driver of speciation in homalopsid snakes.
The rear-fanged group likely has ancestral origins in South Asia+Indochina (cyt-b; Fig. 2B) or Sundaland (nuclear species tree; Supplementary Text S1).Our best supported model for ancestral range estimation, DEC+J, allows either vicariance or subset (partial sympatry) speciation if Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)...  one of the daughter lineages has a range defined by a single area, as well as jump dispersal (founder events), which seems likely given their tolerance of a variety of aquatic habitats, including marine.Additionally, vicariance, which is part of the DEC+J model, has been supported in many narrow-scale homalopsid studies (Bernstein et al., 2021;Lukoschek et al., 2011), and the region's changing river systems have led to both corridors and barriers to migration in Indochina and Sundaland (Salles et al., 2021).Rivers have often been thought of as barriers to some terrestrial vertebrate groups (e.g., draconin lizards; Klabacka et al., 2020) but can also act as migration corridors for semiaquatic fauna like hylid frogs (Fonseca et al., 2021).With Southeast Asia's mosaic of geological events, it would not be surprising that the tectonically-induced changes in river currents and morphology have turned rivers into barriers to migration, even in aquatic groups (Kurata et al., 2022).

Bulletin of the Society of Systematic Biology
With their different levels of salinity tolerance (Dunson & Dunson, 1979;Kumar et al., 2012) and sensitivity to elevational gradients (Karns et al., 2005), vicariance or subset sympatry is a likely diversification scenario for many homalopsids.We also note that the +J parameter for founder events is reflected in the diversity of this group, such as endemic taxa like Cerberus microlepis and C. dunsoni as well as the Timor population of C. schneiderii.It is possible that the salt glands in C. schneiderii have facilitated their dispersal capabilities, leading to them being the most widespread homalopsid genus and species and one of the most abundant and widespread reptiles in Southeast Asia (Murphy, 2007).
Our findings from the ancestral state reconstructions and GeoHiSSE analysis suggest that aquatic habitat specificity did not influence diversification rates, thus rejecting our hypothesis.However, these results do suggest that rearfanged homalopsids had aquatic ancestors that lived in both brackish and freshwater environments.Thus, it is possible that the physical shifting of these aquatic pathways led to the formation of paleodrainage systems and influenced the diversification and distribution of lineages.These geological processes in Indochina have been associated Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)... with species diversity in fish (Alshari et al., 2021;de Bruyn et al., 2013), lizards (Klabacka et al., 2020), and snakes (including homalopsids; Lukoschek et al., 2011;Voris et al., 2012).Whereas many of these cases represent more recent divergences, our results indicate that geologically-influenced diversification may have also taken place at much older timescales.We also recover shallow, species-level, geographic-based divergences in Indochina and Sundaland (e.g., Enhydris, Homalopsis, Myrrophis; ~200 Ka-5 Ma in the cyt-b tree; ~300 Ka-3 Ma in the nuclear species tree).Recent geological research (Salles et al., 2021) has shown that in the last 500 Ka, changes in landscape dynamics may have played a larger role in species dispersal within and across Southeast Asia, and biogeographic scenarios involving Pleistocene sea-level changes may need to be revisited due to a new paradigm showing continuous subaerial presence of Sundaland until 400 Ka (Husson et al., 2020;Sarr et al., 2019).Although widespread studies on aquatic snakes are limited, these geological findings are supported by phylogeographic studies on fish, suggesting that sea-level fluctuations are not solely responsible for population structure within the Indochinese region in some groups even over the last 5 Ma (Sholihah et al., 2021), which is congruent with our date estimates.

Bulletin of the Society of Systematic Biology
Some taxa in our study (e.g., Myron, Pseudoferania) require further sampling to better understand the dispersal of Indochinese/Sundaic lineages into Wallacea and Australasia.Although lineages in New Guinea and Australia likely have dispersed across the Torres Strait land bridge (between southern New Guinea and northern Australia) in the Plio-Pleistocene (Jones & Torgersen, 1988;Torgersen et al., 1985), our dates, even at the population level (1.7-4.9Ma; Fig. 2B; Supplementary Fig. S3), predate these land bridges that occurred between ~130-10 Ka (Reeves et al., 2008;Sloss et al., 2018).Some population-level studies on elapid snakes and cardiid bivalves have found genetic structure correlated with the Torres Strait land bridge but at dates older than 130 Ka (Keyse et al., 2018;Wüster et al., 2005).More thorough sampling is needed to reassess divergence times and determine whether one or multiple dispersals occurred between New Guinea and Australia in Myron and Pseudoferania (Supplementary Fig. S3) and determine if these dispersals coincide with the Torres Strait land bridge, as has been seen in frogs (Oliver et al., 2021).Our ancestral range estimation shows a long distance, over-water, jump dispersal from Indochina to Australasia at 11 Ma (Pseudoferania and Myron; Fig. 2B), a pattern also seen in Hylaranine frogs (Chan et al., 2020).This would be possible only through the intermediate region of Wallacea, likely through land bridges or stepping-stone dispersal as more land and ephemeral islands emerged in Wallacea (Lohman et al., 2011).Lydekker's Line (separating Wallacea and Australa-Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)...

Bulletin of the Society of Systematic Biology
sia; Lydekker, 1896) and Wallace's Line (separating Southeast Asia's Greater Sunda Islands and Bali from Wallacea; Mayr, 1944;Wallace, 1860) are well known faunal boundaries which are crossed by many of the younger lineages in our tree (e.g., Hypsiscopus spp., Cerberus schneiderii).Because of the aquatic nature and salinity tolerance of many Mud Snakes, it is not surprising to find that some taxa have crossed Wallace's Line.Although not tolerant to high salinity, this pattern has also been seen in some pythonids and elapids (Esquerré et al., 2020;Lee et al., 2016).Additional sampling will help determine the origin of C. schneiderii in Australia and on Timor.This origin is inconclusive given our data, and no tissues for DNA analysis exist for populations in eastern Indonesia (the Moluccas).The Timor clade subtends branches from the Greater Sundas, and it is likely that the Timor population was established by colonization of intermediate landmasses in Wallacea, such as the Lesser Sundas.Our results suggest that Australia is likely not within the ancestral range of Cerberus schneiderii populations (Fig. 2B), which, in our concatenated analysis, includes C. dunsoni and C. microlepis (Supplementary Figs.S2,S3).This pattern may indicate that Australia-to-Wallacea dispersals are inhibited.The Indonesian Throughflow are strong ocean currents that trace Wallace's line, flowing eastward over the Lesser Sunda Islands that wrap around the islands of Timor, Damar, and Babar.Inhibited dispersal due to ocean currents has been suggested for other reptiles (Karin et al., 2020), and future studies will require more samples from eastern Indonesia, New Guinea, and Australia to test if these currents influence the movement of homalopsids.

Museum Genomics of Homalopsidae
The inclusion of degraded samples from natural history collections has been useful for determining the phylogenetic placement (Delling et al., 2023) and biogeography (Garg et al., 2022) of poorly known or extinct lineages.In this study, we phylogenetically place five genera and six species for the first time, each of which are known from only a few voucher specimens, and use these results to formulate hypotheses regarding homalopsid evolution.Using multilocus data, Bernstein et al. (2021) estimated divergence dates of Brachyorrhos of ~1.5 Ma (95% HPD: 180 Kyr-2.58Ma).The grouping of Calamophis and a monophyletic Brachyorrhos as a clade likely indicate recent diversification events.Brachyorrhos and Calamophis consist of five and four species, respectively, with Brachyorrhos endemic to the Moluccan and Aru islands of eastern Indonesia and Calamophis restricted to the Bird's Head peninsula of New Guinea.Most of the Molucca and Aru islands have only been emergent for ~5 Ma (Hall, 2009).Our results indicate that the Brachyorrhos crown group is 5.8 Ma (Supplementary Fig. S4).Within this timeframe, New Guinea's Tamrau region, from which Calamophis ruuddelangi is uniquely known, underwent a collision with an island arc in the late Miocene-Pliocene (Webb et al., 2019).Although more data will be needed to determine more accurate divergence times, our results allow us to formulate a hypothesis of how fangless homalopsids may have diversified.Future studies can test scenarios of dispersal of ancestral lineages from New Guinea into the Molucca and Aru islands to determine if this led to the evolution of the Brachyorrhos clade.
Additionally, the placement of the South Asian homalopsids Dieurostus, Mintonophis, and Ferania (by subclade I, 18 Ma; Supplementary Fig. S4) supports the hypothesis that homalopsids may have diversified in mainland Southeast Asia and then dispersed eastward and westward (Murphy, 2007); the recovery of these three genera as a clade, and the respective long branches in the tree, may indicate long-term isolation due to the heterogeneous topography of South Asia and considerable distances between the distributions of these lineages (i.e., Indus River of Pakistan [Mintonophis]; Kerala, India [Dieurostus]; northwest India, Bangladesh and Nepal [Ferania]).Interestingly, the relationships of the South Asian lineages differ from previous hypotheses based on morphology (Murphy, 2007) in that the two species from India are not each other's closest relatives, which may be evidence of dispersal into Pakistan when the Ganges and Punjab Rivers (the latter of which is now connected to the Indus River) drainage systems were connected until tectonic shifts separated these aquatic systems (Clift & Blusztajn, 2005).Finally, multiple dispersals to Borneo and Sumatra from Indochina/Sundaland are evident from the placement of Miralia from Sumatra (sister to Myrrophis [Indochina]) and Homalophis from Borneo (sister to Phytoplopsis [Peninsular Malaysia]).Our results emphasize that the use of nuclear and mtDNA from museum specimens can help to test previous, morphology-based hypotheses, as seen by our placement of M. alternans and F. sieboldii which were hypothesized to be closely related to species now recognized as distantly related (Gyi, 1970).
Despite our increased sampling, we could not reject or support our hypothesis on the geographic origins of Homalopsidae as the range estimation for the ancestral node is unresolved in both the cyt-b and nuclear species trees (Supplementary Fig. S5).This is likely due to the overlapping ranges of fangless and rear-fanged homalopsids, with fangless homalopsids in eastern Indonesia and the rear-fanged group in South Asia, all of Southeast Asia (including eastern Indonesia), New Guinea, and Australia.All recent estimated dates of divergence between the fangless homalopsids of eastern Indonesia and the widespread rear-fanged group are ~20-55 Ma (Alfaro et al., 2008;Bernstein et al., 2021;Harrington & Reeder, 2017;Zaher et al., 2019).Based on geological evidence (Hall, 2009), these age estimates, including our own, for the crown group of Homalopsidae are too old for there to be any possible dispersals from Indochina/Sundaland to the Moluccan islands (i.e., no subaerial land/islands at that time), the latter of which had not formed until the last few million years (Hall, 2009(Hall, , 2013)).However, an additional genus of fangless homalopsids, Karnsophis, only known from the holotype, is from Sumatra.This individual represents the only specimen of fangless homalopsid known from Sundaland, which bridges the distributional gap of fangless homalopsids (Karnsophis, Calamophis, Brachyorrhos) between Indochina/Sundaland and the Moluccas and New Guinea.While molecular data has yet to confirm the familial identity of Karnsophis as a Phylogenomics of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)...

Bulletin of the Society of Systematic Biology
homalopsid, the systematic affinities of this taxon stands as a 'missing puzzle piece' that will be required to understand how homalopsids diversified between the two distant regions of Sundaland and Australasia.If Karnsophis is in fact a homalopsid, we speculate that extinction events have taken place along the branch spanning these taxa, and, thus, our divergence dates may be overestimated.Bernstein et al. (2021) hypothesized there may be unsampled extinctions in the homalopsid tree as shown by a 35 Ma diversification gap in their species tree.We recover a ~13 Ma gap of no diversification in the cyt-b tree (~ 5 Ma in the nuclear species tree) between the fangless and rear-fanged homalopsids, potentially indicating unsampled extinction events (Ricklefs, 2007).Testing for extinction in future studies will be an interesting investigation into homalopsid evolutionary history but is not possible in our study due to the absence of any known homalopsid fossils (Rabosky, 2010).Since no fossils exist, we had to rely on one fossil and two secondary calibrations for our divergence time estimations.We acknowledge that caution must be taken when using secondary calibrations, which could lead to erroneous date estimation (Schenk, 2016).Thus, we rely on our rigorous approach with high levels of molecular and taxonomic sampling to pave the way for future work.
While our use of historical museum samples here was limited to mitochondrial data, their inclusion has expanded our biogeographic inferences to encompass South Asia.This region contains some of the most poorly known and undersampled species and is one of the biggest gaps to fill in homalopsid evolutionary and biogeographic history.We emphasize that, whenever possible, studies should utilize any molecular data that can be leveraged from museum samples.We encourage researchers to consider attempting DNA extractions from different tissue types (e.g., bone) and spiking libraries with non-enriched DNA to increase the chances of bycatching mtDNA.Including museum samples can be a costly risk if minimal data is retrieved, but the methods used here ultimately recovered useful molecular loci, which also may be incorporated into future studies.With recent works advancing the uses and successes of museomics (Bernstein & Ruane, 2022;Nunes et al., 2022;Roycroft et al., 2022), it is important to continue to attempt inclusion of taxa known only from intractable specimens to better expand our knowledge of the evolutionary processes that generate diversity.

Conclusions
In this study, we used dense sampling of all genera and species of homalopsid snakes for which fresh tissues exist and bolstered this sampling by including previously intractable museum specimens in our phylogenomic framework.We provide the most comprehensive phylogeny thus far for homalopsids, raising the genus/species coverage from previous studies (62%/62%) to 83%/74%.Including degraded samples and use of biogeographic range estimation models allow us to fill in several gaps in homalopsid biogeography and evolution, and we articulate several hypotheses to be tested regarding poorly known species in fu-ture phylogenetic analyses.Focal studies on the fangless homalopsids will shed light on the species richness and biogeography of homalopsids in eastern Indonesia.With further molecular sampling from museum specimens and denser specimen sampling schemes for population-level studies, we may better understand geological processes that have contributed to species dispersal throughout Indochina and Sundaland and dispersal modes into Wallacea and Australasia.
Calderón Acevedo for help in running ancestral range estimation analysis, Siavash Mirarab for help with ASTRAL code, Marcelo Gehara for helpful discussions on phylogenetic dating, and Robert Hall for his helpful advice and discussions on paleogeography and geology.We thank Nitin Saxena for help with the ancestral state reconstructions.Finally, we thank the Biodiversity Management Bureau, of the Philippine Department of the Environment and Natural Resources, for logistical support and for overseeing drafts of project Memoranda of Agreement as well as issuing research, collecting, transport, and export permits necessary for this and related studies (fieldwork supported in part by NSF DEB 0743491, 1654388, and 0804115 to RMB and C.D. Siler).We also thank the KU Institutional Animal Care and Use Committee for approving animal handling protocols (KU IACUC AUS 185-05 to RMB).

Figure 1 .
Figure 1.The eight bioregions used in the ancestral range estimation in BioGeoBEARS.Regions of interest are colored in blue.Inset maps in panels e) and f) show the island groups that form Wallacea and the Republic of Palau (the Micronesia bioregion), respectively.

Figure 2 .
Figure 2. a) Nuclear species tree of Homalopsidae using ASTRAL-III, with fangless and rear-fanged groups boxed in green and blue, respectively.Green nodes denote strong support and green branches represent topology that matches that cyt-b tree in panel B. b) Cyt-b tree of Homalopsidae, with degraded sample tips in purple; fangless and rear-fanged

Figure 3 .
Figure 3. Model-averaged GeoHiSSE results using subclade II from the time-calibrated cyt-b tree in Fig. 2B.Branch outlines are colored by net diversification rates (hotter = faster) and branches colored by reconstructed ancestral habitat states.Squares at tips denote the currently known habitat preference of the respective taxon.Rear-fanged clade (blue) from the nuclear species tree used for the GeoHiSSE analysis is shown in the upper right.

Table 2 .
Comparison of fresh and formalin datasets post-bioinformatics processing (out of 4,860 locus alignments).The number of samples in alignments (Al samples ), alignment lengths, number of alignments ≥250 bp (Al 250 ), and number of parsimony informative sites (PIS) are given for each dataset of N samples.All parenthetical numbers represent averages. of Fresh and Formalin Specimens Resolves the Systematics of Old World Mud Snakes (Serpentes: Homalopsidae)...
expanding their ranges ~17 Ma.The clade containing Bitia, Fordonia, Gerarda, and Cantoria had likely underwent range expansions westward into South Asia and south/eastward into Sundaland and Wallacea.A similar pattern is also seen amongst Erpeton, Pseudoferania, and Myron (17.6 Ma), with the latter two diversifying into Australasia ~11 Ma.Sev-6-10 Ma.The nuclear species tree supported the BAYAREALIKE+J (statistically identical to DEC and DEC+J; see Supplementary Text S1) as the best model and a similar biogeographic history, with the exception that the rear-fanged group's ancestral range originated in Sundaland and subsequently dispered east and west.PhylogenomicsBulletin of the Society of Systematic Biology

Table 3 .
Statistics on loci mapped to the pseudoreference genome in Geneious.The number of unique DNA sequences (N sequences ), number of unique specimens (N specimens ), and the range, average, and median locus lengths across UCEs, AHEs, and NPCGs are given.

Table 4 .
Likelihood ratio tests of the alternative and null models for BioGeoBEARS using the cyt-b tree.The log likelihoods of the alternative (LnL alt ) and null (LnL null ) models are given, with corresponding p-values (α = 0.05).

Table 5 .
Model comparison for reconstruction of ancestral range estimations in BioGeoBEARS using the cyt-b tree.Log likelihoods (LnL), number of parameters, dispersal rate (range expansion; d), and extinction rate (range contraction; e) are given for each model (model in bold = model of best fit based on AIC and AICc).Model are listed in order of best fit.