Two hundred and five newly assembled mitogenomes provide mixed evidence for rivers as drivers of speciation for Amazonian primates

Abstract Mitochondrial DNA remains a cornerstone for molecular ecology, especially for study species from which high‐quality tissue samples cannot be easily obtained. Methods using mitochondrial markers are usually reliant on reference databases, but these are often incomplete. Furthermore, available mitochondrial genomes often lack crucial metadata, such as sampling location, limiting their utility for many analyses. Here, we assembled 205 new mitochondrial genomes for platyrrhine primates, most from the Amazon and with known sampling locations. We present a dated mitogenomic phylogeny based on these samples along with additional published platyrrhine mitogenomes, and use this to assess support for the long‐standing riverine barrier hypothesis (RBH), which proposes that river formation was a major driver of speciation in Amazonian primates. Along the Amazon, Negro, and Madeira rivers, we found mixed support for the RBH. While we identified divergences that coincide with a river barrier, only some occur synchronously and also overlap with the proposed dates of river formation. The most compelling evidence is for the Amazon river potentially driving speciation within bearded saki monkeys (Chiropotes spp.) and within the smallest extant platyrrhines, the marmosets and tamarins. However, we also found that even large rivers do not appear to be barriers for some primates, including howler monkeys (Alouatta spp.), uakaris (Cacajao spp.), sakis (Pithecia spp.), and robust capuchins (Sapajus spp.). Our results support a more nuanced, clade‐specific effect of riverine barriers and suggest that other evolutionary mechanisms, besides the RBH and allopatric speciation, may have played an important role in the diversification of platyrrhines.

However, many of these methods are reliant on databases from which sequences can be integrated and against which results can be compared, and which are often incomplete (Curry et al., 2018).
For example, for platyrrhine primates (a group including all monkeys found in Central and South America) only 32 mitochondrial genome assemblies are available in RefSeq, even though over 200 species have been described, and complete platyrrhine mitogenomes are only available for 76 individuals in GenBank overall. Additionally, the majority of these mitogenomes contain little or no metadata, such as sampling locality, limiting their utility for many analyses, including population genetic studies that rely on spatial data (Deichmann et al., 2017;Strohm et al., 2016;Tahsin et al., 2016).
Hypotheses about how landscape features have shaped the distribution and richness of species can be investigated with molecular data that include sampling localities, and mitochondrial DNA is a fast-evolving marker (Brown et al., 1979), which can shed light on evolutionary relationships within young radiations more quickly than nuclear DNA. As such, mitogenomic data sets may be especially useful for assessing biogeographic and phylogeographic questions.
Primates found within the Amazon are disproportionately speciose for the geographic area they occupy (Fordham et al., 2020), and Alfred Russel Wallace noted that the distributions of many Amazonian primates appear to be limited by boundaries formed by the Amazon, Madeira, and Negro rivers (Wallace, 1852). Now known as the riverine barrier hypothesis (RBH), it is a long-standing paradigm used to explain the extraordinary species richness of not just primates (Ayres & Clutton-Brock, 1992;Boubli et al., 2015), but also other mammals (Patton et al., 1994), birds (Cracraft, 1985;Hayes & Sewlal, 2004;Pomara et al., 2014), amphibians and reptiles (de Fraga & de Carvalho, 2021;Godinho & da Silva, 2018;Ortiz et al., 2018), and butterflies (Hall & Harvey, 2002). The RBH proposes that the rivers of the Amazon river basin acted as drivers of speciation when their formation divided existing species' ranges and formed barriers to continued gene flow, leading to allopatric speciation. As an extension of the RBH, the Amazon has been divided into proposed areas of endemism: interfluvial regions which are suggested to harbour unique species assemblages, and which have been used as units for conservation planning (da Silva et al., 2005). However, the RBH and proposed areas of endemism are not without controversy. Criticisms include limits of interspecific phylogenetic comparative methods, and that many studies are based on very few taxa or single gene markers (Losos & Glor, 2003;Santorelli et al., 2018). In addition, some large-scale studies have found little or only species-specific support for the RBH (Dambros et al., 2020;Gascon et al., 2000;Kopuchian et al., 2020;Naka & Brumfield, 2018;Santorelli et al., 2018;Smith et al., 2014).
Here, we assemble more than 200 new mitochondrial genomes for Amazonian primates, with locality information (Figure 1), combine these with other Amazonian primate mitogenomes currently available, and use this data set to produce a dated phylogeny ("timetree"), which we use to assess support for the RBH. Specifically, we explore support for rivers as engines of speciation by first identifying divergences in the mitochondrial phylogeny where members of the neighbouring clades are found only on opposite sides of the major river boundaries proposed by Wallace (Amazon, Negro, and Madeira rivers; (1852)), followed by assessing synchrony of congruent divergences occurring for the same river and comparing these dates to current geological evidence for the timing of river formation. We consider divergences to be congruent with the RBH if divergences meet both conditions, namely that (1) sister taxa are found only on opposite sides of a river and that (2)

| Sample extraction, sequencing, and mitochondrial genome assembly
Sample extraction and sequencing for samples AC_t1 and AGC_ m1 (see Table S1) were previously described (Torosin, Argibay, et al., 2020a;Torosin, Webster, et al., 2020b). Details on genomic sequence generation for the remaining samples are provided in Kuderna et al. (2022). Briefly, genomic DNA was extracted and libraries prepared using standard Illumina protocols and libraries were sequenced to ~30× coverage on an Illumina NovaSeq6000 (150 bp paired-end reads). Reads were trimmed to remove any sequencing adapters or primers with cutadapt version 2.10 (Martin, 2011) and then subsampled to 3.5 million read pairs with reformat.sh from the bbtools suite v38.86 (Bushnell, 2014). We used mitofinder version 1.4 (Allio et al., 2020) to assemble and annotate mitochondrial genomes from the trimmed and subsampled Illumina short reads, using metaspades (Nurk et al., 2017) for the assembly step and mitfi (Jühling et al., 2012) for the tRNA annotation step. If multiple mitochondrial contigs were identified, we ran mitofinder a second time, setting the minimum contig size to 10,000 and the maximum contigs to 1, in order to force selection and annotation of only the single best contig. For each sample, we used the complete mitochondrial genome from a closely related species available in NCBI's RefSeq database as the reference genome in mitofinder (Supporting Information). All mitochondrial genomes were compared to the ncbi reference database via blast searches to confirm correct taxon identity and to check for completeness.
Following (Hassanin et al., 2021), we retained only the 12 proteincoding genes on the forward ("heavy") strand and the 12S and 16S rRNAs for the downstream analyses, and manually removed the other regions, while visually ensuring the integrity of the alignment.

| Phylogenetic analysis
We used beast 2.6.3 (Bouckaert et al., 2019) for simultaneous phylogeny estimation and divergence dating. As input, we used the trimmed alignment of the 12 forward ("heavy") strand proteincoding genes and rRNAs described above, partitioned by codon position for the protein-coding genes and stems and loops for the rRNAs. We linked clock and tree models for all partitions, setting the clock model to relaxed log normal. Instead of setting an a priori substitution model for each partition, we used the bModelTest module (Bouckaert & Drummond, 2017)  We used the same alignment and five partitions as for the beast2 analysis, assigning the GTR + G model to all partitions , while allowing independent model parameters, and used 25 random and 25 parsimony-based starting trees.

| Lineage delimitation and assessment of riverine barriers
In order to determine whether speciation in Amazonian primates has been facilitated by riverine barriers, we first used multi-rate Poisson Tree Processes (mPTP; Kapli et al., 2017) to identify major evolutionary lineages in our sample, rather than relying on existing species identifications or the identification of clades by eye. We did this because species limits within the platyrrhines are not always wellresolved and/or are controversial (Fordham et al., 2020;Quintela et al., 2020;Zachos et al., 2013), and, in some cases, are based on the presence of river boundaries, even if it has not always been established definitively whether the river forms a species barrier. To avoid issues of circularity based on potential river-guided species boundaries, we thus sought to delimit lineages in a way that is agnostic to the species assignment of our samples (see Everson et al., 2020 for a similar approach). Within mPTP, we implemented both the multilambda and single-lambda approaches, which provided a more and less conservative approach to lineage delimitation, respectively (Kapli et al., 2017). We used the maximum likelihood tree generated with raxml, removed outgroups with --outgroup_crop and determined minimum branch lengths with --minbr prior to the run.
For samples that had locality data available, phylogenetic relationships and results of delimitation with mPTP were projected onto sample localities with the phytools package (Revell, 2012) in r v4.1.0 (R Core Team, 2019). For any divergences between major lineages (as identified by mPTP) that are congruent with having occurred across a river boundary, we extracted all age estimates for the divergence of the relevant node from the posterior beast2 trees, to determine whether divergences across the same river occurred synchronously and coincided with published geological estimates for the timing of river formation.

| Mitochondrial genome assembly
We successfully assembled complete mitochondrial genomes from

| Lineage delimitation and assessment of riverine barriers
Lineage delimitation with mPTP identified 101 distinct lineages when using a single rate of lambda (Figure 4), and 52 lineages when using the multi-rate setting ( Figure S1). We identified 13 out of a total of 64 divergences within Amazonian platyrrhines that are congruent with having occurred across a riverine barrier, meaning that members of the respective sister clades/taxa were identified as distinct lineages by mPTP and are only found on opposite sides of a river (marked with node symbols in Figure 4, Figure S1).
When using the single-rate setting, the majority of divergences that were congruent with the RBH were found for the Amazon river, including within Saimiri, Cebus, Cheracebus, Ateles, Chiropotes,  and Cacajao (0.56-0.9 Ma) overlapped, but the timing of the diver-

| DISCUSS ION
We assembled 205 new mitochondrial genomes for platyrrhine primates, most sampled from the Amazon region, and used them to assess support for the long-standing riverine barrier hypothesis (RBH), which proposes that river formation was a major driver of speciation in Amazonian primates. Along the Amazon, Negro, and Madeira rivers, we found mixed evidence for the RBH, which we discuss in detail below. With the mitochondrial assemblies presented here, we have tripled the number of available mitogenomes for platyrrhines in GenBank and quadrupled the number of platyrrhine mitogenomes in RefSeq, and we provide an updated dated mitogenomic phylogeny of South American primates.
We utilized the novel mitogenomes presented here to assess support for the RBH, as originally proposed by Wallace (1852) (Campbell, 2010;Figueiredo et al., 2009), we identified four divergences in the platyrrhine tree that are congruent with the RBH and also align temporarily with this date, within the genera Saguinus, Leontocebus, Cebuella, and Chiropotes. We cautiously interpret divergences across the Amazon within these genera as being congruent with the RBH. However, we note that, despite this being the largest mitogenomic survey of the platyrrhines to date by far, some of the sample sizes are small, especially for callitrichids, and that many samples were collected some distance (~150-200 km) from the banks of the Amazon river, making it difficult to reject an alternative explanation of isolation by distance (Dambros et al., 2017).

P it h e c ia h ir s u ta P D _0 14 3 P it h e ci a p is si n a tt i
Furthermore, the relevant divergences within Callitrichidae are only supported when using the single-rate mPTP model, not the more conservative multi-rate model ( Figure S1). That said, additional support for the Amazon as a species barrier for pygmy marmosets (Cebuella spp.) has recently been provided (Boubli et al., 2021;Porter et al., 2021). Notably, many divergences across the Amazon did not occur synchronously and/or postdate the known minimum time of F I G U R E 4 Phylogenetic relationships of platyrrhine subclades mapped onto Brazilian sampling locations. Colours indicate mPTP lineage delimitation based on the single-rate method (see Figure S1 for multirate). Node symbols denote clades whose lineage distributions are congruent with separation by a riverine barrier, including the Amazon (diamonds), Rio Negro ( We identified three divergences that are congruent with the Rio Negro being a barrier, within Cebus, Cheracebus, and Cacajao, as suggested previously . However, for both Cacajao Our results suggest that the Madeira river may form a barrier for titi monkeys (Plecturocebus spp.), as has been suggested previously (Byrne et al., 2016;Hoyos et al., 2016;Santorelli et al., 2018).
However, the same river does not appear to present a barrier to squirrel monkeys (Saimiri spp.). Because only a single divergence congruent with the RBH was identified for the Madeira, we cannot assess synchrony here. However, the age of the Madeira river may date to the Miocene (Ruokolainen et al., 2019;Tagliacollo et al., 2015), in which case the divergence within Plecturocebus postdates river formation by several million years, and thus the river is unlikely to have acted as a vicariant agent. It has also been suggested that the bed of the Madeira has moved (Ruokolainen et al., 2019;Tagliacollo et al., 2015), complicating the ability to detect evidence for or against the RBH.
It is important to note that rivers can coincide with species barriers without having been vicariant agents (Naka & Pil, 2020), and that inferring evidence for or against vicariance from presentday species ranges is based on the assumption that these ranges have not changed (Losos & Glor, 2003), which may not be the case  (F) (Graham et al., 1996). That said, we find evidence of evolutionarily distinct lineages in close geographic proximity within the same interfluve. This, along with our finding that even large Amazonian rivers do not appear to be barriers for several platyrrhine lineages, underscores the importance of other evolutionary mechanisms, beyond the RBH and allopatric speciation, for diversification within platyrrhines. While comparatively less attention has been paid to the role that mechanisms of sympatric speciation have played in shaping Amazonian primate diversity, speciation via sexual selection, ecological factors, and biotic interactions (Boughman, 2001;Dieckmann & Doebeli, 1999;Doebeli & Dieckmann, 2000;Gutiérrez et al., 2014;Maan & Seehausen, 2011;Rice & Salt, 1990) are important directions for future research. Our results are in line with several recent publications that find little or mixed evidence for the hypothesis that Amazonian rivers have been drivers of speciation, with many supporting a more nuanced, species-specific effect of riverine barriers, rather than a global rule of rivers as vicariant agents (Kopuchian et al., 2020;Naka & Pil, 2020;Oliveira et al., 2017;Voss et al., 2019). Interestingly, at least some platyrrhines have been observed to be competent swimmers (Barnett et al., 2012;Benchimol & Venticinque, 2014;Gonzalez-Socoloske & Snarr, 2010;Lynch Alfaro et al., 2015;Nunes, 2014), but floating islands and meandering rivers may offer another means for monkeys to cross large rivers (Ali et al., 2021;Ayres & Clutton-Brock, 1992;Gascon et al., 2000).
The distribution of Amazonian primates may be shaped by features beyond rivers, including moisture gradients (Silva et al., 2019), geological formations and soil properties (Ruokolainen et al., 2019), and vegetation patterns (Higgins et al., 2011), offering many avenues for future research directions.
Taxonomic revisions are outside of the scope of this study, and should not be based on mitochondrial (or even nuclear) data alone (Zachos et al., 2013); however, our mitogenomic phylogeny suggests that some species boundaries may need to be reassessed, as a handful of species were found to be paraphyletic, in particular within Alouatta. While some of these patterns may be due to incomplete lineage sorting, introgression, or hybridization, taxonomic errors are another common cause of such patterns (McKay & Zink, 2010).
Mitochondrial phylogenies, like the one presented here, may be an important tool for uncovering such inconsistencies, especially in taxa for which species limits are not completely resolved, or for which ranges have been assumed to correspond to interfluvial regions or areas of endemism, which assumes a priori that rivers form dispersal barriers.
The newly assembled mitogenomes, along with their metadata, will be a valuable resource for conservation genetics and genomics, facilitating more accurate identification of sample identities and/ or provenance. Novel methods to extract nuclear data and even whole genomes from low-quality or noninvasively collected samples are available (Burrell et al., 2015;Chiou & Bergey, 2018;Fontsere et al., 2021;Orkin et al., 2021), however, the costs associated with these methods, as well as their downstream computational requirements, remain prohibitive for many researchers, especially in primate host countries. While local capacity building should be a focus for genomicists working in the Global South (de Vries et al., 2015;Hetu et al., 2019;Rodríguez et al., 2005;Şekercioğlu, 2012), these efforts will take time, and until high-throughput methods become more accessible, mitogenomics will continue to be a pillar of conservation genomics (Pomerantz et al., 2018;Watsa et al., 2020).
Importantly, the novel mitogenomes assembled here have been made publicly available on GenBank along with important metadata, including sampling locations and voucher specimens, improving their utility and value for future analyses.

| CON CLUS IONS
Mitochondrial genomics remains a pillar of phylogenetics and conservation research. The 205 newly assembled mitogenomes for Amazonian primates presented here dramatically increase the number of available platyrrhine mitogenomes, and because they include known sampling locations are of additional value to future research.
Using these novel mitogenomes, we find mixed support for the longstanding riverine barrier hypothesis (RBH), supporting a more nuanced, clade-specific effect of riverine barriers. This suggests that other evolutionary mechanisms, beyond the RBH and allopatric speciation, may also play key roles for explaining the extraordinary species-diversity found in Amazonian primates.

ACK N OWLED G EM ENTS
We thank Rob Voss at the American Museum for Natural History for helpful discussions. This work used JASMIN, the UK collaborative data analysis facility. Further computational resources were provided by WestGrid (www.westg rid.ca) and Compute Canada Calcul Canada (www.compu tecan ada.ca).

CO N FLI C T O F I NTE R E S T
Lukas F. K. Kuderna is currently an employee of Illumina Inc.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data used or generated for this article have been made available in the European Nucleotide Archive's SRA (BioProject PRJEB49549) and NCBI's GenBank (accession numbers OM328861-OM329065;