Coming up short: Identifying substrate and geographic biases in fungal sequence databases
Khomich, Maryia; Cox, Filipa; Andrew, Carrie Joy; Andersen, Tom; Kauserud, Håvard; Davey, Marie Louise
Journal article, Peer reviewed
Accepted version
Date
2018-09-22Metadata
Show full item recordCollections
Abstract
Insufficient reference database coverage is a widely recognized limitation of molecular ecology ap-proaches which are reliant on database matches for assignment of function or identity. Here, we use datafrom 65 amplicon high-throughput sequencing (HTS) datasets targeting the internal transcribed spacer(ITS) region of fungal rDNA to identify substrates and geographic areas whose underrepresentation in theavailable reference databases could have meaningful impact on our ability to draw ecological conclu-sions. A total of 14 different substrates were investigated. Database representation was particularly poorfor the fungal communities found in aquatic (freshwater and marine) and soil ecosystems. Aquaticecosystems are identified as priority targets for the recovery of novel fungal lineages. A subset of the datarepresenting soil samples with global distribution were used to identify geographic locations andterrestrial biomes with poor database representation. Database coverage was especially poor in tropical,subtropical, and Antarctic latitudes, and the Amazon, Southeast Asia, Australasia, and the Indian sub-continent are identified as priority areas for improving database coverage in fungi. Coming up short: Identifying substrate and geographic biases in fungal sequence databases