Building a High-Resolution, Specimen-Based Picture of Life: Possibilities and Challenges
Matasci, Naim , Boyle, Bradley , Lu, Zhenyuan , Hopkins, Nicole , Piel, William , Raygoza Garay, Juan Antonio , McKay, Sheldon , Narro, Martha , Enquist, Brian .
Correcting and Standardizing Taxonomic Names with the Taxonomic Name Resolution Service.
Millions of valuable biological specimens and measurements are hosted in countless collections across the world and, thanks to important digitization efforts, this trove of data is now becoming accessible to researchers. Together with the dramatic acceleration in the production of molecular data, this provides a tremendous opportunity for innovative synthetic research. However, redundant, erroneous and ambiguous taxon names caused by misspellings and annotations, as well as lexical variants and errors introduced by the digitization process itself, can severely limit the ability to merge and integrate these diverse datasets. The presence of synonyms further complicates the task of identifying duplicate taxa within and across datasets. Individual researchers often have to perform this scrubbing task manually, at significant cost in time and at the risk of introducing additional errors. Given the ever-increasing size of available datasets, this task will soon become unfeasible. The Taxonomic Name Resolution Service (TNRS)is an online tool to correct and standardize taxonomic names. Given a list of taxon names at the rank of family or below, the TNRS returns the closest matching name, with corrected spelling and formatting and completion of authors as well the updated taxonomic classification. Custom code on top of existing name parsing and matching algorithms uses taxonomically-informed decision rules to interpret and rank results, enabling the TNRS to select the single most likely match to the name submitted from a database compiled from different sources. Users can choose which sources to include and rank them in order of preference. If a taxon has one or more synonyms, the current accepted name according to the chosen source is returned, but the user can still access alternative interpretations and select a different name. Cases in which the acceptance is ambiguous or where only the genus is matched are flagged for user review. Importantly, the original source of the match is also returned, including a web link. The TNRS greatly simplifies the integration across heterogeneous datasets, thus expanding the potential for synthetic research. Furthermore it ensures that botanists have access to the most up-to-date nomenclature and that taxonomic names are spelled correctly. While we have first focused on plants, the architecture of the TNRS provides a general foundation to handle the issue of names across different nomenclatural codes such as animals, fungi, and microbes.
Log in to add this item to your schedule
Taxonomic Name Resolution Service
1 - University Of Arizona, IPlant Collaborative, 1657 East Helen St, Tucson, AZ, 85721, USA
2 - University of Arizona, Ecology and Evolutionary Biology, P.O. Box 210088, Tucson, AZ, 85721, USA
3 - Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11724-2202, USA
4 - University of Arizona, iPlant Collaborative, 1657 E Helen St, Tucson, AZ, 85721, United States
5 - Yale University, Peabody Museum, 170 Whitney Avenue, New Haven, CT, 06511, USA
6 - Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY, 11724-2202, United States
7 - University Of Arizona, Bio5 Institute, PO BOX 210240, Tucson, AZ, 85721-0240, USA
8 - University Of Arizona, BioSciences West, Tucson, AZ, 85719, USA
Presentation Type: Symposium or Colloquium Presentation
Location: Franklin A/Hyatt
Date: Wednesday, July 11th, 2012
Time: 5:00 PM