Building a High-Resolution, Specimen-Based Picture of Life: Possibilities and Challenges
Gries, Corinna , Gilbert, Edward , Nash III , Thomas H .
Digitizing 'all' North American lichen and bryophyte specimens: 2.3 million specimens, 65 institutions, 1 year later.
The recently funded NSF-Advancing Digitization of Biological Collections (ADBC) Thematic Collections Network (TCN) project aims to digitize ca.2.3 million North American lichen and bryophyte specimens from over 60collections representing well over 90% of the remaining North American specimens from Canada, the United States and Mexico. On-line availability of nearly the entire North American bryophyte and lichen collections will greatly accelerate knowledge and evaluation of the biodiversity of these organisms by fostering collaborations between professionals and the general public. Lichens and bryophytes share important traits, which make them some of the most sensitive indicators of environmental change. The specific goal of this project is to provide high quality data to address how species distributions change with regards to major environmental events across time and space. Large scale distribution mapping will support management decisions through identification of biodiversity hotspots, areas of most imminent environmental change, and greatest human impact.To achieve this goal, each digitizing institution is developing efficient workflows to capture images of specimen labels, which are then uploaded to a centralized server for further processing. Since specimens do not always contain annotation labels indicating the most recent identification and filing location, skeletal metadata files are generated at the time of imaging. These metadata are processed with the images to seed specimen records with minimal data.Label images are integrated into the national lichen and bryophyte portals which are powered by the Symbiota software package. Amassing specimen records in Symbiota allows identifying potentially extensive duplication of specimens in different collections as well as published exsiccatii. Duplicate records maybe imported without reentry. Optimization of Optical Character Recognition(OCR) and Natural Language Processing (NLP) are currently being implemented in Symbiota. Digital images of specimen labels will undergo an OCR processing step and then be exposed to customized NLP relative to different label formats.Following these automation steps, the digitized information will be available online for review, adjustment, and key stroking if necessary. This last step of human intervention in the digitization process can be accessed by anyone interested in helping to advance this digital resource; therefore, starting in the fall of 2012 we are planning on developing a vibrant volunteer community by extensively giving back in the form of local and online seminars, local field events, introductions to specimen determinations, etc.
Log in to add this item to your schedule
Symbiota home page
Consortium of North American Bryophyte Herbaria
Consortium of North American Lichen Herbaria
LBCC project home page
1 - University of Wisconsin Madison, Center for Limnology, , 680 North Park Street, Madison, wi, 53706, USA
2 - 2831 E 18th St., Tucson, AZ, 85716, USA
3 - University of Wisconsin Madison, Botany, 430 Lincoln Dr. , Madison, WI, 53706, USA
Presentation Type: Symposium or Colloquium Presentation
Location: Franklin A/Hyatt
Date: Wednesday, July 11th, 2012
Time: 2:15 PM