Home arrow Biological Stations arrow La Selva arrow Data Management Procedures
Data Management Procedures Print
Introduction | Data Sets Used in Data Management | Procedures | Roles Instruction Guide for Completion of Metadata Fields

Introduction

This document describes the data management procedures to be used at the La Selva Field Station. The procedures presented here are based on the Database Management Policy established by the La Selva Advisory Committee in 1992.

The Data Management Policy is available on the OTS website within the La Selva webpage (http://www.ots.ac.cr/en/laselva/datamanagement). Researchers can access the procedures by contacting the La Selva Scientific Director through email ( This e-mail address is being protected from spam bots, you need JavaScript enabled to view it ).

Data Sets Used in Data Management

The following data sets are maintained to control the data generated at La Selva and to provide users with general information about the researchers, projects and data.

Data set of researchers:
it contains general information about the researchers working at La Selva. For each researcher it must include at least the following items:

  • Biographical data
  • Research interests
  • Institutional affiliation
  • Addresses for contact.

Data set of projects:
Contains information about the research projects approved to be carried out at La Selva. For each project it must include at least the following items:

  • Title of the project
  • Topic of the project
  • Date of approval
  • Expected length
  • Researchers
  • Funding sources
  • Other institutions involved in the project
  • Additional comments.

It should also indicate whether or not the project must submit complete data sets to be collected by the Station.

Data sets of visits by the researchers:
Contains information about a particular visit of a researcher to La Selva. It must include the following items:

  • The project that researcher is working on
  • The dates of the visit
  • List of data sets generated on that visit
  • Whether or not metadata about those data sets have been submitted.

Metadata data set:
This is the metadata database; it contains the description of the data sets generated at La Selva: their topics, structure, availability and contacts. For a full description of the content of this collection refer to the Instruction Guide for Completion of Metadata Fields (http://www.ots.ac.cr/rdmcnfs/guiamd.html)

Data Set Repository:
Contains the data sets submitted by researchers. It must specify access restrictions. The repository must be able to store different types of data sets.

Procedures

Proposal approval

The Database Management Policy establishes that a condition of working or continuing to work at La Selva is to provide a research proposal. As part of the research proposal, researchers must acknowledge the policies and procedures of the station.

During the proposal approval process the La Selva Scientific Director will determine whether or not fully documented data sets will be requested from the project.

The data manager is responsible for entering the information about approved projects into the project data sets and in the researchers' data set.

Arrival of a researcher

Researchers must review and refine the description of their projects when they begin work at the station. The initial description of the projects is obtained during the proposal approval phase.

The project description may be used as the starting point to create metadata on datasets generated by that project. The dataset description, however, is expected to be more specific about its content. The Data Manager will assist researchers in the description of data sets.

Departure of a researcher

The Database Management Policy of La Selva establishes that researchers must submit a report at the end of each stay. As part of this report, the researchers must include metadata to describe the data sets generated during their stay. They also must review and update the metadata of other datasets that may have changed during their visit.

The following is a list of the information that has to be provided when departing:

  • Title of Project
  • Principal and associated researchers and addresses
  • Dates of research
  • Keywords for the project
  • Scientific names
  • Taxa
  • Common names
  • Scientific field
  • Nutrients measured
  • Number and identity of specimens taken, and site(s) where deposited
  • Location of project, including La Selva grid system coordinates for research conducted within La Selva boundaries.
  • Trail use
  • Markers or plots (for ongoing projects)
  • Brief summary of results

New fields:

  • List of data sets collected
  • Which of data sets have metadata

Archival of Data Sets Requested

The Database Management Policy of La Selva establishes that fully documented data sets must be submitted to the station if requested by OTS. The requested data sets are determined during the process of proposal approval of the research projects.

The La Selva Scientific Director and the researchers involved will determine when the data sets requested by OTS will be archived. The nature of the research will be considered when establishing the submission date. However, most of the data should be archived six months after the end of the stay that generated the data.

Whenever a researcher submits a dataset for archiving, the researcher must provide the following:

  • Complete documentation of the datasets. This documentation should follow the general guidelines mentioned above and the guidelines specified in the Instruction Guide for the Completion of Metadata Fields and the Detail Guide for Metadata of Submitted Data Sets (http://www.ots.ac.cr/rdmcnfs/guialarga.html).
  • The data must be submitted using an acceptable medium like:
    • 3 1/2" floppy disks
    • Zip-drive disks o Jazz-drive disks
    • CD-ROM
    • ftp address
    • e-mail attachments.

As technology changes, the La Selva Scientific Director may include other media in the list above. The Information Assistant will help researchers to create versions of their data sets in one of the acceptable medium.

  • The data must be submitted in one of the following formats:
    • comma-separated ASCII file (CSV)
    • Microsoft Excel
    • dBase III (extension dbf)

Other formats may become acceptable as technology changes. Changes in the acceptable formats are left to the discretion of the La Selva Scientific Director.

  • The researcher may choose to restrict access to the data for a five-year period, with the possibility of renewal, on appeal to the station Scientific Director, in the case of ongoing research. In that case, the data will be safeguarded from release according to the general Database Management Policy.
  • Support personnel in charge of quality control will check the data consistency of the datasets: values out of range, missing values, invalid codes, etc. (not necessarily the consistency of the meaning of the data). Quality control staff will then query the researcher about possible necessary changes.
  • Whenever possible, other researchers will be contacted to peer-review the metadata to determine whether or not the metadata submitted is good enough to use the dataset.
  • Whenever possible, a researcher will be named custodian of the data set. It is the responsibility of the custodian to know the semantics of the information stored in the data set. Upon retirement of the researcher from active research, the custodianship of the data will be assigned to the La Selva Scientific Director.

Distribution of Data Sets

Datasets not restricted by their researchers will be incorporated into La Selva's on-line system of search and distribution. It is the responsibility of the Data Manager that no restricted data set is distributed without the consent of the researchers who generated that data set.

Metadata for all datasets are always incorporated into La Selva's on-line system of search and distribution.

A reasonable amount of time before the restriction on dataset distribution expires, the researcher will be notified so that the restriction can be renewed.

Roles

The following is a description of the different roles that have been identified when dealing with data management. They do not necessarily correspond to different people; for example, a staff member may be the Data Manager and the Custodian of a given data set.

  • Data Manager: person in charge of managing the storage of the data and metadata.
  • Custodian: specialist who knows the meaning of the data stored in a given data set.
  • Information Assistant: person in charge of helping (advising) researchers in the design of their data sets and in data set format conversions.
  • Quality controller: person who reviews the quality of the items included in a data sets. The following aspects should be part of the quality control: values out of range, missing values, invalid codes, statistical analysis of the data.
  • Researcher: specialist working in a research project which includes gathering data and analyzing the data obtained.
  • User: person (not necessarily a researcher) interested in data collected at La Selva. They probably will use the data in ways that are different from how it was originally intended.
Last Updated ( 07/03/12 )
 
Organization for Tropical Studies
Site powered by Joomla!