help | home

Herbarium Databasing Questionnaire

Summary of responses

Tim Rich 13 May 2003

Databasing botanical collections was raised as a major issue at the first meeting of the Botanical Collections Managers Group, and it wasproposed that a questionnaire should be circulated to establish whatis already happening and where the BCMG could help. Thequestionnaire was circulated in Spring 2003.

Thank you all for your responses which I have tried to summarise below. It is clear that documentation takes place in different institutions according to their priorities and collections. Obviously each are free to make such decisions, but there are several key areas where BCMG might be involved:

  1. Software/websites etc. Especially for the smaller herbaria, much of the software etc needed for databasing and disseminating information is already available and does not need to be developed anew. The BCMG can facilitate communication for those wanting to start documentation with those who have already done similar projects.

  2. There is clearly a huge task of documenting all our collections across the range (only 7% done to date) for which funding is required; I am not sure to the extent that raising this funding is the role of the BCMG, but it is clearly an issue.

  3. Advice on documentation is both needed and available. Can we consider running workshops, eg on reading labels, interpreting data etc. ?

  4. If we are going to develop one central portal for digital herbarium information, it could be practical from existing databases which have written data standards and export into Excel. It would be simpler in the long run if new projects adopted simple data standards directly compatible with existing datasets. Most databases already contain a lot of valuable information fields such as locality, collector, dates etc. and do not need to be adapted.

    Perhaps a good project to start with might be UK or Irish BAP species as many are already documenting them anyway and the data have obvious immediate application (and attractive for possible funding…). This would also go towards the Global Plant Conservation Strategy Target 16 Establishing and strengthening plant conservation communicationnetworks.

  5. As identifications are not consistently checked prior to documentation, this has huge implications for the quality of the information on the web. One advantage of uncertainty in databased records is that the users should go back to the original source to check them before using them (generates enquiries, corrects identifications etc), though I suspect this very rarely happens and the data will be used and abused anyway.

Summary of responses

Herbariaresponding: BIRM, BON, CGE, DBN, E, GLAM, GGO, GL, IMI, K, LDS,LIV, NMW (vascular), NNHS, Oldham, OXF/FHO/FHOw, WSY (17). (SKarley personal herbarium excluded)


Type of collections to be databased

It was clear that the whole range of collections were being databased across the herbaria, obviously adapted to which collections held (e.g. IMI only has cryptogams so is only documenting them). Some herbaria were only planning to document some of their collections, others all (including living and DNA collections).


Geographical Coverage

Most herbaria were documenting on a world basis reflecting scope of the collections, some on a British Isles basis only.


Approximate number of specimens to be databased/number already databased.

The total number of herbarium specimens in the collections reported was 13.1 million (dominated by 8 million from Kew!), of which 870,000 had already been documented to one degree or another (7%). On average 40% of the collection in each institution had already been documented, though there is a very clear and understandable inverse exponential relationship between size of collection and % documented!!!


Are you currently databasing or have you already databased your collections?

All respondents currently databasing, other than CGE who are proposing to. A few smaller herbaria were close to completing documentation.


What software are you using to database the collections?

The responses showed a large range of different database/documentation software, with Excel/Access being the most frequent. The databases reported were (x= numbers using them):

  • Recorder 3.4 x1

  • MODES+ (migrating to TMS) x1

  • Excel/Access x7

  • BG-base x 2

  • Pandora x1

  • Dbase4 x1

  • Gallery Systems x1

  • Micromusé x1

  • BRAHMS x1

  • R-Base Personal x1

In-house software x 2


Estimated date of completion (if ever!)

Very few clear dates, smaller herbaria able to be more precise.

10 & 14

Can the software download into Excel spreadsheets, and, Do you have written data entry standards?

All reported 'yes' or 'indirectly' for downloads into Excel, and all but two for written data entry standards.

Assuming that one wanted to, this might allow one central database to be compiled relatively easily from Excel provided that, say, different dates styles, can be put into one uniform system.


What information are you databasing?

Most were documenting a fairly full list of information from the labels, clearly the more information databased, the greater the use in the long run but the slower the input.

  • Latin name x17

  • Vernacular name x10

  • Locality x17

  • Vice-county x13

  • Habitat x13

  • Grid reference x13

  • Date x16

  • Collectors x16

  • Additional collectors notes x9

  • Determiner x14

  • Accession numbers x15

  • Legal status x10

  • Specimen conservation information x4

  • Type status x10

  • Bibliography x3

  • Loan status x3

  • Host x 1

  • Conservation status x3

Local government/authority (x0!), was asked as such information is often used for searching for data for local BAPS; it is time-consuming to sort out (vice-county is a proxy).


How is the information published/to be made available?

It seems almost everybody is disseminating information electronically or on paper to those who want it, and websites are clearly going to be the favoured option soon:

  • Website (already or very nearly) x8

  • Published catalogues x3

  • Print-outs/electronic files to those who request it x13


Are identifications checked before specimens are databased?

The responses show that the quality of identification information on databased records may be very variable; and this has consequences when they are circulated,

  • Always x1 (GL British flora project!)

  • Usually x6

  • Sometimes x6

  • Never x2

15 and 18.

Average number of specimens databased per day if known, and, Do you have staff dedicated to databasing

30, 50, 50, 75, 125, 150 specimens per day, the rest don't know.

The two main factors determining input rates are the Software, and the number of bits of information to document. Also important is the skill of the input staff; 3 had dedicated input staff, 11 either do it as part of other jobs or rely on volunteers.

This has a huge implication for the financial resources required. I suspect that most decide what they want to document, and then get on with it!


% staff time databasing?

Average 15% of staff time spent documenting


How do you prioritise which specimens are databased?

The larger herbaria tended to gave a long list reflecting multiple usage. For others, priorities were determined by funding, which is often more easily obtained for specific projects!

  • Determined by funding x 9

  • New incoming material x8

  • Specific projects x8

  • Loans x6

  • Types x6

  • Rare and Threatened Species x5

  • Collections where databasing helps with management x4

  • Important Collections x3

  • Historic Collections x3

  • Critical Groups x2

  • Local Collections x1


Are you able to offer advice to anyone who needs it?

Nine felt able to offer advice of one sort or another, so there is help out there!


What help/advice would you like if available?

The need for advice clearly revolves around funding and then websites, and then the more practical aspects of reading labels etc.

  • External funding opportunities x8

  • Website design x8

  • Choosing software x3

  • Data entry standards x4

  • Data verification procedures x4

  • General advice x2

  • Identification verification x3

  • Reading labels x2

  • Tracing localities x2 (see Pankhurst, R. J. 1981, A guide to finding the localities of British plant records. Watsonia13: 221-223.)

  • What to database x1

  • Prioritisation x1 (importance can be ranked using Rich, T. C. G. (1998). Criteria for evaluating the importance of herbarium collections. The Biology Curator 13: 2-4.)