Utility of QR codes in biological collections

Abstract The popularity of QR codes for encoding information such as URIs has increased exponentially in step with the technological advances and availability of smartphones, digital tablets, and other electronic devices. We propose using QR codes on specimens in biological collections to facilitate linking vouchers’ electronic information with their associated collections. QR codes can efficiently provide such links for connecting collections, photographs, maps, ecosystem notes, citations, and even GenBank sequences. QR codes have numerous advantages over barcodes, including their small size, superior security mechanisms, increased complexity and quantity of information, and low implementation cost. The scope of this paper is to initiate an academic discussion about using QR codes on specimens in biological collections.


Introduction
Th ere are 55,000 museums in 202 countries containing a variety of collections from art to zebras (De Gruyter 2012). More than 7,000 biological collections worldwide (CBoL et al. 2013) preserve 1.2-2.1 billion specimens (Ariño 2010), although only ~405 million have been digitized (GBIF 2013). With such a large number of specimens, it is critical that the information be made available for research. However, the best way to link specimens to databases and other materials is still under discussion. Some options include unique specimen identifi ers (USIs), globally unique identifi ers (GUIDs), life science identifi ers (LSIDs).
Currently, one of the most frequently used methods is Barcoding. Th is method was implemented in biological collections in the 1990s at INBio and the Smithsonian Institution (Janzen 1992, Th ompson 1994. Barcodes are one-dimensional optical representations, where widths and spacing of parallel lines are translated primarily into numeric data. Electronic devices decode the information (usually a 13-digit number and a few letters), which is linked to a database. Th e idea for modern barcodes arose in 1948 in response to the need from industry to develop a system to quickly capture product data at supermarkets during the check-out process (Brown 1997). By 1949, Bernard Silver and Norman J. Woodland, both from Drexel University, fi led a patent application with the prototypes of barcodes called "Classifying Apparatus and Method", issued in 1952 (Brown 1997, Woodland et al. 1952. By 1966 the fi rst barcodes started to be used commercially (Brown 1997); in 1973 George J. Laurer proposed a standardized barcode system called Uniform Product Code (UPC), and in 1974 the fi rst UPC scanner was installed in a Marsh supermarket in Troy, Ohio (Brown 1997). A 10-pack of Wrigley's Juicy Fruit chewing gum, today preserved at the National Museum of American History, was the fi rst product in history to have a barcode on it, and their use in commercial situations increased rapidly.
Soon various sectors demanded smaller codes, capable of storing more information and more character types. Matrix codes, also called two-dimensional (2D) codes, were the ideal solution. Among them, QR codes (abbreviation from Quick Response Codes) have increased in popularity during the last few years. Th ey were originally invented in 1994 by a Toyota subsidiary Denso Wave Incorporated (2013), which has chosen not to exercise their patent rights (Denso Wave Incorporated 2013).
QR codes have nine standard features (Denso Wave Incorporated 2013): 1. Capacity to handle diff erent types of data: numeric and alphabetic characters, Kanji, Hiragana, Katakana, symbols, binary and control codes.

2.
Large capacity: up to 7,089 numeric and 4,296 alphanumeric characters can be encoded (hundreds of times more than in a barcode).

3.
Small printout size: the same information can be encoded in a QR code onetenth smaller than a barcode. 4.
High speed scan: omni-directionally readable, with position detection patterns circumventing the negative eff ects of background interference.
Compartmentalization: QR codes can be divided into multiple data areas (as many as 16), allowing smaller printouts. 8.
Flexible representation: shapes and colors of modules can be changed, even allowing for artistic representations (QR code Art). 9.
Readability: QR codes can be read by any smartphone, tablet or laptop with a camera, using freely available software.
A few remarkable uses of QR codes go beyond the codes themselves. For instance, websites linked in QR codes can be displayed in the user's preferred language. Th is was fi rst implemented by QRpedia (http://qrpedia.org) in 2011, to deliver a Wikipedia article in the user's language using just one QR code. Th is is how it works:

1.
Th e QR code has to encode an URL containing the domain name "qrwp.org" and the path (fi nal part) to the title of a Wikipedia article. For example, if the Wikipedia article's URL in Spanish is http://es.wikipedia.org/wiki/Asteraceae, the QR code should encode the URL http://es.qrwp.org/Asteraceae.

2.
When the device navigates to the URL contained in the code, its language setting is sent to the QRpedia web server as well (e.g. English).

3.
Th e QRpedia server then uses Wikipedia's API to look for a version of the article in the language specifi ed (e.g. English), and if it fi nds one, returns it in a mobile-friendly format. If none are found, then the QRpedia server off ers a choice of the available languages, or a Google translation. In the example above the resolved URL would be http://en.m.wikipedia.org/wiki/Asteraceae.
Another relevant characteristic is the possibility of providing statistics (e.g. with Google Analytics) about usability of each code: how many times a code has been scanned, location of individuals who scanned it, user's interest (i.e. where code was located), economic utilities associated to a particular code when transactions are generated, etc. For the statistics to be accurate, a unique URL has to be encoded in a QR code, so that the only way to visit the website is by scanning the QR code (e.g. statistics for Fig. 1D can be checked in the website provided in Fig. 1E).
Applications of QR codes range from commercial tracking, transport and entertainment ticketing, visa and passport information, delivering of Wikipedia articles (QRpedia), songs downloading, to encrypted government data (Denso Wave Incorporated 2013). QR codes are also being implemented in libraries in diff erent ways (Ashford 2010). Information that is typically encoded includes vCard contacts, Uniform Resource Identifi ers (URI), e-mail addresses, map directions and text.
Despite the fast expansion and popularization of QR codes, they have not been openly incorporated into natural history collections. Th eir use as USIs or to replace labels in small specimens (e.g. insects) has been briefl y suggested in previous works (Blagoderov et al. 2012, Mantle et al. 2012, Schuh 2012, without giving details about advantages or implementation. Currently botanical gardens, a few zoos and various museums are using QR codes to link, for instance, specimens in exhibits to Wikipedia articles.
Some of the reasons to explain their reluctant appearance in natural history collection are:

1.
Lack of general knowledge about potential and implementation of QR codes: according to eMarketer (2013), 23-36% of adults (25 to 34 years old) in USA and Europe have scanned a QR code. People usually associate QR codes with URLs, but that is only a limited use.

2.
Concerns about the permanence of this technology: collection managers are often unwilling to invest time and resources in a new technology, if they are unsure about its long-term permanence. Th is is part of the reason there was a lag time of 26 years between the original commercial use of barcodes and their use in natural history collections. But there is also a concern that with technology changing so fast, the long-term permanence of QR codes is diffi cult to predict.

3.
Security concerns (i.e. malicious links or websites). Since QR codes are illegible for the human mind, curators are afraid to link their devices to undesirable websites.

4.
Implementation costs: as discussed below, implementing a new technology can be expensive. According to Green (2010), implementing a barcode tracking solution in a middle-size organization costs in average €40,000 (including software licenses, barcode and wireless mobile computer equipment and professional services).
We intend to show the advantages of using QR codes to hard-link specimens to digital resources associated with them. In addition to hundreds of millions of iPods (Apple Inc.) and tablets, there are more than one billion smartphones in use, projected to double by 2017 according to Sui (2013). Virtually any curator or visitor could use one of these devices to quickly access digital information related to the specimen. QR codes have numerous potential uses in Natural History collections: 1) to provide metadata (e.g., rights of use, proposed citations, projects or particular collections, references, collector contact information); 2) to provide supplementary specimen or species information (e.g., chromosome counts, additional fi eld notes, ecosystem details); 3) to link to digital resources (e.g., photographs, maps, videos, audios, supplementary information); and 4) to provide this information in multiple languages (see examples in Fig. 1). Even though multiple methods have been used in the past to do all this, QR codes provide a unique opportunity to use a personal device (e.g., a smartphone) to perform these tasks in a fast, simple, and graphical way. With QR codes the user would not have to write the collection identifi er in a notebook, fi nd a computer, and search in a database, they would only need to point their devices at the code to obtain the links for photographs, maps, etc. When the Internet is accessible through the telephone network (the usual situation in smartphones and tablets), Wi-Fi is not even necessary. Because traditional Barcodes require special scanners plugged to computers with access to the database the information is accessible only by staff . QR codes, however, can contain links that will allow any user to access all the information and some applications allow users to scan multiple QR codes and save results in numerous formats, which would speed up data gathering. Finally, QR codes could be even used to backup digital information from specimens.

General considerations and recommendations
Projects that implement QR codes on natural history specimens require clear goals. For instance, if Uniform Resource Locators (URLs) are going to be encoded, longterm permanence of URLs must be guaranteed. An alternative for small institutions is the creation of Persistent Uniform Resource Locators (PURLs) that point to other URLs (OCLC 2013). Various PURLs resolvers such as OCLC (2013) are available for free on the Web. Examples of PURLs are the digital object identifi ers (DOIs), commonly used in scholarly materials (journal articles, books, etc.). Th e creation of PURLs is simple and batches of PURLS can be created via API or with Java, Perl or Python codes available online (e.g. Arase 2009, Kaszuba 2013, Perl Training Australia 2013. Th e resulting URLs should be thumb-interactive, purpose-driven and easy to read using smartphones or any other mobile scanning device. QR codes follow strict standards (Fig. 2). Th e minimum unit of information is called a module. Th e number of modules aff ects size and amount of information, ranging from version 1 (21 × 21 modules) to version 40 (177 × 177 modules). Th e minimum size of a module is usually established depending on the printer and reader capabilities. Th e symbol area requires a 4-module wide margin or "quiet (clean) zone" around it (Fig. 2).
Code size and printer resolution are important to generate readable QR codes. Th e minimum size of the QR code modules depends on printer head density. Higher density improves quality printing, diminishing the eff ect of paper width and quality, feed speed fl uctuations, distortion of axis, blurring, etc. Th e minimum module size in a Laser printer at 600 dpi (24 dot/mm) assuming a 4-dot/module confi guration is 0.17 mm. A thermal printer at 200 dpi (8 dot/mm) and a 4-dot/module confi guration prints a module of minimum 0.5 mm of side. Th erefore a version 1 QR code (21 × 21 modules) with 30% of error correction (level H) should have a minimum printed size of 1.05 cm per side, leaving a quiet zone of 2 mm around the margins.
Another factor to consider is the scanner resolution. Standard scanners and smartphone cameras have a resolution of 0.25 mm or less. Roughly 90% of these devices read QR codes of 26 × 26 mm, and the latest phones (2011 or newer) have usually macro capabilities allowing them to scan QR codes of 10 × 10 mm.
To determine minimum size, we tested readability of QR codes with 568, 406, 219 and 111 alphanumeric characters (including spaces), with the four levels of error correction (Fig. 3). All the codes were printed on rough-textured archival paper. Codes were scanned with an iPhone 4® and an iPad 2® using the free version of the software Qrafter (http://keremerkan.net/downloads/). Readable QR codes level H required a minimum size ranging from 1.27 cm (0.5 in) for 111 characters to 2.79 cm (1.1 in) for 568 characters. Th e minimum size required when using level L ranges from 1.02 cm (0.4 in) for 111 characters to 2.03 cm (0.8 in) for 568 characters. We recommend printing QR codes at least 2 mm bigger than the minimum size readable (e.g. 3 cm/side for codes encoding ~600 characters), and preferably using level H when coding information on specimens of natural history collections.  When space is limited (e.g., small labels), lower security codes could be used, and even mini-QR codes.
Th e cost associated with the implementation of QR codes in biological collections needs to be considered carefully. Programming code for creating QR codes is freely available, and there are numerous QR code generation packages (including an API in Google), most of them free (see Appendix 1). With some free applications it is possible to develop thousands of QR codes for free. However, generating hundreds of thousands or millions of codes would require the purchase of specialized software (prices varying from € $5 to $1,000) or the development of an application using the available programming code. At those scales, even if the price of production is small, the investment in time and resources can be signifi cant.
General considerations and recommendations are summarized in Table 1.

Content
Amount of information Th e more text included, the larger the size. A QR code encoding ~600 characters will require a minimum size of ~3cm/side.
Type of information QR codes can be use to provide metadata, label information, supplementary information, and links to digital resources.

Language of content
To deliver content in diff erent languages, a web server such as QRpedia has to identify the language of the scanning device.

Statistics
URLs encoded in QR codes have to be unique to produce accurate statistics of readability. Long-term permanence of URLs Always use permanent URLs (PURLs)

Design
Size Size is aff ected by amount of text, error correction level, paper quality, and printer and scanner resolution. We recommend printing QR codes at least 2 mm bigger than the minimum readable size.
Error correction An error correction of 30% (level H) is recommended. Th is, however, increases the size of the QR code.

Topic Attribute Considerations & recommendation
Paper quality Consider long-term preservation when choosing paper quality. Archival paper has a rough texture, which slightly distorts shapes, aff ecting the minimum size required for readability. Printing QR codes on labels of new specimens is an inexpensive option. Another more expensive option is using the same materials currently used for Barcodes.
Cleanness QR codes require a "quiet" area around them and prefer black/dark print on a white/clear background.

Printer resolution
Prefer high-quality printers. Test the minimum readable size of QR codes with your printer.

Bulk generation
Generation Various resources can be used to produce thousands of QR codes for free. Th e production of batches of hundreds of thousands of QR codes requires adapting available programming code, or purchasing expensive software.
Cost Th e implementation of QR codes in large collections with millions of specimens can be extremely expensive. In those cases we suggest implementing them primarily for the new accessions.

Scanner resolution
Most scanning devices (90%) read QR codes of 26 × 26 mm or bigger. For scanning smaller codes macro capabilities are often required.
Scanning speed Th e scanning speed is inversely related to the amount of information in the codes.

Scanning tips
Th e device should be kept parallel to the code and as close as possible while still allowing it to focus. Edges of the code must be visible. It requires a few minutes of practice.
Security QR codes could be used to direct the device to websites with malicious codes. We recommend: 1) collection managers should check QR codes and links from unverifi ed sources before making them available; 2) users should only scan QR codes that have been approved by collection managers; 3) users should deactivate the automatic website launch option in their scanning software so they have the option of declining before it opens.

Possible utility
QR codes provide a bridge between the digital and the physical world by delivering content to the most used electronic devices. Th e critical question is, what kind of information would a curator or visiting researcher fi nd useful? QR codes can deliver plain text information combined with multiple links to online content. Plain text information could include institutional information, the label information, rights of use, or additional information not printed on the label, such as agencies and name of the project funding the collecting expeditions. Links to online resources can cover a broad spectrum, from information associated to the specimen itself to data related to the taxonomic entity, locality, research project, Links to funding agencies (e.g., projects funded using national and international resources could display that information)

Final remarks
Are the QR codes going to be outdated soon? Are they going to be replaced by another 2D code? Similar questions were asked in the '60s concerning barcodes, and fi ve decades later we are still using them. While there is no immediate answer to these questions, it must be pointed out that QR codes have rapidly inundated vast sectors of Small institutions with tight budgets can still implement this technology, without having to spend preposterous amounts of money. If there is no Wi-Fi, users can still connect to the Internet through their phone networks. If there are no phone networks available, devices can be used to record the information contained in QR codes as text (e.g. label information from large batches of specimens).
One concern that has been expressed about QR codes is the potential presence of malicious code either in the codes themselves or on the websites they link to. Most of the scanning applications now have URL safety check services to detect malicious content. Recommendations for countering this problem are mentioned in Table 1.
Scanning QR codes using smartphones, iPods and digital tablets has become a common practice and there are more than 25 applications for doing this, most of them free of charge (see Appendix 2). QR codes are quickly penetrating the mainstream and represent an opportunity to facilitate access to specimen information. We already have a critical mass of the population with devices capable of using this technology, and in the near future people may be pointing their phones and tables to QR codes on natural history specimens to obtain more information.