The program encodes text strings passed either on the command line (with -b) or retrieved from standard input. The text representation is interpreted according to the following rules. When auto-detection of the encoding is enabled (i.e, no explicit encoding type is specified), the encoding types are scanned to find one that can digest the text string. The following list of supported types is sorted in the same order the library uses when auto-detecting a suitable encoding for a string.
The EAN frontend is similar to UPC; it accepts strings of
digits, 12 or 7 characters long. Strings of 13 or 8 characters
are accepted if the provided checksum digit is correct.
I expect most users to feed input without a
checksum, though. The add-2 and add-5 extension are accepted for both
the EAN-13 and the EAN-8 encodings.
The following are example of valid input strings:
“123456789012
” (EAN-13), “1234567890128
” (EAN-13 wih
checksum), “1234567
” (EAN-8), “12345670 12345
” (EAN-8
with checksum and add-5),
“123456789012 12
” (EAN-13 with add-2),
“123456789012 12345
” (EAN-13 with add-5).
The UPC frontend accepts only strings made up of digits (and, if a supplemental encoding is used, a blank to separate it). It accepts strings of 11 or 12 digits (UPC-A) and 6 or 7 or 8 digits (UPC-E).
The 12th digit of UPC-A is the checksum and is added by the
library if not specified in the input; if it is specified, it
must be the right checksum or the code is rejected as invalid.
For UPC-E, 6 digit are considered to be the middle part of the
code, a leading 0 is assumed and the checksum is added;
7 digits are either considered the initial part (leading digit
0 or 1, checksum missing) or the final part (checksum specified,
leading 0 assumed); 8 digits are considered to be the complete code,
with leading 0 or 1 and checksum.
For both UPC-A and UPC-E, a trailing string of 2 digits or 5 digits
is accepted as well. Therefore, the following are examples
of valid strings that can be encoded as UPC:
“01234567890
” (UPC-A)
“012345678905
” (UPC-A with checksum), “012345
”
(UPC-E), “01234567890 12
” (UPC-A, add-2) and
“01234567890 12345
” (UPC-A, add-5), “0123456 12
”
(UPC-E, add-2).
Please note that when setting BARCODE_ANY
to auto-detect
the encoding to be used, 12-digit strings and 7-digit strings
will always be identified as EAN. This because I expect most
user to provide input without a checksum. If you need to
specify UPC-with-checksum as input you must explicitly set
BARCODE_UPC
as a flag or use -e upc
on the command line.
ISBN numbers are encoded as EAN-13 symbols, with an optional
add-5 trailer. The ISBN frontend of the library accepts real
ISBN numbers and deals with any hyphen and, if present, the
ISBN checksum character before encoding data. Valid
representations for ISBN strings are for example:
“1-56592-292-1
”, “3-89721-122-X
” and “3-89721-122-X
06900
”.
This encoding can represent all of the printing ASCII characters, from the space (32) to DEL (127). The checksum digit is mandatory in this encoding.
The “C” variation of Code-128 uses Code-128 symbols to represent two digits at a time (Code-128 is made up of 104 symbols whose interpretation is controlled by the start symbol being used). Code 128-C is thus the most compact way to represent any even number of digits. The encoder refuses to deal with an odd number of digits because the caller is expected to provide proper padding to an even number of digits. (Since Code-128 includes control symbols to switch charset, it is theoretically possible to represent the odd digit as a Code 128-A or 128-B symbol, but this tool doesn’t currently implement this option).
Code-128 output represented symbol-by-symbol in the input
string. To override part of the problems outlined below in
specifying code128 symbols, this pseudo-encoding allows the
used to specify a list of code128 symbols separated by
spaces. Each symbol is represented by a number in the range
0-105. The list should include the leading character.The
checksum and the stop character are automatically added by the
library. Most likely this pseudo-encoding will be used with
BARCODE_NO_ASCII
and some external program to supply the
printed text.
The code-39 standard can encode uppercase letters, digits, the blank space, plus, minus, dot, star, dollar, slash, percent. Any string that is only composed of such characters is accepted by the code-39 encoder. To avoid loosing information, the encoder refuses to encode mixed-case strings (a lowercase string is nonetheless accepted as a shortcut, but is encoded as uppercase).
This encoding can only represent an even number of digits
(odd digits are represented by bars, and even digits by the
interleaving spaces). The name stresses the fact that two
of the five items (bars or spaces) allocated to each symbol
are wide, while the rest are narrow. The checksum digit is
optional (can be disabled via BARCODE_NO_CHECKSUM
).
Since the number of digits, including the checksum, must be even,
a leading zero is inserted in the string being encoded if needed
(this is specifically stated in the specs I have access to).
Automatic selection between alphabet A, B and C of the Code-128 standard. This encoding can represent all ASCII symbols, from 0 (NUL) to 127 (DEL), as well as four special symbols, named F1, F2, F3, F4. The set of symbols available in this encoding is not easily represented as input to the barcode library, so the following convention is used. In the input string, which is a C-language null-terminated string, the NUL char is represented by the value 128 (0x80, 0200) and the F1-F4 characters are represented by the values 193-196 (0xc1-0xc4, 0301-0304). The values have been chosen to ease their representation as escape sequences.
Since the shell doesn’t seem to interpret escape sequences on the
command line, the "-b" option cannot be easily used to designate
the strings to be encoded. As a workaround you can resort
to the command echo
, either within back-ticks or used
separately to create a file that is then fed to the standard-input
of barcode – assuming your echo
command processes escape
sequences. The newline character is especially though to encode
(but not impossible unless you use a csh
variant.
These problems only apply to the command-line tool; the use of library functions doesn’t give any problem. In needed, you can use the “code 128 raw” pseudo-encoding to represent code128 symbols by their numerical value. This encoding is used late in the auto-selection mechanism because (almost) any input string can be represented using code128.
Codabar can encode the ten digits and a few special symbols
(minus, plus, dollar, colon, bar, dot). The characters
“A
”, “B
”, “C
” and “D
” are used to
represent four different start/stop characters. The input
string to the barcode library can include the start and stop
characters or not include them (in which case “A
” is
used as start and “B
” as stop). Start and stop
characters in the input string can be either all lowercase or
all uppercase and are always printed as uppercase.
Plessey barcodes can encode all the hexadecimal digits. Alphabetic digits in the input string must either be all lowercase or all uppercase. The output text is always uppercase.
MSI can only encode the decimal digits. While the standard specifies either one or two check digits, the current implementation in this library only generates one check digit.
The code-93 standard can natively encode 48 different characters, including uppercase letters, digits, the blank space, plus, minus, dot, star, dollar, slash, percent, as well as five special characters: a start/stop delimiter and four "shift characters" used for extended encoding. Using this "extended encoding" method, any standard 7-bit ASCII character can be encoded, but it takes up two symbol lengths in barcode if the character is not natively supported (one of the 48). The encoder here fully implements the code 93 encoding standard. Any characters natively supported (A-Z, 0-9, ".+-/$&%") will be encoded as such - for any other characters (such as lower case letters, brackets, parentheses, etc.), the encoder will revert to extended encoding. As a note, the option to exclude the checksum will eliminate the two modulo-47 checksums (called C and K) from the barcode, but this probably will make it unreadable by 99% of all scanning systems. These checksums are specified to be used at the firmware level, and their absence will be interpreted as an invalid barcode.