QRPh Part 2 - The EMV Roots

Ingo Noka

2025-01-21 (Last Modified: 2025-03-04)

Page content

Banks and their regulators have been conditioned to believe that EMV is the global standard for card payments. It is therefore not surprising that all domestic QR payment standards, including QRPh, are based on the EMV QR specifications. For this reason, I am covering the design of the EMV specification, before I will go into the QRPh specification in one of the next posts.

The large payment schemes such as MasterCard™ and Visa™ have always been firm believers in the smart card technology. That believe was reinforced with the introduction of contactless payments, which opened the world of mobile phones, smartwatches and other devices to EMV.

However, while the payment schemes and their member organizations spent millions (maybe billions) of dollars on expensive EMV terminals and smart cards, new players dismissed EMV as too cumbersome, too expensive and too much controlled by traditional payment schemes. Unencumbered by legacy systems and blessed with the "ignorance" of the newcomer, e-wallet providers in the merging markets of Asia implemented QR code-driven systems instead^[1]. They simply did not know that such a thing could not be done.

This movement gained critical momentum due to the adoption of QR payments by the ever so pragmatic Chinese payment systems such as Alipay, WeChat and UnionPay. Seeing the neck-breaking speed at which these systems proliferated in China, other countries, especially in the developing world, decided to copy the Chinese success formula; sometimes in cooperation with Chinese e-wallets, sometimes DIY-style.

The cost of QR code payment was much lower than the cost of EMV and also much less complex, especially for static merchant presented QR codes. Enrolling a merchant was as simple as printing a QR code on a cardboard standee and providing a low-cost phone to merchants for receiving payment confirmations. In fact, many enterprising merchants used their personal account to receive payments, without even asking the e-wallet provider.

Alerted by the tremendous growth of QR code payments, somebody at EMVCo must have lost their nerve. Or maybe EMVCo was pressured by the big players in the EMV world to come up with a standard that would allow the payment scheme members to benefit from the new QR technology. And thus, I think in 2017, the EMV QR Code Payment Standard was created.

EMVCo published two QR Code specifications. One for QR codes that are presented by the customer to the merchant and another one for QR codes that are presented by the merchant to the customer.

The customer presented QR code specification is geared towards traditional payment cards and must have been designed by people who come from the world of magnetic stripe and EMV smart cards.

The merchant presented QR code is different, since there is no real equivalent in the traditional world of card payments. The merchant presented QR code data format shows such pragmatism and a lack of complexity that I suspect this is maybe the first standard in the EMVCo world that was not driven by the traditional players.

Since this is a series of articles about QRPh, I will only cover the merchant-presented QR specification.

Both specifications focus on the data format and the meaning of some data included in a QR code. The elements that would be necessary for a truly interoperable system, such as messaging protocols, routing standards and so forth, are undefined or missing. The standards already contain everything that would be necessary for the existing global payment schemes, but also provide enough flexibility for new players to define their own data elements.

Merchant Presented

The merchant presented QR code standard defines the data that a merchant would include in a QR code that is presented to the customer for scanning. In this mode, the merchant has no information about the customer or the customer account. This is very different from the traditional card payments which always start with the customer handing over their account data, so that the merchant can request an authorization and settle funds later (pull the money from the customer account).

In the merchant presented QR transaction flow, the customer will "instruct" their e-wallet or bank to push the transaction amount to the merchant account. This does not preclude the option that the sending QR payment provider only sends the customer account information to the receiving QR payment provider for a later settlement.

"TLV" Encoding

All data elements in the QR data are encoded in three parts: the tag, the length and the value.

Tag	The tag defines the meaning, format, encoding and data type of the data object^[2] value. None of the information about the meaning, format, encoding or data type is encoded in the tag itself. Instead, this information needs to be obtained from the respective data standard.
Length	The length tells us how many bytes or characters^[3] are in the value.
Value	The value contains primitive data such as a number or a printable string or, in the case of a template data object, it contains other data objects.

It is unfortunate that EMVCo does not use well-established TLV message encoding conventions such as the ones in the ASN.1 standard^[4].

EMVCo, instead, chose to encode tag and length in two numerical characters each. In the case of the tag number, this encoding does not carry any information about the type of the data object’s value. It is just an identifier.

The inefficiency becomes clear when you consider that the specification allocates four bytes for the tag and the length, neither of which can be bigger than 99! The number 99 could be encoded in seven bits, and seven bits have room for up to 128 (incl. the zero) different numbers.

Apart from the obvious inefficiency of the encoding, this means that there is no way to sensibly parse the network-proprietary data objects. For instance, there is no way to work out from the data alone whether a data object is a template (i.e., whether it contains other TLV elements) or whether it is a primitive data object. Equally, it is impossible to work out whether the data is a number or a printable string or something else.

Primitive Data Object Types

There are three data types for the values of each template or primitive data object: numeric, alphanumeric-Special, and string. The encoding in all cases is UTF-8. Numeric is a subset of alphanumeric-special, which is a subset of string.

The numeric type data represents an integer consisting of one or multiple decimal digits (0 to 9). It is encoded by assigning each character its one-byte code point according to UTF-8 The encoding length is therefore always the number of digits in bytes, including leading zeros. In the case of tag and length, the integer must always contain exactly two decimal digits, padded with a '0' character on the left if necessary.

The alphanumeric-special data type consists of one or multiple, printable ASCII character (see list table 36 in (1)). As such, the encoding is done by assigning each character the one-byte UTF-8 code (between 0x20 and 0x7e).

The string data type is a character that can be encoded with UTF-8 into one, two, three or four bytes. All character encodings must be precomposed; that means characters with diacritical marks are always encoded into a single UTF-8 encoded character^[5].

The catch now is that the length within the TLV structure provides the number of characters, not the number of bytes.

Using the number of characters has some implications. The length of the encoded data (the number of bytes of the raw QR data) is different from the length provided in the TLV data object. This problem becomes evident in section 4.1. of (2) where they are trying to limit the length of the "payload" (i.e., the data represented by the QR code) to 512 alphanumeric characters:

Firstly, "alphanumeric character" is nowhere defined as a type. Is it possible that they intend to limit the number of bytes in the encoded data of the QR code to 512?
Secondly, there is a requirement to reduce the number of "alphanumeric characters" in the payload if some characters have to be UTF-8 encoded into more than one byte. Again, something that seems to suggest that "alphanumeric character" really means byte.

Even within the well-defined tags of the EMV specification, there are design decisions that can make battle-hardened computer engineers cry.

For instance, Table 3.6. (see (2)) shows that tags 02 to 51 are merchant account information with data type ans. In table 4.1 then, it is revealed that only tags 02 to 25 denote primitive data objects (reserved for various international networks), while tags 26 to 51 are templates that follow a certain structure. The inclusion of network-specific tags is already bad enough, but who came up with this design? All networks, whether old or new, should have been treated the same.

Templates

The value of a template data object is itself a list of data objects, which also follow the TLV format and encoding standard.

The EMV standard contains two classes of templates. There are pre-defined templates, i.e., templates for which the EMV standard defines which simple data objects may be included in the value of the template.

The standard also defines a few templates that can be used for the inclusion of data objects that are network-specific and that are not part of the EMV specification. These templates always have a tag 00 which contains a unique identifier which provides the context for the other data objects in the template. The identifier is similar to namespaces in programming languages. The meaning, value format and length requirements for the data objects in the network-specific templates are defined by the network that "owns" the namespace (i.e., the identifier).

Usually it does not make sense to speak of the data type of the value for a template, because the value of a template contains TLV encoded data objects which define the type of their value. The EMV specification, however, routinely provides a value type for formats. I can only assume that this type is meant as an upper boundary for the value types of the included data objects. For example, according to table 6.3, of (2) the value type for tags 02 to 51 is alphanumeric special. The value of tags 26 to 51 is always a template, which contains proprietary data objects. The value type of those proprietary data objects can therefore never be string, but it can be numeric or alphanumeric special.

The Structure of the QR code data

At the top-level, the QR code data consists of a concatenated list of encoded data objects, some are primitive data objects, and some are templates. The only restriction is that the primitive data object with tag 00 must be the first object in the list and the primitive data object with tag 63 must be the last. The value of tag 00 has a length of 2, is numeric and denotes the version of the specification. Tag value of tag 63 has a length of 4, is alphanumeric-special (really only 0 to 9 and A to F, i.e., hexadecimal digits), and denotes a checksum.

Ignoring some overlaps, the defined data objects in the specification can be grouped as follows (see Table 1):

Transaction-specific data,
Merchant-specific data,
Customer-specific data, and
Meta data.

Table 1. Data Object Groups
Meta Data	Transaction Data	Merchant Data	Customer Data
Payload Format Indicator Tip or Convenience Indicator CRC Additional Data / Additional Consumer Data Request	Point of Initiation Method Transaction Currency Transaction Amount Transaction Currency Transaction Amount Value of Convenience Fee - Fixed Value of Convenience Fee - Percentage Additional Data / Bill Number Additional Data / Reference Label Additional Data / Terminal Label Additional Data / Purpose of Transaction Additional Data / Merchant Channel	Merchant Account Information Merchant Category Code Country code Merchant Name Merchant City Postal Code Additional Data / Language Template Additional Data / Store Label Additional Data / Merchant Tax ID Merchant Information—Language / Language Preference Merchant Information—Language / Merchant Name Merchant Information—Language / Merchant City	Additional Data / Mobile Number Additional Data / Loyalty Number Additional Data / Customer Label

Transaction Processing Concepts

Transaction Routing

The information necessary to the sending QR provider to work out where to send the money is contained in the merchant account information (table 6.3, of (2)). The format is different depending on whether the receiving QR provider can be reached via one of the EMVCo member networks (Visa, Mastercard, JCB, American Express, China UnionPay, and Discover) or via another network.

The EMV QR standard does not provide any information about the data, the data format and the meaning of the data in the merchant account information. Therefore, a critical component that would make this standard truly open and interoperable is missing.

However, I have to assume that there will be at least two pieces of data to make the routing possible: an identifier which tells me which network or QR provider to connect to, and an account identifier for the receiving QR provider to debit the transaction amount to the correct merchant account.

The QR data may contain multiple different merchant account information data objects, so that the customer can choose which network to use. For example, the QR data may include merchant account information for the Visa and MasterCard networks as well as for a domestic payment network. Once the network has been chosen, the QR payment application of the customer can then still offer a choice of different funding sources such as Debit vs. Credit cards, different e-wallets and so forth.

Transaction Matching

I find it impossible to imagine a system in which there is no unique transaction identifier. The authors of the EMV QR standard do not seem to suffer from my lack of imagination, though. There is only one pre-defined data object called "reference label", which is in the additional data template, which in itself is optional. This reference label can even be defined by the customer and seems only useful for logging, receipts and statements.

The EMV QR code standard probably assumes that the Transaction Identifier is proprietary and will be provided by the respective network in one of the proprietary templates or data objects. From the QR code data I have analyzed so far, it seems likely that most networks will choose to include the transaction reference in the merchant account information. This is not ideal as it breaks the logic of the merchant account information data objects.

Once the transaction reference is available, it can be used to ensure that funds are not sent to the wrong account and QR codes are not used multiple times. It is also useful for exception handling, reconciliation, settlement and dispute resolution.

Customer provided data

The EMV QR code provides a mechanism for the merchant or the receiving QR service provider to ask the customer to include additional data in the message that will eventually be sent to the receiving QR provider. Usually this data will be entered by the customer on the mobile phone or added by the customer’s QR application.

Specifically, the receiving party can ask the customer to provide an address, a mobile number or an e-mail address. For this, the QR code data will contain an Additional Consumer Data Request data object in the Additional Data template. This data object will contain one, two or three characters: A if an address is requested, M for the mobile number and E for the email address.

Convenience Fee

The standard allows the merchant to include a tip or convenience fee as well as to ask the customer to add a tip. If the merchant includes a tip or convenience fee, it can be provided as a fixed amount or a percentage of the transaction amount.

In the case of the merchant providing the amount or percentage for the tip, it is already included the QR code data. In the case of the customer providing the information, the amount or percentage would have to be included in the transaction message to the sending and receiving QR providers by some other means that is outside the QR standard. Processing of customer provided tips is therefore proprietary and unlikely to be implemented in interoperable systems. If dynamic EMV codes are used, the customer could be asked to enter or choose the tip on the acceptance device.

It is important, however, to ensure that the transaction amount data-object contains the amount to be paid by the customer without the additional tip or convenience fees.

Extendability

The standard defines a mechanism for proprietary extensions. The extensions consist of templates that can be filled with data objects defined by a third party. As each party can define the meaning and format of the tags that are included in the template, there needs to be a mechanism to identify the party that controls format and meaning. This is what the reserved tag 00 is for. A TLV object with tag '00' must be the first in the template, and its value must contain a unique identifier, which determines the meaning and format of the data objects with tags 01 to 99.

The unique identifier can be an Application Identifier as used in Smart Cards, a UUID or a reversed domain name. So far, I have only encountered reversed domain names. Only identifiers that have been assigned by the respective name registry should be used, but in my experience this rule will probably not be followed, especially for the reversed domain name.

Authentication

The EMV QR standard does not contain anything that would support authentication of the QR data. Anybody with a computer can create EMV QR codes that are indistinguishable from QR codes that have been created by genuine participants in the QR payment systems.

Data Integrity

Since there is no data authentication, the only possible check the receiving QR application can perform is to verify the checksum and the formatting of the data. Formatting checks could include:

Check only pre-defined tag numbers at the root and in templates that are defined by the standard.
Check whether the length component of the TLV structure is correct[^[6]
Check whether the value of data objects only contains up to the maximum number of characters as defined in the standard
Check the correct data format in the value element of the TLV encoding
Check the correct internal data format where it is defined, e.g., the use of decimal point in amount
Check the position of the 00 tag as the first data object at the root and in templates.
Check whether one of the RFU tag numbers has been used.
Check whether multiple data objects with the same tag are included at the same level (root or within the value of a template).
Validate the check sum at the end of the data

I have no illusions that payment applications will actually do this and preserve the flexibility of the standard, limited as it is. Instead, as it happens with EMV, the "industry" will gravitate towards a substandard that puts a defined set of data objects in the QR code in a particular order. This will further complicate the introduction of international QR acceptance.

Issues

While the merchant account information template provides a means to offer multiple networks for the customer to choose from, the other proprietary templates have to be network-specific. For instance, In a QR code that contains merchant account information for two domestic e-wallets, there may be two "Additional Data Field Templates" as each wallet may define their own additional proprietary data. Of course, this may not matter as much as the backend protocol messages will likely contain the entire QR code data, so that the receiving network can pick the data they need. The size limitation of the QR code data could be an issue though.
A tip can either be entered by the customer on their phone or included in the QR code as fixed amount or percentage by the merchant. When the customer enters the amount, it is up to the backend messaging standard to define how this information is conveyed to the backends. I can pretty much guarantee that the fact that the tip amount can be in two different places will lead to confusion and mistakes.
There is no guarantee in the standard that the globally unique identifier will be unique indeed. For instance, if the unique identifier is an AID, it must follow ISO 7816-4, but there is no mandate that this AID was actually registered by the responsible registration authority. The reverse domain name variant is especially circumspect to abuse as no mandate exists to own the domain name, and even if the domain name is owned, who will ensure that subdomains remain unique.

The UUID variant will likely never be used as it is not intuitive and does not provide easy clues who that unique identifier actually belongs to.
The field "loyalty number" has to come from the customer, which means that the terminal has to read a loyalty card or request the number by some other way and generate the QR code with this value included. If the backend system generates the QR code, the terminal will have to send the loyalty number to the backend system when requesting a new QR code. This breaks the logic of the merchant-presented QR code payment where only merchant information flows in one direction.
The "Merchant Account Information" template only allows alphanumeric special data, but the network-specific data can contain string values
The values for tags 02 to 25 at the root level contain merchant account information for a fixed number of payment networks. But there is no information whether the values for these tags are templates or primitive data objects. Since the value data type for these tags is ans we can only assume that the values cannot contain templates that in turn contain primitive data objects with values that are of type string.

It is more likely that the sending participant is supposed to include the data in their transaction message to the Visa network without interpreting it any further.
The data for fields with ID '01' to '08' in the Additional Data Field Template (ID "62") can be provided by the terminal or by the customer. There is one small technical issue and one fundamental problem associated with that.
- When the terminal wants the customer to enter the data, the field would have a length of 3 and a value of *** . It is not obvious to me why they wasted three characters for this. I would have said just add the tag id and a length of zero, or a length of 1 with just one asterix.
- The fundamental problem with the merchant-presented QR code specification is that the designers did not seem entirely clear what the data in the QR code is for. Is the data supposed to convey just enough data for the sending QR provider to know where to send the money? Should the data also include information about the customer, which need to be transferred to the receiving QR provider? How much of the data has meaning for the sending QR provider to be useful for risk management? For instance, the sending QR participant does not need the "terminal label" in template 62 for a money transfer, and the receiving QR participant already has the information and does not need to send this information all the way through the fund transfer network. The same is true for the additional consumer data. The sending QR participant already has this information, and the receiving QR participant should not send personal consumer data through the fund transfer network; if they want it, they should ask the customer at the checkout for the info and process it in their own systems.

References

(1) EMVCo, “EMV - Integrated Circuit Card Specifications for Payment Systems, Book 4, Cardholder, Attendant, and Acquirer Interface Requirements,” EMVCo, LLC, Technical Standard, Nov. 2011. [Online]. Available: https://mvallim.github.io/emv-qrcode/docs/EMV_v4.3_Book_4_Other_Interfaces_20120607062305603.pdf.

(2) EMVCo, “EMV Code Specification for Payment Systems (EMV QRCPS) - Merchant-Presented Mode,” EMVCo, LLC, Technical Standard, Nov. 2020.

1. The real driver was the realtime account to account transfer of funds in the backend and the use of personal mobile devices which made mutual authentication, cardholder verification and the distinction between settlement and authorization mostly obsolete.

2. Data Objects are TLV construct where the value can either be a primitive type or a template. A template contains other data objects, primitive types do not.

3. EMVCo is using characters

4. ASN.1 provides efficient encoding mechanisms for the tag number, the type of the value, and the length part and so forth.

5. The other option would have been to use the UTF-8 encoding for the diacritical mark followed by the UTF-8 encoded value of the base character. However, that would then beg the question whether to add two to the length for each character with a diacritical mark or just add one.

6. This may prove difficult. An application would have problems deciding whether a character erroneously included in the value of a data object due to a wrong length should in fact be the first character of the tag of the next data object. Length errors will likely lead to follow-on errors such as incorrect tag numbers.