Decode European and Domestic Corona QR Codes using PHP

  •  
  •  

For some countries it has been a few months already, for other it's just a few weeks, but the use of scanning your QR-code to access a building or event became the norm before we knew it. I find it at least remarkable to see how fast everyone got used to it.

As with many new technologies, I'm always curious what makes them tick. Same goes for these QR codes, how are they encoded and what information is inside?

Different kinds of CovidCerts: Greenpass vs Domestic

Apart from the known European Greenpass, which has been standardised by the European Union, other countries have their own Domestic Greenpass. These domestic passes can differ per country in both format and what's inside them, the only common denominator is that they're a QR code.

European Greenpass: EU Digital Covid Certificate

This kind of QR-code was the first that became widely available. it's standardised format ensures many countries know how to implement generating and decoding them. Their main use is for (international) travelling, mainly by plain but not exclusively so.

Domestic Dutch Greenpass

The Netherlands has it's own standard for generating a QR, totally different from the European standard. The fact that it's build using Privacy By Design is a very pleasant surprise, as you can see below the actual certificate contains very little information. Since september 25th, 2021, this dutch QR is required to enter restaurants, pubs, clubs, cultural events, cinema's, theaters and stadions.

Other Greenpasses

It is our final goal to have support for decoding all types of greenpasses out there. Since we have no samples to investigate for other countries, we're currently hitting a dead end. If you have any greenpass (sample) data from your country, please share it with us and we'll try to implement the decoding. Off-course we'll keep your data safe and delete it as soon as we can. If you like PHP, please feel free to add a decoder to the GitHub repository.

Technical working: Let's see what's inside

The used QR codes are just regular QR codes, nothing proprietary here. Therefore any QR reader will happily read the code and display resulting output: alphanumeric string of data, a mess at first glance. How this data contains the actual information differs per country. All certificates seem to start with their own 3-letter unique identifier, followed by a colon, and the binary data itself. The contents are retrieved by peeling back multiple layers of encoding, which differs per country. The exact methods are described per QR standard below:

European Greenpass

The data-string inside the QR is always prefixed with HC1:, followed by the Base45 encoded data itself. This result is a zlib-compressed CBOR stream. This CBOR part contains headers, the actual information and a signature. Decoding the actual information results in a string of standardised JSON data.

QR-Code ==> 'HC1:0D15EA5E...' ==> Base45 ==> Zlib ==> CBOR ==> JSON data + signature

Using to code from our Github repository, we can easily decode a sample QR and see what's inside:

Reading EHN DCC: European eHealth Network - Digital Covid Certificate
QR Code Issuer: AT
QR Code Expiry: Wednesday, 23-Jun-2021 16:29:57 CEST
QR Code Generated: Monday, 21-Jun-2021 16:29:57 CEST

Vaccination Group:
Dose Number: 1
Marketing Authorization Holder: ORG-100030215
vaccine or prophylaxis: 1119349007
ISO8601 complete date: Date of Vaccination: 2021-02-18
Country of Vaccination: AT
Unique Certificate Identifier: UVCI: URN:UVCI:01:AT:10807843F94AEE0EE5093FBC254BD813#B
vaccine medicinal product: EU/1/20/1528
Certificate Issuer: Ministry of Health, Austria
Total Series of Doses: 2
disease or agent targeted: 840539006
Surname(s), forename(s):
Standardised surname: MUSTERFRAU

As you can see, the data is quite elaborate. It contains a person's full name and birth date. Also the precise vaccination details, even the supplier and date are included. The signature is stored separately and is used to verify the certificate's origin.

Domestic Dutch Greenpass

Here, the data-string inside the QR is always prefixed with NL2:, followed by a non-standard Base45 flavour, because it's internal structure requires more space. The result is a ASN1 stream containing a signed Idemix document. Unlike JSON, ASN1 has no metadata, they assume the reader uses their Idemix definitions to find what what every field contains.

QR-Code ==> 'NL2:F00D8A8E...' ==> NL-Base45 ==> ASN1 ==> Annotate using Idemix

Using the code from our Github repository, we can easily decode a sample QR and see what's inside:

Reading EHN DCC: European eHealth Network - Digital Covid Certificate
QR Code Issuer: VWS-CC-2
QR Code Version: 02
QR Code Valid from: Wednesday, 28-Jul-2021 12:00:00 CEST
QR Code Valid until: Thursday, 29-Jul-2021 13:00:00 CEST

- version: 02
- issuer: VWS-CC-2
- isSpecimen: 1
- isPaperProof: 1
- validFrom: 1627466400
- validForHours: 25
- firstNameInitial: B
- lastNameInitial: B
- birthDay: 31
- birthMonth: 7

The resulting data is very brief, both first and last names have only the first letter, also only the birthday and month is disclosed. There's no information about the exact vaccine, it just says for how long the QR is valid. The version and issuer field hold more information on what certificates to use to verify the signature. Again, the signature verifies the certificate's origin.

Certificate Signing Why you shouldn't be able to generate your own certificates

Because we know the internal structure of the QR codes, we could in theory generate our own certificates quite easily. Too bad it won't actually validate, because there wouldn't be a valid signature inside. Here's simplified how that works:

The Generating, Signing, Decrypting and Verification process

  1. Using a QR-generator app, your phone generates a certificate based on your information.
    The result is sent to signing servers, which depend on the state you're in.
  2. The signing servers generate a signature using the currently active private key.
    The result is (optionally hashed) sent back to the requesting device, in our case your phone.
  3. Your phone joins the resulting signature with the certificate data and compresses it into a QR image.
    The resulting QR image is stored on your device, to ensure the QR is available when there's no internet.
  4. Using a QR-scanner app on the receiving party's phone, your QR code is scanned and decoded.
    The signature is decrypted using the currently active public key.
  5. If the signature can't be decrypted using the public key, the QR-scanner app denies the certificate.
    Otherwise, the QR-scanner show you your green checkmark.

This process requires internet access for the signing part of the process, but there's no internet access required for the verification part of the process. Off course this is mainly to avoid any issues when many people want to get access to a bigger event.

A possible Achilles' heel

The signing process uses asymmetric encryption, where both a private key and public key exist. The public key is derived from the private key and stored in a certificate with some metadata. This certificate has to be signed by one or more parent certificates, forming a certificate-chain that's trusted by the scanner app. A list of trusted certificates is hardcoded in the scanner apps, to ensure only known certificates are used.

The private key is able to both encrypt en decrypt data, but the public key is only able to decrypt data. Therefore, the signature can be generated by the holder of the private key and can be checked by anyone with access to the public key. These public keys can simply be retrieved, even if they differ per state.

Here's the problem with that: The private key is simply a string of bytes stored in a text-file. Once that key gets leaked — this can be by someone copying the key file or by someone hacking into a state's servers — anybody will be able to generate valid certificates. Off-course until this gets noticed and the associated certificate is revoked, then the private-key will be worthless.

Assuming a states infrastructure is awesomely secured, this will probably never happen.

Conclusion & Privacy Concerns

As much as I was happily surprised by the minimal amount of data in the Dutch certificate, I was shocked by the vast amount of data in the European pass. This difference probably comes from the combination that 1) the scanning part of the process has to work offline and 2) every country has it's own vaccination requirements. Where one country might allow access if you're both recovered and jabbed once, another country might require you to be jabbed twice anyhow.

Nonetheless, the amount of information in the European greenpass must be reduced by a lot! Since the cryptographic signing ensures the certificate can be trusted, a minimal amount of information — like the Dutch QR — must be sufficient. Adding to that, even though the Dutch QR contains very little information, it remains PII. In most cases, this QR is used to access small groups, restaurants and such and could therefore be quite uniquely traced back to one person.

What bothers me the most, is that everybody is able to install these scanner apps on their devices. Up until recently Dutch people were installing the Danish app, which doesn't vote well for technical knowledge of the people, even though these people are now empowered to deny people access based on what their phone says. Who says their phone's aren't rigged? Any hacked phone will be able to store and/or forward the scanned information to anybody. It's not unthinkable, we know there's money to be made in big-data.

Function Creep: Vaccine information used to generate Greenpasses

For obvious reasons our health institutions were creating a database on who received what vaccine. Now, the same information is used to create the greenpasses. This is a perfect example of function creep: where data from system A is used to provide system B. The data was never intended for this use, yet no law — even the GDPR — stops this from happening.

Who says this (d)evolution won't happen to the use cases too? It could be used to deny access to any place, at any time for any ­— yet to specify — reason. Because the QR codes are dynamic by design, there's ample room for expansion. It would fit in a few dozen of booster shots, different diseases or even just totally different information.

Also, we don't know what kind of databases states are building with the information we're sending to get our certificates signed. Even though the code of both the generator- as the scanner-apps has been made public, the actual code for the signing servers is not. Assuming states are collecting any data, it's simply a matter of time before this data leaks. So let's hope they're not storing anything.

What's next?

For now, we want to reverse engineer more domestic certificates. To grow the list we need your help, please feel free to send us your domestic (test) certificate. Meanwhile, we'll have a look at the generator- and scanner- apps to try and understand the data they send an receive. It's said the scanner app doesn't forward any information and we'd like to know for sure.

Finally we think it's important to keep your eyes peeled towards this technology.

We wouldn't want this technology to turn into medical apartheid, would we?

Sources