1 00:00:00,000 --> 00:00:19,480 *36c3 preroll music* 2 00:00:19,480 --> 00:00:25,090 Herald: The next talk is on how to break PDF's, breaking the encryption and the 3 00:00:25,090 --> 00:00:32,910 signatures, by Fabian Ising and Vladislav Mladenov. Their talk was accepted at CCS 4 00:00:32,910 --> 00:00:37,750 this year in London and they had that in November. It comes from research that 5 00:00:37,750 --> 00:00:43,660 basically produced two different kinds of papers and it has been... people worldwide 6 00:00:43,660 --> 00:00:47,540 have been interested in what has been going on. Please give them a great round 7 00:00:47,540 --> 00:00:51,758 of applause and welcome them to the stage. 8 00:00:51,758 --> 00:00:59,150 *Applause* 9 00:00:59,150 --> 00:01:11,590 Vladi: So can you hear me? Yeah. Perfect. OK. Now you can see the slides. My name is 10 00:01:11,590 --> 00:01:15,220 Vladislav Mladenov, or just Vladi if you have some questions to me and this is 11 00:01:15,220 --> 00:01:20,670 Fabian. And we are allowed today to talk about how to break PDF security or more 12 00:01:20,670 --> 00:01:28,230 special about how to break the cryptography operations in PDF files. We 13 00:01:28,230 --> 00:01:36,590 are a large team from university of Bochum, Mue nster and Hackmanit GmbH. So as 14 00:01:36,590 --> 00:01:46,159 I mentioned: We will talk about cryptography and PDF files. Does it work? 15 00:01:46,159 --> 00:01:57,720 Fabian: All right. OK. Let's try that again. Okay. 16 00:01:57,720 --> 00:02:02,070 Vladi: Perfect. This talk will consist of two parts. The first part is about 17 00:02:02,070 --> 00:02:07,829 digitally signed PDF files and how can we recognize such files? If we open them we 18 00:02:07,829 --> 00:02:16,230 see the information regarding that the file was signed and all verification 19 00:02:16,230 --> 00:02:20,690 procedures were valid. And more information regarding the signature 20 00:02:20,690 --> 00:02:27,220 validation panel and information about who signed this file. This is the first part 21 00:02:27,220 --> 00:02:35,660 of the talk and I will present this topic. And the second part is regarding PDF 22 00:02:35,660 --> 00:02:41,280 encrypted files and how can we recognize such files? If you tried to open such 23 00:02:41,280 --> 00:02:47,080 files, the first thing you see is the password prompt. And after entering the 24 00:02:47,080 --> 00:02:51,800 correct password, the file is decrypted and you can read the content within this 25 00:02:51,800 --> 00:02:57,720 file. If you open it with Adobe, additional information regarding if this 26 00:02:57,720 --> 00:03:04,420 file is secured or not is displayed further. And this is the second part of 27 00:03:04,420 --> 00:03:11,650 our talk, and Fabian, will talk: how can we break the PDA encryption? So before we 28 00:03:11,650 --> 00:03:19,450 start with the attacks on signatures or encryption, we first need some basics. And 29 00:03:19,450 --> 00:03:22,700 after six slides, you will be experts regarding PDF files and you will 30 00:03:22,700 --> 00:03:28,820 understand everything about it. But maybe it's a little bit boring, so be patient: 31 00:03:28,820 --> 00:03:34,830 there are only 6 slides. So the first is quite easy. PDF files are... the first 32 00:03:34,830 --> 00:03:42,250 specification was in 1993 and almost at the beginning PDF cryptography operations 33 00:03:42,250 --> 00:03:48,920 like signatures and encryption was already there. The last version is PDF 2.0 and it 34 00:03:48,920 --> 00:03:57,610 was released in 2017. And according to Adobe 1.6 billion files are on the web and 35 00:03:57,610 --> 00:04:06,140 perhaps more exchange beyond the web. So basically PDF files are everywhere. And 36 00:04:06,140 --> 00:04:11,790 that's the reason why we consider this topic and tried to find or to analyze the 37 00:04:11,790 --> 00:04:19,730 security of the features. If we have some very simple file and we open it with Adobe 38 00:04:19,730 --> 00:04:25,390 Reader, the first thing we see is, of course, the content. "Hello, world!" in 39 00:04:25,390 --> 00:04:32,060 this case, and additional information regarding the focused page and how many 40 00:04:32,060 --> 00:04:39,630 pages this document has. But what would happen if we don't use a PDF viewer and 41 00:04:39,630 --> 00:04:48,210 just use some text editor? We use the Notepad++ to open and later manipulate the 42 00:04:48,210 --> 00:04:56,400 files. So I will zoom this thing... this file. And the first thing we see is that 43 00:04:56,400 --> 00:05:04,500 we can read it. Perhaps it's quite, quite funny. And but we can still extract some 44 00:05:04,500 --> 00:05:10,910 information of this file. For example, some information regarding the pages. And 45 00:05:10,910 --> 00:05:19,740 here you can see the information that the PDF file consists of one page. But more 46 00:05:19,740 --> 00:05:27,350 interesting is that we can see the content of the file itself. So the lessons 47 00:05:27,350 --> 00:05:34,960 we learned is that we can use a simple text editor to view and edit PDF files. 48 00:05:34,960 --> 00:05:43,900 And for our attacks, we used only this text editor. So let's go to the details. 49 00:05:43,900 --> 00:05:51,560 How PDF files are structured and how they are processed. PDF files consist of 4 50 00:05:51,560 --> 00:05:59,170 parts: header, body and body is the most important part of the PDF files. The body 51 00:05:59,170 --> 00:06:03,820 contains the entire information presented to the user. And 2 other sections: Xref 52 00:06:03,820 --> 00:06:11,490 section and trailer. Very important think about processing PDF files, is that 53 00:06:11,490 --> 00:06:18,020 they're processed not from the top to the bottom, but from the bottom to the top. So 54 00:06:18,020 --> 00:06:23,700 the first thing is that the PDF viewer analyses or processes is the trailer. So 55 00:06:23,700 --> 00:06:28,981 let's start doing that. What information is starting this trailer? Basically, there 56 00:06:28,981 --> 00:06:35,540 are two very important informations. On the first side this is the information: 57 00:06:35,540 --> 00:06:41,410 what is the root element of this PDF? So which is the first object which will be 58 00:06:41,410 --> 00:06:47,860 processed? And the second important information is where the Xref section 59 00:06:47,860 --> 00:06:54,000 starts. It's just a byte offset pointing to the position of the XRef section within 60 00:06:54,000 --> 00:07:00,201 the PDF file. So this pointer, as mentioned before, points to the Xref 61 00:07:00,201 --> 00:07:05,710 section. But what is the Xref section about? The Xref section is a catalog 62 00:07:05,710 --> 00:07:11,180 pointing or holding the information where the objects defined in the body are 63 00:07:11,180 --> 00:07:18,741 contained or the byte positions of this object. So how can we read this weird Xref 64 00:07:18,741 --> 00:07:25,540 section? The first information we extract is that the first object, which is defined 65 00:07:25,540 --> 00:07:34,610 here, is the object with ID 0 and we have 5 further elements or objects which are 66 00:07:34,610 --> 00:07:41,090 defined. So the first object is here. The first entry is the byte position within 67 00:07:41,090 --> 00:07:46,610 the file. The second is its generation number. And the last charter points, if 68 00:07:46,610 --> 00:07:53,200 this object is used or not used. So reading it, reading this Xref section, we 69 00:07:53,200 --> 00:08:00,590 extract the information that the object with ID 0 is at byte position 0 and is not 70 00:08:00,590 --> 00:08:08,650 in use. So the object with ID 1 is at the position 9 and so on and so forth. So for 71 00:08:08,650 --> 00:08:18,370 the object with ID 4 and the object number comes from counting it: 0 1, 2, 3 and 4. 72 00:08:18,370 --> 00:08:29,430 So the object with ID 4 can be found at the offset 184 and it's in use. In other 73 00:08:29,430 --> 00:08:35,449 words, the PDF viewer knows where each object will be found and can properly 74 00:08:35,449 --> 00:08:42,329 display it and process it. Now we come to the most important part: the body, and I 75 00:08:42,329 --> 00:08:48,810 mentioned it that in the body the entire content which is presented to the user is 76 00:08:48,810 --> 00:08:58,220 contained. So let's see. Object 4 0 is this one and as you can see, it contains 77 00:08:58,220 --> 00:09:04,870 the word "Hello World". The other objects are a reference, too. So each pointer 78 00:09:04,870 --> 00:09:10,119 points exactly to the starting position of each of the objects. And how can we read 79 00:09:10,119 --> 00:09:15,910 this object? You see, we have an object starting with the ID number, then the 80 00:09:15,910 --> 00:09:24,999 generation number and the word "obj". So you now know where the object starts 81 00:09:24,999 --> 00:09:32,259 and when it ends. Now how can we process this body? As I mentioned before in the 82 00:09:32,259 --> 00:09:40,970 trailer, there was a reference regarding the root element and this element was with 83 00:09:40,970 --> 00:09:48,769 ID 1 and generation number 0. So, we now we start reading the document here and we 84 00:09:48,769 --> 00:09:55,910 have a catalog and a reference to some pages. Pages is just a description of all 85 00:09:55,910 --> 00:10:02,889 the pages contained within the file. And what can we see here is that we have this 86 00:10:02,889 --> 00:10:09,779 number count once or we have only one page and a reference to the page object which 87 00:10:09,779 --> 00:10:15,170 contains the entire information inscription of the page. If we have 88 00:10:15,170 --> 00:10:22,230 multiple pages, then we will have here multiple elements. Then we have one page. 89 00:10:22,230 --> 00:10:29,850 And here we have the contents, which is a reference to the string we already saw. 90 00:10:29,850 --> 00:10:35,139 Perfect. If you understand this then you know everything or almost everything about 91 00:10:35,139 --> 00:10:39,360 PDF files. Now you can just use your editor and open such files and analyze 92 00:10:39,360 --> 00:10:50,310 them. Then we need one feature... I forgot the last part. The most simple one. The 93 00:10:50,310 --> 00:10:56,129 header. It should just one line stating which version is used. For example, in our 94 00:10:56,129 --> 00:11:04,779 case, 1.4. For the last version of Adobe here will be stated 2.0. Now, we need this 95 00:11:04,779 --> 00:11:13,699 one feature called "Incremental Update". And I call this feature - do you know this 96 00:11:13,699 --> 00:11:19,629 feature highlighting something in the PDF file or putting some sticky notes? 97 00:11:19,629 --> 00:11:24,119 Technically, it's called "incremental update." I just call it reviewing master 98 00:11:24,119 --> 00:11:30,680 and bachelor thesis of my students because this is exactly the procedure I follow. I 99 00:11:30,680 --> 00:11:38,100 just read the text and highlight something and store the information I put at it. 100 00:11:38,100 --> 00:11:46,970 Technically by putting such a sticky note. this additional information is appended 101 00:11:46,970 --> 00:11:53,160 after the end of the file. So we have a body update which contains exactly the 102 00:11:53,160 --> 00:12:01,369 information additionally of the new objects and of course, new Xref section 103 00:12:01,369 --> 00:12:15,610 and a new trailer pointing to this new object. Okay, we are done. Considering 104 00:12:15,610 --> 00:12:23,860 incremental update, we saw that it is used mainly for sticky notes or highlighting. 105 00:12:23,860 --> 00:12:29,679 But we observed something which is very important because an incremental update we 106 00:12:29,679 --> 00:12:36,930 can redefine existing objects, for example, we can redefine the object with 107 00:12:36,930 --> 00:12:45,730 ID 4 and put new content. So we replace in this manner the word "Hello World" with 108 00:12:45,730 --> 00:12:51,699 another sentence and of course the Xref section and the trailer point to this new 109 00:12:51,699 --> 00:13:00,100 object. So this is very important. With incremental update we are not stuck to 110 00:13:00,100 --> 00:13:06,220 only adding some highlighting or notes. We can redefine already existing content and 111 00:13:06,220 --> 00:13:14,399 perhaps we need this for the attacks we will present. So let's talk about PDF 112 00:13:14,399 --> 00:13:23,339 signatures. First, we need a difference between electronic signature and digital 113 00:13:23,339 --> 00:13:28,699 signature. Electronic signature. From a technical point of view, it's just an 114 00:13:28,699 --> 00:13:36,369 image. I just wrote it on my PC and put it into the file. There is no cryptographic 115 00:13:36,369 --> 00:13:40,890 protection. It could be me lying on the beach doing something. From cryptographic 116 00:13:40,890 --> 00:13:45,509 point of view is the same. It does not provide any security, any cryptographic 117 00:13:45,509 --> 00:13:52,739 security. What we will talk about here is about digitally signed files, so if you 118 00:13:52,739 --> 00:14:00,290 open such files, you have the additional information regarding the validation about 119 00:14:00,290 --> 00:14:08,309 the signatures and who signed this PDF file. So as I mentioned before, this talk 120 00:14:08,309 --> 00:14:16,689 will concentrate only on these digitally signed PDF files. How? What kind of 121 00:14:16,689 --> 00:14:22,879 process is behind digitally signing PDF files? Imagine we have this abstract 122 00:14:22,879 --> 00:14:28,639 overview of a PDF document. We have the header, body, Xref section and trailer. We 123 00:14:28,639 --> 00:14:35,480 want to sign it. What happens is that we take this PDF file and via incremental 124 00:14:35,480 --> 00:14:41,899 update we put additional information regarding that. There is a new catalog and 125 00:14:41,899 --> 00:14:46,379 more important, a new signature object containing the signature value and 126 00:14:46,379 --> 00:14:52,100 information about who signed this PDF file. And of course, there is an Xref 127 00:14:52,100 --> 00:14:58,970 section and trailer. And relevant for you: The entire file is now protected by the 128 00:14:58,970 --> 00:15:06,860 PDF signature. So manipulations within this area should not be possible, right? 129 00:15:06,860 --> 00:15:15,879 Yeah, let's talk about this: why it's not possible and how can we break it? First, 130 00:15:15,879 --> 00:15:21,370 we need an attack scenario. What we want to achieve as an attacker. We assumed in 131 00:15:21,370 --> 00:15:27,839 our research that the attacker possesses this signed PDF file. This could be an old 132 00:15:27,839 --> 00:15:35,989 contract, receipt or, in our case, a bill from Amazon. And if we open this file, the 133 00:15:35,989 --> 00:15:41,440 signature is valid. So everything is green. No warnings are thrown and 134 00:15:41,440 --> 00:15:48,329 everything is fine. What we tried to do is to take this file, manipulate it somehow 135 00:15:48,329 --> 00:15:56,319 and then send it to the victim. And now the victim expects to receive a digitally 136 00:15:56,319 --> 00:16:01,779 signed PDF file, so just tripping the digital signature is a very trivial 137 00:16:01,779 --> 00:16:07,600 scenario and we did not consider it because it's trivial. We considered that 138 00:16:07,600 --> 00:16:13,240 the victim expects to see that there is a signature and it is valid. So no warning 139 00:16:13,240 --> 00:16:20,420 casts are thrown and the entire left side is exactly the same from the normal 140 00:16:20,420 --> 00:16:28,109 behavior. But on the other side, the content was exchanged so we manipulated 141 00:16:28,109 --> 00:16:33,790 the receipt and exchanged it with another content. The question is now: how can we 142 00:16:33,790 --> 00:16:41,079 do it on a technical level? And we came up with three attacks: incremental saving 143 00:16:41,079 --> 00:16:45,929 attacks, signature wrapping and universal signature forgery. And I will now 144 00:16:45,929 --> 00:16:51,209 introduce the techniques and how these attacks are working. The first attack is 145 00:16:51,209 --> 00:16:56,839 the incremental saving attack. So I mentioned before that via incremental 146 00:16:56,839 --> 00:17:06,439 saving or via incremental updates, we can add and remove and even redefine already 147 00:17:06,439 --> 00:17:14,650 existing objects and the signature still stays valid. Why is this happening? 148 00:17:14,650 --> 00:17:21,110 Consider now again our case. We have some header, body, Xref table and trailer and 149 00:17:21,110 --> 00:17:27,559 the file is now signed and the signature protects only the signed area. So what 150 00:17:27,559 --> 00:17:32,600 would happen if I put a sticky note or some highlighting? An incremental update 151 00:17:32,600 --> 00:17:39,169 happens. If I open this file, usually this happens: We have the information that this 152 00:17:39,169 --> 00:17:45,799 signature is valid, when it was signed and so on and so forth. So our first idea was 153 00:17:45,799 --> 00:17:53,250 to just put new body updates, redefine already existing content and with a Xref 154 00:17:53,250 --> 00:17:59,419 table and trailer we point to the new content. This is quite trivial because 155 00:17:59,419 --> 00:18:04,820 it's a legitimate feature in PDF files, so we didn't expect to be quite successful 156 00:18:04,820 --> 00:18:11,760 and we were not so successful. But the first idea: we applied this attack, we 157 00:18:11,760 --> 00:18:22,080 opened it and we got this message. So it's kind of a weird message because an 158 00:18:22,080 --> 00:18:27,970 experienced user sees valid, but the document has been updated and you should 159 00:18:27,970 --> 00:18:33,580 know what does this exactly mean. But we did not consider this attack as successful 160 00:18:33,580 --> 00:18:41,110 because the warning is not the same or the status of the signature validation is not 161 00:18:41,110 --> 00:18:50,909 the same. So what we did is to evaluate this first against this trivial case, 162 00:18:50,909 --> 00:18:56,860 against older viewers we have, and Libre office, for example, was vulnerable 163 00:18:56,860 --> 00:19:01,769 against this trivial attack. This was the only viewer which was vulnerable against 164 00:19:01,769 --> 00:19:07,440 this trivial variation. But then we asked ourselves: Okay, the other viewers are 165 00:19:07,440 --> 00:19:14,250 quite secure. But how do they detect these incremental updates? And from developer 166 00:19:14,250 --> 00:19:22,410 point of view, the laziest thing we can do is just to check if another Xref table and 167 00:19:22,410 --> 00:19:28,330 trailer were added after the signature was applied. So we just put our body updates 168 00:19:28,330 --> 00:19:37,450 but just deleted the other two parts. This is not a standard compliant PDF file. It's 169 00:19:37,450 --> 00:19:44,789 broken. But our hope was that the PDF viewer fixes this kind of stuff for us and 170 00:19:44,789 --> 00:19:51,210 that these viewers are error-tolerant. And we were quite successful because the 171 00:19:51,210 --> 00:19:56,320 verification logic just checked: Is there an Xref table and trailer after the 172 00:19:56,320 --> 00:20:01,580 signature was applied? No? Okay. Everything's fine. The signature is valid. 173 00:20:01,580 --> 00:20:05,450 No warning was thrown. But then the application logic saw that incremental 174 00:20:05,450 --> 00:20:13,580 updates were applied and fixed this for us and processed these body updates and no 175 00:20:13,580 --> 00:20:21,159 warning was thrown. Some of the viewers required to have a trailer. I don't know 176 00:20:21,159 --> 00:20:25,350 why - it was a Black box testing. So we just removed the Xref table, but the 177 00:20:25,350 --> 00:20:32,030 trailer was there and we were able to break further PDF viewers. The most 178 00:20:32,030 --> 00:20:38,490 complex variation of the attack was the following: We had the PDF viewers checked 179 00:20:38,490 --> 00:20:47,330 if every incremental update contains a signature object. But they did not check 180 00:20:47,330 --> 00:20:53,200 if this signature is covered by the incremental update. So we just copy-pasted 181 00:20:53,200 --> 00:21:01,290 the signature which was provided here and we just forced the PDF viewer to validate 182 00:21:01,290 --> 00:21:10,100 this signed content twice - and still our body updates were processed and for 183 00:21:10,100 --> 00:21:18,669 example, Foxit or Master PDF were vulnerable against this type of attack. So 184 00:21:18,669 --> 00:21:24,909 the evaluation of our attack: We considered as part of our evaluation 22 185 00:21:24,909 --> 00:21:31,050 different viewers - among others, Adobe with different versions, Foxit, and so on. 186 00:21:31,050 --> 00:21:41,140 And as you can see 11 of 22 were vulnerable against incremental saving. So 187 00:21:41,140 --> 00:21:47,160 50 percent, and we were quite surprised because we saw that the developers saw 188 00:21:47,160 --> 00:21:51,639 that incremental updates could be dangerous regarding the signature 189 00:21:51,639 --> 00:22:01,070 validation. But we were still able to bypass their considerations. We had - a 190 00:22:01,070 --> 00:22:07,769 full signature bypass means that there is no possibility for the victim to detect 191 00:22:07,769 --> 00:22:14,269 the attack. A limited signature bypass means that the victim, if the victim 192 00:22:14,269 --> 00:22:23,470 clicks on one - at least one - additional window and explicitly wants to validate 193 00:22:23,470 --> 00:22:31,520 the signature, then the viewer was vulnerable. But the most important thing 194 00:22:31,520 --> 00:22:38,080 is by opening the file, there was a status message that the signature validation and 195 00:22:38,080 --> 00:22:44,289 all signatures are valid. So this was the first layer and the viewers were 196 00:22:44,289 --> 00:22:51,390 vulnerable against this. So let's talk about the second attack class. We called 197 00:22:51,390 --> 00:22:57,970 it "signature wrapping attack" and this is the most complex attack of the 3 classes. 198 00:22:57,970 --> 00:23:04,580 And now we have to go a little bit into the details of how PDF signatures are 199 00:23:04,580 --> 00:23:10,450 made. So imagine now we have a PDF file. We have some header and the original 200 00:23:10,450 --> 00:23:15,549 document. The original document contains the header, the body, the Xref section and 201 00:23:15,549 --> 00:23:21,919 so on and so forth. And we want to sign this document. Technically, again, an 202 00:23:21,919 --> 00:23:28,700 incremental update is provided and we have a new catalog here. We have some other 203 00:23:28,700 --> 00:23:35,159 objects, for example, certificates and so on and the signature objects. And we will 204 00:23:35,159 --> 00:23:38,720 now concentrate on this signature object because it's essential for the attack we 205 00:23:38,720 --> 00:23:45,399 want to to carry out. And the signature object contains a lot of information, but 206 00:23:45,399 --> 00:23:51,460 we want for this attacks only two elements are relevant: The contents and the byte 207 00:23:51,460 --> 00:23:57,940 range. The contents contains the signature value. It's a PKCS7 container containing 208 00:23:57,940 --> 00:24:05,710 the signature value and the certificates used to validate the signature and the 209 00:24:05,710 --> 00:24:11,299 bytes range. The byte range contains four different values and what how these values 210 00:24:11,299 --> 00:24:23,090 are being used. The first two, A and B define the first signed area. And this is 211 00:24:23,090 --> 00:24:29,159 here from the beginning of the document until the start of the signature value. 212 00:24:29,159 --> 00:24:35,370 Why we need this? Because the signature value is part of the signed area. So we need 213 00:24:35,370 --> 00:24:42,780 to exclude the signature value from the document computation. And this is how the 214 00:24:42,780 --> 00:24:49,179 bytes range is used. The first part is from the beginning of the document until 215 00:24:49,179 --> 00:24:54,629 the signed the signature value starts and after the signature ends until the end of 216 00:24:54,629 --> 00:25:04,759 the file is the second area specified by the two digits C and D. So, now we have 217 00:25:04,759 --> 00:25:13,500 everything protected besides the signature value itself. What we wanted to try is to 218 00:25:13,500 --> 00:25:21,889 create additional space for our attacks. So our idea was to move the second signed 219 00:25:21,889 --> 00:25:30,350 area. And how can we do it? So basically we can do it by just defining another byte 220 00:25:30,350 --> 00:25:40,240 range. And as you can see here, the byte range points from area A to B. So this 221 00:25:40,240 --> 00:25:46,889 area we didn't made any manipulation in this part, right? It was not modified at 222 00:25:46,889 --> 00:25:53,309 all. So it's still valid. And the second part, the new C value and the next D 223 00:25:53,309 --> 00:26:00,169 bytes, we didn't change anything here, right? So basically, we didn't changed 224 00:26:00,169 --> 00:26:06,750 anything in the signed area. And the signature is still valid. But what we 225 00:26:06,750 --> 00:26:13,980 created was a space for some malicious objects; sometimes we needed some padding 226 00:26:13,980 --> 00:26:20,960 and a new extra section pointing to this malicious objects. Important thing was 227 00:26:20,960 --> 00:26:27,559 that this malicious Xref sections, the position is defined by the trailer. And 228 00:26:27,559 --> 00:26:32,799 since we can not modify this trailer, this position is fixed. So this is the only 229 00:26:32,799 --> 00:26:42,880 limitation of the attack, but it works like a charm. And the question is now: How 230 00:26:42,880 --> 00:26:49,730 many PDF viewers were vulnerable against this attack? And as you can see, this is 231 00:26:49,730 --> 00:26:58,169 the signature wrapping column. 17 out of 22 applications were vulnerable against 232 00:26:58,169 --> 00:27:06,000 this attack. This was quite expected result because the attack was complex we 233 00:27:06,000 --> 00:27:14,789 saw that many developers didn't, were not aware of this threat and that's the reason 234 00:27:14,789 --> 00:27:22,600 why so many vulnerabilities were there. Now to the last class of attacks, 235 00:27:22,600 --> 00:27:28,580 universal signature forgery. And we called it universal signature forgery, but I 236 00:27:28,580 --> 00:27:33,879 preferred to use another definition for this attacks. I call them stupid 237 00:27:33,879 --> 00:27:40,909 implementation flaws. We are coming from the PenTesting area and I know a lot of 238 00:27:40,909 --> 00:27:49,880 you are PenTesters, too. And, many of you have experience, quite interesting 239 00:27:49,880 --> 00:27:58,460 experience with zero bytes, null values or some kind of weird values. And this is 240 00:27:58,460 --> 00:28:06,309 what we tried in this kind of attacks. Just tried to do some stupid values or 241 00:28:06,309 --> 00:28:13,100 remove references and see what happen. Considering the signature, there are two 242 00:28:13,100 --> 00:28:18,389 different important elements: The contents containing the signature value and the 243 00:28:18,389 --> 00:28:25,220 byte range pointing to what is exactly signed. So, what would happen if we remove 244 00:28:25,220 --> 00:28:30,679 the contents? Our hope was that the information regarding the signature is 245 00:28:30,679 --> 00:28:37,779 still shown by the viewer as valid without validating any signature because it was 246 00:28:37,779 --> 00:28:45,169 not possible. And by just removing the signature value is quite obvious idea. And 247 00:28:45,169 --> 00:28:48,899 we were not successful with this kind of attack. But let's proceed with another 248 00:28:48,899 --> 00:28:57,090 values like for example, contents without any value or contents like equals NULL or 249 00:28:57,090 --> 00:29:04,710 zero bytes. And considering this last version, we had two viewers which were 250 00:29:04,710 --> 00:29:15,049 vulnerable against this attack. And another, another case is, for example, by 251 00:29:15,049 --> 00:29:19,929 removing the byte range. By removing this byte range we have some signature value, 252 00:29:19,929 --> 00:29:29,590 but we don't know what is exactly signed. So, we tried this attack and of course, 253 00:29:29,590 --> 00:29:38,390 byte range without any value or NULL bytes or byte range with a minus or negative, 254 00:29:38,390 --> 00:29:46,169 negative numbers. And usually this last crashed very a lot of viewers. But the 255 00:29:46,169 --> 00:29:51,800 most interesting is that Adobe made this mistake by just removing the byte range. 256 00:29:51,800 --> 00:29:56,990 We were able to bypass the entire security. We didn't expect this behavior, 257 00:29:56,990 --> 00:30:00,950 but it was a stupid implementation flaw, allowing us to do anything in this 258 00:30:00,950 --> 00:30:08,190 document and all the exploits we show in our presentations were made on Adobe with 259 00:30:08,190 --> 00:30:14,909 this attack. So let's see what were the results of this attack. As you can see, 260 00:30:14,909 --> 00:30:21,110 only 4 of 22 viewers were vulnerable against this attack and only Adobe 261 00:30:21,110 --> 00:30:26,280 unlimited; for the others, there was limitation because if you click on the 262 00:30:26,280 --> 00:30:32,760 signature validation, then a warning was thrown. It was very easy for Adobe to fix. 263 00:30:32,760 --> 00:30:37,540 And as you can see, Adobe didn't mistake, made any mistake regarding incremental 264 00:30:37,540 --> 00:30:40,820 saving, a signature wrapping, but regarding controversial signature forgery. 265 00:30:40,820 --> 00:30:48,169 There were vulnerable against this attack. And this was the hope of our approach. In 266 00:30:48,169 --> 00:30:56,029 summary, we were able to break 21 of 22 PDF viewers. The only 267 00:30:56,029 --> 00:31:00,850 *Applause* Thanks. 268 00:31:00,850 --> 00:31:08,149 *Applause* The only secure PDF viewer is Adobe 9, 269 00:31:08,149 --> 00:31:12,860 which is deprecated and has remote code execution. The only 270 00:31:12,860 --> 00:31:18,039 *Laugh* The only users allowed to use them or are 271 00:31:18,039 --> 00:31:25,450 using it are Linux users, because this is the last version available for Linux and 272 00:31:25,450 --> 00:31:31,779 that's the reason why you consider it. So, I'm done with the talk about PDF 273 00:31:31,779 --> 00:31:36,644 signatures and now Fabian can talk about PDF encryption. Thank you. 274 00:31:36,644 --> 00:31:42,540 Fabian: Yes *Applause* 275 00:31:42,540 --> 00:31:46,759 OK, now that we have dealt with the signatures, let's talk about another 276 00:31:46,759 --> 00:31:52,759 cryptographic aspect in PDFs. And that is encryption. And some of you might remember 277 00:31:52,759 --> 00:31:58,481 our PDFex vulnerability from earlier this year. It's, of course, an attack with a 278 00:31:58,481 --> 00:32:03,720 logo and it presents two novel tech techniques targeting PDF encryption that 279 00:32:03,720 --> 00:32:08,029 have never been applied to PDF encryption before. So one of them is these so-called 280 00:32:08,029 --> 00:32:12,549 direct exfiltration where we break the cryptography without even touching the 281 00:32:12,549 --> 00:32:18,840 cryptography. So no ciphertext manipulation here. The second one as so- 282 00:32:18,840 --> 00:32:24,690 called malleability gadgets. And those are actually targeted modifications of the 283 00:32:24,690 --> 00:32:31,240 ciphertext of the document. But first, let's take a step back and let again take 284 00:32:31,240 --> 00:32:39,519 some keywords in. So PDF uses AES. OK. Well, AES is good. Nothing can go wrong, 285 00:32:39,519 --> 00:32:44,220 right? So let's go home. Encryption is fine. Well, of course, we didn't stop 286 00:32:44,220 --> 00:32:52,160 here, but took a closer look. So they use CBC mode of operation, so cipher block 287 00:32:52,160 --> 00:32:58,309 chaining. And, what's more important is that they don't use any integrity 288 00:32:58,309 --> 00:33:04,120 protection. So it's unintegrity protected AES-CBC. And you might remember the 289 00:33:04,120 --> 00:33:08,909 scenario from the attacks against encrypted e-mail, so against OpenPGP and 290 00:33:08,909 --> 00:33:15,999 S-MIME, it's basically the same problem. But first, who actually uses PDF 291 00:33:15,999 --> 00:33:20,940 encryption? You might ask. For one, we found some local banks in Germany use 292 00:33:20,940 --> 00:33:26,030 encrypted PDFs as a drop-in replacement for S-MIME or OpenPGP because their 293 00:33:26,030 --> 00:33:34,899 customers might not want to deal with uhm, set, with the setup of encrypted e-mail. 294 00:33:34,899 --> 00:33:39,740 Second one, were some drop-in plugins for encrypt e-mail as well. So there are some 295 00:33:39,740 --> 00:33:44,570 companies out there that produce product that you can put into your outlook and you 296 00:33:44,570 --> 00:33:51,330 can use encrypted PDF files instead of encrypted email. We also found that some 297 00:33:51,330 --> 00:33:57,919 scanners and medical devices were able to send encrypted PDF files via e-mail. So 298 00:33:57,919 --> 00:34:02,990 you can set a password on that machine and they will send the encrypted PDF via 299 00:34:02,990 --> 00:34:10,369 e-mail and you have to put in the password some other way. And lastly, we 300 00:34:10,369 --> 00:34:14,639 found that some governmental organizations use encrypted PDF documents, for example, 301 00:34:14,639 --> 00:34:20,409 the US Department of Justice allows for the send, sending in some claims via 302 00:34:20,409 --> 00:34:25,280 encrypted PDFs. And I've exactly no idea how you how they get the password, but at 303 00:34:25,280 --> 00:34:30,850 least they allow it. So as we are from academia, let's take a step back and look 304 00:34:30,850 --> 00:34:36,860 at our attacker model. So we've got Alice and Bob. Alice wants to send a document to 305 00:34:36,860 --> 00:34:42,120 Bob. And she wants to send it over an unencrypted channel or a channel she 306 00:34:42,120 --> 00:34:48,610 doesn't trust. So of course, she decides to encrypt it. Second scenario is, they 307 00:34:48,610 --> 00:34:53,020 want to upload it to a shared storage. For example, Dropbox or any other shared 308 00:34:53,020 --> 00:34:57,190 storage. And of course, they don't trust the storage. So, again, they use end-to- 309 00:34:57,190 --> 00:35:05,120 end encryption. So let's assume that this shared storage is indeed dangerous or 310 00:35:05,120 --> 00:35:11,420 malicious. So, Alice will, of course, again upload the encrypted document to the 311 00:35:11,420 --> 00:35:17,490 attacker in this case, will perform some targeted modification of that, and will 312 00:35:17,490 --> 00:35:22,290 send the modified documents back to Bob, who will happily put in the password 313 00:35:22,290 --> 00:35:26,800 because from his point of view, it's undistinguishable from the original 314 00:35:26,800 --> 00:35:32,880 document and the original plain text will be leaked back to the attacker, breaking 315 00:35:32,880 --> 00:35:39,730 the confidentiality. So let's take a look at the first attack on how we did that. 316 00:35:39,730 --> 00:35:43,410 That's the direct exfiltration, so breaking the cryptography without touching 317 00:35:43,410 --> 00:35:51,360 any cryptography, as I like to say. But first, encryption in, in a nutshell, PDF 318 00:35:51,360 --> 00:35:54,570 encryption. So you have seen the structure of the PDF document. There is a header 319 00:35:54,570 --> 00:35:59,990 with a version number. There's a body where all the interesting objects live. So 320 00:35:59,990 --> 00:36:06,820 there is our confidential content that we want to actually, well, to actually 321 00:36:06,820 --> 00:36:14,740 exfiltrate as an attacker. And finally, there is Xref table and the trailer. So 322 00:36:14,740 --> 00:36:19,730 what changes if we decide to encrypt this document? Well, actually, not a whole lot. 323 00:36:19,730 --> 00:36:24,080 So instead of confidential data, of course, there's now some encrypted 324 00:36:24,080 --> 00:36:29,010 ciphertext. Okay. And the rest pretty much remains the same. The only thing that is 325 00:36:29,010 --> 00:36:36,960 added is a new value in the trailer that tells us how to decrypt this data again. 326 00:36:36,960 --> 00:36:43,560 So there's pretty much of the structure left unencrypted. And we thought about: 327 00:36:43,560 --> 00:36:50,120 Why is this? And we took a look at the standard. So, this is an excerpt from the 328 00:36:50,120 --> 00:36:55,940 PDF specification and I've highlighted the interesting parts for you. Encryption is 329 00:36:55,940 --> 00:37:00,690 only applied to strings and streams. Well, those of the values that actually can 330 00:37:00,690 --> 00:37:07,640 contain any text in the document and all other objects are not encrypted. And that 331 00:37:07,640 --> 00:37:12,270 is because, well, they want to allow random access to the whole document. So no 332 00:37:12,270 --> 00:37:17,600 parsing the whole document before actually showing page 16 of the encrypted document. 333 00:37:17,600 --> 00:37:24,560 Well, that seems kind of reasonable. So, but that also means that the whole 334 00:37:24,560 --> 00:37:27,970 documents structure is unencrypted and only the streams and strings are 335 00:37:27,970 --> 00:37:31,380 encrypted. This reveals a lot of information to an attacker that he or she 336 00:37:31,380 --> 00:37:36,420 shouldn't have probably. That's for one the number and size of pages, that's the 337 00:37:36,420 --> 00:37:42,610 number and size of objects in the document and that's also including any links, so 338 00:37:42,610 --> 00:37:48,120 any hyperlinks in document that are actually there. So, that's a lot of 339 00:37:48,120 --> 00:37:55,260 information an attacker probably shouldn't have. So, next we thought maybe we can do 340 00:37:55,260 --> 00:38:01,270 some more stuff. Can we add our own unencrypted content? And we took a look at 341 00:38:01,270 --> 00:38:05,910 the standard again and found that our so- called crypt filters, which provide finer 342 00:38:05,910 --> 00:38:10,750 granularity control of the encryption. This basically means as an attacker, I can 343 00:38:10,750 --> 00:38:15,920 change a document to say, hey, only strings in this document are encrypted and 344 00:38:15,920 --> 00:38:21,340 streams are unencrypted. That's what the identity filter is for. I have no idea why 345 00:38:21,340 --> 00:38:27,190 they decided to add that to a document format, but it's there. So that means 346 00:38:27,190 --> 00:38:31,570 their support for partial encryption and that means attackers content can be mixed 347 00:38:31,570 --> 00:38:36,880 with actual encrypted content. And we found 18 different techniques to do that 348 00:38:36,880 --> 00:38:42,290 in different readers. So there is a lot of ways to do that in the different readers. 349 00:38:42,290 --> 00:38:48,150 So let's have a look at a demo. So we have this document, this encrypted document, we 350 00:38:48,150 --> 00:38:54,170 put in our password and get our secret message. We now open it again in a text 351 00:38:54,170 --> 00:39:00,140 editor. We see, in object 4 0 down here, there's the actual ciphertext of the 352 00:39:00,140 --> 00:39:06,110 object, so of the message, and we see it's AES encrypted, with a 32 byte key, so it's 353 00:39:06,110 --> 00:39:15,670 AES-256. OK. Now we decide to add a new object that contains, well, plaintext. 354 00:39:15,670 --> 00:39:22,220 And, well, we simply add that to the contents array of this document. So, we 355 00:39:22,220 --> 00:39:28,241 say "Display this on the first page", save the document. We open it, and we'll put in 356 00:39:28,241 --> 00:39:38,300 our password and, oh well, this is indeed awkward. OK. So, now, we have broken the 357 00:39:38,300 --> 00:39:44,160 integrity of an encrypted document. Well, you might think maybe they didn't want any 358 00:39:44,160 --> 00:39:49,190 integrity in the encrypted files. Maybe that's the use case people have, I don't 359 00:39:49,190 --> 00:39:55,060 know. But we thought, maybe we can somehow exfiltrate the plaintext this way. So 360 00:39:55,060 --> 00:40:00,040 again, we took a step back, and looked at the PDF specification. And the first thing 361 00:40:00,040 --> 00:40:06,080 we found were so-called submit-form actions. And that's basically the same as 362 00:40:06,080 --> 00:40:10,550 a form on a website. You can put in data. You might have seen this in a contract, in 363 00:40:10,550 --> 00:40:14,740 a PDF contract, where you can put in your name, and your address, and so on, and so 364 00:40:14,740 --> 00:40:23,330 on, and the data that is saved inside of that is saved in strings and streams. And 365 00:40:23,330 --> 00:40:27,760 now remember that is everything that is encrypted in a document. And, of course, 366 00:40:27,760 --> 00:40:32,101 you can also send that back to an attacker, or well, to a legitimate use 367 00:40:32,101 --> 00:40:37,890 case, of course, via clicking a button, but clicking buttons is pretty lame. So we 368 00:40:37,890 --> 00:40:42,120 again looked at the standard and found the so-called open action. And that is an 369 00:40:42,120 --> 00:40:47,190 action, for example, submitting a form that can be performed upon opening a 370 00:40:47,190 --> 00:40:54,980 document. So how might this look? This is how a PDF form looks, already with the 371 00:40:54,980 --> 00:41:01,390 attack applied. So, we've got an URL here that is unencrypted, because all strings 372 00:41:01,390 --> 00:41:07,400 in this document are unencrypted, and we've got the value object 2 O, where the 373 00:41:07,400 --> 00:41:13,335 actual encrypted data lives. So, that is the value of the form fields. And what 374 00:41:13,335 --> 00:41:17,120 will happen on the attacker side as soon as this document is opened? Well, we'll 375 00:41:17,120 --> 00:41:24,540 get a post request with a confidential content. Let's have a demo. Again, we have 376 00:41:24,540 --> 00:41:30,620 this document. We put in our password. It's the original document you have 377 00:41:30,620 --> 00:41:36,160 already seen. We reopen it in a text viewer, or a text editor, again see it's 378 00:41:36,160 --> 00:41:44,160 encrypted, and we decide to change all strings to the identity filter. So, no 379 00:41:44,160 --> 00:41:49,480 encryption is applied to strings from now on. And then we add a whole blob of 380 00:41:49,480 --> 00:41:55,940 information for the open action, and for the form. So this will be op- this will be 381 00:41:55,940 --> 00:42:00,350 performed, as soon as the document is opened. There is a URL, p.df, and the 382 00:42:00,350 --> 00:42:07,540 value is the encrypted object 4 0. We start an HTTP server on the domain we 383 00:42:07,540 --> 00:42:12,970 specified, we open the document, put in the password again, and as soon as we open 384 00:42:12,970 --> 00:42:17,770 the document Adobe will helpfully show us a warning, but they will already click the 385 00:42:17,770 --> 00:42:22,170 button for remembering that for the future. And if you accept that, you will 386 00:42:22,170 --> 00:42:29,390 see your secret message on the attacker server. And that is pretty bad already. 387 00:42:29,390 --> 00:42:36,480 OK. The same works for hyperlinks, so, of course, there are links in PDF documents, 388 00:42:36,480 --> 00:42:43,600 and as on the Web, we can define a base URL for hyperlinks. So we can say all URLs 389 00:42:43,600 --> 00:42:49,940 from this document start with http://p.df. And of course we can define any object as 390 00:42:49,940 --> 00:42:57,260 a URL. So any object we prepared this way can be sent as a URL, and that will, of 391 00:42:57,260 --> 00:43:01,180 course, trigger a GET request upon opening the document again, if you defined an open 392 00:43:01,180 --> 00:43:08,750 action for the same object. So again, pretty bad and breaks confidentiality. And 393 00:43:08,750 --> 00:43:16,380 of course, everybody loves JavaScript in PDF files, and that works as well. Okay. 394 00:43:16,380 --> 00:43:21,350 Let's talk about ciphertext attacks, so actual cryptographic attacks, no more not 395 00:43:21,350 --> 00:43:29,190 touching the crypto. So you might remember the efail attacks on OpenPGP and S/MIME, 396 00:43:29,190 --> 00:43:34,160 and those had basically three prerequisites. 1: Well, ciphertext 397 00:43:34,160 --> 00:43:38,690 malleability, so it's called malleability gadgets. That's why we need ciphertext 398 00:43:38,690 --> 00:43:43,850 malleability, and we've got no integrity protection, that's a plus. Then we need 399 00:43:43,850 --> 00:43:48,680 some known plaintext for actual targeted modifications. And we need an exfiltration 400 00:43:48,680 --> 00:43:53,070 channel to send the data back to an attacker. Well, exfiltration channels are 401 00:43:53,070 --> 00:43:59,730 already dealt with as we have hyperlinks and forms. So we can already check that. 402 00:43:59,730 --> 00:44:05,800 Nice. Let's talk about ciphertext malleability, or what we call gadgets. So, 403 00:44:05,800 --> 00:44:10,180 some of you might remember this from crypto 101, or whatever lecture you ever 404 00:44:10,180 --> 00:44:15,290 had on cryptography. This is the decryption function of CBC, so cipher 405 00:44:15,290 --> 00:44:24,030 block chaining. And it's basically, you've got your ciphertext up here, and your 406 00:44:24,030 --> 00:44:29,730 plaintext down here. And it works by simply decrypting a block of ciphertext, 407 00:44:29,730 --> 00:44:35,850 XORing the previous block of ciphertext onto that, and you'll get the plaintext. 408 00:44:35,850 --> 00:44:41,070 So what happens, if you decide to change a single bit in the ciphertext, for example, 409 00:44:41,070 --> 00:44:47,530 the first bit of the initialization vector? Well, that same bit will flip in 410 00:44:47,530 --> 00:44:53,110 the actual plaintext. Wait a second. What happens, if you happen to know a whole 411 00:44:53,110 --> 00:45:00,150 plaintext block? Well, we can XOR that onto the first block, and basically get 412 00:45:00,150 --> 00:45:05,890 all zeros, or what we call a gadget, or a blank sheet of paper, because we can write 413 00:45:05,890 --> 00:45:14,130 on that by taking a chosen plaintext and XORing that onto this results. And this 414 00:45:14,130 --> 00:45:18,740 way we can, for example, construct URLs in the actual ciphertext, or in the actual 415 00:45:18,740 --> 00:45:24,420 resulting plaintext. What we can also do with these gadget is, gadgets is moving 416 00:45:24,420 --> 00:45:28,580 them somewhere else in the document, cloning them, so we can have multiple 417 00:45:28,580 --> 00:45:34,150 gadgets, at multiple places in the ciphertext. But remember, if you do that, 418 00:45:34,150 --> 00:45:37,800 there's always the avalanche effect of CBC, so you will have some random bytes in 419 00:45:37,800 --> 00:45:45,590 here, but the URL still remains in place. Okay. That's ciphertext malleability done. 420 00:45:45,590 --> 00:45:50,610 As I've said we need some plaintext. We need to have some known plaintext. And as 421 00:45:50,610 --> 00:45:54,460 the PDF standard has been pretty helpful up until now, in breaking PDF encryption, 422 00:45:54,460 --> 00:46:02,071 let's take a look again. And what we found here: Permissions. So a PDF documents can 423 00:46:02,071 --> 00:46:08,040 have different permissions for the author, and the user of the document. This 424 00:46:08,040 --> 00:46:11,020 basically means the author can edit the document and the users might not be able 425 00:46:11,020 --> 00:46:16,060 to do that. And of course, people started to change with that- started to tamper 426 00:46:16,060 --> 00:46:20,220 with that value, if it was left unencrypted, so in the newest version, it 427 00:46:20,220 --> 00:46:27,310 was decided this should be encrypted as a 16 byte value. So we've got 16 bytes. How 428 00:46:27,310 --> 00:46:30,890 do they look? Well, at first, we need room for extension. We need lots of 429 00:46:30,890 --> 00:46:36,100 permissions. Then we put 4 bytes of the actual permission value - That is also in 430 00:46:36,100 --> 00:46:42,270 unencrypted form in document. Then we need one byte for encrypted metadata, and for 431 00:46:42,270 --> 00:46:46,840 some reason we need some acronym, "adb", I'll leave it to you to figure out what 432 00:46:46,840 --> 00:46:52,700 that stands for. And finally, we've got four random bytes, because we have to fill 433 00:46:52,700 --> 00:47:00,260 up 16 bytes, and we have run out of ideas. Okay. We take all of that, encrypt it, and 434 00:47:00,260 --> 00:47:05,980 oh well, we know a lot of that, and that is basically known plaintext by design. 435 00:47:05,980 --> 00:47:12,940 Which is bad. Let's look at how this looks in a document. So, you see the perms 436 00:47:12,940 --> 00:47:16,410 value, I've marked it down here. That is the actual extended value I've shown you 437 00:47:16,410 --> 00:47:22,750 on the last slide. And above that you'll see the unencrypted value that's inside 438 00:47:22,750 --> 00:47:28,030 this perms value, so the minus 4 in this case, it's basically a bit field. On the 439 00:47:28,030 --> 00:47:33,610 right side you see the actual encrypted contents, and helpfully, all of this is 440 00:47:33,610 --> 00:47:37,750 encrypted under the same document-wide key in the newest version of the 441 00:47:37,750 --> 00:47:43,510 specification. And that means we can you reuse this plaintext anywhere in the 442 00:47:43,510 --> 00:47:48,930 document we want, and we can reuse this to build gadgets. To sum that last point 443 00:47:48,930 --> 00:47:53,190 up for you: Adobe decided to add permissions to the PDF format, and people 444 00:47:53,190 --> 00:47:56,950 thought of tampering with them. So they decided to encrypt these permissions to 445 00:47:56,950 --> 00:48:06,360 prevent tampering, and now known plaintext is available to attackers. All right. So 446 00:48:06,360 --> 00:48:14,330 that's basically all of the prerequisites done, and let's again have a demo. So, we 447 00:48:14,330 --> 00:48:20,180 again open this document, put in our password, well, as soon as Chrome decides 448 00:48:20,180 --> 00:48:26,740 to open this document, we put in our password. It's the same as before. Now, 449 00:48:26,740 --> 00:48:31,630 I've prepared a script for you, because I really can't do this live, and it 450 00:48:31,630 --> 00:48:35,400 basically does what I've told you. It's getting a blank gadget from the perms 451 00:48:35,400 --> 00:48:39,670 value. It's generating a URL from that. It's generating a field name, so that it 452 00:48:39,670 --> 00:48:45,410 will look nice on the server side, we regenerate this document and put a form in 453 00:48:45,410 --> 00:48:50,080 there. We start a web server, open this modified document, put in the password 454 00:48:50,080 --> 00:48:55,540 again and oh well, Chrome doesn't even ask. So as soon as this document is opened 455 00:48:55,540 --> 00:48:59,160 in Chrome and the password is put in, we'll get our secret message delivered to 456 00:48:59,160 --> 00:49:07,080 the attacker. *Applause* 457 00:49:07,080 --> 00:49:13,510 So we took a look at 27 viewers and found all of them vulnerable to at least one of 458 00:49:13,510 --> 00:49:18,390 our attacks. So some of them work with no user interaction as we have seen in 459 00:49:18,390 --> 00:49:22,730 Chrome. Some work with user interaction in specific cases, as you've seen with Adobe 460 00:49:22,730 --> 00:49:30,660 with a warning, but generally all of these were attackable in one way or the other. 461 00:49:30,660 --> 00:49:35,670 So what can be done about all of this? Well, you might think signatures might 462 00:49:35,670 --> 00:49:40,250 help. That's usually the first point people bring up: "A signature on the 463 00:49:40,250 --> 00:49:46,550 encrypted file will help." Well, no, not really. Why is that? Well, for one, a 464 00:49:46,550 --> 00:49:50,332 broken signature does not prevent opening the document. So we'll still be able to 465 00:49:50,332 --> 00:49:54,360 exfiltrate as soon as a password is put in. Signatures can be stripped because 466 00:49:54,360 --> 00:49:57,700 they're not encrypted. And as you have seen before, they can also be forged in 467 00:49:57,700 --> 00:50:02,960 most viewers. Signatures are not the answer. Closing exfiltration channels is 468 00:50:02,960 --> 00:50:08,360 also not the answer because for one, it's hard to do. And how would you even find 469 00:50:08,360 --> 00:50:14,690 all exfiltrations channels in an 800 pages standard? And I mean, we have barely 470 00:50:14,690 --> 00:50:18,430 scratched the surface of exfiltration channels. And should we really remove 471 00:50:18,430 --> 00:50:24,290 forms and hyperlinks from documents? And should we remove JavaScript? OK, maybe we 472 00:50:24,290 --> 00:50:28,700 should. And finally, if you have to do that, please ask the user before 473 00:50:28,700 --> 00:50:34,300 connecting to a web server. So let's look at some vendor reactions. Apple decided to 474 00:50:34,300 --> 00:50:38,680 do exactly what I've told you: to add a dialog to warn the user and even show the 475 00:50:38,680 --> 00:50:44,460 whole URL with the encrypted plaintext. And Google decided to stop trying to fix 476 00:50:44,460 --> 00:50:49,830 the unfixable in Chrome. They fixed the automatic exfiltration, but there's really 477 00:50:49,830 --> 00:50:54,290 nothing they can do about the standard. So this is a problem that has to be done in 478 00:50:54,290 --> 00:51:00,230 the standard. And that is basically that. For mitigating wrapping attacks, we have 479 00:51:00,230 --> 00:51:04,110 to deprecate partial encryption and disallow access from unencrypted to 480 00:51:04,110 --> 00:51:08,450 encrypted objects. And against the gadget attacks, we have to use authenticated 481 00:51:08,450 --> 00:51:16,221 encryption like AES-GCM. OK. And Adobe has told us that they were escalating this to 482 00:51:16,221 --> 00:51:19,980 the ISO working group that's now responsible for the PDF standard and this 483 00:51:19,980 --> 00:51:24,710 will be taken up in the next revision. So that's a win in my book. 484 00:51:24,710 --> 00:51:30,950 *Applause* 485 00:51:30,950 --> 00:51:36,330 Herald: Thank you so much, guys. That was really awesome. Please queue up by the 486 00:51:36,330 --> 00:51:41,290 microphones if you have any questions, we still have some time left for Q and A. But 487 00:51:41,290 --> 00:51:45,180 I think your research is really, really interesting because it opens my mind to 488 00:51:45,180 --> 00:51:51,490 like how would this actually be able to be misused in practice? Like, and I don't 489 00:51:51,490 --> 00:51:54,760 know, like, what's your take? I guess since you've been working so much with 490 00:51:54,760 --> 00:51:59,020 this, you must have some kind of idea as to what devious things you could come up 491 00:51:59,020 --> 00:52:02,680 with. Fabian: I mean, it's still an attacker 492 00:52:02,680 --> 00:52:08,080 scenario that requires a lot of resources and a very motivated attacker. So this 493 00:52:08,080 --> 00:52:13,680 might not be very important to the normal user. Let's be real here. So most of us 494 00:52:13,680 --> 00:52:19,100 are not targeted by the NSA, I guess. So you need an active attacker, an active man 495 00:52:19,100 --> 00:52:21,070 in the middle to actually perform these attacks. 496 00:52:21,070 --> 00:52:25,800 Herald: Great. Thank you. And then I think we have a question from microphone number 497 00:52:25,800 --> 00:52:28,850 four, please. Microphone 4: Yes. You'll said that the 498 00:52:28,850 --> 00:52:32,700 next standard might have a fix. Do you know a time frame on how long it 499 00:52:32,700 --> 00:52:41,450 takes to build such a standard? Fabian: Well, no, we don't really know. We 500 00:52:41,450 --> 00:52:44,640 have talked with Adobe and they told us they will show the next version of the 501 00:52:44,640 --> 00:52:48,950 standard to us before actually releasing that, but we have no time frame at all 502 00:52:48,950 --> 00:52:51,950 from them. Microphone 4: OK. Thank you. 503 00:52:51,950 --> 00:52:57,400 Herald: Thank you. Microphone number five, please. 504 00:52:57,400 --> 00:53:02,300 Microphone 5: Thank you for a very interesting talk. You showed in the first 505 00:53:02,300 --> 00:53:09,140 part that the signature has like these four numbers with the byte range. And why 506 00:53:09,140 --> 00:53:15,580 is this, like four numbers, not part of a signature? Is there a technical reason for 507 00:53:15,580 --> 00:53:18,480 that? Because the byte offset is predictable. 508 00:53:18,480 --> 00:53:24,470 Vladi: It is! The bytes ranges protected by the signature. But we just defined the 509 00:53:24,470 --> 00:53:31,710 second one and just moved the signed one to be validated later. So there are two 510 00:53:31,710 --> 00:53:37,530 byte ranges. But only the first one, the manipulated one, will be processed. 511 00:53:37,530 --> 00:53:42,580 Microphone 5: Thank you. Herald: Thank you so much. Microphone 512 00:53:42,580 --> 00:53:47,940 number four, please. Microphone 4: Oh, this is way too high for 513 00:53:47,940 --> 00:53:53,870 me. OK. I have an answer and a question for you. You mentioned during the talk 514 00:53:53,870 --> 00:53:58,690 that you weren't sure how the Department of Justice did distributes the passwords 515 00:53:58,690 --> 00:54:07,940 for encrypting PDFs. The answer is: in plain text, in a separate email or as the 516 00:54:07,940 --> 00:54:14,300 password of the week, which is distributed through various means. That is also what 517 00:54:14,300 --> 00:54:20,370 the Department of Homeland Security does, and the military is somewhat less stupid. 518 00:54:20,370 --> 00:54:27,030 As a question: I have roughly a half terabyte of sensitive PDFs that I would 519 00:54:27,030 --> 00:54:36,910 like to scan for your attack and also for redaction failures. Do you know of any 520 00:54:36,910 --> 00:54:45,560 fast, feasible ways to scan documents for the presence of this kind of attack? 521 00:54:45,560 --> 00:54:51,970 Fabian: I don't know of any tools, but I mean, scanning for the gadget attacks is 522 00:54:51,970 --> 00:54:58,390 actually possible if you tried to do some entropy detection. So, because you reuse 523 00:54:58,390 --> 00:55:01,870 ciphertext, you will have less entropy in your ciphertext, but that's pretty hard to 524 00:55:01,870 --> 00:55:07,350 do. Direct exfiltration should probably be detectable by scanning simply for words 525 00:55:07,350 --> 00:55:12,300 like "identity". Well, beyond that, 18 different techniques that we provided in 526 00:55:12,300 --> 00:55:15,980 the paper. But I don't know of any tools to do that automatically. 527 00:55:15,980 --> 00:55:21,560 Microphone 4: Thank you. Herald: Great. Thank you. And microphone 528 00:55:21,560 --> 00:55:24,200 number two, please. Microphone 2: Thank you for your very interesting 529 00:55:24,200 --> 00:55:30,220 presentation. I have one suggestion and one question for the mitigation scheme. If 530 00:55:30,220 --> 00:55:33,810 you simply run your PDF reader in a virtual machine, that is firewalled away, 531 00:55:33,810 --> 00:55:38,660 so your firewall won't led you to anybody going out. But for the signature 532 00:55:38,660 --> 00:55:43,020 forgeries, I had an idea. I'm not sure if this is actually a stupid idea, but did 533 00:55:43,020 --> 00:55:47,440 you consider faking the certificate? Because presumably the signature is 534 00:55:47,440 --> 00:55:52,250 protected by the seller's certificate. You make up your own, signing with that. Does 535 00:55:52,250 --> 00:55:57,670 it catch it and how? Vladi: We considered it but not in this 536 00:55:57,670 --> 00:56:04,900 paper. We assume that the certificate and the entire chain of trust for this path is 537 00:56:04,900 --> 00:56:11,750 totally secure. It was just an assumption to just concentrate only on the attacks we 538 00:56:11,750 --> 00:56:19,600 already found. So, perhaps there will be further research provided by us in the 539 00:56:19,600 --> 00:56:22,810 next months and years. Herald: We might just hear more from you 540 00:56:22,810 --> 00:56:27,890 in the future. Thank you so much. And now questions from the Internet, please. 541 00:56:27,890 --> 00:56:34,800 Signal Angel: I have two questions to the first part of your talk from the Internet. 542 00:56:34,800 --> 00:56:40,540 The first one is you mentioned a few reactions, but can you give a bit more 543 00:56:40,540 --> 00:56:46,510 detail about your experience with vendors while reporting these issues? 544 00:56:46,510 --> 00:56:58,480 Vladi: Yeah. We, ... for the first time we started, we asked the CERT team from BSI, 545 00:56:58,480 --> 00:57:04,790 CERT-Bund, to help us because there were a lot of affected vendors and we were not 546 00:57:04,790 --> 00:57:13,580 able to provide the support in a feasible way. So they supported us the entire way. 547 00:57:13,580 --> 00:57:19,880 We first created the report with, containing the exact description of the 548 00:57:19,880 --> 00:57:26,190 vulnerabilities and old exploits. Then, we distributed it to the BSI and they 549 00:57:26,190 --> 00:57:32,540 contacted the vendors and just proxied to the communication and there was a lot of 550 00:57:32,540 --> 00:57:36,680 communication. So I'm not aware of the entire communication, but only about the 551 00:57:36,680 --> 00:57:45,930 technical stuff where we were asked to just retest the fix and so on. So there 552 00:57:45,930 --> 00:57:52,810 was some reaction from Adobe, FoxIt and a lot of viewers reacted on our attacks and 553 00:57:52,810 --> 00:57:58,210 contacted us, but not everybody. Herald: Thank you so much. Unfortunately, 554 00:57:58,210 --> 00:58:01,670 that's the only time that we have available for questions today. I think you 555 00:58:01,670 --> 00:58:06,080 guys might stay around for a couple of minutes, just if someone has any more 556 00:58:06,080 --> 00:58:10,930 questions. Fabian, I thank ... and Vladislav, not enough. Thank you so much. 557 00:58:10,930 --> 00:58:13,040 It was very interesting. Please give them a great round of applause. 558 00:58:13,040 --> 00:58:14,793 Valdi: Thank you. *Applause* 559 00:58:14,793 --> 00:58:20,299 *36c3 postroll music* 560 00:58:20,299 --> 00:58:43,000 subtitles created by c3subtitles.de in the year 2019. Join, and help us!