1 00:00:00,000 --> 00:00:15,740 *34c3 intro* 2 00:00:15,740 --> 00:00:22,800 Herald: The next talk is called "End-to- end formal ISA verification of RISC-V 3 00:00:22,800 --> 00:00:28,410 processors with riscv-formal". I have no idea what this means, but I'm very excited 4 00:00:28,410 --> 00:00:33,540 to understand what it means and Clifford promised he's going to make sure everyone 5 00:00:33,540 --> 00:00:39,320 will. Clifford has been very known in the open-source and free-software... free- 6 00:00:39,320 --> 00:00:45,219 source... oh my gosh... free-and-open- source community and especially he's known 7 00:00:45,219 --> 00:00:50,300 for the project IceStorm. Please help me welcome Clifford! 8 00:00:50,300 --> 00:00:57,260 *applause* 9 00:00:57,260 --> 00:01:12,170 *inaudible* Clifford: RISC-V is an open instruction 10 00:01:12,170 --> 00:01:17,189 set architecture. It's an open ISA, so it's not a processor, but it's a processor 11 00:01:17,189 --> 00:01:22,860 specification -- an ISA specification -- that is free to use for everyone and if 12 00:01:22,860 --> 00:01:26,789 you happen to have already implemented your own processor at one point in time, 13 00:01:26,789 --> 00:01:30,880 you might know that it's actually much easier to implement a processor than to 14 00:01:30,880 --> 00:01:35,560 implement all the tools that you need to compile programs for your processor. And 15 00:01:35,560 --> 00:01:39,750 if you use something like RISC-V, then you can reuse the tools that are already out 16 00:01:39,750 --> 00:01:45,170 there, so that's a great benefit. However, for this endeavor we need processors that 17 00:01:45,170 --> 00:01:50,280 actually really compatible to each other: Processors that implement the RISC-V ISA 18 00:01:50,280 --> 00:01:54,970 correctly. So, with many other ISAs, we start with one processor and we say "Oh, 19 00:01:54,970 --> 00:01:59,660 that's the processor" and later on, we figure out there was a bug, and what 20 00:01:59,660 --> 00:02:03,049 people sometimes do is, they just change the specification, so that the 21 00:02:03,049 --> 00:02:06,770 specification all fits the hardware they actually have. We can't do something like 22 00:02:06,770 --> 00:02:09,860 that with RISC-V, but there are many implementations out there, all being 23 00:02:09,860 --> 00:02:14,440 developed in parallel to fit the same specification, so we want to have some 24 00:02:14,440 --> 00:02:20,670 kind of means to make sure that all these processors actually agree about what the 25 00:02:20,670 --> 00:02:27,081 ISA specification is. So, what's formal verification? Formal verification is a 26 00:02:27,081 --> 00:02:32,960 super-broad term. In the context of this talk, I'm talking about hardware model 27 00:02:32,960 --> 00:02:38,230 checking. More specifically, I'm talking about checking of so-called "safety 28 00:02:38,230 --> 00:02:43,161 properties". So, we have some hardware design and we have an initial state and we 29 00:02:43,161 --> 00:02:48,540 would like to know if this hardware design can reach a bad state from the initial 30 00:02:48,540 --> 00:02:55,280 state. That's formally the problem that we are trying to solve here. And there are 31 00:02:55,280 --> 00:03:00,180 two means to do that, two different categories of proofs that are bounded and 32 00:03:00,180 --> 00:03:05,090 unbounded proofs, and with the bounded proofs we only prove that it's impossible 33 00:03:05,090 --> 00:03:07,720 to reach a bad state within a certain number of cycles. 34 00:03:07,720 --> 00:03:12,410 So, we give a maximum bound for the length of a counterexample and with unbounded 35 00:03:12,410 --> 00:03:18,240 proofs, we prove that a bad state can actually never be reached. So, unbounded 36 00:03:18,240 --> 00:03:23,060 proofs are, of course, better -- if you can make an unbounded proof -- but in many 37 00:03:23,060 --> 00:03:28,560 cases, this is very hard to achieve, but bounded proofs is something that we can 38 00:03:28,560 --> 00:03:35,880 do, so I'm talking about bounded proofs here for the most part. So, what's end-to- 39 00:03:35,880 --> 00:03:40,530 end formal verification? Because that's also in my title. So, historically when 40 00:03:40,530 --> 00:03:45,110 you formally verify something like a processor, you break down the processor in 41 00:03:45,110 --> 00:03:49,550 many small components and then you write properties for each component and prove 42 00:03:49,550 --> 00:03:54,510 for each component individually that they adhere to the properties and then you make 43 00:03:54,510 --> 00:03:59,290 a more abstract proof, that if you put in system together from components that have 44 00:03:59,290 --> 00:04:04,230 this property, then this system will have the properties that you want. With end-to- 45 00:04:04,230 --> 00:04:10,340 end verification, we treat the processors one huge black box and just ask the 46 00:04:10,340 --> 00:04:15,710 question "Does this one huge thing fit our specification? Have the properties that we 47 00:04:15,710 --> 00:04:22,190 want?" That has a couple of advantages: It's much, much easier this way to take 48 00:04:22,190 --> 00:04:26,460 one specification and port it from one processor to another, because we don't 49 00:04:26,460 --> 00:04:31,850 care about how the processor is built internally, and it's much easier to take 50 00:04:31,850 --> 00:04:35,970 the specification that we have and actually match it to other specifications 51 00:04:35,970 --> 00:04:39,520 of the ISA, because we have a specification that says, what is the 52 00:04:39,520 --> 00:04:45,190 overall behavior we expect from our processor. But the big disadvantage, of 53 00:04:45,190 --> 00:04:49,700 cause, is that it's computationally much more expensive to do end-to-end formal 54 00:04:49,700 --> 00:04:55,240 verifications and doing this end-to-end verification of a processor against an ISA 55 00:04:55,240 --> 00:04:59,140 specification is something that historically was always fueled as the 56 00:04:59,140 --> 00:05:03,550 textbook example of things that you can't do with formal methods, but fortunately 57 00:05:03,550 --> 00:05:08,670 the solvers... they became much better in the last couple of years and now if we use 58 00:05:08,670 --> 00:05:14,480 the right tricks, we can do stuff like that with the solvers we have nowadays. 59 00:05:14,480 --> 00:05:19,570 So, that's riscv-formal. riscv-formal is a framework that allows us to do end-to-end 60 00:05:19,570 --> 00:05:24,460 formal verification of RISC-V processors against a formal version of the ISA 61 00:05:24,460 --> 00:05:29,850 specification. So, riscv-formal is not a formally verified processor. Instead, if 62 00:05:29,850 --> 00:05:35,230 you happen to have a RISC-V processor, you can use riscv-formal to prove that your 63 00:05:35,230 --> 00:05:40,980 processor confirms to the ISA specification. For the most part, this is 64 00:05:40,980 --> 00:05:46,120 using bounded methods. Theoretically, you could do unbounded proofs with riscv- 65 00:05:46,120 --> 00:05:52,230 formal, but it's not the main use case. So, it's good for what we call "bug 66 00:05:52,230 --> 00:05:56,020 hunting", because maybe there is a counterexample that would show, that the 67 00:05:56,020 --> 00:06:02,510 processor could diverge from the desired behavior with 1000 or 5000 cycles, but 68 00:06:02,510 --> 00:06:06,669 usually, when you have something like a processor and you can't reach a bad state 69 00:06:06,669 --> 00:06:11,860 within the very short bounds, you have high confidence that actually your 70 00:06:11,860 --> 00:06:16,910 processor implements the ISA correctly. So, if you have a processor and you would 71 00:06:16,910 --> 00:06:22,470 like to integrate it with riscv-formal, you need to do 2 things: You need to add a 72 00:06:22,470 --> 00:06:27,610 special trace port to your processor; it's called the RVFI trace port -- riscv-formal 73 00:06:27,610 --> 00:06:31,790 interface trace port. And you have to configure riscv-formal so that riscv- 74 00:06:31,790 --> 00:06:39,520 formal understands the attributes of your processor. So, for example, RISC-V is 75 00:06:39,520 --> 00:06:44,770 available in a 32-bit and a 64-bit version. You have to tell riscv-formal, if 76 00:06:44,770 --> 00:06:51,020 you want to verify a 32-bit or 64-bit processor. RISC-V is a modular ISA, so 77 00:06:51,020 --> 00:06:54,120 there're a couple of extensions and you have to tell riscv-formal, which 78 00:06:54,120 --> 00:06:59,350 extensions your processor actually implements. 79 00:06:59,350 --> 00:07:04,510 And then there're a couple of other things that are transparent for a userland 80 00:07:04,510 --> 00:07:12,630 process, like if unaligned loads or stores are supported by the hardware natively, 81 00:07:12,630 --> 00:07:18,010 because RISC-V... the spec only says, that when you do an unaligned load or store, 82 00:07:18,010 --> 00:07:23,980 then a userspace program can expect this load or store to succeed, but it might 83 00:07:23,980 --> 00:07:28,400 take a long time, because there might be a machine interrupt handler that is 84 00:07:28,400 --> 00:07:34,949 emulating an unaligned load store by doing alligned loads and stores, but if we do 85 00:07:34,949 --> 00:07:40,850 this formal verification of the processor, then the riscv-formal framework must be 86 00:07:40,850 --> 00:07:44,990 aware: "What is the expected behavior for your course?", "Should it trap when it 87 00:07:44,990 --> 00:07:50,960 sees an unaligned load store or should it just perform the load store unaligned?" 88 00:07:50,960 --> 00:07:54,250 So, what does this interface look like that you need to implement in your 89 00:07:54,250 --> 00:07:59,490 processor if you would like to use riscv- formal? This is the current version of the 90 00:07:59,490 --> 00:08:03,930 riscv-formal interface. Right now, there is no support for floating-point 91 00:08:03,930 --> 00:08:09,150 instructions and there is no support for CSRs, but this is on the to-do list, so 92 00:08:09,150 --> 00:08:14,321 this interface will grow larger and larger when we add these additional features, but 93 00:08:14,321 --> 00:08:18,139 all these additional features will be optional. And one of the reasons is that 94 00:08:18,139 --> 00:08:22,400 some might implement just small microcontrollers that actually don't have 95 00:08:22,400 --> 00:08:29,510 floating-point cores or that don't have support for the privileged specification, 96 00:08:29,510 --> 00:08:35,328 so they don't have CSRs. Through this interface, whenever the core retires an 97 00:08:35,328 --> 00:08:40,750 instruction, it documents which instruction it retired; so it tells us 98 00:08:40,750 --> 00:08:45,829 "This is the instruction where I retired; this was the program counter where I found 99 00:08:45,829 --> 00:08:49,939 the instruction; this is the program counter for the next instruction; these 100 00:08:49,939 --> 00:08:53,809 are the registers that I read and these are the values that I've observed in the 101 00:08:53,809 --> 00:08:58,029 register file; this is the register that I've written and this is the value that I 102 00:08:58,029 --> 00:09:00,639 have written to the register file" All that stuff. 103 00:09:00,639 --> 00:09:06,949 So, in short, what we document through the riscv-formal interface, is the part of the 104 00:09:06,949 --> 00:09:12,269 processor state that is observed by an instruction and the change to the state of 105 00:09:12,269 --> 00:09:16,980 the processor that is performed by an instruction -- like changes to the 106 00:09:16,980 --> 00:09:24,309 register file or changes to the program counter. And of course, most processors 107 00:09:24,309 --> 00:09:28,980 actually are superscalar, even those processors that say they're non- 108 00:09:28,980 --> 00:09:33,579 superscalar. In-order pipelines usually can do stuff like retire memory load 109 00:09:33,579 --> 00:09:38,079 instructions out of order, in parallel to another instruction that does not write 110 00:09:38,079 --> 00:09:43,670 the register, things like that. So, even with processors we usually don't think of 111 00:09:43,670 --> 00:09:48,680 as superscalar processors, even with those processors, we need the capability to 112 00:09:48,680 --> 00:09:53,712 retire more than one instruction each cycle and this can be done with this 113 00:09:53,712 --> 00:10:03,839 "NRET" parameter. And we see all the ports 5 times wider if NRET is 5. Okay, so then 114 00:10:03,839 --> 00:10:07,850 we have a processor that implements this interface. What is the verification 115 00:10:07,850 --> 00:10:14,269 strategy that riscv-formal follows in order to do this proof to formally verify 116 00:10:14,269 --> 00:10:20,720 that our processor's actually correct? So, there is not one big proof that we run. 117 00:10:20,720 --> 00:10:26,990 Instead, there is a large number of very small proofs that we run. This is the most 118 00:10:26,990 --> 00:10:32,559 important trick when it comes to this and there are 2 categories of proofs: One 119 00:10:32,559 --> 00:10:37,619 category is what I call the "instruction checks". We have one of those proofs for 120 00:10:37,619 --> 00:10:42,880 each instruction in the ISA specification and each of the channels in the riscv- 121 00:10:42,880 --> 00:10:48,470 formal interface. So, this is easily a couple of hundred proofs right there 122 00:10:48,470 --> 00:10:52,120 because you easily have 100 instructions and if you have 2 channels, you already 123 00:10:52,120 --> 00:10:58,470 have 200 proofs that you have to run. And what this instruction checks do, they 124 00:10:58,470 --> 00:11:04,079 reset the processor or they started a symbolic state, if you would like to run a 125 00:11:04,079 --> 00:11:09,199 unbounded proof, let the process run for a certain number of cycles and then it 126 00:11:09,199 --> 00:11:14,540 assumes that in the last cycle, the processor will retire a certain 127 00:11:14,540 --> 00:11:16,769 instruction. So, if this check checks if the "add" 128 00:11:16,769 --> 00:11:21,410 instruction works correctly, it assumes that the last instruction retired in the 129 00:11:21,410 --> 00:11:27,809 last cycle of this bounder checked will be an "add" instruction and then it looks at 130 00:11:27,809 --> 00:11:33,550 all the interfaces on the riscv-formal interface, to make sure that this is 131 00:11:33,550 --> 00:11:37,269 compliant with an "add" instruction. It checks if the instruction has been decoded 132 00:11:37,269 --> 00:11:42,220 correctly; it checks if the register value we write to the register file is actually 133 00:11:42,220 --> 00:11:47,479 the sum of the values we read from the register file. All that kind of stuff. 134 00:11:47,479 --> 00:11:51,030 But, of course, if you just have these instruction checks, there is still a 135 00:11:51,030 --> 00:11:56,410 certain verification gap, because the core might lie to us: The core might say "Oh, I 136 00:11:56,410 --> 00:12:00,019 write this value to the register file", but then not write the value to the 137 00:12:00,019 --> 00:12:06,160 register file, so we have to have a separate set of proofs that do not look at 138 00:12:06,160 --> 00:12:11,249 the entire riscv-formal interface in one cycle, but look at only a small fraction 139 00:12:11,249 --> 00:12:15,850 of the riscv- formal interface, but over a span of cycles. So, for example, there is 140 00:12:15,850 --> 00:12:20,199 one check that says "If I write the register and then later I read the 141 00:12:20,199 --> 00:12:24,930 register, I better read back the value that I've written to the register file" 142 00:12:24,930 --> 00:12:33,899 and this I call "consistency checks". So, that's I think what I said already... So 143 00:12:33,899 --> 00:12:40,170 for each instruction with riscv-formal, we have a instruction model that looks like 144 00:12:40,170 --> 00:12:45,920 that. So, these are 2 slides: The first slides is just the interface there we have 145 00:12:45,920 --> 00:12:51,939 a couple of signals from this riscv-formal interface that we read like the 146 00:12:51,939 --> 00:12:58,189 instruction that we are executing, the program counter where we found this 147 00:12:58,189 --> 00:13:04,079 instruction, the register values we read, and then we have a couple of signals that 148 00:13:04,079 --> 00:13:09,899 are generated by our specification that are output of this specification module. 149 00:13:09,899 --> 00:13:17,601 Like, which registers should we read? Which registers should we write? What 150 00:13:17,601 --> 00:13:22,129 values should we write to that register? Stuff like that. So, that's the interface. 151 00:13:22,129 --> 00:13:26,739 It's the same all the instructions and then we have a body that looks more 152 00:13:26,739 --> 00:13:31,190 like that. For all the instructions that just decodes the instruction, checks if 153 00:13:31,190 --> 00:13:34,780 this is actually the instruction the check is for. So, in this case it's an "add 154 00:13:34,780 --> 00:13:42,030 immediate" instruction. And then you have things like the line near the bottom above 155 00:13:42,030 --> 00:13:46,290 "default assignments": "assign spec_pc_wdata", for example, says "Okay, 156 00:13:46,290 --> 00:13:54,680 the next PC must be 4 bytes later than the PC for this instruction. We must increment 157 00:13:54,680 --> 00:13:58,860 the program counter by a value of 4 when we execute this instruction." Things like 158 00:13:58,860 --> 00:14:09,509 that. So, you might see, there is no assert here, there are no assertions, 159 00:14:09,509 --> 00:14:13,639 because this is just the model of what kind of behavior we would expect. And then 160 00:14:13,639 --> 00:14:17,769 there is a wrapper that instantiates this and instantiates the call and builds the 161 00:14:17,769 --> 00:14:23,079 proof and there are the assertions. The main reason why we don't have assertions 162 00:14:23,079 --> 00:14:27,819 here, but instead we output the desired behavior here, is because I can also 163 00:14:27,819 --> 00:14:32,749 generate monitor cores that can run alongside your core and check in 164 00:14:32,749 --> 00:14:38,160 simulation or an emulation, an FPGA, if your core is doing the right thing. That 165 00:14:38,160 --> 00:14:43,939 can be very, very helpful if you have a situation where you run your core for 166 00:14:43,939 --> 00:14:49,940 maybe days and then you have some observable behavior that's not right, but 167 00:14:49,940 --> 00:14:53,709 maybe there are thousands, even million, cycles between the point where you can 168 00:14:53,709 --> 00:14:58,139 observe that something is wrong and the point where the process actually started 169 00:14:58,139 --> 00:15:03,379 diverging from what the specification said and if you can use a monitor core like 170 00:15:03,379 --> 00:15:10,230 that, then it's much easier to find bugs like this. Okay, so some examples of those 171 00:15:10,230 --> 00:15:15,410 consistency checks.; the list is actually not complete and it varies a little bit 172 00:15:15,410 --> 00:15:20,079 from processor to processor what kind of consistency checks we can actually run 173 00:15:20,079 --> 00:15:26,480 with the processor we are looking at. There is a check if the program counter 174 00:15:26,480 --> 00:15:30,100 for one instruction - so I have an instruction it says "This is the program 175 00:15:30,100 --> 00:15:32,339 counter for the instruction and this is the program counter for the next 176 00:15:32,339 --> 00:15:37,329 instruction" -- and then we can look at the next instruction and we can see is... 177 00:15:37,329 --> 00:15:40,920 the program counter for that instruction actually approved the next program counter 178 00:15:40,920 --> 00:15:46,369 value for the previous instruction and they must link together like that, but the 179 00:15:46,369 --> 00:15:52,129 core might retire instructions out of order. So it might be, that we see the 180 00:15:52,129 --> 00:15:55,649 first instruction for us and then the second instruction later, but it's also 181 00:15:55,649 --> 00:15:58,670 possible that we will see the second instruction first and then the first 182 00:15:58,670 --> 00:16:03,579 instruction later and because of that, there're 2 different checks: One for a 183 00:16:03,579 --> 00:16:08,939 pair in the non-reversed order and for a pair of instruction in the reversed order. 184 00:16:08,939 --> 00:16:13,480 There is one check that checks, if register value reads and writes are 185 00:16:13,480 --> 00:16:21,049 consistent. There is one check that sees, if the processor is alive, so when I give 186 00:16:21,049 --> 00:16:29,139 the processor certain fairness constrains, that the memory will always return, memory 187 00:16:29,139 --> 00:16:32,600 reads at number of cycles, things like that, then I can use this to prove that 188 00:16:32,600 --> 00:16:37,459 the process will not just suddenly freeze. This is very important and this will also 189 00:16:37,459 --> 00:16:41,459 prove that the processor is not skipping instruction indices, which is very 190 00:16:41,459 --> 00:16:45,620 important, because some of the other checks actually depend on the processor 191 00:16:45,620 --> 00:16:51,100 behaving in this way. And so forth. So, there are couple of these consistency 192 00:16:51,100 --> 00:16:55,379 checks and it's a nice exercise to sit down in a group of people and go through 193 00:16:55,379 --> 00:17:00,679 the list of consistency checks and see which which set of them actually is 194 00:17:00,679 --> 00:17:05,799 meaningful or which set of them actually leaves an interesting verification gap and 195 00:17:05,799 --> 00:17:11,789 we still need to add checks for this or that processor then. Okay, so what kind of 196 00:17:11,789 --> 00:17:19,949 bugs can it find? That's a super-hard question, because it's really hard to give 197 00:17:19,949 --> 00:17:26,209 a complete list. It can definitely find incorrect single-threaded instruction 198 00:17:26,209 --> 00:17:28,880 semantics. So, if you just implement a instruction 199 00:17:28,880 --> 00:17:34,350 incorrectly in your core, then this will find it; no question about it. It can find 200 00:17:34,350 --> 00:17:38,740 a lot of bugs and things like bypassing and forwarding and pipeline interlocks, 201 00:17:38,740 --> 00:17:46,600 things like that. Things where you reorder stuff in a way you shouldn't reorder them, 202 00:17:46,600 --> 00:17:51,950 freezes if you have this lifeness check, some bugs related to memory interfaces and 203 00:17:51,950 --> 00:17:58,610 load/store consistency and things like that, but that depends on things like the 204 00:17:58,610 --> 00:18:04,110 size of your cache lines, if this is a feasible proof or not. Bugs that we can't 205 00:18:04,110 --> 00:18:09,280 find yet with riscv-formal are things that are not yet covered with the riscv-formal 206 00:18:09,280 --> 00:18:13,179 interface, like the floating-point stuff or CSRs, but this is all on the to-do 207 00:18:13,179 --> 00:18:18,280 list, so they are actively working on that and a year from now, this stuff will be 208 00:18:18,280 --> 00:18:24,650 included. And anything related to concurrency between multiple heats. So 209 00:18:24,650 --> 00:18:28,210 far, my excuse for that was, that the RISC-V memory model is not completely 210 00:18:28,210 --> 00:18:33,789 specified yet, so I would not actually know what to check exactly, but right now 211 00:18:33,789 --> 00:18:38,559 the RISC-V memory model is in the process of being finalized, so I won't have this 212 00:18:38,559 --> 00:18:45,740 excuse for much, much longer. So, the processors currently supported PicoRV32, 213 00:18:45,740 --> 00:18:50,740 which is my own processor, then RISC-V Rocket, which is probably the most famous 214 00:18:50,740 --> 00:18:56,519 RISC-V implementation, and VexRiscv. And there are also a couple of others, but 215 00:18:56,519 --> 00:19:04,039 they are not part of the open source release of riscv-formal. So, if you would 216 00:19:04,039 --> 00:19:11,159 like to add support to riscv-formal for your RISC-V processor, then just check out 217 00:19:11,159 --> 00:19:15,950 the riscv-formal repository, look at the "cores" directory, see which of the 218 00:19:15,950 --> 00:19:20,210 supported cores's most closely to the code that you actually have and then just copy 219 00:19:20,210 --> 00:19:27,800 that directory and make a couple of small modifications. So, I have a few minutes 220 00:19:27,800 --> 00:19:33,980 left to talk about things like cut-points and blackboxes and other abstractions. So, 221 00:19:33,980 --> 00:19:37,919 the title of this slide could just be "abstractions", because cut-points and 222 00:19:37,919 --> 00:19:41,760 blackboxes are just abstractions. The idea behind an abstraction and formal 223 00:19:41,760 --> 00:19:48,221 methods is that I switch out part of my design with a different part with a 224 00:19:48,221 --> 00:19:54,639 different circuit that is less constrained, it includes the behavior of 225 00:19:54,639 --> 00:19:59,040 the original circuit, but might do other stuff as well. So, the textbook example 226 00:19:59,040 --> 00:20:03,340 would be, I have a design with a counter and usually the counter would just 227 00:20:03,340 --> 00:20:08,610 increment in steps of 1, but now I create an abstraction that can skip numbers and 228 00:20:08,610 --> 00:20:15,860 will just increment in strictly increasing steps. And this, of course, includes the 229 00:20:15,860 --> 00:20:20,039 behavior of the original design, so if I can prove a property with this abstraction 230 00:20:20,039 --> 00:20:24,889 in place instead of the just-increment- by-1 counter, then we have proven even a 231 00:20:24,889 --> 00:20:29,620 stronger property and that includes the same property for the thing in the 232 00:20:29,620 --> 00:20:35,690 original design. And actually this idea of abstractions works very well with riscv- 233 00:20:35,690 --> 00:20:40,950 formal. So, the main reason why we do abstractions is because it leads to easier 234 00:20:40,950 --> 00:20:46,540 proofs. So, for example, consider an instruction checker that just checks if 235 00:20:46,540 --> 00:20:52,280 the core implements the "add" instruction correctly. For this checker, we don't 236 00:20:52,280 --> 00:20:56,020 actually need a register file that's working. We could replace the register 237 00:20:56,020 --> 00:21:00,580 file by something that just ignores all writes to it and whenever we read 238 00:21:00,580 --> 00:21:04,289 something from the register file, it returns an arbitrary value. That would 239 00:21:04,289 --> 00:21:09,870 still include the behavior of a core with a functional register file, but, because 240 00:21:09,870 --> 00:21:13,610 the instruction checker does not care about consistency between register file 241 00:21:13,610 --> 00:21:17,929 writes and register file reads, we can still prove that the instruction is 242 00:21:17,929 --> 00:21:22,889 implemented correctly, and therefore we get an easier proof. Of course, we can't 243 00:21:22,889 --> 00:21:27,100 use this abstraction for all those proofs, because the other proofs that actually 244 00:21:27,100 --> 00:21:33,330 check if my register file works as I would expect it to work, but if we go through 245 00:21:33,330 --> 00:21:37,429 the list of proofs and we run all these proofs independently, then you will see 246 00:21:37,429 --> 00:21:41,960 that for each of them, it's possible to abstract away a large portion of your 247 00:21:41,960 --> 00:21:47,380 processor and therefore yield and an easier proof. 248 00:21:47,380 --> 00:21:51,169 Depending on what kind of solvers you use, some solvers're actually very capable of 249 00:21:51,169 --> 00:21:54,779 finding these kind of abstractions themselves. So in that cases, it doesn't 250 00:21:54,779 --> 00:21:59,470 really help by adding these abstractions manually, but just realizing that the 251 00:21:59,470 --> 00:22:04,490 potential for these abstractions is there, is something that's very useful when 252 00:22:04,490 --> 00:22:09,020 guiding your decisions how to split up a large verification problem into smaller 253 00:22:09,020 --> 00:22:12,610 verification problems, because you would like to split up the problem in a way so 254 00:22:12,610 --> 00:22:17,100 that the solver is always capable of finding useful abstractions that actually 255 00:22:17,100 --> 00:22:27,700 lead to easier circuits to prove. With a bounded check, we also have the questions 256 00:22:27,700 --> 00:22:32,850 of what bounds do we use. Of course, larger bounds are better, but larger 257 00:22:32,850 --> 00:22:41,860 bounds also yield something that is harder to compute, and if you have a small bound, 258 00:22:41,860 --> 00:22:45,340 then you have a proof that runs very, very quickly, but maybe you're not very 259 00:22:45,340 --> 00:22:49,970 confident that it actually has proven something that's relevant for you. So, I 260 00:22:49,970 --> 00:22:57,350 propose 2 solutions for this: The first solution is, you can use the same solvers 261 00:22:57,350 --> 00:23:02,490 to find traces that cover certain events and you could write a list and say "I 262 00:23:02,490 --> 00:23:07,409 would like to see 1 memory read and 1 memory write and at least one ALU 263 00:23:07,409 --> 00:23:10,690 instruction executed" and things like that. And then, you can ask the solver 264 00:23:10,690 --> 00:23:18,830 "What is the shortest trace that would actually satisfy all this stuff?" And when 265 00:23:18,830 --> 00:23:24,570 that's a trace of, say 25 cycles, then you know "Okay, when I look at a prove that's 266 00:23:24,570 --> 00:23:29,789 25 cyclic deep, I know at least these are the cases that are going to be covered." 267 00:23:29,789 --> 00:23:34,730 But more important, I think, is: Usually when you have a processor, you already 268 00:23:34,730 --> 00:23:41,269 found bugs and it's a good idea to not just fix the bugs and forget about them, 269 00:23:41,269 --> 00:23:46,610 but preserve some way of re-introducing the bugs, just to see if your testing 270 00:23:46,610 --> 00:23:51,509 framework works. So, if you have already a couple of bugs 271 00:23:51,509 --> 00:23:55,120 and you know "It took me a week to find that and took me a month to find that", 272 00:23:55,120 --> 00:24:00,549 the best thing is to just add the bugs to your design again and see "What are the 273 00:24:00,549 --> 00:24:04,610 bounds that are necessary for riscv-formal to actually discover those bugs" and then 274 00:24:04,610 --> 00:24:09,009 you will have some degree of confidence that other similar bugs would also have 275 00:24:09,009 --> 00:24:15,779 been found with the same bounds. So, "Results": I have found bugs in pretty much 276 00:24:15,779 --> 00:24:23,029 every implementation I looked at; I found bugs in all 3 processors; we found bugs in 277 00:24:23,029 --> 00:24:28,139 Spike, which is the official implementation of RISC-V in C and I found 278 00:24:28,139 --> 00:24:33,760 a way to formally verify my specification against Spike and in some cases where I found 279 00:24:33,760 --> 00:24:37,250 the difference between my specification and Spike, it turned out, it 280 00:24:37,250 --> 00:24:41,830 was actually a bug in the English language specification. So, because of that, I also 281 00:24:41,830 --> 00:24:47,950 found bugs in the English language specification with riscv-formal. "Future 282 00:24:47,950 --> 00:24:53,519 work": Multipliers already supported, floating-point is still on the to-do list, 283 00:24:53,519 --> 00:25:01,600 64-bit is half done. We would like to add support for CSRs. We would like to add 284 00:25:01,600 --> 00:25:06,120 support for more cores, but this is something that I would like to do slowly, 285 00:25:06,120 --> 00:25:10,649 because adding more cores also means we have less flexibility with changing 286 00:25:10,649 --> 00:25:17,970 things. And better integration with non- free tools, because right now all of that 287 00:25:17,970 --> 00:25:22,600 runs with open source tools that I also happen to write -- so, I wrote those 288 00:25:22,600 --> 00:25:27,580 tools, but some people actually don't want to use my open source tools; they would 289 00:25:27,580 --> 00:25:31,260 like to use the commercial tools -- and it's the to-do list that I have better 290 00:25:31,260 --> 00:25:35,750 integration with those tools. Maybe, because I don't get licenses to those 291 00:25:35,750 --> 00:25:41,049 tools, so we will see how this works. That's it. Do we have still time for 292 00:25:41,049 --> 00:26:02,709 questions? *applause* 293 00:26:02,709 --> 00:26:05,380 Clifford: So, I'd say we start with questions at 1. 294 00:26:05,380 --> 00:26:11,240 Herald: I'm sorry; here we go. We have 2 questions; we have time for 2 questions 295 00:26:11,240 --> 00:26:15,720 and we're going to start with microphone number 1, please. 296 00:26:15,720 --> 00:26:22,779 M1: Hello, thanks for your talk and for your work. First question: You told about 297 00:26:22,779 --> 00:26:26,690 riscv-formal interface. Clifford: Yes. 298 00:26:26,690 --> 00:26:34,199 M1: So, does vendor ship the final processor with this interface available? 299 00:26:34,199 --> 00:26:39,529 Clifford: Oh, that's a great question, thank you. This interface has only output 300 00:26:39,529 --> 00:26:45,290 ports and actually when you implement this interface, you should not add something to 301 00:26:45,290 --> 00:26:49,230 your design that's not needed to generate those output ports. So, what you can do 302 00:26:49,230 --> 00:26:53,510 is, you can take the version of your core with that interface, the version of the 303 00:26:53,510 --> 00:26:58,480 core without that interface. Then, in your synthesis script, just remove those output 304 00:26:58,480 --> 00:27:03,720 ports and then run a formal equivalence check between that version and the version 305 00:27:03,720 --> 00:27:10,820 that you actually deploy on your ASIC. M1: Thanks; and one short question: When 306 00:27:10,820 --> 00:27:17,381 people say formal verification, usually others think "Oh, if it is verified, it is 307 00:27:17,381 --> 00:27:26,220 excellent, absolutely excellent. Do you plan to say that it will find all the 308 00:27:26,220 --> 00:27:30,649 erato for the processor? Clifford: Well, it depends on what kind of 309 00:27:30,649 --> 00:27:35,990 proof you run. The most work I do is with bounded proofs and there you only get a 310 00:27:35,990 --> 00:27:40,830 certain degree of confidence, because you only see bugs that can occur within a 311 00:27:40,830 --> 00:27:46,160 certain number of cycles from reset, but if you want, you can also run a complete 312 00:27:46,160 --> 00:27:52,539 proof, where you start with a symbolic state instead of a reset state and then 313 00:27:52,539 --> 00:27:57,510 you can can make sure that you actually check the entire reachable state space, 314 00:27:57,510 --> 00:28:01,509 but that's a very, very hard thing to do So, that's not a weekend project. Just 315 00:28:01,509 --> 00:28:05,730 adding the riscv-formal interface and running some bounded proofs is probably a 316 00:28:05,730 --> 00:28:10,560 weekend project if you already have your RISC-V processor. 317 00:28:10,560 --> 00:28:13,400 M1: Thank you. Herald: Thank you. We actually do not have 318 00:28:13,400 --> 00:28:17,919 time for any more questions, but there will be time after the talk to ask you 319 00:28:17,919 --> 00:28:22,130 questions... maybe? Clifford: Yeah. So, maybe you can find me 320 00:28:22,130 --> 00:28:27,390 at the open FPGA assembly, which is part of the hardware hacking area. 321 00:28:27,390 --> 00:28:32,740 Herald: Super. Very great job to put that much information into 30 minutes. Please 322 00:28:32,740 --> 00:28:35,830 help me thank Cliff for his wonderful talk. 323 00:28:35,830 --> 00:28:44,232 *applause* 324 00:28:44,232 --> 00:28:49,447 *34c3 outro* 325 00:28:49,447 --> 00:29:06,000 subtitles created by c3subtitles.de in the year 2018. Join, and help us!