1 00:00:00,000 --> 00:00:09,550 *34c3 preroll music* 2 00:00:15,565 --> 00:00:18,230 Herald: ...and I will let Katherine take the stage now. 3 00:00:18,589 --> 00:00:21,430 Katharine Jarmul, kjam: Awesome! Well, thank you so much for the introduction and 4 00:00:21,430 --> 00:00:25,310 thank you so much for being here, taking your time. I know that Congress is really 5 00:00:25,310 --> 00:00:29,800 exciting, so I really appreciate you spending some time with me today. It's my 6 00:00:29,800 --> 00:00:34,470 first ever Congress, so I'm also really excited and I want to meet new people. So 7 00:00:34,470 --> 00:00:39,930 if you wanna come say hi to me later, I'm somewhat friendly, so we can maybe be 8 00:00:39,930 --> 00:00:44,680 friends later. Today what we're going to talk about is deep learning blind spots or 9 00:00:44,680 --> 00:00:49,890 how to fool "artificial intelligence". I like to put "artificial intelligence" in 10 00:00:49,890 --> 00:00:55,270 quotes, because.. yeah, we'll talk about that, but I think it should be in quotes. 11 00:00:55,270 --> 00:00:59,570 And today we're going to talk a little bit about deep learning, how it works and how 12 00:00:59,570 --> 00:01:07,640 you can maybe fool it. So I ask us: Is AI becoming more intelligent? 13 00:01:07,640 --> 00:01:11,078 And I ask this because when I open a browser and, of course, often it's Chrome 14 00:01:11,078 --> 00:01:16,979 and Google is already prompting me for what I should look at 15 00:01:16,979 --> 00:01:20,260 and it knows that I work with machine learning, right? 16 00:01:20,260 --> 00:01:23,830 And these are the headlines that I see every day: 17 00:01:23,830 --> 00:01:29,399 "Are Computers Already Smarter Than Humans?" 18 00:01:29,399 --> 00:01:32,289 If so, I think we could just pack up and go home, right? 19 00:01:32,289 --> 00:01:36,140 Like, we fixed computers, right? If a computer is smarter than me, 20 00:01:36,140 --> 00:01:39,780 then I already fixed it, we can go home, there's no need to talk about computers 21 00:01:39,780 --> 00:01:47,750 anymore, let's just move on with life. But that's not true, right? We know, because 22 00:01:47,750 --> 00:01:51,010 we work with computers and we know how stupid computers are sometimes. They're 23 00:01:51,010 --> 00:01:55,890 pretty bad. Computers do only what we tell them to do, generally, so I don't think a 24 00:01:55,890 --> 00:02:01,090 computer can think and be smarter than me. So with the same types of headlines that 25 00:02:01,090 --> 00:02:11,690 you see this, then you also see this: And yeah, so Apple recently released their 26 00:02:11,690 --> 00:02:17,500 face ID and this unlocks your phone with your face and it seems like a great idea, 27 00:02:17,500 --> 00:02:22,451 right? You have a unique face, you have a face, nobody else can take your face. But 28 00:02:22,451 --> 00:02:28,300 unfortunately what we find out about computers is that they're awful sometimes, 29 00:02:28,300 --> 00:02:32,480 and for these women.. for this Chinese woman that owned an iPhone, 30 00:02:32,480 --> 00:02:35,960 her coworker was able to unlock her phone. 31 00:02:35,964 --> 00:02:39,320 And I think Hendrick and Karin talked about, if you were here for the 32 00:02:39,320 --> 00:02:41,590 last talk ("Beeinflussung durch künstliche Intelligenz"). We have a lot of problems 33 00:02:41,590 --> 00:02:46,379 in machine learning and one of them is stereotypes and prejudice that are within 34 00:02:46,379 --> 00:02:52,340 our training data or within our minds that leak into our models. And perhaps they 35 00:02:52,340 --> 00:02:57,739 didn't do adequate training data on determining different features of Chinese 36 00:02:57,739 --> 00:03:03,160 folks. And perhaps it's other problems with their model or their training data or 37 00:03:03,160 --> 00:03:07,500 whatever they're trying to do. But they clearly have some issues, right? So when 38 00:03:07,500 --> 00:03:12,050 somebody asked me: "Is AI gonna take over the world and is there a super robot 39 00:03:12,050 --> 00:03:17,300 that's gonna come and be my new, you know, leader or so to speak?" I tell them we 40 00:03:17,300 --> 00:03:21,710 can't even figure out the stuff that we already have in production. So if we can't 41 00:03:21,710 --> 00:03:25,690 even figure out the stuff we already have in production, I'm a little bit less 42 00:03:25,690 --> 00:03:33,209 worried of the super robot coming to kill me. That said, unfortunately the powers 43 00:03:33,209 --> 00:03:38,190 that be, the powers that be a lot of times they believe in this and they believe 44 00:03:38,190 --> 00:03:44,540 strongly in "artificial intelligence" and machine learning. They're collecting data 45 00:03:44,540 --> 00:03:50,800 every day about you and me and everyone else. And they're gonna use this data to 46 00:03:50,800 --> 00:03:56,349 build even better models. This is because the revolution that we're seeing now in 47 00:03:56,349 --> 00:04:02,080 machine learning has really not much to do with new algorithms or architectures. It 48 00:04:02,080 --> 00:04:09,630 has a lot more to do with heavy compute and with massive, massive data sets. And 49 00:04:09,630 --> 00:04:15,740 the more that we have training data of petabytes per 24 hours or even less, the 50 00:04:15,740 --> 00:04:22,690 more we're able to essentially fix up the parts that don't work so well. The 51 00:04:22,690 --> 00:04:25,979 companies that we see here are companies that are investing heavily in machine 52 00:04:25,979 --> 00:04:30,979 learning and AI. Part of how they're investing heavily is, they're collecting 53 00:04:30,979 --> 00:04:37,999 more and more data about you and me and everyone else. Google and Facebook, more 54 00:04:37,999 --> 00:04:42,789 than 1 billion active users. I was surprised to know that in Germany the 55 00:04:42,789 --> 00:04:48,159 desktop search traffic for Google is higher than most of the rest of the world. 56 00:04:48,159 --> 00:04:53,259 And for Baidu they're growing with the speed that broadband is available. And so, 57 00:04:53,259 --> 00:04:56,970 what we see is, these people are collecting this data and they also are 58 00:04:56,970 --> 00:05:02,779 using new technologies like GPUs and TPUs in new ways to parallelize workflows 59 00:05:02,779 --> 00:05:09,449 and with this they're able to mess up less, right? They're still messing up, but 60 00:05:09,449 --> 00:05:14,960 they mess up slightly less. And they're not going to get uninterested in this 61 00:05:14,960 --> 00:05:20,550 topic, so we need to kind of start to prepare how we respond to this type of 62 00:05:20,550 --> 00:05:25,860 behavior. One of the things that has been a big area of research, actually also for 63 00:05:25,860 --> 00:05:30,080 a lot of these companies, is what we'll talk about today and that's adversarial 64 00:05:30,080 --> 00:05:36,800 machine learning. But the first thing that we'll start with is what is behind what we 65 00:05:36,800 --> 00:05:44,009 call AI. So most of the time when you think of AI or something like Siri and so 66 00:05:44,009 --> 00:05:48,979 forth, you are actually potentially talking about an old-school rule-based 67 00:05:48,979 --> 00:05:53,930 system. This is a rule, like you say a particular thing and then Siri is like: 68 00:05:53,930 --> 00:05:58,129 "Yes, I know how to respond to this". And we even hard program these types of things 69 00:05:58,129 --> 00:06:02,880 in, right? That is one version of AI, is essentially: It's been pre-programmed to 70 00:06:02,880 --> 00:06:08,839 do and understand certain things. Another form that usually, for example for the 71 00:06:08,839 --> 00:06:12,619 people that are trying to build AI robots and the people that are trying to build 72 00:06:12,619 --> 00:06:17,110 what we call "general AI", so this is something that can maybe learn like a 73 00:06:17,110 --> 00:06:20,190 human, they'll use reinforcement learning. 74 00:06:20,190 --> 00:06:22,200 I don't specialize in reinforcement learning. 75 00:06:22,200 --> 00:06:26,401 But what it does is it essentially tries to reward you for 76 00:06:26,401 --> 00:06:32,429 behaviour that you're expected to do. So if you complete a task, you get a a 77 00:06:32,429 --> 00:06:36,099 cookie. You complete two other tasks, you get two or three more cookies depending on 78 00:06:36,099 --> 00:06:41,759 how important the task is. And this will help you learn how to behave to get more 79 00:06:41,759 --> 00:06:45,990 points and it's used a lot in robots and gaming and so forth. And I'm not really 80 00:06:45,990 --> 00:06:49,340 going to talk about that today because most of that is still not really something 81 00:06:49,340 --> 00:06:54,880 that you or I interact with. Well, what I am gonna talk about today is neural 82 00:06:54,880 --> 00:06:59,680 networks, or as some people like to call them "deep learning", right? So deep 83 00:06:59,680 --> 00:07:04,119 learning 1: The neural network versus deep learning battle awhile ago. So here's an 84 00:07:04,119 --> 00:07:09,949 example neural network: we have an input layer and that's where we essentially make 85 00:07:09,949 --> 00:07:14,550 a quantitative version of whatever our data is. So we need to make it into 86 00:07:14,550 --> 00:07:19,890 numbers. Then we have a hidden layer and we might have multiple hidden layers. And 87 00:07:19,890 --> 00:07:23,759 depending on how deep our network is, or a network inside a network, right, which is 88 00:07:23,759 --> 00:07:28,179 possible. We might have very much different layers there and they may even 89 00:07:28,179 --> 00:07:33,539 act in cyclical ways. And then that's where all the weights and the variables 90 00:07:33,539 --> 00:07:39,259 and the learning happens. So that has.. holds a lot of information and data that 91 00:07:39,259 --> 00:07:43,979 we eventually want to train there. And finally we have an output layer. And 92 00:07:43,979 --> 00:07:47,529 depending on the network and what we're trying to do the output layer can vary 93 00:07:47,529 --> 00:07:51,539 between something that looks like the input, like for example if we want to 94 00:07:51,539 --> 00:07:55,719 machine translate, then I want the output to look like the input, right, I want it 95 00:07:55,719 --> 00:07:59,909 to just be in a different language, or the output could be a different class. It can 96 00:07:59,909 --> 00:08:05,749 be, you know, this is a car or this is a train and so forth. So it really depends 97 00:08:05,749 --> 00:08:10,610 what you're trying to solve, but the output layer gives us the answer. And how 98 00:08:10,610 --> 00:08:17,159 we train this is, we use backpropagation. Backpropagation is nothing new and neither 99 00:08:17,159 --> 00:08:21,139 is one of the most popular methods to do so, which is called stochastic gradient 100 00:08:21,139 --> 00:08:26,459 descent. What we do when we go through that part of the training, is we go from 101 00:08:26,459 --> 00:08:29,759 the output layer and we go backwards through the network. That's why it's 102 00:08:29,759 --> 00:08:34,828 called backpropagation, right? And as we go backwards through the network, in the 103 00:08:34,828 --> 00:08:39,139 most simple way, we upvote and downvote what's working and what's not working. So 104 00:08:39,139 --> 00:08:42,729 we say: "oh you got it right, you get a little bit more importance", or "you got 105 00:08:42,729 --> 00:08:46,040 it wrong, you get a little bit less importance". And eventually we hope 106 00:08:46,040 --> 00:08:50,481 over time, that they essentially correct each other's errors enough that we get a 107 00:08:50,481 --> 00:08:57,550 right answer. So that's a very general overview of how it works and the cool 108 00:08:57,550 --> 00:09:02,720 thing is: Because it works that way, we can fool it. And people have been 109 00:09:02,720 --> 00:09:08,269 researching ways to fool it for quite some time. So I give you a brief overview of 110 00:09:08,269 --> 00:09:13,290 the history of this field, so we can kind of know where we're working from and maybe 111 00:09:13,290 --> 00:09:19,220 hopefully then where we're going to. In 2005 was one of the first most important 112 00:09:19,220 --> 00:09:24,740 papers to approach adversarial learning and it was written by a series of 113 00:09:24,740 --> 00:09:29,630 researchers and they wanted to see, if they could act as an informed attacker and 114 00:09:29,630 --> 00:09:34,440 attack a linear classifier. So this is just a spam filter and they're like can I 115 00:09:34,440 --> 00:09:37,850 send spam to my friend? I don't know why they would want to do this, but: "Can I 116 00:09:37,850 --> 00:09:43,209 send spam to my friend, if I tried testing out a few ideas?" And what they were able 117 00:09:43,209 --> 00:09:47,639 to show is: Yes, rather than just, you know, trial and error which anybody can do 118 00:09:47,639 --> 00:09:52,120 or a brute force attack of just like send a thousand emails and see what happens, 119 00:09:52,120 --> 00:09:56,370 they were able to craft a few algorithms that they could use to try and find 120 00:09:56,370 --> 00:10:03,240 important words to change, to make it go through the spam filter. In 2007 NIPS, 121 00:10:03,240 --> 00:10:08,019 which is a very popular machine learning conference, had one of their first all-day 122 00:10:08,019 --> 00:10:12,930 workshops on computer security. And when they did so, they had a bunch of different 123 00:10:12,930 --> 00:10:16,780 people that were working on machine learning in computer security: From 124 00:10:16,780 --> 00:10:21,430 malware detection, to network intrusion detection, to of course spam. And they 125 00:10:21,430 --> 00:10:25,190 also had a few talks on this type of adversarial learning. So how do you act as 126 00:10:25,190 --> 00:10:29,980 an adversary to your own model? And then how do you learn how to counter that 127 00:10:29,980 --> 00:10:35,650 adversary? In 2013 there was a really great paper that got a lot of people's 128 00:10:35,650 --> 00:10:40,001 attention called "Poisoning Attacks against Support Vector Machines". Now 129 00:10:40,001 --> 00:10:45,290 support vector machines are essentially usually a linear classifier and we use 130 00:10:45,290 --> 00:10:50,121 them a lot to say, "this is a member of this class, that, or another", when we 131 00:10:50,121 --> 00:10:54,940 pertain to text. So I have a text and I want to know what the text is about or I 132 00:10:54,940 --> 00:10:58,610 want to know if it's a positive or negative sentiment, a lot of times I'll 133 00:10:58,610 --> 00:11:05,160 use a support vector machine. We call them SVM's as well. Battista Biggio was the 134 00:11:05,160 --> 00:11:08,319 main researcher and he has actually written quite a lot about these poisoning 135 00:11:08,319 --> 00:11:15,569 attacks and he poisoned the training data. So for a lot of these systems, sometimes 136 00:11:15,569 --> 00:11:20,820 they have active learning. This means, you or I, when we classify our emails as spam, 137 00:11:20,820 --> 00:11:26,290 we're helping train the network. So he poisoned the training data and was able to 138 00:11:26,290 --> 00:11:32,360 show that by poisoning it in a particular way, that he was able to then send spam 139 00:11:32,360 --> 00:11:37,810 email because he knew what words were then benign, essentially. He went on to study a 140 00:11:37,810 --> 00:11:43,220 few other things about biometric data if you're interested in biometrics. But then 141 00:11:43,220 --> 00:11:49,329 in 2014 Christian Szegedy, Ian Goodfellow, and a few other main researchers at Google 142 00:11:49,329 --> 00:11:55,350 Brain released "Intriguing Properties of Neural Networks." That really became the 143 00:11:55,350 --> 00:12:00,040 explosion of what we're seeing today in adversarial learning. And what they were 144 00:12:00,040 --> 00:12:04,629 able to do, is they were able to say "We believe there's linear properties of these 145 00:12:04,629 --> 00:12:08,790 neural networks, even if they're not necessarily linear networks. 146 00:12:08,790 --> 00:12:15,560 And we believe we can exploit them to fool them". And they first introduced then the 147 00:12:15,560 --> 00:12:23,189 fast gradient sign method, which we'll talk about later today. So how does it 148 00:12:23,189 --> 00:12:28,830 work? First I want us to get a little bit of an intuition around how this works. 149 00:12:28,830 --> 00:12:35,310 Here's a graphic of gradient descent. And in gradient descent we have this vertical 150 00:12:35,310 --> 00:12:40,339 axis is our cost function. And what we're trying to do is: We're trying to minimize 151 00:12:40,339 --> 00:12:47,400 cost, we want to minimize the error. And so when we start out, we just chose random 152 00:12:47,400 --> 00:12:51,790 weights and variables, so all of our hidden layers, they just have maybe random 153 00:12:51,790 --> 00:12:57,339 weights or random distribution. And then we want to get to a place where the 154 00:12:57,339 --> 00:13:01,740 weights have meaning, right? We want our network to know something, even if it's 155 00:13:01,740 --> 00:13:08,740 just a mathematical pattern, right? So we start in the high area of the graph, or 156 00:13:08,740 --> 00:13:13,819 the reddish area, and that's where we started, we have high error there. And 157 00:13:13,819 --> 00:13:21,209 then we try to get to the lowest area of the graph, or here the dark blue that is 158 00:13:21,209 --> 00:13:26,889 right about here. But sometimes what happens: As we learn, as we go through 159 00:13:26,889 --> 00:13:33,300 epochs and training, we're moving slowly down and hopefully we're optimizing. But 160 00:13:33,300 --> 00:13:37,370 what we might end up in instead of this global minimum, we might end up in the 161 00:13:37,370 --> 00:13:43,800 local minimum which is the other trail. And that's fine, because it's still zero 162 00:13:43,800 --> 00:13:49,889 error, right? So we're still probably going to be able to succeed, but we might 163 00:13:49,889 --> 00:13:56,139 not get the best answer all the time. What adversarial tries to do in the most basic 164 00:13:56,139 --> 00:14:01,980 of ways, it essentially tries to push the error rate back up the hill for as many 165 00:14:01,980 --> 00:14:07,709 units as it can. So it essentially tries to increase the error slowly through 166 00:14:07,709 --> 00:14:14,600 perturbations. And by disrupting, let's say, the weakest links like the one that 167 00:14:14,600 --> 00:14:19,060 did not find the global minimum but instead found a local minimum, we can 168 00:14:19,060 --> 00:14:23,069 hopefully fool the network, because we're finding those weak spots and we're 169 00:14:23,069 --> 00:14:25,629 capitalizing on them, essentially. 170 00:14:31,252 --> 00:14:34,140 So what does an adversarial example actually look like? 171 00:14:34,140 --> 00:14:37,430 You may have already seen this because it's very popular on the 172 00:14:37,430 --> 00:14:45,221 Twittersphere and a few other places, but this was a series of researches at MIT. It 173 00:14:45,221 --> 00:14:51,059 was debated whether you could do adverse.. adversarial learning in the real world. A 174 00:14:51,059 --> 00:14:57,339 lot of the research has just been a still image. And what they were able to show: 175 00:14:57,339 --> 00:15:03,079 They created a 3D-printed turtle. I mean it looks like a turtle to you as well, 176 00:15:03,079 --> 00:15:09,910 correct? And this 3D-printed turtle by the Inception Network, which is a very popular 177 00:15:09,910 --> 00:15:16,790 computer vision network, is a rifle and it is a rifle in every angle that you can 178 00:15:16,790 --> 00:15:21,959 see. And the way they were able to do this and, I don't know the next time it goes 179 00:15:21,959 --> 00:15:25,910 around you can see perhaps, and it's a little bit easier on the video which I'll 180 00:15:25,910 --> 00:15:29,790 have posted, I'll share at the end, you can see perhaps that there's a slight 181 00:15:29,790 --> 00:15:35,529 discoloration of the shell. They messed with the texture. By messing with this 182 00:15:35,529 --> 00:15:39,910 texture and the colors they were able to fool the neural network, they were able to 183 00:15:39,910 --> 00:15:45,259 activate different neurons that were not supposed to be activated. Units, I should 184 00:15:45,259 --> 00:15:51,129 say. So what we see here is, yeah, it can be done in the real world, and when I saw 185 00:15:51,129 --> 00:15:56,339 this I started getting really excited. Because, video surveillance is a real 186 00:15:56,339 --> 00:16:02,529 thing, right? So if we can start fooling 3D objects, we can perhaps start fooling 187 00:16:02,529 --> 00:16:08,040 other things in the real world that we would like to fool. 188 00:16:08,040 --> 00:16:12,440 *applause* 189 00:16:12,440 --> 00:16:19,149 kjam: So why do adversarial examples exist? We're going to talk a little bit 190 00:16:19,149 --> 00:16:23,879 about some things that are approximations of what's actually happening, so please 191 00:16:23,879 --> 00:16:27,610 forgive me for not being always exact, but I would rather us all have a general 192 00:16:27,610 --> 00:16:33,660 understanding of what's happening. Across the top row we have an input layer and 193 00:16:33,660 --> 00:16:39,480 these images to the left, we can see, are the source images and this source image is 194 00:16:39,480 --> 00:16:43,380 like a piece of farming equipment or something. And on the right we have our 195 00:16:43,380 --> 00:16:48,800 guide image. This is what we're trying to get the network to see we want it to 196 00:16:48,800 --> 00:16:55,070 missclassify this farm equipment as a pink bird. So what these researchers did is 197 00:16:55,070 --> 00:16:59,019 they targeted different layers of the network. And they said: "Okay, we're going 198 00:16:59,019 --> 00:17:02,410 to use this method to target this particular layer and we'll see what 199 00:17:02,410 --> 00:17:07,569 happens". And so as they targeted these different layers you can see what's 200 00:17:07,569 --> 00:17:12,109 happening on the internal visualization. Now neural networks can't see, right? 201 00:17:12,109 --> 00:17:17,939 They're looking at matrices of numbers but what we can do is we can use those 202 00:17:17,939 --> 00:17:26,559 internal values to try and see with our human eyes what they are learning. And we 203 00:17:26,559 --> 00:17:31,370 can see here clearly inside the network, we no longer see the farming equipment, 204 00:17:31,370 --> 00:17:39,550 right? We see a pink bird. And this is not visible to our human eyes. Now if you 205 00:17:39,550 --> 00:17:43,570 really study and if you enlarge the image you can start to see okay there's a little 206 00:17:43,570 --> 00:17:48,190 bit of pink here or greens, I don't know what's happening, but we can still see it 207 00:17:48,190 --> 00:17:56,510 in the neural network we have tricked. Now people don't exactly know yet why these 208 00:17:56,510 --> 00:18:03,159 blind spots exist. So it's still an area of active research exactly why we can fool 209 00:18:03,159 --> 00:18:09,429 neural networks so easily. There are some prominent researchers that believe that 210 00:18:09,429 --> 00:18:14,450 neural networks are essentially very linear and that we can use this simple 211 00:18:14,450 --> 00:18:20,840 linearity to misclassify to jump into another area. But there are others that 212 00:18:20,840 --> 00:18:24,820 believe that there's these pockets or blind spots and that we can then find 213 00:18:24,820 --> 00:18:28,500 these blind spots where these neurons really are the weakest links and they 214 00:18:28,500 --> 00:18:33,160 maybe even haven't learned anything and if we change their activation then we can 215 00:18:33,160 --> 00:18:37,580 fool the network easily. So this is still an area of active research and let's say 216 00:18:37,580 --> 00:18:44,320 you're looking for your thesis, this would be a pretty neat thing to work on. So 217 00:18:44,320 --> 00:18:49,399 we'll get into just a brief overview of some of the math behind the most popular 218 00:18:49,399 --> 00:18:55,571 methods. First we have the fast gradient sign method and that is was used in the 219 00:18:55,571 --> 00:18:59,950 initial paper and now there's been many iterations on it. And what we do is we 220 00:18:59,950 --> 00:19:05,120 have our same cost function, so this is the same way that we're trying to train 221 00:19:05,120 --> 00:19:13,110 our network and it's trying to learn. And we take the gradient sign of that and if 222 00:19:13,110 --> 00:19:16,330 you can think, it's okay, if you're not used to doing vector calculus, and 223 00:19:16,330 --> 00:19:20,250 especially not without a pen and paper in front of you, but what you think we're 224 00:19:20,250 --> 00:19:24,140 doing is we're essentially trying to calculate some approximation of a 225 00:19:24,140 --> 00:19:29,700 derivative of the function. And this can kind of tell us, where is it going. And if 226 00:19:29,700 --> 00:19:37,299 we know where it's going, we can maybe anticipate that and change. And then to 227 00:19:37,299 --> 00:19:41,480 create the adversarial images, we then take the original input plus a small 228 00:19:41,480 --> 00:19:48,770 number epsilon times that gradient's sign. For the Jacobian Saliency Map, this is a 229 00:19:48,770 --> 00:19:55,010 newer method and it's a little bit more effective, but it takes a little bit more 230 00:19:55,010 --> 00:20:02,250 compute. This Jacobian Saliency Map uses a Jacobian matrix and if you remember also, 231 00:20:02,250 --> 00:20:07,649 and it's okay if you don't, a Jacobian matrix will look at the full derivative of 232 00:20:07,649 --> 00:20:12,049 a function, so you take the full derivative of a cost function 233 00:20:12,049 --> 00:20:18,269 at that vector, and it gives you a matrix that is a pointwise approximation, 234 00:20:18,269 --> 00:20:22,550 if the function is differentiable at that input vector. Don't 235 00:20:22,550 --> 00:20:28,320 worry you can review this later too. But the Jacobian matrix then we use to create 236 00:20:28,320 --> 00:20:33,059 this saliency map the same way where we're essentially trying some sort of linear 237 00:20:33,059 --> 00:20:38,830 approximation, or pointwise approximation, and we then want to find two pixels that 238 00:20:38,830 --> 00:20:43,860 we can perturb that cause the most disruption. And then we continue to the 239 00:20:43,860 --> 00:20:48,970 next. Unfortunately this is currently a O(n²) problem, but there's a few people 240 00:20:48,970 --> 00:20:53,910 that are trying to essentially find ways that we can approximate this and make it 241 00:20:53,910 --> 00:21:01,320 faster. So maybe now you want to fool a network too and I hope you do, because 242 00:21:01,320 --> 00:21:06,580 that's what we're going to talk about. First you need to pick a problem or a 243 00:21:06,580 --> 00:21:13,460 network type you may already know. But you may want to investigate what perhaps is 244 00:21:13,460 --> 00:21:19,019 this company using, what perhaps is this method using and do a little bit of 245 00:21:19,019 --> 00:21:23,730 research, because that's going to help you. Then you want to research state-of- 246 00:21:23,730 --> 00:21:28,610 the-art methods and this is like a typical research statement that you have a new 247 00:21:28,610 --> 00:21:32,360 state-of-the-art method, but the good news is is that the state-of-the-art two to 248 00:21:32,360 --> 00:21:38,179 three years ago is most likely in production or in systems today. So once 249 00:21:38,179 --> 00:21:44,480 they find ways to speed it up, some approximation of that is deployed. And a 250 00:21:44,480 --> 00:21:48,279 lot of times these are then publicly available models, so a lot of times, if 251 00:21:48,279 --> 00:21:51,480 you're already working with the deep learning framework they'll come 252 00:21:51,480 --> 00:21:56,450 prepackaged with a few of the different popular models, so you can even use that. 253 00:21:56,450 --> 00:22:00,691 If you're already building neural networks of course you can build your own. An 254 00:22:00,691 --> 00:22:05,510 optional step, but one that might be recommended, is to fine-tune your model 255 00:22:05,510 --> 00:22:10,750 and what this means is to essentially take a new training data set, maybe data that 256 00:22:10,750 --> 00:22:15,490 you think this company is using or that you think this network is using, and 257 00:22:15,490 --> 00:22:19,300 you're going to remove the last few layers of the neural network and you're going to 258 00:22:19,300 --> 00:22:24,809 retrain it. So you essentially are nicely piggybacking on the work of the pre 259 00:22:24,809 --> 00:22:30,650 trained model and you're using the final layers to create finesse. This essentially 260 00:22:30,650 --> 00:22:37,169 makes your model better at the task that you have for it. Finally then you use a 261 00:22:37,169 --> 00:22:40,260 library, and we'll go through a few of them, but some of the ones that I have 262 00:22:40,260 --> 00:22:46,450 used myself is cleverhans, DeepFool and deep-pwning, and these all come with nice 263 00:22:46,450 --> 00:22:51,580 built-in features for you to use for let's say the fast gradient sign method, the 264 00:22:51,580 --> 00:22:56,740 Jacobian saliency map and a few other methods that are available. Finally it's 265 00:22:56,740 --> 00:23:01,550 not going to always work so depending on your source and your target, you won't 266 00:23:01,550 --> 00:23:05,840 always necessarily find a match. What researchers have shown is it's a lot 267 00:23:05,840 --> 00:23:10,950 easier to fool a network that a cat is a dog than it is to fool in networks that a 268 00:23:10,950 --> 00:23:16,030 cat is an airplane. And this is just like we can make these intuitive, so you might 269 00:23:16,030 --> 00:23:21,830 want to pick an input that's not super dissimilar from where you want to go, but 270 00:23:21,830 --> 00:23:28,260 is dissimilar enough. And you want to test it locally and then finally test the one 271 00:23:28,260 --> 00:23:38,149 for the highest misclassification rates on the target network. And you might say 272 00:23:38,149 --> 00:23:44,230 Katharine, or you can call me kjam, that's okay. You might say: "I don't know what 273 00:23:44,230 --> 00:23:50,049 the person is using", "I don't know what the company is using" and I will say "it's 274 00:23:50,049 --> 00:23:56,750 okay", because what's been proven: You can attack a blackbox model, you do not have 275 00:23:56,750 --> 00:24:01,950 to know what they're using, you do not have to know exactly how it works, you 276 00:24:01,950 --> 00:24:06,760 don't even have to know their training data, because what you can do is if it 277 00:24:06,760 --> 00:24:12,710 has.. okay, addendum it has to have some API you can interface with. But if it has 278 00:24:12,710 --> 00:24:18,130 an API you can interface with or even any API you can interact with, that uses the 279 00:24:18,130 --> 00:24:24,840 same type of learning, you can collect training data by querying the API. And 280 00:24:24,840 --> 00:24:28,700 then you're training your local model on that data that you're collecting. So 281 00:24:28,700 --> 00:24:32,890 you're collecting the data, you're training your local model, and as your 282 00:24:32,890 --> 00:24:37,299 local model gets more accurate and more similar to the deployed black box that you 283 00:24:37,299 --> 00:24:43,409 don't know how it works, you are then still able to fool it. And what this paper 284 00:24:43,409 --> 00:24:49,730 proved, Nicolas Papanov and a few other great researchers, is that with usually 285 00:24:49,730 --> 00:24:56,527 less than six thousand queries they were able to fool the network between 84% and 97% certainty 286 00:24:59,301 --> 00:25:03,419 And what the same group of researchers also studied is the ability 287 00:25:03,419 --> 00:25:09,241 to transfer the ability to fool one network into another network and they 288 00:25:09,241 --> 00:25:14,910 called that transfer ability. So I can take a certain type of network and I can 289 00:25:14,910 --> 00:25:19,320 use adversarial examples against this network to fool a different type of 290 00:25:19,320 --> 00:25:26,269 machine learning technique. Here we have their matrix, their heat map, that shows 291 00:25:26,269 --> 00:25:32,730 us exactly what they were able to fool. So we have across the left-hand side here the 292 00:25:32,730 --> 00:25:37,740 source machine learning technique, we have deep learning, logistic regression, SVM's 293 00:25:37,740 --> 00:25:43,380 like we talked about, decision trees and K-nearest-neighbors. And across the bottom 294 00:25:43,380 --> 00:25:47,340 we have the target machine learning, so what were they targeting. They created the 295 00:25:47,340 --> 00:25:51,470 adversaries with the left hand side and they targeted across the bottom. We 296 00:25:51,470 --> 00:25:56,700 finally have an ensemble model at the end. And what they were able to show is like, 297 00:25:56,700 --> 00:26:03,130 for example, SVM's and decision trees are quite easy to fool, but logistic 298 00:26:03,130 --> 00:26:08,480 regression a little bit less so, but still strong, for deep learning and K-nearest- 299 00:26:08,480 --> 00:26:13,460 neighbors, if you train a deep learning model or a K-nearest-neighbor model, then 300 00:26:13,460 --> 00:26:18,179 that performs fairly well against itself. And so what they're able to show is that 301 00:26:18,179 --> 00:26:23,320 you don't necessarily need to know the target machine and you don't even have to 302 00:26:23,320 --> 00:26:28,050 get it right, even if you do know, you can use a different type of machine learning 303 00:26:28,050 --> 00:26:30,437 technique to target the network. 304 00:26:34,314 --> 00:26:39,204 So we'll look at six lines of Python here and in 305 00:26:39,204 --> 00:26:44,559 these six lines of Python I'm using the cleverhans library and in six lines of 306 00:26:44,559 --> 00:26:52,419 Python I can both generate my adversarial input and I can even predict on it. So if 307 00:26:52,419 --> 00:27:02,350 you don't code Python, it's pretty easy to learn and pick up. And for example here we 308 00:27:02,350 --> 00:27:06,830 have Keras and Keras is a very popular deep learning library in Python, it 309 00:27:06,830 --> 00:27:12,070 usually works with a theano or a tensorflow backend and we can just wrap 310 00:27:12,070 --> 00:27:19,250 our model, pass it to the fast gradient method, class and then set up some 311 00:27:19,250 --> 00:27:24,630 parameters, so here's our epsilon and a few extra parameters, this is to tune our 312 00:27:24,630 --> 00:27:30,860 adversary, and finally we can generate our adversarial examples and then predict on 313 00:27:30,860 --> 00:27:39,865 them. So in a very small amount of Python we're able to target and trick a network. 314 00:27:40,710 --> 00:27:45,791 If you're already using tensorflow or Keras, it already works with those libraries. 315 00:27:48,828 --> 00:27:52,610 Deep-pwning is one of the first libraries that I heard about in this space 316 00:27:52,610 --> 00:27:58,200 and it was presented at Def Con in 2016 and what it comes with is a bunch of 317 00:27:58,200 --> 00:28:03,320 tensorflow built-in code. It even comes with a way that you can train the model 318 00:28:03,320 --> 00:28:06,730 yourself, so it has a few different models, a few different convolutional 319 00:28:06,730 --> 00:28:12,130 neural networks and these are predominantly used in computer vision. 320 00:28:12,130 --> 00:28:18,090 It also however has a semantic model and I normally work in NLP and I was pretty 321 00:28:18,090 --> 00:28:24,240 excited to try it out. What it comes built with is the Rotten Tomatoes sentiment, so 322 00:28:24,240 --> 00:28:29,900 this is Rotten Tomatoes movie reviews that try to learn is it positive or negative. 323 00:28:30,470 --> 00:28:35,269 So the original text that I input in, when I was generating my adversarial networks 324 00:28:35,269 --> 00:28:41,500 was "more trifle than triumph", which is a real review and the adversarial text that 325 00:28:41,500 --> 00:28:46,080 it gave me was "jonah refreshing haunting leaky" 326 00:28:49,470 --> 00:28:52,660 ...Yeah.. so I was able to fool my network 327 00:28:52,660 --> 00:28:57,559 but I lost any type of meaning and this is really the problem when we think 328 00:28:57,559 --> 00:29:03,539 about how we apply adversarial learning to different tasks is, it's easy for an image 329 00:29:03,539 --> 00:29:08,960 if we make a few changes for it to retain its image, right? It's many, many pixels, 330 00:29:08,960 --> 00:29:14,139 but when we start going into language, if we change one word and then another word 331 00:29:14,139 --> 00:29:18,950 and another word or maybe we changed all of the words, we no longer understand as 332 00:29:18,950 --> 00:29:23,120 humans. And I would say this is garbage in, garbage out, this is not actual 333 00:29:23,120 --> 00:29:28,759 adversarial learning. So we have a long way to go when it comes to language tasks 334 00:29:28,759 --> 00:29:32,740 and being able to do adversarial learning and there is some research in this, but 335 00:29:32,740 --> 00:29:37,279 it's not really advanced yet. So hopefully this is something that we can continue to 336 00:29:37,279 --> 00:29:42,429 work on and advance further and if so we need to support a few different types of 337 00:29:42,429 --> 00:29:47,426 networks that are more common in NLP than they are in computer vision. 338 00:29:50,331 --> 00:29:54,759 There's some other notable open-source libraries that are available to you and I'll cover just a 339 00:29:54,759 --> 00:29:59,610 few here. There's a "Vanderbilt computational economics research lab" that 340 00:29:59,610 --> 00:30:03,679 has adlib and this allows you to do poisoning attacks. So if you want to 341 00:30:03,679 --> 00:30:09,429 target training data and poison it, then you can do so with that and use scikit- 342 00:30:09,429 --> 00:30:16,590 learn. DeepFool allows you to do the fast gradient sign method, but it tries to do 343 00:30:16,590 --> 00:30:21,590 smaller perturbations, it tries to be less detectable to us humans. 344 00:30:23,171 --> 00:30:28,284 It's based on Theano, which is another library that I believe uses Lua as well as Python. 345 00:30:29,669 --> 00:30:34,049 "FoolBox" is kind of neat because I only heard about it last week, but it collects 346 00:30:34,049 --> 00:30:39,309 a bunch of different techniques all in one library and you could use it with one 347 00:30:39,309 --> 00:30:43,160 interface. So if you want to experiment with a few different ones at once, I would 348 00:30:43,160 --> 00:30:47,460 recommend taking a look at that and finally for something that we'll talk 349 00:30:47,460 --> 00:30:53,600 about briefly in a short period of time we have "Evolving AI Lab", which release a 350 00:30:53,600 --> 00:30:59,710 fooling library and this fooling library is able to generate images that you or I 351 00:30:59,710 --> 00:31:04,573 can't tell what it is, but that the neural network is convinced it is something. 352 00:31:05,298 --> 00:31:09,940 So this we'll talk about maybe some applications of this in a moment, but they 353 00:31:09,940 --> 00:31:13,559 also open sourced all of their code and they're researchers, who open sourced 354 00:31:13,559 --> 00:31:19,649 their code, which is always very exciting. As you may have known from some of the 355 00:31:19,649 --> 00:31:25,500 research I already cited, most of the studies and the research in this area has 356 00:31:25,500 --> 00:31:29,830 been on malicious attacks. So there's very few people trying to figure out how to do 357 00:31:29,830 --> 00:31:33,769 this for what I would call benevolent purposes. Most of them are trying to act 358 00:31:33,769 --> 00:31:39,539 as an adversary in the traditional computer security sense. They're perhaps 359 00:31:39,539 --> 00:31:43,889 studying spam filters and how spammers can get by them. They're perhaps looking at 360 00:31:43,889 --> 00:31:48,669 network intrusion or botnet-attacks and so forth. They're perhaps looking at self- 361 00:31:48,669 --> 00:31:53,390 driving cars so and I know that was referenced earlier as well at Henrick and 362 00:31:53,390 --> 00:31:57,889 Karen's talk, they're perhaps trying to make a yield sign look like a stop sign or 363 00:31:57,889 --> 00:32:02,760 a stop sign look like a yield sign or a speed limit, and so forth, and scarily 364 00:32:02,760 --> 00:32:07,669 they are quite successful at this. Or perhaps they're looking at data poisoning, 365 00:32:07,669 --> 00:32:12,441 so how do we poison the model so we render it useless? In a particular context, so we 366 00:32:12,441 --> 00:32:17,990 can utilize that. And finally for malware. So what a few researchers were able to 367 00:32:17,990 --> 00:32:22,669 show is, by just changing a few things in the malware they were able to upload their 368 00:32:22,669 --> 00:32:26,270 malware to Google Mail and send it to someone and this was still fully 369 00:32:26,270 --> 00:32:31,580 functional malware. In that same sense there's the malGAN project, which uses a 370 00:32:31,580 --> 00:32:38,549 generative adversarial network to create malware that works, I guess. So there's a 371 00:32:38,549 --> 00:32:43,326 lot of research of these kind of malicious attacks within adversarial learning. 372 00:32:44,984 --> 00:32:51,929 But what I wonder is how might we use this for good. And I put "good" in quotation marks, 373 00:32:51,929 --> 00:32:56,179 because we all have different ethical and moral systems we use. And what you may 374 00:32:56,179 --> 00:33:00,289 decide is ethical for you might be different. But I think as a community, 375 00:33:00,289 --> 00:33:05,450 especially at a conference like this, hopefully we can converge on some ethical 376 00:33:05,450 --> 00:33:10,183 privacy concerned version of using these networks. 377 00:33:13,237 --> 00:33:20,990 So I've composed a few ideas and I hope that this is just a starting list of a longer conversation. 378 00:33:22,889 --> 00:33:30,010 One idea is that we can perhaps use this type of adversarial learning to fool surveillance. 379 00:33:30,830 --> 00:33:36,470 As surveillance affects you and I it even disproportionately affects people that 380 00:33:36,470 --> 00:33:41,870 most likely can't be here. So whether or not we're personally affected, we can care 381 00:33:41,870 --> 00:33:46,419 about the many lives that are affected by this type of surveillance. And we can try 382 00:33:46,419 --> 00:33:49,667 and build ways to fool surveillance systems. 383 00:33:50,937 --> 00:33:52,120 Stenography: 384 00:33:52,120 --> 00:33:55,223 So we could potentially, in a world where more and more people 385 00:33:55,223 --> 00:33:58,780 have less of a private way of sending messages to one another 386 00:33:58,780 --> 00:34:03,080 We can perhaps use adversarial learning to send private messages. 387 00:34:03,830 --> 00:34:08,310 Adware fooling: So again, where I might have quite a lot of 388 00:34:08,310 --> 00:34:13,859 privilege and I don't actually see ads that are predatory on me as much, there is 389 00:34:13,859 --> 00:34:19,449 a lot of people in the world that face predatory advertising. And so how can we 390 00:34:19,449 --> 00:34:23,604 help those problems by developing adversarial techniques? 391 00:34:24,638 --> 00:34:26,520 Poisoning your own private data: 392 00:34:27,386 --> 00:34:30,600 This depends on whether you actually need to use the service and 393 00:34:30,600 --> 00:34:34,590 whether you like how the service is helping you with the machine learning, but 394 00:34:34,590 --> 00:34:40,110 if you don't care or if you need to essentially have a burn box of your data. 395 00:34:40,110 --> 00:34:45,760 Then potentially you could poison your own private data. Finally, I want us to use it 396 00:34:45,760 --> 00:34:51,139 to investigate deployed models. So even if we don't actually need a use for 397 00:34:51,139 --> 00:34:56,010 fooling this particular network, the more we know about what's deployed and how we 398 00:34:56,010 --> 00:35:00,350 can fool it, the more we're able to keep up with this technology as it continues to 399 00:35:00,350 --> 00:35:04,630 evolve. So the more that we're practicing, the more that we're ready for whatever 400 00:35:04,630 --> 00:35:09,800 might happen next. And finally I really want to hear your ideas as well. So I'll 401 00:35:09,800 --> 00:35:13,940 be here throughout the whole Congress and of course you can share during the Q&A 402 00:35:13,940 --> 00:35:17,073 time. If you have great ideas, I really want to hear them. 403 00:35:20,635 --> 00:35:26,085 So I decided to play around a little bit with some of my ideas. 404 00:35:26,810 --> 00:35:32,720 And I was convinced perhaps that I could make Facebook think I was a cat. 405 00:35:33,305 --> 00:35:36,499 This is my goal. Can Facebook think I'm a cat? 406 00:35:37,816 --> 00:35:40,704 Because nobody really likes Facebook. I mean let's be honest, right? 407 00:35:41,549 --> 00:35:44,166 But I have to be on it because my mom messages me there 408 00:35:44,166 --> 00:35:46,020 and she doesn't use the email anymore. 409 00:35:46,020 --> 00:35:47,890 So I'm on Facebook. Anyways. 410 00:35:48,479 --> 00:35:55,151 So I used a pre-trained Inception model and Keras and I fine-tuned the layers. 411 00:35:55,151 --> 00:35:57,190 And I'm not a computer vision person really. But it 412 00:35:57,190 --> 00:36:01,770 took me like a day of figuring out how computer vision people transfer their data 413 00:36:01,770 --> 00:36:06,350 into something I can put inside of a network figure that out and I was able to 414 00:36:06,350 --> 00:36:12,040 quickly train a model and the model could only distinguish between people and cats. 415 00:36:12,040 --> 00:36:15,140 That's all the model knew how to do. I give it a picture it says it's a person or 416 00:36:15,140 --> 00:36:19,630 it's a cat. I actually didn't try just giving it an image of something else, it 417 00:36:19,630 --> 00:36:25,380 would probably guess it's a person or a cat maybe, 50/50, who knows. What I did 418 00:36:25,380 --> 00:36:31,930 was, I used an image of myself and eventually I had my fast gradient sign 419 00:36:31,930 --> 00:36:37,700 method, I used cleverhans, and I was able to slowly increase the epsilon and so the 420 00:36:37,700 --> 00:36:44,100 epsilon as it's low, you and I can't see the perturbations, but also the network 421 00:36:44,100 --> 00:36:48,920 can't see the perturbations. So we need to increase it, and of course as we increase 422 00:36:48,920 --> 00:36:53,300 it, when we're using a technique like FGSM, we are also increasing the noise 423 00:36:53,300 --> 00:37:00,830 that we see. And when I got 2.21 epsilon and I kept uploading it to Facebook and 424 00:37:00,830 --> 00:37:02,350 Facebook kept saying: "Yeah, do you want to tag yourself?" and I'm like: 425 00:37:02,370 --> 00:37:04,222 "no Idon't, I'm just testing". 426 00:37:05,123 --> 00:37:11,379 Finally I got deployed to an epsilon and Facebook no longer knew I was a face 427 00:37:11,379 --> 00:37:15,323 So I was just a book, I was a cat book, maybe. 428 00:37:15,340 --> 00:37:19,590 *applause* 429 00:37:21,311 --> 00:37:24,740 kjam: So, unfortunately, as we see, I didn't actually become a cat, because that 430 00:37:24,740 --> 00:37:30,630 would be pretty neat. But I was able to fool it. I spoke with the computer visions 431 00:37:30,630 --> 00:37:34,760 specialists that I know and she actually works in this and I was like: "What 432 00:37:34,760 --> 00:37:39,020 methods do you think Facebook was using? Did I really fool the neural network or 433 00:37:39,020 --> 00:37:43,140 what did I do?" And she's convinced most likely that they're actually using a 434 00:37:43,140 --> 00:37:47,580 statistical method called Viola-Jones, which takes a look at the statistical 435 00:37:47,580 --> 00:37:53,280 distribution of your face and tries to guess if there's really a face there. But 436 00:37:53,280 --> 00:37:58,800 what I was able to show: transferability. That is, I can use my neural network even 437 00:37:58,800 --> 00:38:05,380 to fool this statistical model, so now I have a very noisy but happy photo on FB 438 00:38:08,548 --> 00:38:14,140 Another use case potentially is adversarial stenography and I was really 439 00:38:14,140 --> 00:38:18,590 excited reading this paper. What this paper covered and they actually released 440 00:38:18,590 --> 00:38:22,860 the library, as I mentioned. They study the ability of a neural network to be 441 00:38:22,860 --> 00:38:26,309 convinced that something's there that's not actually there. 442 00:38:27,149 --> 00:38:30,177 And what they used, they used the MNIST training set. 443 00:38:30,240 --> 00:38:33,420 I'm sorry, if that's like a trigger word 444 00:38:33,420 --> 00:38:38,410 if you've used MNIST a million times, then I'm sorry for this, but what they use is 445 00:38:38,410 --> 00:38:43,290 MNIST, which is zero through nine of digits, and what they were able to show 446 00:38:43,290 --> 00:38:48,790 using evolutionary networks is they were able to generate things that to us look 447 00:38:48,790 --> 00:38:53,280 maybe like art and they actually used it on the CIFAR data set too, which has 448 00:38:53,280 --> 00:38:57,320 colors, and it was quite beautiful. Some of what they created in fact they showed 449 00:38:57,320 --> 00:39:04,340 in a gallery. And what the network sees here is the digits across the top. They 450 00:39:04,340 --> 00:39:12,170 see that digit, they are more than 99% convinced that that digit is there and 451 00:39:12,170 --> 00:39:15,476 what we see is pretty patterns or just noise. 452 00:39:16,778 --> 00:39:19,698 When I was reading this paper I was thinking, 453 00:39:19,698 --> 00:39:23,620 how can we use this to send messages to each other that nobody else 454 00:39:23,620 --> 00:39:28,511 will know is there? I'm just sending really nice.., I'm an artist and this is 455 00:39:28,511 --> 00:39:35,200 my art and I'm sharing it with my friend. And in a world where I'm afraid to go home 456 00:39:35,200 --> 00:39:42,360 because there's a crazy person in charge and I'm afraid that they might look at my 457 00:39:42,360 --> 00:39:47,040 phone, in my computer, and a million other things and I just want to make sure that 458 00:39:47,040 --> 00:39:51,650 my friend has my pin number or this or that or whatever. I see a use case for my 459 00:39:51,650 --> 00:39:56,120 life, but again I leave a fairly privileged life, there are other people 460 00:39:56,120 --> 00:40:01,690 where their actual life and livelihood and security might depend on using a technique 461 00:40:01,690 --> 00:40:06,150 like this. And I think we could use adversarial learning to create a new form 462 00:40:06,150 --> 00:40:07,359 of stenography. 463 00:40:11,289 --> 00:40:17,070 Finally I cannot impress enough that the more information we have 464 00:40:17,070 --> 00:40:20,620 about the systems that we interact with every day, that our machine learning 465 00:40:20,620 --> 00:40:24,850 systems, that our AI systems, or whatever you want to call it, that our deep 466 00:40:24,850 --> 00:40:29,701 networks, the more information we have, the better we can fight them, right. We 467 00:40:29,701 --> 00:40:33,920 don't need perfect knowledge, but the more knowledge that we have, the better an 468 00:40:33,920 --> 00:40:41,360 adversary we can be. I thankfully now live in Germany and if you are also a European 469 00:40:41,360 --> 00:40:46,770 resident: We have GDPR, which is the general data protection regulation and it 470 00:40:46,770 --> 00:40:55,650 goes into effect in May of 2018. We can use gdpr to make requests about our data, 471 00:40:55,650 --> 00:41:00,450 we can use GDPR to make requests about machine learning systems that we interact 472 00:41:00,450 --> 00:41:07,840 with, this is a right that we have. And in recital 71 of the GDPR it states: "The 473 00:41:07,840 --> 00:41:12,550 data subject should have the right to not be subject to a decision, which may 474 00:41:12,550 --> 00:41:17,730 include a measure, evaluating personal aspects relating to him or her which is 475 00:41:17,730 --> 00:41:22,880 based solely on automated processing and which produces legal effects concerning 476 00:41:22,880 --> 00:41:28,010 him or her or similarly significantly affects him or her, such as automatic 477 00:41:28,010 --> 00:41:33,620 refusal of an online credit application or e-recruiting practices without any human 478 00:41:33,620 --> 00:41:39,270 intervention." And I'm not a lawyer and I don't know how this will be implemented 479 00:41:39,270 --> 00:41:43,990 and it's a recital, so we don't even know, if it will be in force the same way, but 480 00:41:43,990 --> 00:41:50,720 the good news is: Pieces of this same sentiment are in the actual amendments and 481 00:41:50,720 --> 00:41:55,580 if they're in the amendments, then we can legally use them. And what it also says 482 00:41:55,580 --> 00:41:59,920 is, we can ask companies to port our data other places, we can ask companies to 483 00:41:59,920 --> 00:42:03,890 delete our data, we can ask for information about how our data is 484 00:42:03,890 --> 00:42:09,010 processed, we can ask for information about what different automated decisions 485 00:42:09,010 --> 00:42:15,750 are being made, and the more we all here ask for that data, the more we can also 486 00:42:15,750 --> 00:42:20,530 share that same information with people worldwide. Because the systems that we 487 00:42:20,530 --> 00:42:25,091 interact with, they're not special to us, they're the same types of systems that are 488 00:42:25,091 --> 00:42:30,610 being deployed everywhere in the world. So we can help our fellow humans outside of 489 00:42:30,610 --> 00:42:36,400 Europe by being good caretakers and using our rights to make more information 490 00:42:36,400 --> 00:42:41,960 available to the entire world and to use this information, to find ways to use 491 00:42:41,960 --> 00:42:46,242 adversarial learning to fool these types of systems. 492 00:42:47,512 --> 00:42:56,500 *applause* 493 00:42:56,662 --> 00:43:03,360 So how else might we be able to harness this for good? I cannot focus enough on 494 00:43:03,360 --> 00:43:08,260 GDPR and our right to collect more information about the information they're 495 00:43:08,260 --> 00:43:14,110 already collecting about us and everyone else. So use it, let's find ways to share 496 00:43:14,110 --> 00:43:17,740 the information we gain from it. So I don't want it to just be that one person 497 00:43:17,740 --> 00:43:21,020 requests it and they learn something. Se have to find ways to share this 498 00:43:21,020 --> 00:43:28,080 information with one another. Test low- tech ways. I'm so excited about the maker 499 00:43:28,080 --> 00:43:32,850 space here and maker culture and other low-tech or human-crafted ways to fool 500 00:43:32,850 --> 00:43:37,890 networks. We can use adversarial learning perhaps to get good ideas on how to fool 501 00:43:37,890 --> 00:43:43,350 networks, to get lower tech ways. What if I painted red pixels all over my face? 502 00:43:43,350 --> 00:43:48,600 Would I still be recognized? Would I not? Let's experiment with things that we learn 503 00:43:48,600 --> 00:43:53,570 from adversarial learning and try to find other lower-tech solutions to the same problem 504 00:43:55,428 --> 00:43:59,930 Finally. or nearly finally, we need to increase the research beyond just 505 00:43:59,930 --> 00:44:04,010 computer vision. Quite a lot of adversarial learning has been only in 506 00:44:04,010 --> 00:44:08,220 computer vision and while I think that's important and it's also been very 507 00:44:08,220 --> 00:44:12,030 practical, because we can start to see how we can fool something, we need to figure 508 00:44:12,030 --> 00:44:15,920 out natural language processing, we need to figure out other ways that machine 509 00:44:15,920 --> 00:44:19,933 learning systems are being used, and we need to come up with clever ways to fool them. 510 00:44:21,797 --> 00:44:26,000 Finally, spread the word! So I don't want the conversation to end here, I don't 511 00:44:26,000 --> 00:44:30,950 want the conversation to end at Congress, I want you to go back to your hacker 512 00:44:30,950 --> 00:44:36,530 collective, your local CCC, the people that you talk with, your co-workers and I 513 00:44:36,530 --> 00:44:41,340 want you to spread the word. I want you to do workshops on adversarial learning, I 514 00:44:41,340 --> 00:44:47,930 want more people to not treat this AI as something mystical and powerful, because 515 00:44:47,930 --> 00:44:52,340 unfortunately it is powerful, but it's not mystical! So we need to demystify this 516 00:44:52,340 --> 00:44:57,040 space, we need to experiment, we need to hack on it and we need to find ways to 517 00:44:57,040 --> 00:45:02,310 play with it and spread the word to other people. Finally, I really want to hear 518 00:45:02,310 --> 00:45:10,480 your other ideas and before I leave today have to say a little bit about why I 519 00:45:10,480 --> 00:45:15,820 decided to join the resiliency track this year. I read about the resiliency track 520 00:45:15,820 --> 00:45:21,910 and I was really excited. It spoke to me. And I said I want to live in a world 521 00:45:21,910 --> 00:45:27,230 where, even if there's an entire burning trash fire around me, I know that there 522 00:45:27,230 --> 00:45:32,010 are other people that I care about, that I can count on, that I can work with to try 523 00:45:32,010 --> 00:45:37,840 and at least protect portions of our world. To try and protect ourselves, to 524 00:45:37,840 --> 00:45:43,940 try and protect people that do not have as much privilege. So, what I want to be a 525 00:45:43,940 --> 00:45:49,240 part of, is something that can use maybe the skills I have and the skills you have 526 00:45:49,240 --> 00:45:56,590 to do something with that. And your data is a big source of value for everyone. 527 00:45:56,590 --> 00:46:02,820 Any free service you use, they are selling your data. OK, I don't know that for a 528 00:46:02,820 --> 00:46:08,420 fact, but it is very certain, I feel very certain about the fact that they're most 529 00:46:08,420 --> 00:46:12,560 likely selling your data. And if they're selling your data, they might also be 530 00:46:12,560 --> 00:46:17,730 buying your data. And there is a whole market, that's legal, that's freely 531 00:46:17,730 --> 00:46:22,670 available, to buy and sell your data. And they make money off of that, and they mine 532 00:46:22,670 --> 00:46:28,910 more information, and make more money off of that, and so forth. So, I will read a 533 00:46:28,910 --> 00:46:35,410 little bit of my opinions that I put forth on this. Determine who you share your data 534 00:46:35,410 --> 00:46:41,910 with and for what reasons. GDPR and data portability give us European residents 535 00:46:41,910 --> 00:46:44,410 stronger rights than most of the world. 536 00:46:44,920 --> 00:46:47,940 Let's use them. Let's choose privacy 537 00:46:47,940 --> 00:46:52,800 concerned ethical data companies over corporations that are entirely built on 538 00:46:52,800 --> 00:46:58,260 selling ads. Let's build start-ups, organizations, open-source tools and 539 00:46:58,260 --> 00:47:05,691 systems that we can be truly proud of. And let's port our data to those. 540 00:47:05,910 --> 00:47:15,310 *Applause* 541 00:47:15,409 --> 00:47:18,940 Herald: Amazing. We have, we have time for a few questions. 542 00:47:18,940 --> 00:47:21,860 K.J.: I'm not done yet, sorry, it's fine. Herald: I'm so sorry. 543 00:47:21,860 --> 00:47:24,750 K.J.: *Laughs* It's cool. No big deal. 544 00:47:24,750 --> 00:47:31,520 So, machine learning. Closing remarks is brief round up. Closing remarks. There is 545 00:47:31,520 --> 00:47:35,250 that machine learning is not very intelligent. I think artificial 546 00:47:35,250 --> 00:47:39,330 intelligence is a misnomer in a lot of ways, but this doesn't mean that people 547 00:47:39,330 --> 00:47:43,830 are going to stop using it. In fact there's very smart, powerful, and rich 548 00:47:43,830 --> 00:47:49,850 people that are investing more than ever in it. So it's not going anywhere. And 549 00:47:49,850 --> 00:47:53,620 it's going to be something that potentially becomes more dangerous over 550 00:47:53,620 --> 00:47:58,570 time. Because as we hand over more of these to these systems, it could 551 00:47:58,570 --> 00:48:04,240 potentially control more and more of our lives. We can use, however, adversarial 552 00:48:04,240 --> 00:48:09,320 machine learning techniques to find ways to fool "black box" networks. So we can 553 00:48:09,320 --> 00:48:14,400 use these and we know we don't have to have perfect knowledge. However, 554 00:48:14,400 --> 00:48:18,930 information is powerful. And the more information that we do have, the more were 555 00:48:18,930 --> 00:48:25,860 able to become a good GDPR based adversary. So please use GDPR and let's 556 00:48:25,860 --> 00:48:31,230 discuss ways where we can share information. Finally, please support open- 557 00:48:31,230 --> 00:48:35,590 source tools and research in this space, because we need to keep up with where the 558 00:48:35,590 --> 00:48:41,790 state of the art is. So we need to keep ourselves moving and open in that way. And 559 00:48:41,790 --> 00:48:46,670 please, support ethical data companies. Or start one. If you come to me and you say 560 00:48:46,670 --> 00:48:50,240 "Katharine, I'm going to charge you this much money, but I will never sell your 561 00:48:50,240 --> 00:48:56,520 data. And I will never buy your data." I would much rather you handle my data. So I 562 00:48:56,520 --> 00:49:03,390 want us, especially those within the EU, to start a new economy around trust, and 563 00:49:03,390 --> 00:49:12,740 privacy, and ethical data use. *Applause* 564 00:49:12,740 --> 00:49:15,830 Thank you very much. Thank you. 565 00:49:15,830 --> 00:49:18,050 Herald: OK. We still have time for a few questions. 566 00:49:18,050 --> 00:49:20,390 K.J.: No, no, no. No worries, no worries. Herald: Less than the last time I walked 567 00:49:20,390 --> 00:49:23,870 up here, but we do. K.J.: Yeah, now I'm really done. 568 00:49:23,870 --> 00:49:27,730 Herald: Come up to one of the mics in the front section and raise your hand. Can we 569 00:49:27,730 --> 00:49:31,584 take a question from mic one. Question: Thank you very much for the very 570 00:49:31,584 --> 00:49:37,860 interesting talk. One impression that I got during the talk was, with the 571 00:49:37,860 --> 00:49:42,420 adversarial learning approach aren't we just doing pen testing and Quality 572 00:49:42,420 --> 00:49:47,920 Assurance for the AI companies they're just going to build better machines. 573 00:49:47,920 --> 00:49:52,910 Answer: That's a very good question and of course most of this research right now is 574 00:49:52,910 --> 00:49:56,780 coming from those companies, because they're worried about this. What, however, 575 00:49:56,780 --> 00:50:02,290 they've shown is, they don't really have a good way to fool, to learn how to fool 576 00:50:02,290 --> 00:50:08,710 this. Most likely they will need to use a different type of network, eventually. So 577 00:50:08,710 --> 00:50:13,440 probably, whether it's the blind spots or the linearity of these networks, they are 578 00:50:13,440 --> 00:50:18,000 easy to fool and they will have to come up with a different method for generating 579 00:50:18,000 --> 00:50:24,520 something that is robust enough to not be tricked. So, to some degree yes, its a 580 00:50:24,520 --> 00:50:28,520 cat-and-mouse game, right. But that's why I want the research and the open source to 581 00:50:28,520 --> 00:50:33,410 continue as well. And I would be highly suspect if they all of a sudden figure out 582 00:50:33,410 --> 00:50:38,170 a way to make a neural network which has proven linear relationships, that we can 583 00:50:38,170 --> 00:50:42,560 exploit, nonlinear. And if so, it's usually a different type of network that's 584 00:50:42,560 --> 00:50:47,430 a lot more expensive to train and that doesn't actually generalize well. So we're 585 00:50:47,430 --> 00:50:51,280 going to really hit them in a way where they're going to have to be more specific, 586 00:50:51,280 --> 00:50:59,620 try harder, and I would rather do that than just kind of give up. 587 00:50:59,620 --> 00:51:02,560 Herald: Next one. Mic 2 588 00:51:02,560 --> 00:51:07,840 Q: Hello. Thank you for the nice talk. I wanted to ask, have you ever tried looking 589 00:51:07,840 --> 00:51:14,720 at from the other direction? Like, just trying to feed the companies falsely 590 00:51:14,720 --> 00:51:21,560 classified data. And just do it with so massive amounts of data, so that they 591 00:51:21,560 --> 00:51:25,380 learn from it at a certain point. A: Yes, that's these poisoning attacks. So 592 00:51:25,380 --> 00:51:30,020 when we talk about poison attacks, we are essentially feeding bad training data and 593 00:51:30,020 --> 00:51:35,120 we're trying to get them to learn bad things. Or I wouldn't say bad things, but 594 00:51:35,120 --> 00:51:37,540 we're trying to get them to learn false information. 595 00:51:37,540 --> 00:51:42,781 And that already happens on accident all the time so I think the more to we can, if 596 00:51:42,781 --> 00:51:46,491 we share information and they have a publicly available API, where they're 597 00:51:46,491 --> 00:51:49,970 actually actively learning from our information, then yes I would say 598 00:51:49,970 --> 00:51:55,180 poisoning is a great attack way. And we can also share information of maybe how 599 00:51:55,180 --> 00:51:58,360 that works. So especially I would be intrigued if we 600 00:51:58,360 --> 00:52:02,330 can do poisoning for adware and malicious ad targeting. 601 00:52:02,330 --> 00:52:07,300 Mic 2: OK, thank you. Herald: One more question from the 602 00:52:07,300 --> 00:52:12,300 internet and then we run out of time. K.J. Oh no, sorry 603 00:52:12,300 --> 00:52:14,290 Herald: So you can find Katherine after. Signal-Angel: Thank you. One question from 604 00:52:14,290 --> 00:52:18,210 the internet. What exactly can I do to harden my model against adversarial 605 00:52:18,210 --> 00:52:21,210 samples? K.J.: Sorry? 606 00:52:21,210 --> 00:52:27,080 Signal: What exactly can I do to harden my model against adversarial samples? 607 00:52:27,080 --> 00:52:33,340 K.J.: Not much. What they have shown is, that if you train on a mixture of real 608 00:52:33,340 --> 00:52:39,300 training data and adversarial data it's a little bit harder to fool, but that just 609 00:52:39,300 --> 00:52:44,720 means that you have to try more iterations of adversarial input. So right now, the 610 00:52:44,720 --> 00:52:51,520 recommendation is to train on a mixture of adversarial and real training data and to 611 00:52:51,520 --> 00:52:56,330 continue to do that over time. And I would argue that you need to maybe do data 612 00:52:56,330 --> 00:53:00,400 validation on input. And if you do data validation on input maybe you can 613 00:53:00,400 --> 00:53:05,100 recognize abnormalities. But that's because I come from mainly like production 614 00:53:05,100 --> 00:53:09,220 levels not theoretical, and I think maybe you should just test things, and see if 615 00:53:09,220 --> 00:53:15,210 look weird you should maybe not take them into the system. 616 00:53:15,210 --> 00:53:19,340 Herald: And that's all for the questions. I wish we had more time but we just don't. 617 00:53:19,340 --> 00:53:21,660 Please give it up for Katharine Jarmul 618 00:53:21,660 --> 00:53:26,200 *Applause* 619 00:53:26,200 --> 00:53:31,050 *34c3 postroll music* 620 00:53:31,050 --> 00:53:47,950 subtitles created by c3subtitles.de in the year 2019. Join, and help us!