SDN, Network Virtualisation and Beyond

April 7, 2014

Martin Casado, CTO, Networking, VMwareVery technical companies like Google and Facebook and Amazon and Azure and Tencent and Baidu and Yahoo! with some of the most technical expertise on the face of the planet came up with their own architecture. And if you look at what that trend looks like, basically, what they did is they said, I’m going to move functionality that has traditionally been in the network, and I’m going to move it to software.
So things like security, things like security, things like fault isolation, things like billing, things like visibility and debugging, instead of being traditionally put in hardware in the network, they were moved into software. And there’s a lot of good reasons for doing this. If it’s in software, you can evolve it more quickly. If it’s in software, you’ve got more context, because you’re closer to the application.
Arguably, if it’s in software, you can scale it better, so there’s been this massive trend. And then if you look at these data centres, which again are the most successful data centres on the planet, they’re awesome in pretty much every vector and direction you can look at. I mean, if you look at the CapEx, people throw around numbers, but it seems to be about a fifth the cost to build a data centre like this.
They’re by far the most scalable data centres on the planet. Operational overheads are always the best in this case. So you’re like, wow, this is this great way to build a data centre and they’re awesome on every vector, so why doesn’t everybody do this?
So I’ll tell you why. The reason why everybody doesn’t do this is because you can only do this if you can rewrite your application. If I’m Google, I have the Google application, and I can put security in there, and I can put load balancing in there, and I can put billing in there. If I’m Amazon, I control the application, I can do the same thing. So if I control my application, I can build the most awesome data centre on the planet.

So the question is, how do you then build a data centre that has the same type of properties, right, but you don’t control the application? You’re the IT for Goldman Sachs, you’re a large enterprise, you’re a hospital — listen, we all want to have awesome data centres, but you don’t necessarily have these characteristics. And this is kind of where network virtualisation comes in, which is the work that we did at Nicira, and now that we’re continuing to do at VMware. So I want to actually describe it very quickly.
So what is network virtualisation? If you have a data centre, you have a physical network in the data centre. So that physical network could be anything. It could be Cisco infrastructure, it could be an IP network. It could be IP over InfiniBand, some physical infrastructure that provides connectivity.
So connected to that, of course, you have your servers, and then on your servers, let’s assume that you’re running virtualisation. So the idea is, just like these large data centres have pulled functionality away from the physical network and moved it into the application on the edge, the idea of network virtualisation is to use the position at the edge to create what you can think of as a network hypervisor. So all the functionality they’re pulling out of the physical network, you’re moving it to the edge, and then you’re exposing what looks like a physical network but is really a virtual abstraction.
So now the idea is I can deploy any application. I can attach it to one of these virtual abstractions, and the application thinks it’s running on a physical network, but these abstractions have the operational model of a VM. You can create them dynamically. You can grow them or shrink them or move them around or do whatever you want. So now, from a high level, you have the same types of characteristics of the Googles and the Facebooks and the Yahoo!s, which is you have functionality that’s written in software, that provides all of the operations. You can use any type of hardware that you want, but you’re gluing it to these applications in a way that they know about.
So when we first started doing this stuff, most people thought we were crazy. We were working with the real early adopters, and there was a lot of joint partnerships.
We worked with some of the largest clouds in the world. We worked with some of the largest telcos in the world, the largest financials in the world.
But it’s been very interesting to see this evolve. And this tends to happen with virtualisation in general, which is if you think about compute virtualisation, like what VMware does, so with compute virtualisation, when you started to bring it out, people just viewed it as something that will allow them consolidate two servers into one server, right? And it’s just a very simple tool that allowed for a very simple value proposition, but over time, virtualisation, it tends to be that kind of proverbial indirection layer in computer science. Once it’s there, you can leverage it to do great things.
And so with compute virtualisation, you started with this very simple value proposition of server consolidation, but then over time this grew to be cloud, right? You have things like vMotion, you have things like full data centre provisioning. And so the same thing has kind of started to happen with network virtualisation. You start with this very simple use case. I’ll talk about this now. And the very simple use case was provisioning.
So at a macro level, you can say, okay, listen, you’ve got a data centre. It’s expensive, it’s hard to operate. If you go to this new model, everything’s better, but the reality is, it’s very difficult to consume new technology, so you normally have to point to some very simple if you put this in your life is better because of X. And the initial use case that people adopted network virtualisation on, the initial use case was provisioning time. It goes something like this.
If you’re going to deploy a new application, spinning up a VM takes 30 seconds, but configuring the network takes two months. There’s a huge mismatch here, so if I reduce the time it takes to provision the network to zero, you’re happier. That was the initial value proposition of network virtualisation. That was it.
I’m going to reduce the time it takes to provision the network to zero and I’m going to remove a hurdle to do something for the business, whether that’s onboard a new customer or onboard a new employee or deploy a new application or whatever. And so over the last three years, we’ve kind of seen this be adopted. It started being adopted, and then the service providers, and then the cloud guys, and then test and dev environments, we’ve seen a lot of traction in the financials. And over the last year, we’ve actually kind of seen this grow out.
Now, I think we announced 31 customers in the last few months, and three of the top five financials and beverage companies and conservative Midwest manufacturing companies. This is really starting to catch on, and what I find very interesting as a technologist is as you adopt these kind of primitive platforms like virtualisation, to see how it captures the imagination and to move into new types of use cases.
And as the market matures, and it actually is maturing. Early adopter sales are very difficult, because you’ve got to take the technology, and you’ve got to sell it into a particular company, and you’ve got to educate them, and it takes years, and it’s a very technical type of discussion, but as markets mature, they can consume technologies much easier, right?
So, for example, last quarter, I think we had — I found two customers that adopted network virtualisation that I’d never spoke to. Nobody on my team had ever spoken to. And then, when I talked to the sales guy that sold it to them, he didn’t know what network virtualisation was. It’s like the first example where you actually have a pull or a draw that’s coming from the field, and you ask why that is.
Well, it’s because they’re starting to understand this stuff. All of the big companies are talking about it now, there’s general education, and so a lot of times I think people view SDN and network virtualisation as this existential threat, this thing that’s coming and whatever. But largely it’s here. We’ve got the use cases, we’ve got the proof points. And so what’s been interesting to me is to watch the evolution of the use of this. You’re starting off in provisioning, you’re starting off with a simple use case, but more and more, it’s become security, actually, that’s driving a lot of sales of this.
And actually, I didn’t anticipate this early on, and I want to dedicate the last half of my talk to exactly this use case. And so I would say about 40% of the actual adopters that are paying money for SDN network virtualisation are doing it as a security use case. And there’s kind of two driving kind of sub-use cases. The first one is micro-segmentation, which is basically I have a data centre. Right now, data centres have tons of shared state and tons of shared services and a huge attack surface.
So I’m not sure if you guys know, I used to work for the intelligence agencies. So before I went to Stanford, I actually did computer security. I did kind of operations, where I would actually break into things. And let me tell you, a data centre has almost no controls in it at all. Like, 80% of our spend is on the perimeter, and that’s a Maginot Line. So if I can pay somebody off or I can put on a black mask and I can break into the building and I can install some code on a server or I can remotely exploit a server, if I get in the data centre, I’m done. That’s because that’s where all the data is, and there’s almost no controls within the data centre.
Why? It’s very difficult to control a terabit worth of bandwidth. That’s why we build boxes and we put them on the perimeter. So what’s the state of data centre networking today? We’ve got this Maginot Line of middle boxes that we put on the perimeter, and we’ve got where all the valuable stuff is kind of — attackers have unfettered access to.
So the more we can develop technologies within the data centre to add controls, to do things like micro-segmentation and limiting the attack surface, the better position that we are in protecting the data centre and the assets within it. And this has become, I think, the driving use case going forward. And as things like SDN and network virtualisation cross the chasm, I think it’s security that’s going to do it.
So just to pencil this out very quickly, and I want to make sure that I stay on time here. So to pencil this out very quickly, the idea is as follows. So let’s say Martin is in his previous role and I’m attacking a data centre. So what do I do? Let’s say I pay somebody off within the data centre to deploy some code on some server, right? So that code is on some server.
So, now, if I scan from that server, what can I see? Everything. I can see the physical network, which in a physical network will have 50 versions of iOS, which is many tens of millions of lines of code. I’ve got shared DHCP. I’ve got shared DNS. I’ve got shared AD. I can see every one of the other servers.
Now, who knows what server I might have compromised. It could have been some test dev server. It could have been something that was plugged in to support a legacy app that’s been running a long time ago, and if I compromise it and the millions of lines of code that are running on the end host, I have unfettered access to everything that I want.
So what do you want to do? What you want to do is you want to enforce what’s called the principle of least privilege, which is I want to take any application that’s running on the data centre and I only want to give it access to exactly what it needs to get the job done and nothing else.
It’s pretty silly that if I compromise an application I can see the physical infrastructure. There’s no reason for me to see that. It’s pretty silly that I have to share all of these components, so that if I’m able to smack one of these components — so if I compromise the server and I go ahead and I smack DNS or AD, then I have access to everything within the data centre or shared storage.
So the idea is you use network virtualisation as a primitive, as building blocks to build micro-segments. And if I put something within one of those virtual networks, or within one of those segments, the only thing that it can see are also in those segments. So, for example, for every application I can create a virtual network. I can give it its own security services. I can give it its own L4 through 7 services, and if it gets compromised, the attack gets localised to just that.
So this is kind of driving a lot of the adoption of network virtualisation, which is cool. Again, as a technologist, you come up with these core architectures and you come up with these core products, and then it starts getting driven into areas that you hadn’t really anticipated.
And so now, I’ve been spending the last six months actually looking at the security problem, and so I’m going to take the last portion of this talk to say where I think that security is going kind of from a vision perspective. So I’m going to tee this up. Oops. That was like half of my slides.
The good news is I actually have these memorised, so I can talk about them. You guys have the rest of my slides there, or do you want me to just keep talking? What’s that? That’s right, they’ve been isolated.
So let me just go ahead and move on. So like I told you, I was in security 10 years ago. I took a hiatus. Guido and I were at Stanford together, good friends, did a bunch of great stuff, focused on networking, and then I come back to security. And the funny thing is like almost nothing seems to have changed in 10 years, as far as I can tell.

We’ve been looking at the trends of security. So what are they? Well, security spend is outpacing IT spend, right? And the only thing — great, cool. No worries. So the only thing that seems to be outpacing security spend is security losses. It’s like we’re losing this battle, we can’t spend our way out of the battle. And to me, this is opportunity, and there’s something fundamentally architecturally wrong.
It was just like with SDN. For SDN, you’re like, you’ve got computers you can program to do cool stuff and you’ve got networks that you do almost nothing with and operations is getting worse over time. So you’ve got this trend that if I take the slider bar out to the future, I’m like, wow, we’re going to spend all of our time on the network. That’s opportunity for an architectural shift.
I think we’re at the exact same place with security, which is like, if you look at all the trends, you take the slider bar out to the future, 100% of our money is going to be in security. It’s the quickest growing, both on losses and both on spend.
So we’ve been developing this concept called the Goldilocks zone, which is a corollary to network virtualisation, but taking advantage of the hypervisor. So what is the Goldilocks zone? The Goldilocks zone is a term that was created by NASA planetary — I think it was planetary scientists in the 1970s, and it describes the perfect distance away from the sun for planet to be able to sustain life, so not too hot and not too cold.
So I think — I think that in the modern data centre, one thing that’s missing is a horizontal security layer that provides both context in isolation to do security, so I’m going to describe this by describing the lack of it. So today, when we do security in the data centre, there’s this basic trade-off between context and isolation.
So if I take a security control, like whatever it is, a firewall or some agent, and then I put it in the application, it’s got all of this great context. It knows the users, it knows data, it knows files. But you don’t have any isolation. You don’t trust the application. You don’t trust the endpoint, so putting a security control there is kind of like taking the on-off switch to an alarm system and putting it on the outside of a house. It doesn’t make any sense.
On the other hand, and this is to the bottom, we say, okay, well, maybe I’ll put the security control in the infrastructure. So what I’m going to do is I’m going to put ACLs or whatever on switches and routers. And there you actually have good isolation. If I’m able to break into a server, I haven’t broken into the router, necessarily. The attack surface is much smaller, but the problem is, even though I have isolation, I don’t have any context. I don’t really know users really, I don’t know applications. I don’t have access to local file systems.
So I’m doing this fundamental trade-off between I know everything, but I don’t have any real security, or I know nothing and I’m pretty isolated. And so the question we’ve been asking is, can you build a Goldilocks layer that goes ubiquitously throughout the data centre that provides both context and isolation? And so, given that the majority of workloads are virtualised — certainly the majority of enterprise workloads are virtualised, 40m VMs are out there, just under VMware alone. Some 70% to 80% of enterprise workloads are arguably virtualised.
If you could use the hypervisors — the hypervisor is in a separate trust domain. If you could use the hypervisor to both peer into the application to pull out meaningful context, like users and applications and what things are doing but also protect that visibility and provide protection and enforcement, you kind of have this optimal place, where you have both this visibility and context and the isolation.
And so this is kind of a major area that I’m looking into, because again, given the state of the security industry and if things go the way we are, we’re going to be spending all our time and money on it, we do need something that will change the architecture and the way we view it. And I do pause it, and I’ll stand behind this going forward, that what we’re missing is we’re missing a horizontal layer that we can provide meaningful security.
And so if I can build that out and we can build that out as a platform, new security services can snap on top of this to do things like, for example, next-generation firewalling with deep visibility in the end host, or maybe network access control that actually understands things like objects and people or meaningful policy or vulnerability assessment, where you’re actually looking in and saying, there’s this vulnerable piece of code, so I’m going to immediately remediate this.
I think that this actually cuts across many areas of security, and every time now I go through a new vertical in security — so security always seems to be like a litany of stuff. It’s kind of these different verticals that are loosely coupled. But if you look at data centre security, whether it’s end host security, whether it’s network access control, vulnerability assessment, whether it’s IDS or IPS, all of them would be affected by something like this. All of them need better isolation and all of them need more context.
So if we can build out this layer or this Goldilocks zone, I think we can actually move security in very much the same way that we have moved networking over the past seven years. I mean, I dedicated my life to SDN, and I think that we have the same type of opportunity here.
And so I’m going to leave you with this. Compute changes — the model of computing changes very rarely, right? Mainframe, client server, and from client server we’re going to cloud. And I think we’re seeing shifts happen in the network architecture, and I think that’s great and I think this is happening, but I think this is like the one time, sort of the once in a wave opportunity, as we’re redefining these new architectures, to actually build security in as a primitive, as a fundamental primitive. So we have a root of trust. So you have a horizontal security layer that you can build rich systems on top of.
By
Martin Casado, CTO, Networking, VMware

Comments are closed.