Syntax Error Nullification and Coding Analytics Workgroup alt text here

Cookie Expiration

Day 1 at the OWASP conference in Irvine. Lots of good people here, and tons of good conversations. Talking with Jeremiah from Whitehat and Sid Stamm from Mozilla reminded me that I wanted to talk about cookie expiration. I’m only talking for myself here, and not the average user – but I really dislike the concept of persistent cookies. If I wanted something to persist, I wouldn’t use sandboxes, and violently and regularly clean my cookies by hand. Yet still – cookies persist way too long. Realistically there’s two types of attacks that involve the persistence of cookies. The first is a drive by opportunistic exploit – let’s say you’re on a porn site and it forces your browser to visit MySpace or Facebook and because you’re probably logged in, boom, your compromised via CSRF or clickjacking or whatever. The second is where the attacker knows you’re logged in because they’re attacking you through the very platform that they intend to compromise (likejacking is a good example).

Although we can’t do much about the second case, the first case it comes down to cookie expiration in large part. Why should a browser hold onto a cookie just because the site told it to? If I’m not actively sending requests to the site in question there’s a good chance I don’t want my browser to send cookies after X amount of time. In my case, X is probably an hour or two max (considering I take lunches). Maybe some people would argue that they don’t want to be hassled by typing their webmail password in more than once per day. Okay, fine, but the point is the magic number probably isn’t once every two weeks, or once a month or once every 20 years, for most security people (I’d hope). So perhaps we need to consider a default mechanism for timing cookies out when they’re not actively being sent to the server, regardless of what the server wants. Incidentally, Sid thinks this would make a good addon. Takers?

The Effect of Snakeoil Security

I’ve talked about this a few times over the years during various presentations but I wanted to document it here as well. It’s a concept that I’ve been wrestling with for 7+ years and I don’t think I’ve made any headway in convincing anyone, beyond a few head nods. Bad security isn’t just bad because it allows you to be exploited. It’s also a long term cost center. But more interestingly, even the most worthless security tools can be proven to “work” if you look at the numbers. Here’s how.

Let’s say hypothetically that you have only two banks in the entire world: banka.com and bankb.com. Let’s say Snakoil salesman goes up to banka.com and convinces banka.com to try their product. Banka.com is thinking that they are seeing increased fraud (as is the whole industry), and they’re willing to try anything for a few months. Worst case they can always get rid of it if it doesn’t do anything. So they implement Snakeoil into their site. The bad guy takes one look at the Snakeoil and shrugs. Is it worth bothering to figure out how banka.com security works and potentially having to modify their code? Nah, why not just focus on bankb.com double up the fraud, and continue doing the exact same thing they were doing before?

Suddenly banka.com is free of fraud. Snakeoil works, they find! They happily let the Snakeoil salesman use them as a use case. So our Snakeoil salesman goes across the street to bankb.com. Bankb.com has seen a two fold increase in fraud over the last few months (all of banka.com’s fraud plus their own), strangely and they’re desperate to do something about it. Snakeoil salesman is happy to show them how much banka.com has decreased their fraud just by buying their shoddy product. Bankb.com is desperate so they say fine and hand over the cash.

Suddenly the bad guy is presented with a problem. He’s got to find a way around this whole Snakeoil software or he’ll be out of business. So he invests a few hours, finds an easy way around it and voila. Back in business. So the bad guy again diversifies his fraud across both banks again. Banka.com sees an increase in fraud back to the old days, which can’t be correlated to anything having to do with the Snakeoil product. Bankb.com sees their fraud drop immediately after having installed the Snakeoil therefore proving that it works twice if you just look at the numbers.

Meanwhile what has happened? Are the users safer? No, and in fact, in some cases it may even make the users less safe (incidentally, we did manage finally stop AcuTrust as the company is completely gone now). Has this stopped the attacker? Only long enough to work around it. What’s the net effect? The two banks are now spending money on a product that does nothing but they are now convinced that it is saving them from huge amounts of fraud. They have the numbers to back it up – although the numbers are only half the story. Now there’s less money to spend on real security measures. Of course, if you look at it from either bank’s perspective the product did save them and they’ll vehemently disagree that the product doesn’t work, but it also created the problem that it solved in the case of bankb.com (double the fraud).

This goes back to the bear in the woods analogy that I personally hate. The story goes that you don’t have to run faster than the bear, you just have to run faster than the guy next to you. While that’s a funny story, that only works if there are two people and you only encounter one bear. In a true ecosystem you have many many people in the same business, and you have many attackers. If you leave your competitor(s) out to dry that may seem good for you in the short term, but in reality you’re feeding your attacker(s). Ultimately you are allowing the attacker ecosystem to thrive by not reducing the total amount of fraud globally. Yes, this means if you really care about fixing your own problem you have to help your competitors. Think about the bear analogy again. If you feed the guy next to you to the bear, now the bear is satiated. That’s great for a while, and you’re safe. But when the bear is hungry again, guess who he’s going after? You’re much better off working together to kill or scare off the bear in that analogy.

Of course if you’re a short-timer CSO who just wants to have a quick win, guess which option you’ll be going for? Jeremiah had a good insight about why better security is rarely implemented and/or sweeping security changes are rare inside big companies. CSOs are typically only around for a few years. They want to go in, make a big win, and get out before anything big breaks or they get hacked into. After a few years they can no longer blame their predecessor either. They have no incentive to make things right, or go for huge wins. Those wins come with too much risk, and they don’t want their name attached to a fiasco. No, they’re better off doing little to nothing, with a few minor wins that they can put on their resume. It’s a little disheartening, but you can probably tell which CSOs are which by how long they’ve stayed put and by the scale of what they’ve accomplished.

Browser Detection Autopwn, etc…

I often find myself thinking about egyp7’s DefCon speech last year. He was talking about browser autopwn, which was a relatively new concept at that time being built into Metasploit. Pretty cool technology, and with only one minor mishap he was able to demonstrate it on stage with impressive results. That’s all well and fine, and you should check it out, but one thing stuck out from the presentation more than the technology itself.

Version 3.2 includes exploit modules for recent Microsoft flaws, such as MS08-041, MS08-053, MS08-059, MS08-067, MS08-068, and many more.   The module format has been changed in version 3.2. The new format removes the previous naming and location restrictions and paved the way to an improved module loading and caching backend. For users, this means being able to copy a module into nearly any subdirectory and be able to immediately use it without edits.  The Byakugan WinDBG extension developed by Pusscat has been integrated with this release, enabling exploit developers to quickly exploit new vulnerabilities using the best Win32 debugger available today.  The Context-Map payload encoding system development by I)ruid is now enabled in this release, allowing for any chunk of known process memory to be used as an encoding key for Windows payloads.  The Incognito token manipulation toolkit, written by Luke Jennings, has been integrated as a Meterpreter module. This allows an attacker to gain new privleges through token hopping. The most common use is to hijack domain admin credentials once remote system access is obtained.  The PcapRub, Scruby, and Packetfu libraries have all been linked into the Metasploit source tree, allowing easy packet injection and capture.  The METASM pure-Ruby assembler, written by Yoann Guillot and Julien Tinnes, has gone through a series of updates. The latest version has been integrated with Metasploit and now supports MIPS assembly and the ability to compile C code.  The Windows payload stagers have been updated to support targets with NX CPU support. These stagers now allocate a read/write/exec segment of memory for all payload downloads and execution.  Executables which have been generated by msfpayload or msfencode now support NX CPUs. The generated executable is now smaller and more reliable, opening the door to a wider range of uses. The psexec and smb_relay modules now use an executable template thats acts like a real Windows service, improving the reliability and cleanup requirements of these modules.  The Reflective DLL Injection technique pioneered by Stephen Fewer of Harmony Security has been integrated into the framework. The new payloads use the “reflectivedllinjection” stager prefix and share the same binaries as the older DLL injection method.  Client-side browser exploits now benefit from a set of new javascript obfuscation techniques developed by Egypt. This improvement leads to a greater degree of anti-virus bypass for client-side exploits.  Metasploit contains dozens of exploit modules for web browsers and third-party plugins. The new browser_autopwn module ties many of these together with advanced fingerprinting techniques to deliver more shells than most pen-testers know what to do with.  This release includes a set of man-in-the-middle, authentication relay, and authentication capture modules. These modules can be integrated with a fake proxy (WPAD), a malicious access point (Karmetasploit), or basic network traffic interception to gain access to client machines. These modules tie together browser_autopwn, SMB relaying, and HTTP credential and form capturing to pillage data from client systems.  Nearly all Metasploit modules now support IPv6 transports. IPv6 stagers exist for the Windows and Linux platforms, opening the door for penetration testing of pure IPv6 networks. The VNCInject and Meterpreter payloads have been extensively tested over IPv6 sockets.  Efrain Torres’s WMAP project has been merged into Metasploit. WMAP is general purpose web application scanning framework that can be automated through integration with an attack proxy (ratproxy) or be accessed as individual auxiliary modules.  Egypt’s new PHP payloads provide complete bind, reverse, and findsock support for PHP web application exploits. If you are sick of C99 and R57 and looking to gain a “real” shell from one of the hundreds of RFI flaws listed on milw0rm, the new PHP payloads work great against multiple=By doing variable detection he could find out everything down to the individual patch level of the device in most cases. Of course a bad guy can mess with these variables and lie, which egyp7 admitted to. But, wisely he said something to the effect that if you find a browser that is lying about it’s user agent, you probably have found yourself a browser hacker, and you don’t want to try to be owning his browser anyway. Once you find yourself in this condition, bail. The idea mirrors a lot of the type of stuff I wrote about in Detecting Malice. By identifying the signature of browsers and how people navigate sites you can know a lot about your potential adversary. Either for good or, in the case of autopwn, evil. Growing this signature database over time could be very useful as attention on browser exploitation increases and the need for understanding user traffic and intent grows.

The Perils of Speeding up the Browser

17 posts left until the end…

A year or so ago I went to go visit the Intel guys at their internal conference that they throw (similar to Microsoft’s Bluehat). I honestly had no idea what to tell a bunch of hardware guys. What correlation does chip manufacturing really have with browsers or webapps. Well virtualization and malware certainly, but what else? It got me thinking… one of the things they are in direct control over is how fast operating systems (and subsequently browsers) work. I talked it over with id before going out there. Faster is better right?

I’ve got mixed feelings about fast vs slow browsers. When something is slow, you can actually detect that something strange is going on. It’s also easier to stop it from mis-behaving if an attack takes a while. When it’s fast, it’s much harder to notice that your computer had to chug for a while to do something complex and much less likely that a user can intervene. There have been a number of exploits out there that have really been proof of concept only. They’re deemed not practical because they take too long, or hang the browser temporarily while they’re being executed. If the speed barrier is removed, then suddenly those old proof of concepts (think res:// timing attacks and so on) are actually much easier to perform. So while I think innovation and performance improvement is a good thing overall, it does come with some unintended consequences.

Browser Differences, Minutia Et Al…

18 posts left…

I got an email last night from someone asking me to do a breakdown of which browser is better, Internet Explorer, Firefox, Opera, Safari and Chrome. First of all, there’s already a pretty good reference that Michal Zalewski put together. Like anything this comprehensive, since it’s not been edited for about half a year it’s already out of date in a few ways, but it’s a great place to get started for those who want to get familiar with the internal differences between various browsers. No need to re-invent the wheel, go read it. Now, that’s the purely technical side, but there is one thing that’s wildly missing from most documents that talk about browser security.

Browser security often turns into a religious war amongst technologists, instead of thinking about it pragmatically. What are the real motives of the companies that are developing the browsers? In most cases they care primarily about market share because market share makes them money (through search engine agreements, and so on). So now you have to think about yourself and your needs. What kind of user are you? I tend to be a very security conscious person, and if you’re reading this you probably are too. I’m willing to severely degrade my usability for an increase in security, whereas most users are not. So the browser I will tend towards is one that offers me the flexibility to make those decisions for myself while still giving me enough usability to be able to do anything I need to do, when I decide to. This is why Firefox has been my personal browser of choice for years – but don’t be confused and think it’s because I think Firefox is more secure out of the box. Firefox has just as many flaws as other browsers, by default.

While security people’s needs are important, if you look at the number of people who are security folks compared to the rest of the world, we are insignificant as a percentage. That means that it is not in the browser company’s interest to focus on appeasing security people. Sure, it’s nice to have a browser that is secure, but that’s not ever going to drive the volume of users necessary to make the real revenue for their organizations – or at least that’s what the market seems to be proving. Plus most of the major browsers above tout themselves as being more secure than their competitors – so normal consumers don’t know who to believe. As such, while I think all the major browsers mentioned above have their pros and cons, none of them are designed with security first. They’re designed for a different set of users in mind (which includes security people, but it also includes our grandmas, and tweens and cousin Cletus), and that puts browser design choices somewhat at odds with security, because what does Cletus care or know about security? So that’s where plugins, addons, sandboxes, VMs, etc… come into play. It’s like wearing a condom around your browser, if you like. It gives us the ability to use the same underlying product while still protecting ourselves as much as possible.

I honestly think most browsers can be made to be very secure, if you’re willing to sacrifice all usability – not completely secure, no doubt, but far more secure than any of the major browsers above ship by default. So, it’s a little hard for me to play favorites. They each have their own security mess to clean up, so currently there is no good solution, and I don’t recommend any browsers to anyone (although you people still on IE6 really should upgrade already). The work involved in really securing your browser simply isn’t worth explaining to most people. In fact, “which browser do you use” is my least favorite question, because it’s not as simple as a single word. Boutique browsers, while interesting, don’t often have the support behind them to make them useful for a lot of the more common applications (lacking vast plugin support, etc…) although of anyone, they actually could align themselves nicely with the needs of security people. So, while I think browser security is often about minutia, we need to fully grasp the market forces at work before getting completely fed up by a constant string of functionality that only makes it less secure, instead of expecting dramatic security improvements. Or we need to pick something more obscure and assume the risks involved with a product that is not tried and true. It’s not an easy problem for us or the browser companies – I don’t envy their situation.

Throttling Traffic Using CSS + Chunked Encoding

19 posts left…

So Pyloris doesn’t work particularly well for port exhaustion on the server, but what if we can exhaust the connections on the client to better meter out traffic? That would make it easier for a MITM to see each individual request if it worked. So I started down a rather complicated path of using a mess load of link tags on an HTTP website trying to affect the connections on the HTTPS version of the same domain. No joy. It turns out that the limits placed on one port don’t affect what happens on another (at least in Firefox). So while I can exhaust all the connections to a domain over a single port I can’t do anything using HTTPS – or so it seemed (unless I was willing to muddy the water further by sending a bunch of requests that I knew are a certain size to the HTTPS site – which just seemed more painful than helpful).

Then, based on some earlier research I stormed into id’s office and I started bitching that there was no point in trying to stop port exhaustion if they were going to allow tons of connections, just over multiple sockets anyway. As the words came out of my mouth I realized I had come up with the answer – a ton of webservers. I guessed that there must be some upper bound of outbound connections and it’s probably at or less than 130. You should have seen id’s face as I asked him to set up 130 connections / 6 connections per socket = 22 web-servers for me. Hahah… I thought he’d kill me.

It turns out it’s nowhere near 130 open connections. Firefox sets a rather arbitrary 30 connection limit. So if you can create 5 open web-servers and exhaust 30 connections and only free up one long enough to allow the victim to download one request at a time, I think we’re in business. Makes sense in theory. The problem is that it’s REALLLLY slow. I mean… painful. In my testing it seemed more like the server was broken entirely from the victim’s perspective. But eventually… and in some cases I mean minutes later – it would load. I’m sure that the attack could be optimized to work based on the fact that no more packets are being sent when the image gets downloaded or whatever… which would signal the program to free up a connection. This is opposed to my crapola time based solution combined with chunked encoding to force the connection to stay open without downloading anything that I came up with for testing. So I bet this attack could work if someone put some tender loving care into it, but it was kind of a huge waste of time for me personally – and for poor id.

Incidentally, for those who have never seen or met id, and would like to know a little about the other side of webappsec that I don’t talk about much here (the configuration, operating system and network), you’re chance is nearing. There’s a rumor that he’ll be speaking at Lascon in October. He’ll be talking on how he’s managed to secure ha.ckers.org for all these years despite how much of a target I’ve made it. :) Should be fun.

Pyloris and Metering Traffic

20 posts left…

Pyloris is a python version of Slowloris, and since it is written in python it’s SSL version is thread safe. So what better way to lock up an SSL/TLS Apache install (given that Apache still hasn’t fixed their DoS)? Well, one of the big problems attackers have when trying to decipher SSL/TLS traffic is the fact that browsers not only send a lot of request down a single connection but they also connect use a bunch of open connections over separate sockets. What if we could use pyloris to exhaust all but one open socket?

Well it turns out that while this sorta works, there are a lot of issues with the concept. Firstly, it requires Apache. Secondly the server can’t be using a load balancer (assuming the load balancer isn’t using Apache itself). Thirdly it requires that there are no other users on the system or there will be a seriously annoying user experience for the poor victim who can’t reach the site that the man in the middle is trying to decipher traffic from. Alas… So while this didn’t work particularly well in my testing, I’m certain with more thinking someone can figure out a way to do this.

XSHM Mark 2

If you’re familiar with XSHM this is going to look awfully similar (but better). When a script creates a new popup (or tab) it retains control over where to send it at a later date. I talked about this concept before. But let’s see what else can be done. What if the attacker uses the history.length function to calculate how many pages a user has visited after they left the tab for wherever they landed. The attacker could do something like this:

a.location = ‘data:text/html;utf-8,<script>alert(history.length);history.go(-1);<\/script>’;

By setting either a recursive setTimeout or using some manual polling mechanism, the attacker can (in this case) cause a popup which monitors how many pages they’ve gone. Normally it wouldn’t cause a popup, the attacker would redirect to another domain that they had access to which would do the same history.length check. Voila. The user only sees a brief white flash and then the same page they were just on – as if nothing happened. They’d probably just think their browser is messing up again. This could be helpful in a number of esoteric situations where the number of pages visited may change, or you may want to force them through several flows (and back and forth again) all with a single mouse click – giving you authority to popup in the first place. The best part is that this will follow them while they surf for as long as both windows stay open.

Converted Impact

Impacts of IPv4 to IPv6 Conversions

Background: The Internet is growing. The problem is that the Internet was based on a concept called IP (internet protocol). IPv4 (version 4) to be exact. When you see an IP address like 12.34.56.78 you are seeing a single point on the routable Internet. The problem is there are a finite amount of numbers. There are 2^32 useful IPv4 addresses approximately – if you start subtracting 127.0.0.0/8 (loopback) 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 (non routable) and all the IP used for routing packets across the Internet, that number starts looking a lot smaller. Now add on the fact that countries are coming online quickly buying up huge blocks of IP space, IP based mobile devices are approaching 100% penetration and there has been a run on static IPs for home users it is easy to realize that while the Internet is growing, the total number of non allocated IPs is shrinking at an increasing rate.

IPv6 brings a number of positive security benefits along with the problems associated with the implementation. Some of these include decreased susceptibility to man in the middle attacks and larger environment surface area, making worm propagation less efficient. While these benefits are small, they should absolutely be embraced within the economic cost breakdown of implementation.

IPv6 has been hailed as the remedy for many of the internet's current limitations ever since the mid-1990s. Nevertheless, even today in 2008 only a small fraction of the traffic on the internet is actually flowing over IPv6. Some of the reasons many websites are only available over IPv4 are interoperability concerns. In this article we'll describe how ZXTM 5.0 lets you hook up your company's web presence to tomorrow's internet without causing any nuisance for today's customers accessing your pages over IPv4.  IPv4 and the world's problems The most urgent reason for pushing IPv6 out of its perceived niche in research and academia is the simple fact that the internet's growth will stop as soon as 2010 if we stick to IPv4 only. The reason for this has to do with the number of available addresses. In order to be reachable over IPv4 a computer must have been assigned a unique 32 bit number. This means that at most about four billion different machines can be connected to the internet at any given instant in time (one of the clever mechanisms to work around that limitation is network address translation - NAT - which allows many computers to 'share' the same IP address).  Four billion is, however, only the theoretical upper limit, the practical limit is much lower. Due to the lax distribution policies in the early days of the internet, large parts of the IPv4 address space are lost forever. There are companies and institutions in the United States that have more addresses at their disposal than whole countries. The Halliburton Company, for instance, has the subnet 34.0.0.0/8, corresponding to 224 or well over 16 million addresses. A country like Senegal, on the other hand, has to make do with a paltry 67866 IPv4 numbers. Yes, that's sixty-seven thousand eight hundred and sixty-six IPv4 addresses for more than eleven million people (for more details, please consult AfriNIC's database).  This discrepancy of course has to do with the fact that the internet originated in the US and that back in those days nobody could have foreseen the global spread of what was then an arcane technology. However, apart from limiting IT growth in those parts of the world where it is most needed, this imbalance amounts to a gross injustice and it is easy to foresee a kind of "black market" for internet numbers in which those who have IP addresses in excess take advantage of those who came late to the table where addresses were being served. The following pictures give you an idea of the situation:The Problem With IPv4 According to IANA and RIR it is expected that the available non-allocated IP space for IPv4 will become exhausted somewhere between 2010 and 2012. While this issue was predicted in the 1980s and IPv6 was first proposed in 1996, it still has not been widely embraced by the Internet community, even though the date of exhaustion is looming close on the horizon. The issue thus far has been considered mostly a networking problem with networking solutions, however this paper will discuss many of the other issues as they relate to web based applications and the overall security ramifications of the migration between IPv4 and IPv6 as it comes ever closer. It is no longer an option, if you and your company have not yet considered IPv6 you are already falling behind.

Economic Impacts Of IPv4 to IPv6 Conversion: A number of interesting business issues may arise from the switch between IPv4 and IPv6. It is important to realize that while IPv4 spaces may be allocated that does not necessarily mean that they are used for anything. The first interesting possibility is that there may be a huge increase in cost for IPv4 address spaces as ISPs may decide that their apparent value is higher with less and less IP space available. This issue could lead to IPv4 barons who hoarde and eventually resell or rent IP space in smaller blocks at a premium. Additionally it is plausible that legislation may be instituted to fine companies for squatting on IPs that are unused yet allocated to prompt them to return unallocated IPv4 addresses back to the global pool of unallocated addresses. This would have the effect of further punishing businesses who are unable to make the switch to IPv6 for whatever reasons.

Short Term Issues With Conversion: A number of services advertise themselves as quick solutions to IPv6 to IPv4 conversions. One such company is 6gate who essentially proxies all the inbound connections to your website for you. This would allow 6gate to “see” any connections made to the host in question. While that may be ideal for insensitive transactions it also creates a great place to perform man in the middle attacks or sniffing attacks, similar to the ones used against Tor.

Similarly there are a number of IPv6 to IPv4 tunnel brokers that aim to make IPv6 to IPv6 over IPv4 enabled networks possible (since IPv4 cannot speak IPv6). These tunnelbrokers are technically capable of reading all of the traffic routed over them. While it is unlikely, these aggregate points are a high visibility point for attack within networks, and must be guarded as such.

Configuring tunnels usually requires cooperation of the two parties to set up the correct tunnel end-points. Tunnel Brokers (TB) as described in RFC 3053 can help people collect the necessary information to set up the tunnels. A TB can be viewed as an IPv6 ISP offering connectivity through IPv6-in-IPv4 tunnels. Current implementations are web-based tools that allow interactive setup of an tunnel. By requesting a tunnel, the tunnel client gets assigned IPv6 addresses out of the address space of the tunnel provider.The TB fits well for small isolated IPv6 sites, and especially isolated IPv6 hosts on the IPv4 Internet, that want to easily connect to an existing IPv6 network. The TB usually is an IPv4-webserver where the user connects to register and activate tunnels. The TB manages tunnel creation, modification and deletion on behalf of the user. For scalability reasons the tunnel broker can share the load of network side tunnel end-points among several Tunnel Servers (TS). It sends configuration orders to the relevant TS whenever a tunnel has to be created, modified or deleted. The TB may also register the user's IPv6 address and name in the DNS.IPv6 Networking Performance Impacts: Although not often addressed, there are performance impacts associated with IPv4 to IPv6. At least one of the major networking hardware manufacturers has estimated that the operational efficiency of their devices drop 4:1 for IPv6 enabled devices. That means that for certain types of hardware either a four times increase in networking equipment or costly upgrades of existing equipment. If additional equipment is required this could include supplemental equipment like uninterruptable power supply manufacturers, HVAC manufacturers, remote management devices (EG: Cyclades and Arula). This will require large scale migrations for networking staff, increases in physical space requirements within data centers and increase manufacturing needs for networking companies. This includes companies like Cisco, Juniper, Nortel and Acatel-Lucent amongst others – each of whom stand to make a substantial short term revenue growth for the switch and a sustained smaller long term growth based on the estimated 4:1 reduced efficiency of IPv6. Ultimately, the budget for the change must be accurately estimated for all the associated components for a seamless global transition.

IPv6 and Security Tools: Many tools that are enabled to help identify vulnerable applications within networks are IPv4 enabled to be able to route to addresses that have either no name associated with them or use something like NetBios instead of DNS to help users identify their location on the network. Many security tools have no IPv6 functionality, meaning that IPv6 enabled networks can often pass network security audits simply because the tools were not designed to operate in IPv6 enabled networks. Further, many security tools do parsing based on certain regular expressions. Here’s two examples:

An example IPv4 URL: https://12.23.45.67:443/

An example IPv6 URL: https://[2001:0db8:85a3:08d3:1319:8a2e:0370:7344]:443/

A regular expression could often be used for parsing in security tools that are designed to parse apart hostnames. Many existing tools fail when faced with a URL of this kind for various reasons. Firstly, the routable address no longer contains decimals, which in of itself is not a current requirement of IPv4 (an example of this is Dwords which are often used by phishers: http://1113982867/).

Another difference is the size of the string, which often is bounded by database string size. It is often that databases use the IP address rather than the host name for logging purposes, which normally requires 15 characters (12 numbers plus three periods), while IPv6 can be 39 characters. The last change is in URL structure which can include square brackets. The square brackets can fool logging that attempts to look at referring URLs. Nefarious users can force legitimate requests through a system which may give erroneous results based on how the logging is built.

An example of what someone might see when looking at a IPv6 DNS entry is:

$ dig www.ipv6.ac.uk |grep AAAA
ns0.ecs.soton.ac.uk. 1782 IN AAAA 2001:630:d0:f102::53a
ns1.ecs.soton.ac.uk. 1782 IN AAAA 2001:630:d0:f110::53b
ns2.ecs.soton.ac.uk. 1782 IN AAAA 2001:630:d0:f102::53b

Note: An example of a valid IPv6 URL that may fool logging which looks for the first occurence of /\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\ to base security decisions upon: http://[2001:0db8:85a3:08d3:1319:8a2e:0370:7344]/a.aspx?12.34.56.78/

As a side note a variation of this URL structure was used to deny service to a particular client side application code during testing. It will become necessary to perform this sort of regression testing against both IPv4 and IPv6 going forward to insure stability of any application that interacts with the web.

Future: There are many locations in many different types of applications that may use IP as hostname verification or for logging. Every instance should be identified and dealt with as soon as possible to ensure a smooth transition to an IPv6 environment. While IPv4 is not going away, the replacement version 6 will ultimately require further engineering thought. This will include security and networking tools (many of which are already getting attention or have already been ported to work in IPv6 environments), web applications, logging and of course the networking that ties it all together. While IPv6 is not widely understood amongst the majority of Internet professionals, it will quickly become an important aspect of all future network and Internet application development.

Thanks: Special thanks to James Flom for helping me with this paper in a number of aspects.

PXE dust: scalable diskless booting

If you’re a veteran system administrator, you might remember an era of extremely expensive hard disk storage, when any serious network would have a beefy central file server (probably accessed using the Network File System, NFS) that formed the lifeblood of its operations. It was a well-loved feature as early as Linux kernel 2.0 that you could actually boot your machine with a root filesystem in NFS and have no local disk at all. Hardware costs went down, similar machines could share large parts of their system binaries, upgrades could be done without touching anything but the central server—sysadmins loved this.

But that was then. Diskless booting these days seems a lot less common, even though the technology still exists. You hear about supercomputer clusters using it, but not the “typical” IT department. What happened?

Part of it, I’m sure, is that hard disks became speedier and cheaper more quickly than consumer network technology gained performance. With local disks, it’s still difficult to roll out updates to a hundred or a thousand computers simultaneously, but many groups don’tstart with a hundred or a thousand computers, and multicast system re-imaging software like Norton Ghost prevents the hassle from being unbearable enough to force a switch.

More important, though, is that after a few years of real innovation, the de facto standard in network booting has been stagnant for over a decade. Back in 1993, when the fastest Ethernet anyone could use transferred a little over a megabyte of data per second and IDE hard drives didn’t go much faster, network card managers were already including boot ROMs on their expansion cards, each following its own proprietary protocol for loading and executing a bootstrap program. A first effort at standardization, Jamie Honan’s “Net Boot Image Proposal”, was informally published that year, and soon enough two open-source projects, Etherboot (1995) and Netboot (1996), were providing generic ROM images with pluggable driver support. (Full disclosure: I’m an Etherboot Project developer.) They took care of downloading and executing a boot file, but that file would have no way of going back to the network for more data unless it had a network card driver built in. These tools thus became rather popular for booting Linux, and largely useless for booting simpler system management utilities that couldn’t afford the maintenance cost of their own network stack and drivers.

Around this time, Intel was looking at diskless booting from a more commercial point of view: it made management easier, consolidated resources, avoided leaving sysadmins at the mercy of users who broke their systems thinking themselves experts. They published aspecification for the Preboot Execution Environment (PXE), as part of a larger initiative called Wired for Management. Network cards started replacing their proprietary boot ROMs with PXE, and things looked pretty good; the venerable SYSLINUX bootloader grew aPXELINUX variant for PXE-booting Linux, and a number of enterprise system management utilities became available in PXE-bootable form.

But, for whatever reason, the standard hasn’t been updated since 1999. It still operates in terms of the ancient x86 real mode, only supports UDP and a “slow, simple, and stupid” file transfer protocol called TFTP, and officially limits boot program size to 32kB. For modern-day applications, this is less than ideal.

Luckily for us, the Etherboot Project still exists, and Etherboot’s successor gPXE has been picking up where Intel left off, and supports a number of more modern protocols. Between that, excellent support in recent Linux kernels for both accessing and serving SAN disks with high performance, and the flexibility gained by booting with an initial ramdisk, diskless booting is making a big comeback. It’s not even very hard to set up: read on!

The general idea

PXE booting flowchart

While it can get a lot more complicated to support boot menus and proprietary operating systems, the basic netbooting process these days is pretty straightforward. The PXE firmware (usually burned into ROM on the network card) performs a DHCP request, just like most networked computers do to get an IP address. The DHCP server has been configured to provide additional “options” in its reply, specifying where to find boot files. All PXE stacks support booting from a file (a network bootstrap program, NBP); PXELINUX is the NBP most commonly used for booting Linux. The NBP can call back to the PXE stack to ask it to download more files using TFTP.

Alternatively, some PXE stacks (including gPXE) support booting from a networked disk, accessed using a SAN protocol like ATA over Ethernet or iSCSI. Since it’s running in real mode, the firmware can hook the interrupt table to cause other boot-time programs (like the GRUB bootloader) to see an extra hard disk attached to the system; unbeknownst to these programs, requests to read sectors from that hard disk are actually being routed over the network.

If you want to boot a real-mode operating system like DOS, you can stop there; DOS never looks beyond the hooked interrupt, so it never has to know about the network at all. We’re interested in booting Linux, though, and Linux has to manage the network card itself. When the kernel is booted, the PXE firmware becomes inaccessible, so it falls to the initial ramdisk (initrd or initramfs) to establish its own connection to the boot server so it can mount the root filesystem.

Setting up the client

We’re going to walk through setting up network booting for a group of similar Ubuntu Lucid systems using disk images served over iSCSI. (The instructions should work with Karmic as well.) iSCSI runs on top of a TCP/IP stack, so it’ll work fine within your current network infrastructure. Even over 100Mbps Ethernet, it’s not significantly slower than a local disk boot, and certainly faster than a live CD. Rebooting may be obsolete, but short bootup times are still good to have!

You’ll want to start by installing one Ubuntu system and configuring it how you’ll want all of your diskless clients to be configured. There’s room for individual configuration (like setting unique hostnames and maybe passwords) later on, but the more you can do once here, the less you’ll have to do or script for all however-many clients you have. When they’re booted, the clients will find your networked drive just like a real hard drive; it’ll show up as /dev/sda, in /proc/scsi/scsi, and so forth, so you can pretty much configure them just like any local machine.

No matter what site-specific configuration choices you make, there are some steps you’ll need to perform to make the image iSCSI bootable. First, you’ll need to install the iSCSI initiator, which makes the connection to the server housing the boot disk image

client# aptitude install open-iscsi

That connection will need to occur during the earliest stages of bootup, in the scripts on the initial ramdisk. open-iscsi can automatically update the initramfs to find and mount the iSCSI device, but it assumes you’ll be setting a bunch of parameters in a configuration file to point it in the right place. It’s quite cumbersome to set this up separately for every node, so I have prepared a patch that will make the initramfs automatically set these values based on the “boot firmware table” created by the iSCSI boot firmware from the information provided by your DHCP server. You should apply it now with

client# wget http://etherboot.org/share/oremanj/ubuntu-iscsi-ibft.patch client# patch -p0 -i ubuntu-iscsi-ibft.patch patching file /usr/share/initramfs-tools/hooks/iscsi patching file /usr/share/initramfs-tools/scripts/local-top/iscsi

Next, tell the setup scripts you want boot-time iSCSI and regenerate the ramdisk to include your modifications:

client# touch /etc/iscsi/iscsi.initramfs client# update-initramfs -u

Finally, make sure the clients will get an IP address at boot time, so they can get to their root filesystem:

client# vi /etc/default/grub [find the GRUB_CMDLINE_LINUX line and add ip=dhcp to it; e.g. GRUB_CMDLINE_LINUX="" becomes GRUB_CMDLINE_LINUX="ip=dhcp"] client# update-grub

Setting up the storage

Let’s assume you’ve set up the prototype client as above, and you now have an image of its hard disk in a file somewhere. Because the disk-level utilities we’ll be using don’t know how to deal with files, it’s necessary to create a loop device to bridge the two:

server# losetup -v -f /path/to/ubuntu-image Loop device is /dev/loop0

If you get different output, or if the client disk image you created is already on a “real” block device (e.g. using LVM), replace /dev/loop0 with that device in the below examples.

You may be familiar with the Linux device mapper, probably as the backend behind LVM, but there’s a lot more it can do. In particular, it gives us very easy copy-on-write (COW) semantics: you can create multiple overlays on a shared image, such that writes to the overlay get stored separately from the underlying image, reads of areas you’ve written come from the overlay, and reads of areas you’ve not modified come from the underlying image, all transparently. The shared image is not modified, and the overlays are only as large as is necessary to store each one’s changes. Let’s suppose you’ve got some space in /cow that you want to use for the overlay images; then you can create five of them named/cow/overlay-1.img through /cow/overlay-5.img with

server# for i in $(seq 1 5); do > dd if=/dev/zero of=/cow/overlay-$i.img bs=512 count=1 seek=10M > done

(10M blocks * 512 bytes per block = 5GB per overlay; this represents the amount of new data that can be written “on top of” the base image.)

Now for the fun part. The aforementioned snapshot functionality is provided by the dm-snapshot module; it’s a standard part of the Linux device mapper, but you might not have it loaded if you’ve not used the snapshotting feature before. Rectify that if necessary:

server# modprobe dm-snapshot

and set up the copy-on-write like this:

server# for i in $(seq 1 5); do > loopdev=$(losetup -f) > losetup $loopdev /cow/overlay-$i.img > echo “0 $(blockdev –getsize /dev/loop0) snapshot /dev/loop0 $loopdev p 8″ | dmsetup create image-$i > done

This sequence of commands assigns a loopback device to each COW overlay file (to make it look like a normal block device) and creates a bunch of /dev/mapper/image-n devices from which each client will boot. The 8 in the above line is the “chunk size” in 512-byte blocks, i.e. the size of the modified regions that the overlay device will record. A large chunk size wastes more disk space if you’re only modifying a byte here and there, but may increase performance by lowering the overhead of the COW setup. p makes the overlays “persistent”; i.e., all relevant state is stored in the image itself, so it can survive a reboot.

You can tear down the overlays with dmsetup remove:

server# for i in $(seq 1 5); do > dmsetup remove image-$i > done

It’s generally not safe to modify the base image when there are overlays on top of it. However, if you script the changes (hostname and such) that you need to make in the overlays, it should be pretty easy to just blow away the COW files and regenerate everything when you need to do an upgrade.

The loopback device and dmsetup configuration need to be performed again after every reboot, but you can reuse the /cow/overlay-n.img files.

Setting up the server for iSCSI

We now have five client images set up to boot over iSCSI; currently they’re all passing reads through to the single prototype client image, but when the clients start writing to their disks they won’t interfere with each other. All that remains is to set up the iSCSI server and actually boot the clients.

The iSCSI server we’ll be using is called ietd, the iSCSI Enterprise Target daemon; there are several others available, but ietd is simple and mature—perfect for our purposes. Install it:

server# aptitude install iscsitarget

Next, we need to tell ietd where it can find our disk images. The relevant configuration file is/etc/ietd.conf; edit it and add lines like the following:

Target iqn.2008-01.com.ksplice.servertest:client-0 Lun 0 Path=/dev/mapper/image-0,Type=blockio Target iqn.2008-01.com.ksplice.servertest:client-1 Lun 0 Path=/dev/mapper/image-1,Type=blockio …

Each Target line names an image that can be mounted over iSCSI, using a hierarchical naming scheme called the “iSCSI Qualified Name” or IQN. In the example above, thecom.ksplice.servertest should be replaced with the reverse DNS name of your organization’s domain, and 2008-01 with a year and month as of which that name validly referred to you. The part after the colon determines the specific resource being named; in this case these are the client drives client-0, client-1, etc. None of this is required—your clients will quite happily boot images introduced as Target worfle-blarg—but it’s customary, and useful in managing large setups. The Lun 0 line specifies a backing store to use for the first SCSI logical unit number of the exported device. (Multi-LUN configurations are outside the scope of this post.)

Edit /etc/default/iscsitarget and change the one line in that file to:

ISCSITARGET_ENABLE=true

You can then start ietd with

server# /etc/init.d/iscsitarget start

To test that it’s working, you can install open-iscsi and ask the server what images it’s serving up:

server# aptitude install open-iscsi server# iscsiadm -m discovery -p localhost -t sendtargets [::1]:3260,1 iqn.2008-01.com.ksplice.servertest:client-1 [::1]:3260,1 iqn.2008-01.com.ksplice.servertest:client-2 …

Setting up DHCP

The only piece that remains is somehow communicating to your clients what they’ll be booting from; if they’re diskless, they don’t have any way to read that information locally. Luckily, you probably already have a DHCP server set up in your organization, and as we mentioned before, it can hand out boot information just as easily as it can hand out IP addresses. You need to have it supply the root-path option (number 17); detailed instructions for ISC dhcpd, the most popular DHCP server, are below.

In order to make sure each client gets the right disk, you’ll need to know their MAC addresses; for this demo’s sake, we’ll assume the addresses are 52:54:00:00:00:0n wheren is the client number (1 through 5). Then the lines you’ll need to add to /etc/dhcpd.conf, inside the subnet block corresponding to your network, look like this:

host client-1 { hardware ethernet 52:54:00:00:00:01; option root-path “iscsi:192.168.1.90::::iqn.2008-01.com.ksplice.servertest:client-1″; } host client-2 { hardware ethernet 52:54:00:00:00:02; option root-path “iscsi:192.168.1.90::::iqn.2008-01.com.ksplice.servertest:client-2″; } …

Replace 192.168.1.90 with the IP address of your iSCSI server. The syntax of the root-pathoption is actually iscsi:server:protocol:port:lun:iqn, but the middle three fields can be left blank because the defaults (TCP, port 3260, LUN 0) are exactly what we want.

Booting the clients

If your clients are equipped with particularly high-end, “server” network cards, you can likely boot them now and everything will Just Work. Most network cards, though, don’t contain an iSCSI initiator; they only know how to boot files downloaded using TFTP. To bridge the gap, we’ll be using gPXE.

gPXE is a very flexible open-source boot firmware that implements the PXE standard as well as a number of extensions: you can download files over HTTP, use symbolic DNS names instead of IP addresses, and (most importantly for our purposes) boot off a SAN disk served over iSCSI. You can burn gPXE into your network card, replacing the less-capable PXE firmware, but that’s likely more hassle than you’d like to go to. You can start it from a CD or USB key, which is great for testing. For long-term use you probably want to set up PXE chainloading; the basic idea is to configure the DHCP server to hand out your root-path when it gets a DHCP request with user class “gPXE”, and the gPXE firmware (in PXE netboot program format) when it gets a request without that user class (coming from your network card’s simple PXE firmware).

For now, let’s go the easy-testing route and start gPXE from a CD. Download this 600kB ISO image, burn it to a CD, and boot one of your client machines using it. It will automatically perform DHCP and boot, yielding output something like the below:

gPXE 1.0.0+ — Open Source Boot Firmware — http://etherboot.org Features: AoE HTTP iSCSI DNS TFTP bzImage COMBOOT ELF Multiboot NBI PXE PXEXT net0: 52:54:00:00:00:01 on PCI00:03.0 (open) [Link:up, TX:0 TXE:0 RX:0 RXE:0] DHCP (net0 52:54:00:00:00:01)…. ok net0: 192.168.1.110/255.255.255.0 gw 192.168.1.54 Booting from root path “iscsi:192.168.1.90::::iqn.2008-01.com.ksplice.servertest:client-1″ Registered as BIOS drive 0×80 Booting from BIOS drive 0×80

after which, thanks to the client setup peformed earlier, the boot will proceed just like from a local hard disk. You can eject the CD out as soon as you see the gPXE banner; it’s just being used as an oversized ROM chip here.

You’ll probably want to boot each client in turn and, at a minimum, set its hostname to something unique. It’s also possible to script this on the server side by using kpartx on the/dev/mapper/image-n devices, mounting each client’s root partition, and modifying the configuration files therein.

That’s it: if you’ve followed these instructions, you now have a basic but complete architecture for network booting a bunch of similar clients. You’ve set up servers to handle iSCSI and DHCP, set up one prototype client from which client disks can be automatically generated, and can easily scale to hosting many more clients just by increasing the number 5 to something larger. (You’d probably want to switch to using LVM logical volumes instead of file-backed loopback devices for performance reasons, though.) The number of clients you can quickly provision is limited only by the capacity of your network. And the next time one of your users decides their computer is an excellent place to stick refrigerator magnets, they won’t be creating any additional headaches for you