Wednesday, August 6, 2008

I Love My PCI

PCI as in PCI-DSS as in Payment Card Industry Data Security Standards

We met with a QSA on Monday. Don't me what QSA stands for - their the official PCI auditors. The killer statement from the meeting was that every network device we manage which forwards a packet with payment card data in it - even if that data is encrypted - is within scope for PCI compliance. My understanding is that this means that requirements like regular password rotation, quarterly config reviews, and file integrity monitoring apply to all out network equipment. We run a very secure network, but security != compliance so we will end up spending a lot of time dotting our I's and crossing our T's. And a lot more time showing auditors that we dotted and crossed !

Tuesday, July 29, 2008

iPhone + Streaming Radio

Okay, this is not really about networking or IU, but I thought it was pretty cool so I figured I'd share it with all of you (which hopefully includes a few more people than I've already told this to in person). *AND* it did involve 1 piece of network equipment owned by IU, so....

Like many people, I'm amazed by many of the 3rd party applications for the iPhone. I was very busy preparing for the Joint Techs workshop last week, so I didn't have much time to "play" with all the new applications for my iPhone. I did, however, download the AOL Radio application a couple of days before leaving for Lincoln. It worked fairly well and I quickly thought it would be quite cool if I could use it in my car while driving ! I'm too cheap to pay for satellite radio, so the idea of being able to listen to radio stations from all over the country in my car caught my eye !

Of course, the first thing I thought was *DOH* - what about that darn GSM interference ? All that buzzing and popping coming through the radio from the streaming audio over the EDGE network wouldn't do. Luckily, I've been testing a Linksys Mobile Broadband router with a Sprint EV-DO card. So I could plug this into the power outlet in my trunk and connect my iPhone to it via Wifi. Note: with iPhone 2.0 release, you can put the iPhone in "airplane mode" - shutting down the cellular radio - and then enable the Wifi radio :) Problem #1 solved ! BTW- I've been told that HSDPA (AT&T's 3G technology) does not have the same interference issues, but alas I don't have one to test with :-(

The next problem was that Sprint doesn't have 3G in Bloomington yet. So how well would this work over the "slow-as molasses" 1xRTT network ?

Before I left for the airport, I tossed the Linksys into my trunk (not literally) and plugged into the power outlet. I dropped (again not literally) my iphone into the dock in my car and headed out. Shortly after I passed the Bloomington bypass on highway 37, I fired up AOL Radio to see what would happen. The station started, but the audio was in and out, stopping and starting --- unusable :-( I turned it off and went back to listening to a podcast. When I reached Martinsville - safely within Sprint's EV-DO coverage - I tried it again -- tuning into the Jack FM station in Chicago. This time it worked fairly well. Every few minutes there would be a short audio drop as it rebuffered, but all-in-all it worked reasonably well.

While I was in Lincoln, I had some free time to play my iPhone. I downloaded a bunch of 3rd party apps include Pandora. For those of you who haven't used Pandora, it's a personal radio station application. You pick an artist and they select songs from that artist and other similar artists. You can give songs a thumbs up or thumbs down and it supposedly adjusts to your tastes.

While in Lincoln, I used Pandora over the EDGE network from my hotel room and walking around town. I was amazed by how well it worked over the EDGE network. Excellent sound quality and almost no rebuffering. I couldn't wait to try it out on the drive home from the airport.

So, last Thursday night while driving home from the airport I tried it out. Amazing ! The quality over both EV-DO and 1xRTT networks was excellent ! Presumably it would be just as good using the cellular radio internal to the iPhone - assuming there wasn't a GSM interference issue. I've been using it for the past several days and have been amazed at how well it works - even down by my house in the southern part of the county where there are definitely some dead spots !

If I ran a satellite radio company, I'd definitely be paying attention to this. It seems to me the major cost for the satellite radio companies is transport - ie getting the signal from the head-end to the users. The reason people want satellite radio is the large selection of content that is available anywhere - not just within your local broadcast area. Exchanging satellite transport for IP transport (either over wired or wireless networks) could drastically reduce their costs and increase their availability - ie you can get IP-based connection in places you can't easily get satellite - like in basements !

Wednesday, July 23, 2008

Internet2 Joint Techs

I'm at the Internet2 Joint Techs Workshop in Lincoln Nebraska this week. The primary reason I'm attending is actually for 2 events that were "tacked-on" to the main workshop: The MPLS Hands-On Workshop on Sunday and the Netguru meeting today and tomorrow.

The MPLS workshop was a 1 day workshop meant to educate campus network engineer about MPLS and it's application on campus networks. The morning was spent on presentations and the afternoon on hands-on configuration of MPLS in a lab setting. This was the first MPLS workshop and it went extremely well. There were 22 people in attendance. I was an instructor for the workshop and gave about a 1 hour talk on the control-plane for MPLS VPNs. I plan to reuse the material to provide some MPLS instruction for the networking staff at IU.

The second event I'm attending is the Netguru's meeting. Netguru is a small group of network architects from universities around the country. As you might imagine, campus network architects often have lots of challenging problems they're trying to solve and find it very helpful to discuss these with other people who are facing the same challenges. I think it's typical for these folks to have 1 or 2 network architect friends that they discuss issues with on a fairly regular basis. A few years ago I shared a cab ride to an airport with David Richardson and Mark Pepin. David and I got together to discuss networking issues on a fairly regular basis - whenever we were in the same city (David worked at the Univ. of Washington before leaving to work for Amazon). We somehow started talking about how network architects share information and Mark Pepin brought up the idea of starting a small group (10-15 people) of network architects that met in conjunction with the I2 Joint Techs workshop to discuss issues of the day. Thus Netguru was born ! We have a full agenda for this afternoon, dinner tonight and all day tomorrow. I've missed the last 2 meetings, so I'm looking forward to the discussions today and tomorrow.

Thursday, July 17, 2008

Catching up (again)...

Well, it's been 3 weeks since my last post, but I assure you we have not been sitting around twiddling our thumbs ! Here's a summary of what's been going on...

The wireless and core upgrade projects are moving along smoothly. About 1,000 of the 1,200 APs in Bloomington have been replaced. We're also starting to complete some of the dorms in Bloomington as well - so some of the dorm rooms will have wireless by the start of the fall semester. At IUPUI, we're not quite as far along as in Bloomington, but will have completed wireless upgrades in all the on-campus buildings by the time the UITS change freeze goes into effect on August 18th.

We're finishing up the preparations for adding the "IU Guest" SSID to all the APs. This will be the SSID guests who have been given Network Access Accounts will use to access the network. This will allow us to shutdown our old web portal authentication system. The system has a scaling limitation related to the number of MAC addresses on wireless and we've been putting band-aids in place for 2 years to get it to scale to the number of wireless users we have. The "IU Guest" SSID will use the web-portal authentication built-in to the HP WESM modules - these do not have the same scaling limitations.

With these projects moving along smoothly, Jason and I have shifted our attention to the *next* set of projects. Here's a bit about what we've been up to...

We spent a day at IU-Northwest talking with them about the major network upgrade they're planning. During the next 12 months they'll be upgrading all their wiring to Cat6e, consolidating IDFs, improving their outside fiber plant, upgrading all their switches to HP5400's, and deploying over 150 new 802.11n APs.

Jason spent a day at IU-Kokomo helping them setup their new HP wireless gear and discussing their future use of HP's Identity Driven Management product. IU-Kokomo undertook a major upgrade of their network earlier this year, replacing all their switches with HP 5400's, and as part of that they purchased HP's Identity Driven Management system. I could devote a whole post just to this (and probably will eventually), but essentially this is a policy engine that let's you decide when and where users can connect to your network and what type of network service they get - which is done by placing them on different VLANs or applying ACLs to their connection. We've been interested in getting our feet wet with a system like this for some time and Kokomo has agreed to be a guinea pig of sorts :) Thanks Chris !

We had our yearly retreat with the IT Security Office - now called the University Information Security Office. This is something we've been doing for a few years now. A couple people from ITSO and a couple people from Networks get together off-campus and spend several hours thinking strategically about improving security - instead of the tactical thinking we usually do. Tom Zeller hosted the event again - Tom has a large screened in porch in the woods and we were able to watch some wildlife in addition to discussing security !

We met with the University Place Conference Center staff at IUPUI to discuss their unique wireless and guest access needs. They have web-portal authentication on both their wireless network and their wired network. The new web-portal system on the HP WESMs only works for wireless users, so when we upgrade wireless in the hotel and conference center, we'll have to do a bit of a one-off for them.

I've been very busy preparing for the upcoming MPLS Workshop at the Internet2 Joint Tech's workshop in Lincoln, Nebraska. MPLS VPNs are becoming a hot-button topic for campuses as they struggle to meet the divergent networking needs of their different constituents - from the business aspect of the university, to student housing, to researchers. In fact, we're planning to roll-out MPLS VPNs this fall, so when I was asked to be an instructor for this workshop, I figured it would be a great opportunity to sharpen my skills on MPLS VPNs *AND* I could reuse the materials I develop to provide training for all the UITS networking staff that will need to learn how to support MPLS VPNs ! As part of this process, I put together a small MPLS testlab with 3 routers and, when I return, will use this to start preparing for our MPLS VPN deployment.

We've also continued to develop our plans for networking in the new data center. I'll share some more about later once I get past the Joint Tech's workshop in Lincoln !

Friday, June 27, 2008

Data Center - Take 2

Shortly after my last post - actually it was the next morning in the shower, which is where I do my best thinking - I was hit by the thought, "doesn't CX4 reach 15 meters instead of 10?". So when I got to work that morning, I looked it up and, sure enough, 10GBASE-CX4 has a 15 meter (49 foot) reach. And that, my friends, makes all the difference !

15 meter reach makes it possible to use CX4 as the uplink between all the TOR switches in a 26 cabinet row and a distribution switch located in the middle of said row. This reduces the cost of each TOR switch 10GbE uplink by several thousand dollars.

The result is that, for a server rack with 48 GbE ports, it's substantially less expensive to deploy a TOR switch with 2 10GbE (CX) uplinks to a distribution switch than to deploy a two 24-port patch panels with preterminated Cat6e cables running back to the distribution switch. For server rack with 24 GbE ports, it's a wash in terms of cost - the TOR switch option being a few percent more expensive. This also means that the cost of 10G server connection is significantly lower than I originally calculated.

The only remaining issue is that, in the new data center, the plan was to distribute power down the middle aisle (13 racks on each side of the aisle) and out to each rack, but to distribute the fiber from the outsides of the rows in. One thing that makes the TOR model less expensive is that you only need 1 distribution switch per 26 racks (13 + 13) whereas with the patch panel model you'd need multiple distribution switches on each side of the aisle (2 or 3 switches per 1/2 row or 13 racks). But having only 1 distribution switch per row means that there would be CX4 cables and fiber crossing over the power cables running down the middle aisle. We have 36" raised floors though, so hopefully there's plenty of vertical space for separating the power cables and network cables.

The other consideration is that it appears to me vendors will be converging on SFP+ as a standard 10G pluggable form-factor - going away from XENPACK, XFP and X2. If this happens, SFP+ Direct Attach will become the prevalent 10G copper technology and that I believe does only have 10 meter reach. That would lead us back to placing a distribution switch on each side of the aisle (1 per 13 racks instead of 1 per 26 racks) - which will raise the overall cost slightly.

Tuesday, June 24, 2008

Networking the Data Center

As if we had nothing else to do this year, we're busy planning for our new data center in Bloomington that will come on-line in the spring of 2009. I spent the better part of the afternoon yesterday working through a rough cost analysis of the various options to distribute network connectivity into the server racks, so I thought I'd share some of that with all of you. I'll start with a little history lesson :)

The machine rooms originally had a "home-run" model of networking. All the switches were located in one area and individual ethernet cable were "home-run" from servers directly to the switches. If you're ever in the Wrubel machine room, just pick up a floor tile in the older section to see why this model doesn't scale well ;-)

When the IT building @ IUPUI was built, we moved to a "zone" model. There's a rack in each area or "zone" of the room dedicated to network equipment. From each zone rack, cables are run into each server rack with patch panels on each end. All the zone switches had GbE uplinks to a distribution switch. We originally planned for a 24-port patch panel in every other server rack - which seemed like enough way back when - but we've definitely outgrown this ! So, when we started upgrading the Wrubel machine room to the zone model, we planned for 24-ports in every server racks. 24-ports of GbE is still sufficient for many racks, but the higher-density racks are starting to have 48-ports and sometimes 60 or more ports. This is starting to cause some issues !!

But first, why so many ports per rack ? Well, it's not outrageous to consider 30 1RU servers in a 44 RU rack. Most servers come with dual GbE ports built-in and admins want to use one port for their public interface and the second for their private network for backups and such. That's 60 GbE ports in a rack. - OR - In a large VMware environment, each physical server may have 6 or 8 GbE NICs in it: 2 for VMkernel, 2 for console, 2 for public network and maybe 2 more for private network (again backups, front-end to back-end server communications, etc). 8 NICs per physical server, 6 or 8 physical servers per rack and you have 48 to 64 GbE ports per rack.

So, why doesn't the zone model work ? In a nutshell, it's cable management and too much rack space consumed by patch panels. If you figure 12 server racks per "zone" and 48-ports per rack, you end up with 576 Cat6e cables coming into the zone rack. If you use patch panels, even with 48-port 1.5RU patch panels, you consume 18 RU just with patch panels. An HP5412 switch, which is a pretty dense switch, can support 264 GbE ports in 7 RU (assuming you use 1 of the 12 slots for 10G uplinks). So you'll need 2 HP5412s (14 total RU) PLUS an HP5406 (4 RU) to support all those ports. 18 + 14 + 4 = 36 - that's a pretty full rack - and you still need space to run 576 cables between the patch panels and the switches. If you don't use patch panels, you have 576 individual cables coming into the rack to manage. Neither option is very attractive !

Also, if you manage a large VMware environment, with 6 or 8 ethernet connections into each physical server, 10GbE starts looking like an attractive option (at least until you get the bill ;-). Can you collapse the 8 GbE connections into 2 10GbE connections ? The first thing that pops out when you look at this is that the cost to run 10GbE connections across the data center on fiber between servers and switches is simply prohibitive ! 10GBASE-SR optics are usually a couple grand (even at edu discounts), so the cost of a single 10GbE connection over multimode fiber is upwards of $4,000 *just* for the optics - not include the cost of the switch port or the NIC !

For both these reasons (high-density 1G and 10G) a top-of-rack (TOR) switch model starts looking quite attractive. The result is a 3-layer switching model with TOR switches in each rack uplinked to some number of distribution switches that are uplinked to a pair of core switches.

The first downside that pops out is that you have some amount of oversubscription on the TOR switch uplink. With a 48-port GbE switch in a rack, you may have 1 or 2 10GbE uplinks for either a 4:1 or 2:1 oversubscription rate. With a 6-port 10GbE TOR switch with 1 or 2 10GbE uplinks, you have a 6:1 or 3:1 ratio. By comparison, with a "zone" model, you have full line-rate between all the connections on a single zone switch although the oversubscription rate on the zone switch uplink is likely to be much higher (10:1 or 20:1). Also, the TOR switch uplinks are a large fraction of the cost (especially with 10G uplinks), so there's a natural tendency to want to skimp on uplink capacity. For example, you can save a LOT of money by using 4 bonded 1G uplinks ( or 2 pairs of 4) instead of 1 or 2 10G uplinks.

My conclusion so far is that, if you want to connect servers at 10GbE, you *absolutely* want to go with a TOR switch model. If you need to deliver 48 or most GbE ports per rack, you probably want to go with a TOR model - even though it's a little more expensive - because it avoids a cable management nightmare. If you only need 24-ports (or less) per rack, the "zone" model still probably makes the most sense.

Wednesday, June 18, 2008

Distributing Control

One thing I've done quite a bit of since taking on the network architect role last summer is meet with LSPs to discuss their networking needs. Just yesterday we met with the Center for Genomics and Bioinformatics, this morning we're meeting with the Computer Science department, and Friday with University College @ IUPUI. What I've learned is that there are many excellent LSPs and that they know their local environment better than we ever will.

As the network becomes more complex with firewalls, IPS', MPLS VPNs and such, I think we (UITS) need to find ways to provide LSPs with more direct access to affect changes to their network configurations and with direct access to information about their network. For example, if an LSP knows they need port 443 open in the firewall for their server, what benefit does it add to have them fill out a form, which opens a ticket, which is assigned to an engineer, who changes the firewall config, updates the ticket and emails the LSP to let them know it's completed ?

Okay, it sounds easy enough to just give LSPs access to directly edit their firewall rules (as one example) - why not just do this ?

First, you have to know which LSPs are responsible for which firewall rules. To do that you first need to know who the "official" LSPs are, but then you also need to know which IP addresses they're "officially" responsible for. It turns out this is a pretty challenging endeavor. I've been told we now have a database of "authoritative" LSPs that is accomplished by an official contact from the department (e.g. dean) designating who their LSPs are. But then you need to associate LSPs with IP addresses - and doing this by subnet isn't sufficient since there can be multiple departments on a subnet. The DHCP MAC registration database has a field for LSP, but that only works for DHCP addresses and is an optional user-entered field.

Second, you have to have a UI into the firewall configuration that has an authentication/authorization step that utilizes the LSP-to-IP information. None of the commercial firewall management products I've seen address this need, so it would require custom develop. The firewall vendors are all addressing this with the "virtual firewall" feature. This would give each department their own "virtual firewall" which they could control. This sounds all fine and good, but there are some caveats.... There are limitations to the number of virtual firewalls you can create. If you have a relatively small number of large departments, this is fine, but a very large number of small departments might be an issue. Also, one advantage of a centrally managed solution is the ability to implement minimum across-the-board security standards. None of the virtual firewall solutions I've seen provide the ability for a central administrator to set base rules for security policy that the virtual firewall admins cannot override.

Third, it is possible to screw things up and, in some rare cases, one person's screw up could affect the entire system. High-end, ASIC-based firewalls are complex beasts and you should really know a bit about what you're doing before you go messing around with them. So would you require LSPs to go through training (internal, vendor, SANS, ?) before having access to configure their virtual firewall ? Would they have to pass some kind of a test ?

I don't think any of these hurdles are show-stoppers, but it will take some time to work through the issues and come up with a good solution. And this is just one example (firewalls) of many. Oh, and people have to actually buy-in to the whole idea of distributing control !