Prismatic CISO: enterprise architecture

Saturday, April 13, 2013

Pi and Writing your own SAML Engine

Writing your own SAML Engine, like Pi, is not rational. Unlike Pi, it can seem rational. I promise you, writing your own SAML engine is not a rational act. I have coached and counseled dozens of third-party developers, to the point of even getting in their face, and they don't quite believe me. Hopefully you'll take this to heart, and avoid some pain.

I've been implementing SAML since 2002. In fact, I've been doing commercial SAML as long as anyone has. I was tech lead on the team that was the first to deploy SAML in the real world for real transactions. We implemented the very first SAML interface to a 3rd party August of 2002, and then did the first 3-way handshake (chained assertion) a few minutes later. Our commercial system, doing financial transactions via SAML federation, went live 6-Jan-2003. Since those early day of SAML 1.0, I've implemented dozens of SAML 1.0, 1.1 and 2.0 systems, in roles that have included identity architect, certification engineer, build/run team, solution architect and consultant. Undoubtedly, there are folks with more SAML experience than I have, but I've got quite a bit, and several implementations that have created scars.

Now that you know my bona-fides, let me tell you why creating your own SAML engine is not a rational act... even though it may appear to be a great idea.

Coding a SAML engine is one of the great examples of a bear trap laid before developers called Just a Matter of Programming (JAMP). It seems pretty simply, doesn't it? This is just some WS-Security Web Services with a canonical XML construct, easy-peasy. Add a dash of auth and a quick crypto lib call, and done! Yeah, no. I have not quite seen professional developers break down and cry, but I have seen highly-lauded developers with impressive resumes founder on the rocks of SAML, even though they had all the right skills on paper. Developers with 10 years experience doing hundreds of custom Web Services implementations have failed to deliver. In fact, I can confidently say that I have NEVER seen a home-grown SAML implementation delivered in less than 3 months overdue for Browser Artifact SAML, and Browser Post Profile SAML is even worse, because getting mutually authenticating digital certificates seems to be a barrier that crack programmers slam against unexpectedly. The scary thing is that we're not talking about script kiddies, but highly qualified developers that I've seen fail repeatedly -- kinda like watching a NASCAR driver fail to parallel park when given 100 tries. It's not an expected outcome.

Perhaps you've had differing experience. If so, you've been lucky. I have yet to discuss homegrown SAML engine development with an experienced federation colleague and have anyone take the position that writing your own SAML engine is a good idea.

I'll admit that it's been 14 years since I was a hard-core client-server developer. I cannot speak authoritatively and explicitly to the exact reason why this is hard, but point out that a SAML engine is the intersection of interactive web-services, session management, access management, cryptography and web access management engines. Add in load-balancers, firewalls, VLANs, application firewalls and packet rewrites along the way, and it's a devil's brew. I've also noticed that the incredible complexity that has been added within .Net and Java means that developers increasingly do not have a strong systems perspective. Most developers seem to focus entirely within software, so may not readily understand network communications and protocols. Whatever the actual cause, I can tell you that it's a big barrier.

Homegrown SAML engines also have enormous quality problems. Since the Sunny Day positive- test scenario takes weeks to develop and get working, you can bet there is a lot of buggy code that will be discovered as unexpected conditions result in un-trapped exceptions. In one notable case I saw a decade ago, whenever a load balancing error resulted in a timeout, an un-trapped exception in the code caused the last session to be provided to the new user -- session hijacking was the result.

Without a commercial SAML engine, who will provide documentation, support, training, maintenance, patching, upgrades and troubleshooting? With a commercial SAML engine, changing the SAML exchange is a simple configuration change, while home-grown SAML means custom code changes. Hopefully, the hot-shot developer who decided to write a custom engine is still available, and can figure out their code... but, frequently, they've already moved on.

Now that ADFS 2.0 provides for full SAML 2.0 interoperability, there is no longer any excuse. It's been bad practice for a decade, it's time to kill this practice once and for all. If your developers want to play with SAML, let them do so at home. Meanwhile, buy a commercial SAML engine. You'll be glad you did.

The next time someone suggests coding their own SAML engine, tell them it's as rational as writing your own Microsoft Excel, since you don't want to pay $150 for a spreadsheet. Sure, Excel is just some C++ code, so anyone could write a replacement Excel, but it's not a rational act!

Wednesday, March 13, 2013

Technical Debt vs. Managed Technical Debt

I was invited by Rafal Los (@Wh1t3Rabbit) to publish as guest blogger to his HP blog... reposted here for posterity.

----

If you’ve been paying attention to the Enterprise Architecture space, the notion of measuring, managing and avoiding technical debt has come into the forefront in the past 5 years. Broadly attributed to Ward Cunningham, technical debt is a concept spawned from the adjacent concept of design debt. The idea of technical debt is that there are decisions made throughout the SDLC that hamper the delivered product from the ideal, and that this is deficit spending you'll have to pay for eventually.

There might be a compelling business reason to host a public application with web/app/database on the same physical server, implement deprecated function calls, or to use MS-SQL 2008 for your database because you're constrained in some fashion; this is all cruft that will likely result in future efforts to correct the architectural misstep. In essence, you're willing to saddle the organization, business unit, and that application with "technical debt" because of an imperative like time-to-market, budget shortfall, or architectural constraints imposed from legacy investments (i.e. - creating more technical debt because of existing technical debt).

You've likely gritted your teeth several times in your career when you run into problems of technical debt -- Visual Basic codebase, NT servers still in use, use of telnet, users with desktop admin privileges, use of custom cryptography, all of these are examples of intractable technical debt in a large enterprise.

We all have technical debt -- but is it managed? In my experience across many large organizations, while people get the concept of technical debt, after a fashion, the governance of it is exceptionally difficult, because it means having teeth in your governance program sufficient to make very hard business decisions. To transition from notional concepts of technical debt to actual _managed_ technical debt, you have to have accountability for the technical debt, a way of measuring and reporting on it, and managing it into the future budget cycles. From discussions with peers, few organizations have a mature enough Enterprise Architecture and portfolio management practice to be able to manage that in a suitably complex environment. When your network is so large that no single person can know it all, or no single person has visited all your facilities, then managing technical debt becomes a difficult problem. In most organizations of that size, your portfolio management and EA teams would be happy if they could just know all the applications (and cloud apps) installed by shadow IT.

Measuring "relative evil" as technical debt when an application is implemented, warts and all, as a means of assessing progress (deprogression) is a great way to drive visibility through metrics. However, when it comes to actually putting the wheels on tech debt governance and driving it down the road, that's another thing entirely. That requires a level of sophistication in portfolio management that most organizations are not ready to achieve, IMHO. However, in organizations that can pull it off, there are some hard dollars to be found in R&M, rework, outages, costs passed along to future projects, brittle architecture, and lack of first-mover advantage. In other words, it's worth the trip.

As architects, it's our job to articulate those issues to the ones making the decisions to add technical debt, and make it clear that it's not cheaper, not by a long-shot -- in my experience, the total costs are probably 10x the total costs being "saved" by the project right in front of them wishing to add technical debt. I've also advised several times to go with the tech debt to gain first-mover advantage in a new market, because it's the right thing for the business, tech debt be hanged.

This brings us to the brink of security debt, which is a very useful term if you can make it stick. Once people latch onto the notion of technical debt, you have fertile ground for making the leap to security debt, as the notion of managing and measuring the costs associated with deviation from policy, standards and suitable good practice. However, notice I referred both to the brink of security debt, and leaping. Be careful where you take that, because if your organization cannot effectively measure and manage technical debt, they likely aren't ready for security debt. However, a good quantitative analysis method (I'm a fan of Jack Jones' FAIR model) combined with metrics and collaboration with audit can likely create a solid picture during the SDLC of the actual probabilistic losses associated with security debt being added to the organization, and that can be a powerful tool.

As Dan Geer has said many times, in Information Security, the future belongs to the quants.

---------------

Prismatic CISO

Translate

Saturday, April 13, 2013

Pi and Writing your own SAML Engine

Wednesday, March 13, 2013

Technical Debt vs. Managed Technical Debt

About Me

How is your company handling Single-Signon?

Blog Archive