IVOA

Review of Shibboleth
Version 0.1

IVOA WG Internal Draft 2005 May 09

Working Group:
http://www.ivoa.net/twiki/bin/view/IVOA/IvoaGridAndWebServices
Author(s):
Guy Rixon

Status of this Document

This is a Note. The first release of this document was 2005-05-09. A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.

Acknowledgments

The ideas in this note arose from discussions in IVOA meetings, in AstroGrid and in the VOTech-DS3 component of EuroVO. I thank my colleagues there for all their ideas and advice. In particular, I thank Reagan Moore for first pointing out that Shibboleth was useful and Andy Lawrence for insisting that it still be considered when I'd discovered the problems and was becoming dismissive.

The idea of using Shibboleth services with PKI authentication was taken from Von Welch's "GridShib" project, as presented at the UK e-Science meeting on security [e-Science], although the details have been worked out separately and the eventual solution may not be the same.

Contents


1. Introduction

Shibboleth [Shibboleth] is an on-line security system developed by the Internet 2 project. It provides authentication functions for HTTP services and also serves information useful for authorization decisions. At base, Shibboleth is a single-sign-on (SSO) facility for a grid of HTTP services. The authors of Shibboleth are particularly concerned with the case where the user agent is a web browser and the HTTP service is a conventional web server.

Shibboleth attempts to deal with huge numbers of users by 'federating administration of users'. This means that users are registered, and their accounts managed, at the users' home institutions. These registration details are looked up at run-time by the 'payload' services needing to authenticate the users; Shibboleth is the set of middleware that allows the payload services to perform this authentication.

The problem addressed by Shibboleth is very close to that faced by the IVO in controlling access to resources. Shibboleth is a possible solution, both as a design for an SSO system and as a reference implementation. This paper examines the Shibboleth design, looking at how it might fit into the IVOA architecture. At the time of writing, I have not evaluated the Shibboleth implementation; that could be done if there is sufficient interest in adopting the design.

2. Shibboleth architecture and protocols

2.1 Components

Shibboleth adds a number of services to the basic, unsecured system. At each site with web resources needing to be secured, there is a Shibboleth Attribute Requester (SHAR) and a Shibboleth Indexical Reference Establisher (SHIRE). At each site where users are registered, there is an Attribute Authority (AA), a Handle Service (HS) and a local authentication system (by which users sign on). There is also, typically, a Where Are You From (WAYF) service.The WAYF is chosen by the service provider, so, in principle, each payload service could run its own WAYF. However, a WAYF is typically provided by the virtual organization in which the services are federated.

The SHIRE, WAYF and HS are used to establish a handle for a user. This handle is an anonymized, unique reference to the user that is understood by the other services at the user's point of registration.

The SHAR and AA are used to authenticate the user's use of a handle and to provide 'attributes' to the payload service that allow it to make an authorization decision.

2.2 Protocols

The security-check process is in three parts.

  1. The user's home site is determined. This is the site where the user is registered and where the user can sign on using a local password.
  2. A 'handle' is established for the user's software agent. This is a unique, opaque name that the system uses to identify the agent. It does not identify the  user.
  3. Third, the attributes for the user are obtained and an authorization decision is made at the payload service site. The attribute server is able to map from the handle to the user identity and thus find the user's attributes. Attributes are things like group membership.

2.2.1 Determining the user's home site

The SHIRE intercepts the attempted access to the web resource and initiates the search for a handle. It typically delegates this to a WAYF service. The delegation is done by sending a redirect response to the user agent. The URL to which the agent is redirected contains two parameters: the URL that the agent was originally trying to reach and a service endpoint in the SHIRE to which a handle may be sent.

The WAYF 'interacts with [the user] to find out his origin site'. The Shibboleth documents explicitly decline to state how this interaction is carried out. The protocol specification says:

'A WAYF is free to interact with the principal's user agent in any manner it deems appropriate to determine the identity provider to which to relay the authentication request. This includes, but is not limited to, presenting lists, a search interface, heuristics based on client characteristics, etc. A WAYF service service SHOULD provide some means for the user agent to cache the user's selection, perhaps using HTTP cookies, but SHOULD also provide a reasonable means for the user to change the selection in the future.'

Given that Shibboleth is intended for use with web browsers, this means the WAYF sends the user agent a form for the user to fill in. Having accepted the form, the WAYF sets one or more cookies to retain the information for future authentications. This is the only method that can work with current web-browsers. The details of the form and the cookies are not defined and can vary between virtual organizations.

2.2.2 Getting the handle

The WAYF redirects the user agent to the HS, including in the URL the two parameters sent to it, the WAYF, by the SHIRE: target URL and the call-back URL on the SHIRE. The HS derives a handle for the user.  The means by which the HS identifies the user is not specified exactly:

'...the principal is identified by the identity provider by some means outside the scope of this specification. This may require a new act of authentication or it may reuse an existing authenticated session.'

Given that the user agent is typically a web browser, 'reuse an existing session' implies that the HS looks for an HTTP cookie set when the user originally logged on to the system.

The HS encodes the handle in a SAMLResponse (an XML structure defined by the Security Assertion Mark-up Language). The HS signs this structure digitally.

The HS then returns an HTML form to the user agent and the agent is assumed to display it to the user. The form tells the user what security information is being shared with the SHIRE; the SAMLResponse carrying that information is present as a base-64-encoded, hidden parameter of the form. Submitting the form sends an HTTP-post message to the callback endpoint specified by the SHIRE when it started the authentication process.

There is provision for automating this stage:

'Furthermore [the HS] MAY include in the response sufficient client-side scripting to cause the form to be submitted automatically without intervention by the user...'

'Client-side scripting' presumably means Javascript or ECMAscript.

The SHIRE is required to validate the signature on the SAMLResponse. However, the details of how it does this are not fully specified by Shibboleth:

'The verification key is assumed to be obtainable through unspecified means (e.g. in a certificate passed along with the  [SAMLResponse]; also unspecified is how the association between that key and the HS is to be validated by the SHIRE...'

If the SHIRE accepts the handle, it passes it to the SHAR by unspecified means:

'Shibboleth doesn't specify the interaction between the SHIRE and the SHAR components. In many, perhaps most, cases, the SHIRE and SHAR will be elements of a common implementation module within an HTTP server...'

2.2.3 Getting the attributes

Once the SHAR has the handle, it can ask for SAML 'attributes' relating to that handle. It sends SAML request to the Attribute Authority and receives back a SAML response. The SAML specification gives schemata for this request and response and the Shibboleth specification defines which parts of SAML must be used.

The SHARs and AAs in a given virtual organization may use any protocol, but the Shibboleth specification requires that both support SOAP 1.1 over HTTPS. This is presumable the protocol supported by the reference implementation of Shibboleth.

3. Advantages of Shibboleth

Shibboleth has some features considered important for the IVO.

  1. It provides SSO authentication across a virtual organization.
  2. It disperses the user management to the origin sites such that the payload sites don't have to do it.
  3. It supports group-base and role-base authorization via its attribute service.

Shibboleth also comes with a working reference implementation. It can be added to a simple web-server without writing any code and, apparently, without changing any of the content on that web server. Thus, Shibboleth security can easily be applied to static files on a web server and to CGI services on such a server. It would be quite straightforward, for example, to apply it to the SIAP service ivo://uk.ac.cam.ast/INT-WFS/images/siap-atlas, which is a CGI programme running on an Apache web-server. However, for the reasons listed in the following section, it would be harder to apply Shibboleth to a service made of Java servlets and very hard to apply Shibboleth to any SOAP service.

Registering users and recording their attributes is unwelcome work. If a site with IVO users has a working Shibboleth system, then it is attractive for the IVO to reuse that system. In fact, the most valuable resource to reuse is the information in the system: we want to minimize the number of security-support services run by IVO sites. Several American universities have Shibboleth. Most (all?) universities in Switzerland have it. In the UK, JISC is proposing Shibboleth as standard infrastructure for universities. Shibboleth is emerging as a standard infrastructure for internet security; it remains to be seen whether it can be the standard infrastructure for all services.

4. Assumptions and limitations of the Shibboleth design

Shibboleth is designed to control access to web pages by users with web browsers. The primary use-case in the Shibboleth specification is 'Joe surfs the web'. This focus severely restricts the use of other user agents. Shibboleth assumes the following points.

  1. The user is present and can answer questions during access to the web-resource.
  2. The user agent can display an HTML form to the user and get a response.
  3. The user agent can be redirected to a different URL.
  4. The web resource is requested using the HTTP protocol.
  5. The user agent accepts HTTP cookies.
  6. Cookies carrying the authentication record are valid throughout the virtual organization.
  7. The authors of the HS and SHIRE agree on the digital-signature method.
  8. The authors of the SHIRE and SHAR agree on a communications mechanism.
  9. The authors of the AA and SHAR agree on a communications mechanism.
  10. The user's original request to access a web resource is entirely specified by the URL of the resource.

I am not sure about point 10; the architecture documents imply it but do not state it.

Points 1 and 2 and, to a lesser extent points 3 and 5, make it difficult to use a Shibboleth-protected service from a user-agent that is not a web browser. Even the common command-line tool wget will not cope with a Shibboleth-protected web-site. It is not feasible in the general case to write a user agent that can cope autonomously with the arbitrary, unspecified form sent by a WAYF; however, it might be possible to automate this process for a given, known WAYF.

Points 7, 8 and 9 make it difficult to build a Shibboleth system out of parts from different authors. So much protocol is left unspecified that I assume that Shibboleth works only because it is a single, reference implementation. An alternative implement of some part could be made by reverse-engineering the protocol from the reference implementation (the source-code is open), but this could prove fragile.

Point 4 makes it difficult to use Shibboleth to protect resources on an FTP server. It is perhaps possible to map every FTP URL to an equivalent HTTP URL, and to redirect from the HTTP server to the FTP server after the security check; but this makes the work of setting up web resources more complex and fragile, and the underlying FTP URLs are then only obscure, not truly secure.

The most-limiting point is number 10. If I understand the architecture correctly, the body of a request for a web resource is lost in the process of authentication; only the URL, with any embedded parameters survives. This means that SOAP messages sent over HTTP are destroyed by Shibboleth, unless the SOAP envelope is encoded and embedded in the URL which is a perverse way to use SOAP.

If Shibboleth does preserve the message body, then it is possible to write a SOAP client that deals with Shibboleth. However, not all toolkits for generating SOAP stubs will handle the redirections; Apache Axis, for example, will not.

The use of the handle system limits the strength of the authentication of agents to  payload services. Handles are effectively secrets; any agent that learns of a valid handle can use it to authenticate to a payload service. If the user agent connects to the payload site using unencrypted HTTP rather than HTTPS, then handles are being transmitted in clear text, so can be read and copied. This can be mitigated by considering the handles to be valid only for a short time (say 10 minutes) and then requiring the user to reauthenticate to the system. This works for casual, interactive browsing but is a problem for automated or scripted use of the system.

5. Possible use of SAML in an IVO system

I consider that the Shibboleth Handle Service and Attribute Authority service are both usable as they are (relatively) simple SAML services. The use of redirections to obtain a handle is not suitable for the IVO as it does not work well for services that are not CGI programmes and for clients that are not interactive web-browsers. The use of handles as secrets is too weak to be a general basis for the IVO; it may be secure enough for some cases but we should not impose the weakness on services that need to be more secure.

The value in using Shibboleth parts lies more in reusing existing databases of users then in reusing software. If we use Shibboleth at all, then we certainly have to rewrite parts of it to cover web services.

Therefore, the most fruitful path seems to be to use Shibboleth installations as a source of handles and attributes, but to replace the redirection 'dance' by which they are used. The following approach seems plausible:

  1. There are IVOA-defined services set up as a facade for the Shibboleth installations at the user sites.
  2. Agents contact a service in the facade to get a handle before sending requests to controlled payload-services. This removes the need for the redirections.
  3. The facade handle-service delegates the creation of a handle to the Shibboleth handle-service. When it, the facade service, has a handle, it associates that handle with a public-private key-pair by issuing a short-lived identity certificate in which the handle is the subject. It then returns the handle and the certificate to the requesting agent.
  4. Agents authenticate their current handles to payload services using the certificates.
  5. Payload services look up attributes in the origin site to get user identities and group memberships. The payload services should be able to talk directly to the Shibboleth attribute authorities, specifying the authenticated handles.

The implication is that we keep the Shibboleth installations at user sites, add some parts at those sites and replace the Shibboleth parts intended for payload sites.

This model is discussed in more detail in the proposed SSO architecture for the IVO [SSO architecture].

References

[e-Science] UK e-Science core programme, Town meeting on Security for e-Science, approaches and interoperability http://www.nesc.ac.uk/events/townmeeting0405/

[SAML] SAML technical committee of OASIS, SAML v1.1 information, http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=security#samlv11

[Shibboleth] Internet 2 project, Shibboleth web-site, http://shibboleth.internet2.edu/.

[SSO architecture] Guy Rixon, IVOA SSO profile: architecture, forthcoming for Kyoto meeting in May 2005.