Lustre GSSAPI/Kerberos Repair: Difference between revisions

From OpenSFS Wiki
Jump to navigation Jump to search
(Created page with " So far, a few Kerberos/GSS patches have been submitted to the Lustre Gerrit repository: * http://review.whamcloud.com/7960 "LU-4113 gss: uncatched error in gss_svc_upcall" ...")
 
No edit summary
Line 1: Line 1:


== Motivation ==
One of our goals of [[Contract_SFS-DEV-002]] is to provide lightweight, shared-key authentication for sites that may have trouble finding the resources or political will to maintain their own Kerberos infrastructure.  The natural choice, we thought, was to plug in these features using the existing GSSAPI abstraction, thereby interfering with as little other code as possible, including the existing Kerberos support.


We quickly ran into (and fixed) build issues with the GSSAPI and Kerberos code.  We also corrected some inappropriate direct dependencies on Kerberos (where the GSSAPI abstraction wasn't used).  In talking with others at the LAD conference last month, though, we confirmed nagging fears of more issues lurking under the surface.  I want to find and fix these bugs so we can move on with our GSSAPI-dependent implementation. I suspect others here want working Kerberos support.
I've written a "null" (pass-through) GSSAPI mechanism whose sole purpose is to help us test the GSSAPI code paths.  I'm currently working on getting Lustre up and running with GSSAPI enabled, keyring support enabled, but using only the null mechanism to pass traffic through with no authentication or encryption.
Perhaps now is a good time for others on the list to share what experiences they've had with this code and where they've run into issues.  Then maybe we can identify bugs, file them, and get to work on fixing them.
== Patches ==
So far, a few Kerberos/GSS patches have been submitted to the Lustre Gerrit repository:
So far, a few Kerberos/GSS patches have been submitted to the Lustre Gerrit repository:


Line 6: Line 15:
* http://review.whamcloud.com/7913 "LU-4085 build: gss/krb5 is disabled"
* http://review.whamcloud.com/7913 "LU-4085 build: gss/krb5 is disabled"
* http://review.whamcloud.com/7770 "LU-4012 gss: upcall fails due to removed calls"
* http://review.whamcloud.com/7770 "LU-4012 gss: upcall fails due to removed calls"


Each of these review requests is essentially on its own Git branch, and it is easy to cherry-pick them into a repository (see each page to get a "cherry-pick" URL).
Each of these review requests is essentially on its own Git branch, and it is easy to cherry-pick them into a repository (see each page to get a "cherry-pick" URL).


=== Submitting/inspecting patches ===
If you are interested to contribute to Lustre Kerberos, please see:
If you are interested to contribute to Lustre Kerberos, please see:
* https://wiki.hpdd.intel.com/display/PUB/Submitting+Changes
* https://wiki.hpdd.intel.com/display/PUB/Submitting+Changes
Line 15: Line 24:


Note that while the majority of the Lustre code is tested for every patch that is submitted (8h or so per patch) the Kerberos code is NOT currently in the automated environment.  It would be great if people with an understanding of Kerberos could look at the test scripts in lustre/tests/sanity-sec.sh and lustre/tests/sanity-gss.sh to see what needs to be done to get them working.  Having those tests working would ensure that the Lustre Kerberos code would avoid breakage in the future.
Note that while the majority of the Lustre code is tested for every patch that is submitted (8h or so per patch) the Kerberos code is NOT currently in the automated environment.  It would be great if people with an understanding of Kerberos could look at the test scripts in lustre/tests/sanity-sec.sh and lustre/tests/sanity-gss.sh to see what needs to be done to get them working.  Having those tests working would ensure that the Lustre Kerberos code would avoid breakage in the future.
Cheers, Andreas

Revision as of 09:26, 21 October 2013

Motivation

One of our goals of Contract_SFS-DEV-002 is to provide lightweight, shared-key authentication for sites that may have trouble finding the resources or political will to maintain their own Kerberos infrastructure. The natural choice, we thought, was to plug in these features using the existing GSSAPI abstraction, thereby interfering with as little other code as possible, including the existing Kerberos support.

We quickly ran into (and fixed) build issues with the GSSAPI and Kerberos code. We also corrected some inappropriate direct dependencies on Kerberos (where the GSSAPI abstraction wasn't used). In talking with others at the LAD conference last month, though, we confirmed nagging fears of more issues lurking under the surface. I want to find and fix these bugs so we can move on with our GSSAPI-dependent implementation. I suspect others here want working Kerberos support.

I've written a "null" (pass-through) GSSAPI mechanism whose sole purpose is to help us test the GSSAPI code paths. I'm currently working on getting Lustre up and running with GSSAPI enabled, keyring support enabled, but using only the null mechanism to pass traffic through with no authentication or encryption.

Perhaps now is a good time for others on the list to share what experiences they've had with this code and where they've run into issues. Then maybe we can identify bugs, file them, and get to work on fixing them.

Patches

So far, a few Kerberos/GSS patches have been submitted to the Lustre Gerrit repository:

Each of these review requests is essentially on its own Git branch, and it is easy to cherry-pick them into a repository (see each page to get a "cherry-pick" URL).

Submitting/inspecting patches

If you are interested to contribute to Lustre Kerberos, please see:

Note that while the majority of the Lustre code is tested for every patch that is submitted (8h or so per patch) the Kerberos code is NOT currently in the automated environment. It would be great if people with an understanding of Kerberos could look at the test scripts in lustre/tests/sanity-sec.sh and lustre/tests/sanity-gss.sh to see what needs to be done to get them working. Having those tests working would ensure that the Lustre Kerberos code would avoid breakage in the future.