Lustre GSSAPI/Kerberos Repair: Difference between revisions

From OpenSFS Wiki
Jump to navigation Jump to search
No edit summary
Line 14: Line 14:
* http://review.whamcloud.com/7960 "LU-4113 gss: uncatched error in gss_svc_upcall"
* http://review.whamcloud.com/7960 "LU-4113 gss: uncatched error in gss_svc_upcall"
* http://review.whamcloud.com/7913 "LU-4085 build: gss/krb5 is disabled"
* http://review.whamcloud.com/7913 "LU-4085 build: gss/krb5 is disabled"
* http://review.whamcloud.com/7770 "LU-4012 gss: upcall fails due to removed calls"
* [https://jira.hpdd.intel.com/browse/LU-4012 LU-4012] gss: upcall fails due to removed calls
 
* [https://jira.hpdd.intel.com/browse/LU-3778 LU-3778] GSS doesn't know about proxy subsystems
Each of these review requests is essentially on its own Git branch, and it is easy to cherry-pick them into a repository (see each page to get a "cherry-pick" URL).
* [https://jira.hpdd.intel.com/browse/LU-6020 LU-6020] Bugfixes for GSS/Kerberos


=== Submitting/inspecting patches ===
=== Submitting/inspecting patches ===

Revision as of 12:41, 10 March 2015

Motivation

One of our goals of Contract_SFS-DEV-002 is to provide lightweight, shared-key authentication for sites that may have trouble finding the resources or political will to maintain their own Kerberos infrastructure. The natural choice, we thought, was to plug in these features using the existing GSSAPI abstraction, thereby interfering with as little other code as possible, including the existing Kerberos support.

We quickly ran into (and fixed) build issues with the GSSAPI and Kerberos code. We also corrected some inappropriate direct dependencies on Kerberos (where the GSSAPI abstraction wasn't used). In talking with others at the LAD conference last month, though, we confirmed nagging fears of more issues lurking under the surface. I want to find and fix these bugs so we can move on with our GSSAPI-dependent implementation. I suspect others here want working Kerberos support.

I've written a "null" (pass-through) GSSAPI mechanism whose sole purpose is to help us test the GSSAPI code paths. I'm currently working on getting Lustre up and running with GSSAPI enabled, keyring support enabled, but using only the null mechanism to pass traffic through with no authentication or encryption.

Perhaps now is a good time for others on the list to share what experiences they've had with this code and where they've run into issues. Then maybe we can identify bugs, file them, and get to work on fixing them.

Patches

So far, a few Kerberos/GSS patches have been submitted to the Lustre Gerrit repository:

Submitting/inspecting patches

If you are interested to contribute to Lustre Kerberos, please see:

Note that while the majority of the Lustre code is tested for every patch that is submitted (8h or so per patch) the Kerberos code is NOT currently in the automated environment. It would be great if people with an understanding of Kerberos could look at the test scripts in lustre/tests/sanity-sec.sh and lustre/tests/sanity-gss.sh to see what needs to be done to get them working. Having those tests working would ensure that the Lustre Kerberos code would avoid breakage in the future.