Contract SFS-DEV-002

= Summary = This project involves two improvements to Lustre's security features:
 * A mapping feature to allow clusters with different UID/GID sets to use a single common Lustre filesystem
 * A shared-key authentication and encryption scheme based on Lustre's GSS mechanism, as a simpler alternative to Kerberos.

= Resources =

Documentation
(Sample HLD: )


 * [[File:OpenSFS_Software_Contract_6-5-12_article1.pdf]]
 * [[File:UID_GID_Scope_Statement_v2.pdf]]
 * [[File:UID-GID_Solution_Architecture.pdf]]
 * [[File:UID-GID_HLD.docx]]
 * [[File:Shared_keys_scope_v2.pdf]]
 * [[File:Shared_keys_architecture.pdf]]
 * [[File:Shared_keys_HLD.docx]]

Jira

 * Shared-key tracker [LU-3289]
 * [Security search filter]

Mailing List

 * Mailing list administration
 * List archives

= Meetings =

2013-07-09
Attending: Josh, Andy, Steve, Andreas, Ken, Nathan

Meeting Minutes:
 * LU-3527 patches in gerritt
 * discussion on the right size/scope of patches:
 * should be big enough to contain an entire "thought" (no dangling, related lines outside of the patch)
 * must be small enough to be comprehensible in a single sitting (< 1k LOC)
 * Test list -- Josh had sent a list to the listserv; Nathan wiki'ed it here: IUDEV test list


 * Closed actions:
 * AI: Andy to add checks for gss libs before enable-gss (LU-3490) -- done
 * AI: Justin to develop list of required GSS/Kerberos libraries for builders -- in LU-3488
 * AI: Andy will start a "OpenSFS test cluster HowTo" on the OpenSFS Wiki for future generations -- removing from tracking, although Nathan would love to see it.

Actions:
 * AI: Ken to provide simply Kerberos server setup recommendations and potentially unit tests
 * AI: Josh to fix up Nathan's interpretation IUDEV test list
 * AI: Andy to add shared-key test list to IUDEV test list

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST July 16 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-07-02
Attending: Ken, Nathan, Josh, Steve, Andreas

Meeting Minutes:
 * Josh has uploaded patches to gerritt in the master branch of lustre-dev
 * Tracked in LU-3527
 * Inspectors: Fan Yong, Andreas, Ken
 * Trouble pushing to private branches, but pushing to master works with no ill effects.

Actions:
 * AI: Andy to add checks for gss libs before enable-gss (LU-3490)
 * AI: Ken to provide simply Kerberos server setup recommendations and potentially unit tests
 * AI: Justin to develop list of required GSS/Kerberos libraries for builders
 * AI: Andy will start a "OpenSFS test cluster HowTo" on the OpenSFS Wiki for future generations.
 * AI: Andy and Josh to determine test list to insure feature coverage

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST July 09 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-06-25
Attending: steve, josh, andy, nathan, andreas, alex, ken, ned, john

Meeting Minutes:
 * There remain Gerritt and Git problems; these are slowly being worked.
 * UID mapping phase 1 (management interface, node map structures) has been uploaded to Jira (IU-3)
 * Reviewer volunteers: Andreas (or Fan Yong), Ken
 * Needs to be moved to a LU- ticket for visibility.
 * phase 3 (UID mapping) up next, working on unit tests
 * phase 2 (MGS pushing/syncing node map) will be worked on after phase 3.
 * Andy has provided a patch to enable gss by default - LU-3490
 * LU-3288 is a probably a pre-req to landing this
 * kerberos tests do pass if proper Kerberos authenticated setup
 * without gss-null landing, currently no gss tests will pass
 * we need to change the patch such that gss is only enabled if the libraries are found

Actions:
 * AI: Andy to add checks for gss libs before enable-gss (LU-3490)
 * AI: Ken to provide simply Kerberos server setup recommendations and potentially unit tests
 * AI: Justin to develop list of required GSS/Kerberos libraries for builders
 * AI: Andy will start a "OpenSFS test cluster HowTo" on the OpenSFS Wiki for future generations.
 * AI: Andy and Josh to determine test list to insure feature coverage

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST July 02 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-06-11
Attending: Josh, Andy, Andreas

Meeting Minutes:
 * Josh updated UID map to use index objects
 * this is proving faster than the llog operations
 * update test scripts to work with the new index code
 * Andy working on gssrpcd
 * adding ability to select encryption mechanism
 * Justin Miller to help maintain nodes for development build/test of code
 * Justin is collecting a list of required GSS/Kerberos libraries for builders
 * Discussed how we can begin adding this code to autotest
 * configure/build needs to autodetect GSS/Kerberos libraries for Gerrit builds
 * presumably just enabling this does not impact performance?
 * initially select tests via "Test-Parameters: testlist=", later add tests to default test list (must be able to pass w/o GSS enabled)

Actions:
 * AI: Justin to develop list of required GSS/Kerberos libraries for builders
 * AI: Andy will start a "OpenSFS test cluster HowTo" on the OpenSFS Wiki for future generations.
 * AI: Andy and Josh to determine test list to ensure feature coverage

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST June 18 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-06-04
Attending: steve, josh, andy, nathan, andreas, alex, ken hornstein

Meeting Minutes:
 * lustre gss-utils has some kerberos dependencies that Andy needs to disentangle
 * Closed Actions:
 * AI: nathan Ask Ken Hornstein about possible involvement
 * KenH has graciously volunteered to be the Lustre Kerberos maintainer! And is joining our weekly meetings and mail list.
 * AI: nathan Ask Eric Mei about sptlrpc questions
 * Eric answered back to the list, and has joined the mail list as well.
 * Andreas filed ET-1342 for installing security packages on Intel test machines

Actions:
 * AI: Andy will start a "OpenSFS test cluster HowTo" on the OpenSFS Wiki for future generations.
 * AI: Andy and Josh to determine test list to insure feature coverage

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST June 18 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-05-28
Attending: josh, andy, nathan, andreas, alex

Meeting Minutes:
 * [LU-3288] discussion - who will do this work?
 * Is there a kerberos-interested maintainer? Kerb users: PSC, Fermi, UofFL, NRL
 * Note the separation of the build options also implies a separation of #ifdef macros inside of Lustre.
 * Noted that there are 3 "null" mechanisms: gss-null != sptlrpc-null (doesn't use gss) != krb5 "plain"
 * Questions on existing sptlrpc:
 * How is non-krb security level required - mount option
 * Can a single rpc be not encrypted after connection negotiation?
 * Existing Lustre security tests eventually need to be separated:
 * sanity-gss should become sanity-krb5
 * sanity-krb5 should add tests for krb5 "plain" mechanism
 * sanity-gss should eventually use the gss-null mechanism that IU is developing
 * sanity-sptlrpc should be written to test sptlrpc "null" in the absence of GSS.
 * Closed actions
 * Andreas to provide instructions for autotest
 * tune testing for a particular patch. This allows specifying a patch is being submitted for testing (i.e. fortestonly), a list of test scripts to run (e.g. testlist=sanity-sec,sanity-gss), and setting environment variables (e.g. envdefinitions=GSS_PIPEFS=true), etc.
 * Some (sparse) documentation on how to run the test scripts:
 * Some older (and partly out of date) information on the specific testscripts are available at:

Actions:
 * AI: Andy will start a "OpenSFS test cluster HowTo" on the OpenSFS Wiki for future generations.
 * AI: Andy and Josh to determine test list to insure feature coverage
 * AI: Andreas file bug for installing security packages on intel test clusters
 * AI: nathan Ask Ken Hornstein about possible involvement
 * AI: nathan Ask Eric Mei about sptlrpc questions

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST June 04 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-05-21
Attending: simms, josh, andy, nathan, andreas, alex

Meeting Minutes:
 * LU-3288 filed to separate gss from krb5 requirement
 * Closed actions
 * AI: Nathan to pursue Xyratex kerb patches [LU-634]
 * Described here
 * AI: Andreas to help Josh resolve git push problems
 * Andreas added a "kerberos" label to the security Jira tickets at Intel - everyone please include this label on future tickets.

Actions:


 * AI: Andy will start a "OpenSFS test cluster HowTo" on the OpenSFS Wiki for future generations.
 * AI: Andy and Josh to determine test list to insure feature coverage

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST May 28 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-05-14
Attending: simms, josh, andy, nathan, andreas

Meeting Minutes:
 * Josh writing unit test, debuggin; official proc file name shall be "nodemap"
 * Andy working on separating Lustre GSS build from Kerberos requirements
 * J&A both working on how to use OpenSFS test cluster with help from Justin and Chris
 * AI: Andy will start a "OpenSFS test cluster HowTo" on the OpenSFS Wiki for future generations.
 * Andreas added a "kerberos" label to the security Jira tickets at Intel - everyone please include this label on future tickets.

Actions:
 * AI: Nathan to pursue Xyratex kerb patches [LU-634] - in progress
 * AI: Andreas to help Josh resolve git push problems - in progress

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST May 21 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-05-07
Attending: simms, josh, andreas, alex, nathan, andy

Meeting Minutes:
 * Josh pushed config patches to Gerritt and posted usage text
 * proc file format should use YAML
 * please clarify the directory business in the usage text - Andreas asked for man page format
 * Andy working on the GSS API code for shared keys and null mechanism
 * Lustre sanity-gss should be split into 3: sanity-gss (NULL), sanity-krb, and sanity-sharedkey
 * Andy filed bug [LU-3288] to remove krb requirement from --enable-gss build switch
 * Andreas filed [LU-3289] top-level shared-key tracker
 * Nathan added a [Security search filter] at the Intel Jira


 * Closed actions
 * Andy to file a new ticket with Autoconf fix for libgssapi rename to libgssglue [LU-3137]
 * Simms to contact Peter Jones to help find reviewers
 * Simms to locate test cluster: OpenSFS cluster to be used
 * Nathan to identify kerb patches - xyratex vs. intel / stilbor - patches were located, but the utility/purpose is unclear. Nathan to pursue.
 * Andy and Andreas to file a Jira about ptlrpc replay handling problems [LU-3290]

Actions:
 * AI: Nathan to pursue Xyratex kerb patches - in progress
 * AI: Andreas to help Josh resolve git push problems - in progress

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST May 14 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-04-30
Attending: ned, simms, josh, andreas, alex, nathan

Meeting Minutes:
 * Josh - 2500 lines of code for config: management, lctl, proc files
 * proc file display of nid ranges and uid maps
 * nid ranges are specified as start/end; there is no apparent need for a "skip" functionality (e.g. even/odd ranges)

Actions:
 * AI: Andy and Andreas to file a Jira about ptlrpc replay handling problems
 * AI: Andy to file a new ticket with Autoconf fix for libgssapi rename to libgssglue [LU-3137]
 * AI: Nathan to identify kerb patches - xyratex vs. intel / stilbor
 * AI: Simms to contact Chris Gearing to help find reviewers
 * AI: Andreas to help Josh resolve git push problems
 * AI: Simms to locate Kerb-friendly test cluster

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST May 07 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-04-23
Attending: simms, andy, josh, andreas, john, nathan

Meeting Minutes:
 * Josh - working on config
 * Andy - working on test cases, build checks
 * Andreas - established repo at Intel

Actions:
 * AI: Nathan to identify kerb patches - xyratex vs. intel / stilbor
 * AI: Simms to contact Chris Gearing to help find reviewers

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST Apr 30 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-04-09
Attending: Josh, John, Andrew, Steve, Nathan, Alex

Meeting Minutes:
 * Josh - working on config
 * Andy - pushed fixes for Kerberos, working on test cases

Actions:
 * AI: Andreas to establish a git repo/branch hosted at Intel

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * skipping week of LUG; next meeting 12:00pm PST Apr 23 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-04-02
Attending: nathan, josh, andrew, steve, cory, alex, john

Meeting Minutes:
 * Github access available to PAC members - send keys to Josh
 * Andy trying to land fixes for current Lustre Kerberos LU-2392, LU-2384
 * Discussion on replay attacks - replay handling is included in ptlrpc, so we shouldn't need shared-key code specific fix.
 * Andreas points out problems with current ptlrpc, but Nathan and Andy's feeling is that this should not be part of the IU contract work. But we should file a Jira describing the problem.

Actions:
 * Simms to find reviewers for LU-2392 and LU-2384
 * Andy to test and land LU-2392 and LU-2384
 * Andy and Andreas to file a Jira about ptlrpc replay handling problems.

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST Apr 09 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-03-26
Attending: Josh, Andy, Nathan, Ned, Steve, Alex

Meeting Minutes:
 * Github access available to PAC members - send keys to Andy
 * Andy has a fix for LU-2392 that he will attach to that ticket.
 * Andy needs reviews for above. Simms will ask PJones.
 * Andy wanted some direction for how to implement tests - Nathan pointed at sanity-sec.sh and sanity-gss.sh
 * Josh update: finishing up part 1 (map setup): module is complete, proc interface for maps, adding lctl writing config to mgs log
 * wants to work on local identity mapping (part 3) before map shipping (part 2)
 * Andy update: working on build and tests, has implemented null GSS flavor but not tested yet.

Actions:
 * Simms to find reviewers for Andy's version of LU-2392
 * Andy to post his fix for LU-2392 to that ticket
 * Andy to file a new ticket with Autoconf fix for libgssapi rename to libgssglue

Milestones In Progress:
 * Shared Keys CODE
 * UID/GID Mapping CODE

Next Meeting:
 * 12:00pm PST Apr 02 2013 unless otherwise cancelled
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-02-19
Attending: Andreas, Alex, Josh, Nathan, Ned, Steve

Meeting Minutes:
 * Shared Key HLD accepted
 * UID/GID Mapping HLD accepted
 * Coding phase should start now. We don't expect any useful results be next week, so we will cancel next week's meeting.
 * Nathan added latest versions of HLDs to wiki page.

Actions:
 * Josh and Andy to begin coding
 * Josh will send out link to GitHub repository
 * Josh will send email early next week with a status update, at which point we can plan for the next meeting

Milestones Completed:
 * Shared Keys HLD APPROVED 2013-02-19
 * UID/GID Mapping HLD APPROVED 2013-02-19

Next Meeting:
 * No meeting 2013-02-26, next meeting pending code progress.
 * 12:00pm PST Mar ?? 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-01-29
Attending: Andreas, Andy, Alex, Josh, Nathan, Ned

Meeting Minutes:
 * Key scope: sets of keys are defined per cluster (not per-client)
 * These keys are used to generate session keys for Auth and Encrypt
 * Root squash - various ideas
 * EAs on directories describe which clusters are allowed
 * Squash per-cluster roots to distinct users, use ACLs to provide per-cluster root-like permissions
 * Use bind-mounting to limit the visibility of the fs to a subtree
 * suggestion to add root fid/path to cluster definition for future use
 * Current plan: root is not treated specially - per-cluster roots may be mapped to the actual fs root user, or not.
 * Object (OSS) security against untrusted client - out of scope
 * MGS primacy
 * "MGS up before before other servers" may be a requirement for the mapping or shared key features
 * but this requirement must be relaxed if the uid/shared key feature has not been enabled

Actions:


 * Nathan to send HLD example template (done)
 * Nathan to propose OpenSFS contract doc templates
 * Andy/Josh update HLD with detail

Milestones Under Review:
 * UID HLD
 * Shared Keys HLD

Next Meeting:
 * I will be travelling for the next two meetings (Feb 5, 12). Can someone else host the meeting?
 * 12:00pm PST Feb 5 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-01-22
Attending: Nathan, Josh, Andrew, Steve

Meeting Minutes:
 * Comments on Shared Keys HLD
 * 1) independence of auth and encrypt keys
 * 2) encrypt-then-MAC
 * 3) HLD should address multiple simultaneuous keys
 * 4) interaction between shared keys and mappings
 * 5) * original assumption was key-per-client; key-per-cluster seems to make more sense for a few reasons (large-cluster manageability, shared-root clients). A hash of the keys could be added to a cluster definition.  A "null" cluster could be defined for a single-cluster environment.
 * Ended meeting early; we need more meeting attendees to discuss these issues.

Actions:
 * Review Security HLD to provide timely feedback.

Milestones Under Review:
 * UID HLD
 * Shared Keys HLD

Next Meeting:
 * 12:00pm PST Jan 29 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-01-15
Attending: Nathan, Ned, Josh, Andrew, Steve, Alex, Andreas, John

Meeting Minutes:
 * UID/GID HLD Review
 * Comments by Nathan, Andreas, Ned returned via Word doc
 * Define/update cluster definition via complete file vs. incrementally
 * Josh: file-based cluster def changes requires walking export tree
 * Done rarely, probably ok
 * There may be security implications at the transition when redefining cluster defs
 * When a NID is removed from a def it should use the default mapping


 * Define/update UID/GID mappings via complete file vs. incrementally
 * incremental uid/gid mapping in order to prevent fs access blocking during replacement.
 * Andreas suggested atomically swap in new mapping once received/set up.


 * Behaviour during setup and recovery
 * Don't use default mapping while waiting for definitions; FS should block access to all files until mappings and cluster defs have been set up.
 * Need a clear signal when an update is finished/complete.
 * Servers currently cache the MGS Lustre config locally
 * May be undesirable for OSD
 * Perhaps this behaviour should be changed: stop caching, require MGS for server startup.


 * Shared Key HLD distributed
 * Comments should be returned quickly for HLD revision next week.

Actions:
 * Review Shared Key HLD to provide timely feedback.

Milestones Under Review:
 * UID HLD
 * Shared Keys HLD

Milestones Completed:
 * Shared Key Scope Statement APPROVED 2013-01-15

Next Meeting:
 * 12:00pm PST Jan 22 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2013-01-08
Attending: Nathan, Ned, Josh, Andrew, Steve, Dave, Alex

Meeting Minutes:
 * Clarifying current documents:
 * Latest Shared Keys doc: arch doc. HLD expected this week.
 * Latest UID-GID doc: HLD.
 * We need reviewers for both HLDs.
 * UID-GID:
 * Nathan has already sent comments
 * Ned volunteers
 * I'd like to volunteer Andreas in absentia
 * Shared Keys:
 * Not out yet; any eager volunteers?
 * Document types: I think the consensus going forward is Google Docs for easier collaboration/feedback.

Actions:
 * Andrew to deliver HLD be the end of this week (hopefully)
 * Reviews to provide timely feedback.

Milestones Under Review:
 * UID HLD
 * Shared Keys Solution Arch

Milestones Completed:
 * UID/GID Scope Statement APPROVED 2013-01-08

Next Meeting:
 * 12:00pm PST Jan 15 2013
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2012-12-11
Attending: Nathan, Andreas, Josh, Simms, Cory, Alex, Andrew Meeting Minutes:
 * Josh and Andrew updated the arch docs with improved use cases, test plan, and acceptance criteria
 * Several PAC members commented on the updates
 * Alex noted we neglected to address previous discussions on allowing multiple simultaneous keys:
 * should we allow key updates on a live system, or connect-time only?
 * is there any upper limit on total keys?
 * should keys be restricted to particular nid range?

Actions:
 * PAC members review docs for final approval by next week.
 * The above multiple-key use case should be added to the arch doc.
 * In the meantime HLD design can begin

Milestones Under Review:
 * UID Solution Arch
 * Shared Key Solution Arch

Next Meeting:
 * 12:00pm PST Dec 18 2012
 * Intercall (866) 203-7023
 * Conference code: 5093670258

No meetings on Dec 25 or Jan 1.

2012-12-04
Attending: Nathan, Ned, Josh, Simms, Carrier Meeting Minutes:
 * Solution Architecture document review. More detail requested in
 * Practical use case (UID)
 * Specific functional requirements (shared key)
 * Detailed, specific acceptance criteria (e.g. "Any single user on up to 100(?) separate clusters has Unix UID/GID-controlled access to his files on shared Lustre file system.", "Unknown users can be squashed to a particular UID." etc.)

Actions:
 * Josh and Andrew to revise Solution Architecture docs with more detail.

Milestones Under Review:
 * UID Solution Arch
 * Shared Keys Solution Arch

Next Meeting:
 * 12:00pm PST Dec 11 2012
 * Intercall (866) 203-7023
 * Conference code: 5093670258

2012-11-20
Attending: Nathan, Alex, Andreas, Steve, Cory

Meeting Minutes:
 * Simms requested approval of the two scope statements as presented in email 2012-11-10. No objections were raised, and the scope statements were approved.

Actions:
 * Simms et all will begin work on the Solution Architecture.

Milestones Under Review:
 * none

Milestones Completed:
 * UID/GID Scope Statement APPROVED 2012-11-20
 * Shared Key Scope Statement APPROVED 2012-11-20

Next Meeting:
 * 12:00pm PST Nov 27 2012
 * Intercall (866) 203-7023
 * Conference code: 5093670258