LWG Minutes 2015-05-06

Attendance

 * Chris Morrone (LLNL)
 * Andreas Dilger (Intel)
 * Patrick Farrell (Cray)
 * Cory Spitz (Cray)
 * Sebastien Buisson (Bull/Atos)
 * Colin Faber (Seagate)
 * Paul Sathis (Intel)
 * Brad Settlemyer (LANL)
 * Justin Miller (Cray)
 * Richard Henwood (Intel)
 * Chris Horn (Cray)

Agenda

 * Bring-your-own-agenda-topic

Topics decided at the beginning of the call were:


 * Fate of upstream kernel client (Andreas)
 * Side branch status (Cory)
 * Testing topic from LWG meeting at LUG (Justin)
 * 2.8 scope missing on wiki (Cory)

Fate of upstream kernel client
Andreas - New developments are not very good. Andreas does not remember what exactly transpired, but Greg Kroah-Hartman has been reminded that there is an out-of-kernel lustre tree. Greg KH not happy that that dozens of patches that fix up whitespace/whatever in the upstream kernel have not made it back into Lustre's master branch.

Basically Greg has said that if not everything cleaned up by 2.4 rc 1, in two months, going to drop lustre entirely.

Greg stated something along the lines that any competent person would take one week to fix the upstream issues. James/etc. sure whitespace only a week, but there is much more to do.

Andreas expressed the opinion that it is a pretty major deal if we get booted out of the kernel. Oleg tried to push some patches upstream to deal with it, but it has not been enough to satisfy the upstream kernel guys.

Do we think lustre in upstream is in long term interest of Lustre users?

Patrick asked what upstream client benefits are.

Andreas said that one benefit that we are not perpetually behind supporting new kernels. All new kernels would have lustre support. James working on 3.18, but it is up to 3.30? and 4.2 is out now. Second benefit, once in upstream kernel vendors are more likely to ship to all of their clients. Would help with broader adoption.

Right now harder and more work because we have to do multiple lines of development. We were hoping to get the client upstream and then feed patches to the kernel regularly, but upstream kernel guys demanded no new features until cleanup done, which make that a larger development burden.

If Lustre dropped out of staging at this point, it will be be 10 times harder to get Lustre into the Linux again in the future.

Cray, DDN, Seagate need to chip in effort to make this happen. Probably won't happen with current level of effort.

Go back to respective organizations and decide if the benefits are worth the effort.

We have little time to act.

Chris played devil's advocate: The up-streaming effort is a large drain on development resources, and the more tangible benefits are years in future. We are struggling to get enough man power together to meet our development needs without that additional effort, so maybe the benefits do not outweigh the costs.

Cory: Thinks Cray can add a little, but seems like there is too much to do in too little time. Perhaps better to continue the cleanup work doing and get it in master since that work is good whether it lands upstream in Linux or not, but perhaps we should nnot worry as much about getting it in the upstream kernel.

Side branch (topic branch) status
Are we on track to use any side branches?

Andreas - One issue with gerrit is that we can push merge patch, but cherry-pick gerrit mode does not allow landing merge patch. Some technical huddles there.

Other option to push series of patches, but final landing does not get additional comment annotations.

Intel would like to land SFS-DEV-005 remaining work as a topic branch.

Chris ask what the short term work-arounds could be.

Andreas said that we could temporarily change gerrit to merge-mode each time we want to land a large topic branch. Topic branches are not yet common, so it might not be too much of a problem to do that in the short term.

Testing
Testing was an LWG meeting topic for LUG that we did not have time to talk about. Justin wanted to know what that discussion was going to be about.

Chris didn't have strong personal interest in leading that topic. Testing a frequent topic that arrises as something the community should collaborate on, but not much actually being done.

Somewhat related side conversation between Andreas and Chris about Seagate's CPPCheck automation for Lustre testing. Chris reminded that it is taking some time for a response to Seagate's request for access to gerrit events with little explanation of the delay. Andreas explained some concerns about exposure to non public infomration that needs to be investigated, and man power turnover that has caused delays. Chris suggested putting that information in the ticket and giving an estimate of when the new people can look into the ticket would buy Intel more patience and good will.

Back on main topic of lustre testing, it was expressed that lots of people seem interested in testing, but not much coordination of effort yet.

Lustre 2.8 Scoping
Cory pointed out tha the scope section is blank. Chris suggeted that our new way of doing features (feature land when done, releases not tied to a specicfic set of features) may be partly to blame for that. Cory suggested explaining that on the 2.8 wiki page, and Chris agreed.

That led into a discussion of the current status of projects on the wiki.lustre.org/Projects page. Chris asked if those things are still on target. Some points of contact not on the call but two were:


 * Multiple modify RPCs - believed on target to finish this month
 * NRS delay - implementation is done, but not yet pushed for review. Author waiting for dependant NRS cleanup patches to land.  It was suggested that the implementation work be pushed now as dependant patches rather than waiting.  Author

The testing section also empty on the Lustre 2.8 page. Chris wanted to know what we are going to do this time around. Talk during previous two (maybe more) releases about having a written release testing plan, but we have done very little there. Chris suggested that we will not make any progress in that area unless someone commits to running that effort. Chris put Cory on the spot and asked if he could do it, since Cory often is the one (at least be Chris' memory) that reminds us of the need for a test plan. Chris thought his interest and experience made him a good candidate to lead that effort. Cory agreed to consider it and see if he had the time to lead that.

Cory - Would like to see community way more involved in test plan, but not sure if we will be able to do this time around.

Cory did agree to talk to Peter Jones and learn about the current release testing. That might be a starting point for the test plan effort: write down what Intel does as part of the community releases now, before we try to go down the harder path of distributing and coordinating work amongst multiple organizations.

Action Items

 * All organizations to see if they have desire and manpower to apply to Lustre in the upstream kernel. Only one month until Lustre is removed from kernel and point becomes moot unless we take action now.
 * Let James know we discussed upstream kernel topic - (Chris)
 * Cory to talk to Peter Jones to learn about current state of release testing as a starting point for improving the written test plan.
 * Chris Horn to update target completion date for NRS Delay on wiki.lustre.org/Projects.