Test framework requirements

From OpenSFS Wiki
Revision as of 11:18, 16 December 2012 by Roman grigoryev (talk | contribs)
Jump to navigation Jump to search

This page is intended as the collection point for people to record their requirements of a new test-framework environment. The new test framework environment is intended to support Lustre today as well as into the future with exascale systems and so the entries on this page should encapsulate all that will be needed to test Lustre for the foreseeable exascale future. The current plan is that we should design the best possible new test framework environment and then assess if a transition from the current framework is possible, but we should not hinder the new framework environment in order to make the transition possible. Please add your requirements to the table below including child pages where you suggestion is a document or sizeable entry. We want to capture all the valuable input of people from ideas, to language requirements, high level architectural ideas, references to other test methods/language/environments that might be useful......

Email What and Why Requirement/Idea/Thought
[email protected] Make test scalable

Historically test framework tests tend to address single clients against single oss and single ost, extra effort is required to create scaling tests - this behaviour is the inverse of what's required. A simple test should scale to 100,000+ clients, against 1000+ servers with no effort on behalf of the test writer.

The framework environment must cause scalable tests to be the natural way to write tests. The shortest and simplest test should be scalable, with singular non-scalable being more time consuming.
[email protected] Be able to create safe live test system test sets. Test need to be organised on a one test per file basis. This will lead easier to read code but more importantly enable a set of tests to be created that are guaranteed to be safe for a live system. The problem with running tests by filter is that people make mistakes if it's possible to assemble an install of 'safe' tests then we are removing the live ammunition from the situation.
[email protected] Remove all test relationships

A significant issue with fully exploiting the test-framework is that tests have interdependences with one test being required before another test if that later is to succeed.

A knock on effect is that setup / teardown of test requirements is often done by the first or last test in a sequence.

The requirement is that each test can be successfully run without a requirement for any other test to execute. This has the following knock on effects;

• Setup / Teardown be separated entirely from the tests themselves and then the setup called only once before the first test requiring the provided functionality and then the teardown called only once after the last test requiring the provided functionality. • A general mechanism should be created that allows a safe mechanism for sections containing a number of separate tests to be bounded by a setup/teardown pair. The ‘trick’ here is that the setup/teardown must only occur if at least one of the requiring tests is actually run.

REQ-1: Identify all dependent tests and create a change request ticket for each test or set of tests. This work may have already been carried out by Xyratex and so the requirement is to create a change request ticket for those found.

REQ-2: Create a general mechanism that enables setup/teardown code to be separated from the tests and called in an efficient way. Submit this mechanism for review before implementing on each set of dependent tests.

REQ-3: Implement changes for each set of changes described by the ticket in req-1, where applicable apply the mechanism created in req-2 to deal with each setup/teardown requirement. Submit each change for review and landing, during the landing process ensure that extra testing is carried out to validate the modified tests are operating correctly.

[email protected] Add Permissions/Run-Filter Capability

The ability to run tests based on a class of requirement would allow targeted testing to take place more readily that today’s manual selection process using ONLY or EXCEPT.

This functionality as well as allow directed testing for functionality should allow filtering by requirements such as file-system-safe, failover-safe…

The ability to run tests based on a class of requirement would allow targeted testing to take place more readily that today’s manual selection process using ONLY or EXCEPT. To make this possible the run_test function must be updated to apply a AND/OR style mask to be applied.

This mask could be bitwise, keyword or key/value pair match. It should not be a trigger for functionality, i.e. if bit x set then format the file-system before test, it should be a simple way of preventing a test from running.

Suitable and likely masks/filters are:

• File system safe. This would indicate that the test can be run on a production file system.

• Failover-unsafe. By default all tests should be failover safe, this flag allows a test to be specified explicitly as unsafe.• Metadata (see spec) key:value present

• Metadata (see spec) key:value not present

The method used should be extensible so that additional flags can be added with minimal overhead.In the above example the first two are examples of some attributes that might well be hardcode with the test, the last two attributes that would be a managed resource that appends additional information to the test.

It must be the case that the filter is applied not by the called by the test but by the test itself, or framework calling the test, this is so that to override for example file system safe the actual test must be changed.

As each test will need to be modified a bulk tool for changing all tests to the default state is probably worth developing.

REQ-1: Design, implement and have reviewed the mechanism for ensuring that the mask/filter is applied correctly for each test in a failsafe manner that allows for example the use of the test framework on a live file system.

REQ-2: Develop a bulk change tool that allows all tests to be set by default to the failsafe mode. This may require no changes or may require a standard piece of code to be inserted into each test procedure.

REQ-3: Implement changes for each set of changes described by the ticket in req-1, where applicable apply the mechanism created in req-2 to deal with each setup/teardown requirement.

REQ-4: Update the obvious candidates for the major masks such as file system safe, fail over unsafe, it may be required for the majority to be updated ad-hoc moving forwards.

[email protected] Metadata Storage & Format

To increase the knowledge of the individual tests we need to be able to store structured information about each test, this information is the test metadata.

This data needs to be stored as part of the source code, so that it travels with the source as it continues its evolution.

This metadata will record attributes of the test that are completely independent of the hardware that the test runs on.

Good metadata to store is;

• Areas of coverage;

o lnet, ldiskfs, zfs, recovery, health.

• Code coverage it we can work out a method

o Difficult to see how to do this succinctly

• File system safe

Bad metadata to store is

• Execution timeo This will be very machine specific.

• ?

What is required is a human readable and writeable format for storing this data.

This integration of the data with the source will make the data an integral and valued asset which will then be treated with the care and attention that it deserves.

The data should be stored in the source code as headers to the functions, in a way that can be read and written but third party applications. This method means that applications using the data can create a small library making the data storage medium invisible to the user.

I future for example with a new test framework the method might be something quite different.

An example of the methodology is doxygen which uses a formatted header to contain metadata about the function. A useful approach that should be examined is to use doxygen formatting, with something like the /verbatim or /xmlonly section being used to store the test metadata in yaml or xml. If this approach was chosen doxygen could be used for creating a test-manual whilst the yaml/xml in the /verbatim or /xmlonly section could be used to extract test-metadata to enable targeted testing.

The approach needed needs to be prototyped to enable people to review the likely look and feel of the commented section.

A possible header might look like this;

[Need an example here]

One issue that needs to be resolved is how this data is managed from a git/code review perspective. Each change will cause a git update unless some batching/branching takes place, if the tools are automatic do we need a full review process? These issues are not without complication and need to be addressed before automation is possible. This review/git issue means that the data must be long lived and likely to change only when the test itself changes, the data cannot and should not contain transient data like test results.

REQ-1: Design and document a header format that allows the flexible storage of test metadata. If possible include the ability to make use of a tool like doxygen.

REQ-2: Design and document the requirements for a library to allow automated read/write access of this data. to this Develop a bulk change tool that allows all tests to be set by default to the failsafe mode. This may require no changes or may require a standard piece of code to be inserted into each test procedure.

[email protected] Metadata Storage Access Library

A general purpose library is required that can be used to read/write that Metadata stored as part of the source code.This library should be callable from bash and other major languages such as Python, Perl, Ruby and C/C++, this might mean a simple language layer is required to enable each caller but the general design must allow varied callers.

The library should implement at a minimum the abilty to read, write and query for tests of specific attributes.

The library needs to offer the following capability;

read / write metadata encapsulated in the source files, as the data is going to be arbitrary some form of (semi) arbitrary data interface is going to be required. A sensible half way house might be to allow key:value pairs to be read from the source but for the value to potentially be an xml/yaml dataset in its own right.

A simple query mechanism will also be required so that a list of tests with matching key:values can be retrieved in a single call.

Referring back to the Metadata Storage Format the library is going to need to be able cache writes to the metadata for extended periods of time so that updates can be batched and submitted as a single review. This change/review/merge process is going to need considerable thought, documentation and sign-off before it can be implemented.

REQ-1: Design, document and have reviewed an API that enables read, write and query

REQ-2: Investigate a widely language that allows this functionality to be made available as a library to a broad range of languages

REQ-3: Investigate, document and have reviewed a caching and update process for the changes made to the metadata, i.e. the process that occurs in the case of a write.

REQ-4: Implement and test the metadata library and include as part of the Lustre source tree.

[email protected] Clearly expressible tests & Well specified system under test. That is, in looking at the test, it should be pretty obvious regarding"what the test is doing". This is similar to Nathan's "Add test metadata" section.The bullet, "well specified system under test" is tied to the previous"introspection" bullet, but I think that it's easy to miss gathering a fairly detailed description of what is the system under test, making comparisons of similar systems more difficult. It would be more comprehensive than just the lustre specific information.
[email protected] A survey of existing open source test frameworks should be done. It's possible that there's a framework that provides 80% of the requirements, making the time to implement the solution much shorter. See frameworks comparsion : Exists_test_framework_evaluation
[email protected] Distributed infrastructure should be evaluated. I think this issue should be a primary consideration, as it will effect everything else that's done with test framework. I may have the wrong perception, but it seemed as though the use of ssh was almost a given. This should be on the table for evaluation.As well as having a good test framework, it would be beneficial to have a good interactive tool for curiosity-driven research and analysis. ipython might fit that role. It's distributed infrastructure is based on zeromq, so that might be a good distributed layer to consider for the test framework too.For the project I'm working on, I'm looking at robotframework for the base of the test infrastructure and zeromq as the glue for distribute dwork. To this I would need to add a component that'swork-load-management aware, eg, PBS, to put a load on the system froma Cray, or a cluster, and a driver to start multiple instances of robotframework, using different tests, in parallel.
[email protected] Communications Framework Without doubt the new framework environment will use an implementation of MPI, Corba, Parallel Python, ICE etc. any implementation of the framework environment must isolate the tests from that communication layer. This 100% disconnect of the tests from the communication layer is so that the communication layer can be replaced as an need be.
[email protected] Language The language must be object orientated so that all functionality is of an object.action or object.value type notation. The language used to develop the framework need not be the same as that of the tests within the framework, although it is probably easier if the two are the same.

If a language can be found that allows development within an environment such as for example Eclipse or Textmate with tab completion of object members this would make the framework environment much more accessible to many.Careful choice of language will be important because the framework will be a big parallel application in its own right we will need to ensure that the language allows scaling; Memory and process issues (threads, process pinning) are easier to do with C, C++ and easier to investigate however a dynamic language like Ruby or Python is more likely to allow the creation of the easy to write and easy to read tests that are required.Language choice would also depend on availability of things suitable parallel frameworks like MPI, Corba or Parallel Python.

[email protected] What & Why How