Experimental Area
Documentation for experimental branches, spikes, WIPs etc. You can also add documentation related to your merge requests or patches here.
The docs should be moved to proper place in OTS wiki after the feature gets integrated to master branch.
This area is relevant for OTS developers only! Information under this area does not apply to master or released OTS versions.
0.1
0.8.2
How to monitor ots?
We have monitor DTO:s. Probably some kind of hierarchy of dto monitor objects.
- Testrun Events for all testrun states (flashing, bootup, testing, test result processing...):
- Test pkg execution event (start/stop) (minimum requirement for [FEA] 9036)
- Flashing Event (start/stop) (n:th try)
- Bootup event (n:th try)
- Testsuite/set/case/step event? (What level of detail we want to monitor?) Remember that logging still exists...
- Testrun overall result
- Worker events:
- Worker state (Start/stop/busy(testrun id)/free) (What about queue info?)
- Heartbeat?
- Failure event dto? (Flashing failed, device failure...)?
What about statistics monitoring? What do we need?
- Number of testruns?
- Number of workers?
- Active testruns?
- PASS/FAIL/ERROR ratios?
- Error statistics?
- Most common errors
- Some kind of grouping of errors and statistics for them (user fault vs catastrophic ots failure vs device/image error vs...)
Notes:
- One host can have multiple worker instances running. Dto:s need to have information about the instance, not just hostname.
-
Who receives monitor dto:s?
Current implementation uses the same testrun response queue and process for monitor dto:s. This is simple to implement for testrun related events but very limited for generic monitoring of the whole system. For example most of the worker events cannot be sent to testrun queue because they are not related to a specific testrun and we don't know if there even is any testrun response queues around.
Having a separate permanent queue and a separate listener process for monitoring would allow monitoring also outside a testrun.
Pros:
- Can be deployed as a completely separate component
- even into a different machine
- Maintenance, update etc. does not affect test execution in any way
Cons:
- One more process/daemon to maintain
- Might cause errors.
- Not running => Lots of messages pile up in the queue
- Dependencies
- What if for example the distribution model relies on monitor component to provide up to date data?
- How to provide testrun specific monitor info to Publishers?
TODO
- Current server side implementation does not send monitor dtos to Publishers in real time. They are sent only after the testrun is done.
Bug 9036 - [FEA] Test package distribution based on history (last execution time of a package)
Proposal:
- Data provided by monitor dto:s
- Data stored to a DB
- A separate DB for this plugin only or dependency to a more common monitor DB?
- A custom distribution model that reads data from DB and creates tasks based on that
Notes:
- We cannot limit to testpackage level. Implementation for set level distribution is already ongoing.