| Line 11: | Line 11: | ||
* Virtual Machine: CPU E8400(x1 core), RAM 512M, openSUSE 11.2 | * Virtual Machine: CPU E8400(x1 core), RAM 512M, openSUSE 11.2 | ||
* BOSS packages taken from OBS(https://build.opensuse.org/project/show?project=Maemo%3AMeeGo-Infra%3ABOSS) | * BOSS packages taken from OBS(https://build.opensuse.org/project/show?project=Maemo%3AMeeGo-Infra%3ABOSS) | ||
| + | * BOSS configuration: FS storage, single engine, one or two workers, AMQP participants. All components are running in same one machine | ||
=== Test Scripts === | === Test Scripts === | ||
| Line 26: | Line 27: | ||
[[File:load_10k.PNG]] | [[File:load_10k.PNG]] | ||
[[File:load_20k.PNG]] | [[File:load_20k.PNG]] | ||
| + | |||
| + | From above results, we can find some interesting things: | ||
| + | * Rates are decreasing while more number of workflows launched once time; that means the performance is decreasing when more request coming. | ||
| + | * CPU/DISK is at high rate during testing | ||
| + | * Memory is still taken after workflow finished. It looks like "memory leak" but it may not be if considering the "warm cache(or GC)" in Ruby/Python like language | ||
| + | * Even multiple workers, if they all run in one machine, performance can not be boost. This is as excepted considering the CPU/DISK load | ||
| + | * Some workflows lost in some test cases. Need further investigation | ||
| + | * Observed engine crashed some times. Need further investigation | ||
| + | |||
| + | |||
| + | |||
* Test Case Set 2 | * Test Case Set 2 | ||
| - | Here we got the "reponse time" testing results. The way is to launch 1k workflows each time after previous 1k workflows finished. Observing the durations for executing each 1k workflows we can get the response time trend. Following are results on VM and HW | + | Here we got the "reponse time" testing results. The way is to repeat to launch 1k workflows each time after previous 1k workflows finished. Observing the durations for executing each 1k workflows we can get the response time trend. Following are results on VM and HW |
| + | |||
[[File:Response_1k.PNG]] | [[File:Response_1k.PNG]] | ||
| - | === Issues Found === | + | From above results, reponse time is increasing while handling more and more workflows; that means the performance(or rates) is decreasing. So BOSS could be very low capacity after servicing for long time. This is a pontential issue and need further investigation. |
| - | * Observed | + | |
| + | === Issues Found Summarized=== | ||
| + | * Observed losing several workflows(19995 for 20000 workflows; 9999 for 10000 workflows) | ||
* Engine crashed in some cases: | * Engine crashed in some cases: | ||
** Case for 100k workflows launched on HW: finished around 60k workflows then engine crashed | ** Case for 100k workflows launched on HW: finished around 60k workflows then engine crashed | ||
** Case for 30k workflows launched on Virtual Machine(1 CPU, 512M): finished around 20k workflows then engine crashed | ** Case for 30k workflows launched on Virtual Machine(1 CPU, 512M): finished around 20k workflows then engine crashed | ||
* Memory leaking: memory used by engine was keeping even all workflows finished | * Memory leaking: memory used by engine was keeping even all workflows finished | ||
| + | * Response time keeps increasing while more and more workflows launched | ||
=== TODO === | === TODO === | ||
| - | * | + | * Distributed workers |
| - | * | + | * Other storage |
| - | + | * Think about issues found | |
| - | + | ||
| - | * | + | |
| - | + | ||
This page is to talk about the way of BOSS performance testing.
Contents |
We created the test suit(http://gitorious.org/boss-performance-test) including:
In this case set, we test the rates for launching different workflow scales(for N=[several hundreds, ..., several thousands]). Here we also compared the situation of different number of workers. And the CPU/MEM/DISK load is record.
* Raw data and graph
* CPU/MEM/DISK load for 10k and 20k cases
From above results, we can find some interesting things:
* Rates are decreasing while more number of workflows launched once time; that means the performance is decreasing when more request coming. * CPU/DISK is at high rate during testing * Memory is still taken after workflow finished. It looks like "memory leak" but it may not be if considering the "warm cache(or GC)" in Ruby/Python like language * Even multiple workers, if they all run in one machine, performance can not be boost. This is as excepted considering the CPU/DISK load * Some workflows lost in some test cases. Need further investigation * Observed engine crashed some times. Need further investigation
Here we got the "reponse time" testing results. The way is to repeat to launch 1k workflows each time after previous 1k workflows finished. Observing the durations for executing each 1k workflows we can get the response time trend. Following are results on VM and HW
From above results, reponse time is increasing while handling more and more workflows; that means the performance(or rates) is decreasing. So BOSS could be very low capacity after servicing for long time. This is a pontential issue and need further investigation.