Meego Wiki
Views

Release Infrastructure/BOSS/Performance/Results

From MeeGo wiki
(Difference between revisions)
Jump to: navigation, search
Line 5: Line 5:
=== Test Method ===
=== Test Method ===
-
* The way to get "Rate" is straightforward. For each test case, just send specified number of workflows(such 3000, 20000...) to BOSS then record the start and end time. Then it's easy to get the "Rate" for that test case.
+
* The way to get "Rate" is straightforward. In each test case, sending specified number of workflows(such 3000, 20000...) to BOSS then record the start and end time. Then it's easy to get the "Rate" for that test case.
-
=== Test Suit ===
+
=== Test Environment ===
 +
* Hardware: CPU E5520(x16 cores), RAM 16G, openSUSE 11.2
 +
* Virtual Machine: CPU E8400(x1 core), RAM 512M, openSUSE 11.2
 +
* BOSS packages taken from OBS(https://build.opensuse.org/project/show?project=Maemo%3AMeeGo-Infra%3ABOSS)
 +
 
 +
=== Test Scripts ===
We created the test suit(http://gitorious.org/boss-performance-test) including:   
We created the test suit(http://gitorious.org/boss-performance-test) including:   
* A client to send multiple workflows to BOSS(or to say "launch workflows") at once time.
* A client to send multiple workflows to BOSS(or to say "launch workflows") at once time.
Line 13: Line 18:
* An utility to analyze the time data. Here we select the earliest start time and latest end time of all workflows to calculate the whole duration.
* An utility to analyze the time data. Here we select the earliest start time and latest end time of all workflows to calculate the whole duration.
-
=== Test Results ===
+
=== Test Cases ===
-
* Here we executed some cases on our marchine(CPU E5520(x16 cores), RAM 16G) and on a virtual machine(CPU E8400(x1 core), RAM 512M) for different workflow scales.
+
* Test Case Set 1
-
** 1 worker on HW
+
In this case set, we test the rates for launching different workflow scales(for N=[several hundreds, ..., several thousands]). Here we also compared the situation of different number of workers. And the CPU/MEM/DISK load is record.
-
** 1 worker and 2 workers on VM
+
* Raw data and graph
[[File:boss_performance_test_0920.PNG]]
[[File:boss_performance_test_0920.PNG]]
-
** CPU/MEM/DISK load for 10k and 20k launching
+
* CPU/MEM/DISK load for 10k and 20k cases
[[File:load_10k.PNG]]
[[File:load_10k.PNG]]
[[File:load_20k.PNG]]
[[File:load_20k.PNG]]
-
* Here we got the "reponse time" testing results. The way is to launch 1k workflows each time after previous 1k workflows finished. Observing the durations for executing each 1k workflows we can get the response time trend. Following are results on VM and HW   
+
* Test Case Set 2
 +
Here we got the "reponse time" testing results. The way is to launch 1k workflows each time after previous 1k workflows finished. Observing the durations for executing each 1k workflows we can get the response time trend. Following are results on VM and HW   
[[File:Response_1k.PNG]]
[[File:Response_1k.PNG]]
Line 32: Line 38:
=== TODO ===
=== TODO ===
-
* finish test on virtual machine
 
-
* test such case: "after you launch 50000 processes, how long does the next take to process? Is it as quick as the first or is it as slow as the 50000'th"
 
-
* involve mutilple participant
 
-
* use atop or other tool to track CPU, memory, disk load... info
 
-
* track duration between client send requests and engine start to deal first one
 
* think about why we lost workflows in some cases
* think about why we lost workflows in some cases
* different storage
* different storage
-
* multiple workers
 
=== Some Thoughts ===
=== Some Thoughts ===
-
* Storage maybe the bottleneck....need more testing
+
* Using more workers can boost performance?
-
* Using more workers can boost performance? maybe, but verify above thought firstly
+
* Using multiple engines can boost performance?
-
* Is possible to hack scheduler in Ruote::Worker to make things better?
+

Revision as of 11:01, 30 September 2010

This page is to talk about the way of BOSS performance testing.

Contents

Rate

  • "Rate" is to measure the throughput of BOSS - how many workflows can be handled in one second. With the "Rate" of different workflow request scales, we will have a view of the BOSS capability for different service levels.

Test Method

  • The way to get "Rate" is straightforward. In each test case, sending specified number of workflows(such 3000, 20000...) to BOSS then record the start and end time. Then it's easy to get the "Rate" for that test case.

Test Environment

Test Scripts

We created the test suit(http://gitorious.org/boss-performance-test) including:

  • A client to send multiple workflows to BOSS(or to say "launch workflows") at once time.
  • A logger plug-in of engine, record interesting engine internal messages to somewhere which will be used to calculate time durations of workflow, participant, etc.
  • An utility to analyze the time data. Here we select the earliest start time and latest end time of all workflows to calculate the whole duration.

Test Cases

  • Test Case Set 1

In this case set, we test the rates for launching different workflow scales(for N=[several hundreds, ..., several thousands]). Here we also compared the situation of different number of workers. And the CPU/MEM/DISK load is record.

* Raw data and graph

Boss performance test 0920.PNG

* CPU/MEM/DISK load for 10k and 20k cases

Load 10k.PNG Load 20k.PNG

  • Test Case Set 2

Here we got the "reponse time" testing results. The way is to launch 1k workflows each time after previous 1k workflows finished. Observing the durations for executing each 1k workflows we can get the response time trend. Following are results on VM and HW Response 1k.PNG

Issues Found

  • Observed losting several workflows(19995 for 20000 workflows; 9999 for 10000 workflows)
  • Engine crashed in some cases:
    • Case for 100k workflows launched on HW: finished around 60k workflows then engine crashed
    • Case for 30k workflows launched on Virtual Machine(1 CPU, 512M): finished around 20k workflows then engine crashed
  • Memory leaking: memory used by engine was keeping even all workflows finished

TODO

  • think about why we lost workflows in some cases
  • different storage

Some Thoughts

  • Using more workers can boost performance?
  • Using multiple engines can boost performance?
Personal tools