You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/dashboard.md
+56Lines changed: 56 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -12,21 +12,77 @@ provisioning, script upload, etc. If exclusive label was specified, it will
12
12
wait until no other benchmarks with the same label will be in progress. In this
13
13
case, it will be identified as "wait_exclusive".
14
14
15
+
## Graphs
16
+
15
17
When "running" phase begins,
16
18
you will see charts. There are two built-in groups "System load" and
17
19
"MZBench internals", the rest of the charts are worker-specific.
18
20
19
21

20
22
23
+
In case of HTTP, charts are showing response kind counters and latencies:
24
+
25
+

26
+
27
+
All counters in MZBench are supplied with derivative charts, derivative
28
+
shows how quick counter value is changed. In the example above left chart
29
+
stands for total number of responses. Red line is `http_ok` code 200.
30
+
Right chart is for derivatives.
31
+
32
+
### System load
33
+
34
+
MZBench monitors its worker nodes so user can be sure that there is no overload
35
+
in benchmarking cluster on a system level.
36
+
37
+
LA, CPU, RAM and interface-wise traffic metrics are gathered for each node.
38
+
39
+

40
+
41
+
### MZBench internals
42
+
43
+
These metrics are used to diagnose behavior on a level of Erlang application.
44
+
45
+

46
+
47
+
Complete list of internals:
48
+
49
+
* Pool-wise worker status counters: started, ended, failed. You may see linear start scenario in a screenshot above.
50
+
* Errors and blocked workers: number of errors during the bench run. Errors divided in two groups: system and user. User errors usually worker-specific.
51
+
* Number of log records written and dropped. In case of high load, system might be unable to write every log record, in this case logs are dropped.
52
+
* Metric merging time: amount of time taken to merge counters and histograms. Normally it is about few milliseconds, but under high load could be more. If merge time is becoming more than metrics gathering period, system is unhealthy.
53
+
* Mailbox messages: node-wise numbers of messages queued for delivery in Erlang VM. This metric is commonly used in monitoring of Erlang systems, if you want to know more about it, please refer to [Erlang documentation](http://erlang.org/doc/getting_started/conc_prog.html#id69544).
54
+
* Erlang processes: node-wise number of Erlang processes.
55
+
* System metrics report interval: how often metrics are merged, 10 seconds by default.
56
+
* Actual time diff with director: for some applications it is important to check time difference in a cluster. This metrics monitors node-wise difference in clocks.
57
+
* Time offset at node: this value is evaluated by MZBench and could be subtracted from timestamp to make time difference smaller than it is provided by operation system.
58
+
* Director ping time: node-wise metric on how soon a packet is getting to director from worker node.
59
+
60
+
## Scenario
61
+
21
62
While benchmark is running it is possible to adjust environmental variables or
22
63
execute some code on a given percentage of workers.
23
64
24
65

25
66
67
+
## Logs
68
+
26
69
Logs are available all the time.
27
70
28
71

29
72
73
+
## Reports
74
+
75
+
Metric values could be downloaded for post-processing.
76
+
77
+

78
+
79
+
## Finals
80
+
81
+
After execution is done, MZBench is evaluating final percentiles.
82
+
These values could be used to check if the execution was successful, as well as for comparison.
83
+
84
+

85
+
30
86
# Comparison
31
87
32
88
If you need to compare a set of benchmarks you could use "dashboards"
0 commit comments