Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
torvalds
GitHub Repository: torvalds/linux
Path: blob/master/tools/perf/Documentation/callchain-overhead-calculation.txt
26282 views
1
Overhead calculation
2
--------------------
3
The CPU overhead can be shown in two columns as 'Children' and 'Self'
4
when perf collects callchains (and corresponding 'Wall' columns for
5
wall-clock overhead). The 'self' overhead is simply calculated by
6
adding all period values of the entry - usually a function (symbol).
7
This is the value that perf shows traditionally and sum of all the
8
'self' overhead values should be 100%.
9
10
The 'children' overhead is calculated by adding all period values of
11
the child functions so that it can show the total overhead of the
12
higher level functions even if they don't directly execute much.
13
'Children' here means functions that are called from another (parent)
14
function.
15
16
It might be confusing that the sum of all the 'children' overhead
17
values exceeds 100% since each of them is already an accumulation of
18
'self' overhead of its child functions. But with this enabled, users
19
can find which function has the most overhead even if samples are
20
spread over the children.
21
22
Consider the following example; there are three functions like below.
23
24
-----------------------
25
void foo(void) {
26
/* do something */
27
}
28
29
void bar(void) {
30
/* do something */
31
foo();
32
}
33
34
int main(void) {
35
bar()
36
return 0;
37
}
38
-----------------------
39
40
In this case 'foo' is a child of 'bar', and 'bar' is an immediate
41
child of 'main' so 'foo' also is a child of 'main'. In other words,
42
'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'.
43
44
Suppose all samples are recorded in 'foo' and 'bar' only. When it's
45
recorded with callchains the output will show something like below
46
in the usual (self-overhead-only) output of perf report:
47
48
----------------------------------
49
Overhead Symbol
50
........ .....................
51
60.00% foo
52
|
53
--- foo
54
bar
55
main
56
__libc_start_main
57
58
40.00% bar
59
|
60
--- bar
61
main
62
__libc_start_main
63
----------------------------------
64
65
When the --children option is enabled, the 'self' overhead values of
66
child functions (i.e. 'foo' and 'bar') are added to the parents to
67
calculate the 'children' overhead. In this case the report could be
68
displayed as:
69
70
-------------------------------------------
71
Children Self Symbol
72
........ ........ ....................
73
100.00% 0.00% __libc_start_main
74
|
75
--- __libc_start_main
76
77
100.00% 0.00% main
78
|
79
--- main
80
__libc_start_main
81
82
100.00% 40.00% bar
83
|
84
--- bar
85
main
86
__libc_start_main
87
88
60.00% 60.00% foo
89
|
90
--- foo
91
bar
92
main
93
__libc_start_main
94
-------------------------------------------
95
96
In the above output, the 'self' overhead of 'foo' (60%) was add to the
97
'children' overhead of 'bar', 'main' and '\_\_libc_start_main'.
98
Likewise, the 'self' overhead of 'bar' (40%) was added to the
99
'children' overhead of 'main' and '\_\_libc_start_main'.
100
101
So '\_\_libc_start_main' and 'main' are shown first since they have
102
same (100%) 'children' overhead (even though they have zero 'self'
103
overhead) and they are the parents of 'foo' and 'bar'.
104
105
Since v3.16 the 'children' overhead is shown by default and the output
106
is sorted by its values. The 'children' overhead is disabled by
107
specifying --no-children option on the command line or by adding
108
'report.children = false' or 'top.children = false' in the perf config
109
file.
110
111