Path: blob/master/Documentation/cgroups/resource_counter.txt
10821 views
1The Resource Counter23The resource counter, declared at include/linux/res_counter.h,4is supposed to facilitate the resource management by controllers5by providing common stuff for accounting.67This "stuff" includes the res_counter structure and routines8to work with it.91011121. Crucial parts of the res_counter structure1314a. unsigned long long usage1516The usage value shows the amount of a resource that is consumed17by a group at a given time. The units of measurement should be18determined by the controller that uses this counter. E.g. it can19be bytes, items or any other unit the controller operates on.2021b. unsigned long long max_usage2223The maximal value of the usage over time.2425This value is useful when gathering statistical information about26the particular group, as it shows the actual resource requirements27for a particular group, not just some usage snapshot.2829c. unsigned long long limit3031The maximal allowed amount of resource to consume by the group. In32case the group requests for more resources, so that the usage value33would exceed the limit, the resource allocation is rejected (see34the next section).3536d. unsigned long long failcnt3738The failcnt stands for "failures counter". This is the number of39resource allocation attempts that failed.4041c. spinlock_t lock4243Protects changes of the above values.444546472. Basic accounting routines4849a. void res_counter_init(struct res_counter *rc,50struct res_counter *rc_parent)5152Initializes the resource counter. As usual, should be the first53routine called for a new counter.5455The struct res_counter *parent can be used to define a hierarchical56child -> parent relationship directly in the res_counter structure,57NULL can be used to define no relationship.5859c. int res_counter_charge(struct res_counter *rc, unsigned long val,60struct res_counter **limit_fail_at)6162When a resource is about to be allocated it has to be accounted63with the appropriate resource counter (controller should determine64which one to use on its own). This operation is called "charging".6566This is not very important which operation - resource allocation67or charging - is performed first, but68* if the allocation is performed first, this may create a69temporary resource over-usage by the time resource counter is70charged;71* if the charging is performed first, then it should be uncharged72on error path (if the one is called).7374If the charging fails and a hierarchical dependency exists, the75limit_fail_at parameter is set to the particular res_counter element76where the charging failed.7778d. int res_counter_charge_locked79(struct res_counter *rc, unsigned long val)8081The same as res_counter_charge(), but it must not acquire/release the82res_counter->lock internally (it must be called with res_counter->lock83held).8485e. void res_counter_uncharge[_locked]86(struct res_counter *rc, unsigned long val)8788When a resource is released (freed) it should be de-accounted89from the resource counter it was accounted to. This is called90"uncharging".9192The _locked routines imply that the res_counter->lock is taken.93942.1 Other accounting routines9596There are more routines that may help you with common needs, like97checking whether the limit is reached or resetting the max_usage98value. They are all declared in include/linux/res_counter.h.991001011023. Analyzing the resource counter registrations103104a. If the failcnt value constantly grows, this means that the counter's105limit is too tight. Either the group is misbehaving and consumes too106many resources, or the configuration is not suitable for the group107and the limit should be increased.108109b. The max_usage value can be used to quickly tune the group. One may110set the limits to maximal values and either load the container with111a common pattern or leave one for a while. After this the max_usage112value shows the amount of memory the container would require during113its common activity.114115Setting the limit a bit above this value gives a pretty good116configuration that works in most of the cases.117118c. If the max_usage is much less than the limit, but the failcnt value119is growing, then the group tries to allocate a big chunk of resource120at once.121122d. If the max_usage is much less than the limit, but the failcnt value123is 0, then this group is given too high limit, that it does not124require. It is better to lower the limit a bit leaving more resource125for other groups.1261271281294. Communication with the control groups subsystem (cgroups)130131All the resource controllers that are using cgroups and resource counters132should provide files (in the cgroup filesystem) to work with the resource133counter fields. They are recommended to adhere to the following rules:134135a. File names136137Field name File name138---------------------------------------------------139usage usage_in_<unit_of_measurement>140max_usage max_usage_in_<unit_of_measurement>141limit limit_in_<unit_of_measurement>142failcnt failcnt143lock no file :)144145b. Reading from file should show the corresponding field value in the146appropriate format.147148c. Writing to file149150Field Expected behavior151----------------------------------152usage prohibited153max_usage reset to usage154limit set the limit155failcnt reset to zero1561571581595. Usage example160161a. Declare a task group (take a look at cgroups subsystem for this) and162fold a res_counter into it163164struct my_group {165struct res_counter res;166167<other fields>168}169170b. Put hooks in resource allocation/release paths171172int alloc_something(...)173{174if (res_counter_charge(res_counter_ptr, amount) < 0)175return -ENOMEM;176177<allocate the resource and return to the caller>178}179180void release_something(...)181{182res_counter_uncharge(res_counter_ptr, amount);183184<release the resource>185}186187In order to keep the usage value self-consistent, both the188"res_counter_ptr" and the "amount" in release_something() should be189the same as they were in the alloc_something() when the releasing190resource was allocated.191192c. Provide the way to read res_counter values and set them (the cgroups193still can help with it).194195c. Compile and run :)196197198