mbind — Set memory policy for a memory range
#include <numaif.h>
int
mbind( |
void * | start, |
unsigned long | len, | |
int | policy, | |
unsigned long * | nodemask, | |
unsigned long | maxnode, | |
unsigned | flags) ; |
cc ... −lnuma
mbind
() sets the NUMA memory
policy
for the memory
range starting with start
and continuing for
len
bytes. The memory
of a NUMA machine is divided into multiple nodes. The memory
policy defines in which node memory is allocated.
mbind
() only has an effect for
new allocations; if the pages inside the range have been
already touched before setting the policy, then the policy
has no effect.
Available policies are MPOL_DEFAULT
, MPOL_BIND
, MPOL_INTERLEAVE
, and MPOL_PREFERRED
. All policies except
MPOL_DEFAULT
require the caller
to specify the nodes to which the policy applies in the
nodemask
parameter.
nodemask
is a bitmask
of nodes containing up to maxnode
bits. The actual number
of bytes transferred via this argument is rounded up to the
next multiple of sizeof(unsigned
long), but the kernel will only use bits up to
maxnode
. A NULL
argument means an empty set of nodes.
The MPOL_DEFAULT
policy is
the default and means to use the underlying process policy
(which can be modified with set_mempolicy(2)). Unless
the process policy has been changed this means to allocate
memory on the node of the CPU that triggered the allocation.
nodemask
should be
specified as NULL.
The MPOL_BIND
policy is a
strict policy that restricts memory allocation to the nodes
specified in nodemask
. There won't be
allocations on other nodes.
MPOL_INTERLEAVE
interleaves
allocations to the nodes specified in nodemask
. This optimizes for
bandwidth instead of latency. To be effective the memory area
should be fairly large, at least 1MB or bigger.
MPOL_PREFERRED
sets the
preferred node for allocation. The kernel will try to
allocate in this node first and fall back to other nodes if
the preferred nodes is low on free memory. Only the first
node in the nodemask
is used. If no node is set in the mask, then the memory is
allocated on the node of the CPU that triggered the
allocation allocation).
If MPOL_MF_STRICT
is passed
in flags
and
policy
is not
MPOL_DEFAULT
, then the call
will fail with the error EIO
if the existing pages in the mapping don't follow the policy.
In 2.6.16 or later the kernel will also try to move pages to
the requested node with this flag.
If MPOL_MF_MOVE
is passed in
flags
, then an
attempt will be made to move all the pages in the mapping so
that they follow the policy. Pages that are shared with other
processes are not moved. If MPOL_MF_STRICT
is also specified, then the
call will fail with the error EIO if some pages could not be moved.
If MPOL_MF_MOVE_ALL
is
passed in flags
, then
all pages in the mapping will be moved regardless of whether
other processes use the pages. The calling process must be
privileged (CAP_SYS_NICE
) to
use this flag. If MPOL_MF_STRICT
is also specified, then the
call will fail with the error EIO if some pages could not be moved.
On success, mbind
() returns
0; on error, −1 is returned and errno
is set to indicate the error.
There was a unmapped hole in the specified memory range or a passed pointer was not valid.
An invalid value was specified for flags
or mode
; or start + len was less than
start
; or
policy
was
MPOL_DEFAULT
and
nodemask
pointed to a non-empty set; or policy
was MPOL_BIND
or MPOL_INTERLEAVE
and nodemask
pointed to an
empty set,
System out of memory.
MPOL_MF_STRICT
was
specified and an existing page was already on a node
that does not follow the policy.
NUMA policy is not supported on file mappings.
MPOL_MF_STRICT
is ignored on
huge page mappings right now.
It is unfortunate that the same flag, MPOL_DEFAULT
, has different effects for
mbind(2) and set_mempolicy(2). To select
"allocation on the node of the CPU that triggered the
allocation" (like set_mempolicy(2)
MPOL_DEFAULT
) when calling
mbind
(), specify a policy
of MPOL_PREFERRED
with an empty nodemask
.
The mbind
(), get_mempolicy(2), and
set_mempolicy(2) system
calls were added to the Linux kernel with version 2.6.7. They
are only available on kernels compiled with CONFIG_NUMA
.
Support for huge page policy was added with 2.6.16. For interleave policy to be effective on huge page mappings the policied memory needs to be tens of megabytes or larger.
MPOL_MF_MOVE
and
MPOL_MF_MOVE_ALL
are only
available on Linux 2.6.16 and later.
These system calls should not be used directly. Instead,
the higher level interface provided by the numa(3) functions in the
numactl
package is
recommended. The numactl
package is available
at ftp://ftp.suse.com/pub/people/ak/numa/.
You can link with −lnuma
to get system call definitions. libnuma
is available in the
numactl
package.
This package also has the numaif.h
header.
numa(3), numactl(8), set_mempolicy(2), get_mempolicy(2), mmap(2)
|