IPS

IPS: Inter-process Communication.

POSIX: a standard to implement. System that implement POSIX is called unix-like system.

GNU: a project launched by Richard Stallman to encourage free software. It gave birth to a bunch of tools while trying to create an OS

Unix: closed source operating system that implements POSIX.

BSD: operating system from unix that is less closed source

Linux: it just a kernel developed by GNU, not a full operating system. But it can be combined with other GNU tools to form an operating system.

runtimes: a library that implements a specific programming language

This Unix genealogy diagram clearly shows the history of Unix, BSD, GNU and Linux (from klzzwxh:0001)

This Unix genealogy diagram clearly shows the history of Unix, BSD, GNU and Linux (from Wikimedia)

Memory Mapped File (mmap)

The intention for memory mapped file is to read a large file into memory quickly. User can only use some of the virtual memory to map to a portion of the file.

If several processes use the same file mapping to create mapped regions of a file, each process' views contain identical copies of the file on disk. POSIX and Windows mapped file management is very similar, and highly portable.

Shared Memory (shmget)

Shared memory shmget is an old way to achieve memory sharing between processes. It is harder to use but less restrictive than mmap.

Historically, there were two competing unix system: System V and BSD. sysV's IPC includes: shared memory, semaphores, and message queues. shmget syscall is the old sysV shared memory model (but today most unix system still have that and it is likely that it will never go away). On the other hand mmap and shm_open is the new POSIX way to do shared memory and is easier to use.

shmget requires ftok to remove while shared memory created by mmap can be removed using rm.

(Old) POSIX Shared Memory

POSIX defines a shared memory object as "An object that represents memory that can be mapped concurrently into the address space of more than one process."

In Windows, shared memory is implemented as a special case (non-persistent version) of memory-mapped file, where the file mapping object accesses memory backed by the system paging file.

The only viable alternative is to use memory mapped files in Windows simulating shared memory, but avoiding file-memory synchronization as much as possible. This method is prone to program crash.

In POSIX, shared memory can be permanent since it must have lifetime of kernel or a filesystem in its standard. It can be implemented as mapped files of a permanent file system. The shared memory destruction happens with an explicit call to unlink()

Many other named synchronization primitives (like named mutexes or semaphores) suffer the same lifetime portability problem.

New Memory Mapping (mmap)

Here is a man page for POSIX memory mapping.

If the OS supports /dev/shm/, then shm_open is equivalent to opening a file in /dev/shm/. It is a temporary file storage filesystem. /dev/shm/ is not persistent and will be cleared after umount. The size of /dev/shm is limited by excess RAM on the system.

Here is a video introduction on how mmap is implemented and can be used.

Here is how to make shared memory persistent in python

/dev/shm

Many Unix distributions enable and use tmpfs by default for the /tmp branch of the file system or for shared memory. This can be observed with df. Other Linux distributions (e.g. Debian) do not have a tmpfs mounted on /tmp by default; in this case, files under /tmp will be stored in the same file system as / and can grow as required.

/dev/shm support is completely optional within the kernel config file.

Therefore you should not rely on /dev/shm being present.

┌───────────┬──────────────┬────────────────┐
│ /dev/shm  │ always tmpfs │ Linux specific │
├───────────┼──────────────┼────────────────┤
│ /tmp      │ can be tmpfs │ FHS 1.0        │
├───────────┼──────────────┼────────────────┤
│ /var/tmp  │ never tmpfs  │ FHS 1.0        │
└───────────┴──────────────┴────────────────┘

According to stackoverflow: tmpfs performance is deceptive. You will find workloads that are faster on tmpfs, and this is not because RAM is faster than disk: All filesystems are cached in RAM – the page cache! Rather, it is a sign that the workload is doing something that defeats the page cache. And of the worse things a process can do in this regard is syncing to disk way more often than necessary.

Correct Way to do Memory Sharing

Use of mmap:

  1. File Memory Mapping, there are two types of memory mapping. File Memory Mapping and Anonymous Mapping. Use mmap system call to map a file into memory. Usually use open() a file and pass the file descriptor into mmap() to map.
  2. Shared memory: mainly tells you that shared memory is the fastest way to communicate between processes, and then:

  3. Posix interface, POSIX interface for shared memory, usage: shm_open(), mmap()

  4. System V interface, using methods shmget(), shmmat()...

So mmap() can be used for both File Memory Mapping and Shared Memory. Whether you open with open() or shm_open() is the difference.

Stackoverflow: If you open() and mmap() a regular file, data will end up in that file. If you just need to share a memory region, without the need to persist the data, which incurs extra I/O overhead, use shm_open().

Such a memory region would also allow you to store other kinds of objects such as mutexes or semaphores, which you can't store in a mmap()'ed regular file on most systems.

But if you are using Linux, open("/dev/shm/myshm", "w") is about the same as shm_open(). shm_open() takes care to locate /dev/shm if it's mounted at a non-standard location.

Note that POSIX Standard Says: The interpretation of slash characters other than the leading slash character in name is implementation-defined.

Therefore there should not be a directory under /dev/shm.

Table of Content