Written Assignment 3

Question 1

Caller-saved registers (%rax, %rdi, %rsi, %rdx, %rcx, %r8, %r9, %r10, and %r11) are generally used for callers to store values. The callee has no responsibility to change them back to their original state after return. Caller-saved registers can be used as a return value (%rax) from the callee, as arguments passed into callee (%rdi~%r9), or as temporary storage before function calls (%r10, %r11).

Callee-saved registers (%rbx, %r12. %r13, %r14, %r15, and %rbp) are generally used for storing temporary values for calculations. If a callee want to use any of them, the callee should save them to the stack and retrieve them before returning to restore to their original values. Historically, %rbp was used as a frame pointer to de-allocate stack space after function return.

Question 2

struct a {
  char c[3];
  size_t y;
  int z;
}

char c[3] takes up 3 * 1 = 3 bytes space and must align to 1 byte. size_t takes up 8 bytes space and must align to 8 bytes. int takes up 4 byte space and must align to 4 bytes.

Since the order of memory will be allocated as the declared order. char c[3] takes up 3 bytes. There will be 5 bytes padding before size t y that takes up 8 bytes. Finally, the rest of 4 bytes for int z follows size_t y immediately without padding. Since struct s must align to max(1, 8, 4) = 8 bytes, 4 bytes will be added to ensure struct alignment. A total of 3 + 5 + 8 + 4 + 4 = 24 bytes will be allocated for this struct.

To save space, we can reorder declared types with increasing order

struct a {
  char c[3];
  int z;
  size_t y;
}

char c[3] takes up 3 bytes. There will be 1 byte padding before int z that takes up 4 bytes. Finally, the rest of 8 bytes for size_t y follows int z immediately without padding. Since struct s must align to max(1, 8, 4) = 8 bytes, and the struct is already 3 + 1 + 4 + 8 = 16 bytes, there is no need to add any byte for struct alignment.

A total of 24 - 16 = 8 bytes are saved.

Without padding, there exists no alignment. If the memory is not aligned and therefore spans across cache lines, it is inefficient for memory retrieval and writes. With padding, memory access can be more efficient.

Table of Content