ByteIntroduction

Understand process stack and heap

Skills:

OS Concepts

Objective

Understand process stack and heap by using procfs.

Background

Process stack and heap are foundational concepts in computer science. Whenever a process runs, its memory is organized into a bunch of segments with heap and stack being one of them.

image alt text

Layout of a process in memory

Heap

The heap area commonly begins at the end of the .bss and .data segments and grows to larger addresses from there. The heap area is managed by malloc, calloc, realloc, and free, which may use the brk and sbrk system calls to adjust its size (note that the use of brk/sbrk and a single "heap area" is not required to fulfill the contract of malloc/calloc/realloc/free; they may also be implemented using mmap/munmap to reserve/unreserve potentially non-contiguous regions of virtual memory into the process' virtual address space). The heap area is shared by all threads, shared libraries, and dynamically loaded modules in a process.

Stack

The stack area contains the program stack, a LIFO structure, typically located in the higher parts of memory. A "stack pointer" register tracks the top of the stack; it is adjusted each time a value is "pushed" onto the stack. The set of values pushed for one function call is termed a "stack frame". A stack frame consists at minimum of a return address. Automatic variables are also allocated on the stack.

The stack area traditionally adjoined the heap area and they grew towards each other; when the stack pointer met the heap pointer, free memory was exhausted. With large address spaces and virtual memory techniques they tend to be placed more freely, but they still typically grow in a converging direction. On the standard PC x86 architecture the stack grows toward address zero, meaning that more recent items, deeper in the call chain, are at numerically lower addresses and closer to the heap.

Source: wikipedia.org

Primary goals

  1. Get comfortable looking at memory segments from /proc/[pid]/maps.

  2. Understand how process stack and heap grow.

Objective

Understand process stack and heap by using procfs.

Background

Process stack and heap are foundational concepts in computer science. Whenever a process runs, its memory is organized into a bunch of segments with heap and stack being one of them.

image alt text

Layout of a process in memory

Heap

The heap area commonly begins at the end of the .bss and .data segments and grows to larger addresses from there. The heap area is managed by malloc, calloc, realloc, and free, which may use the brk and sbrk system calls to adjust its size (note that the use of brk/sbrk and a single "heap area" is not required to fulfill the contract of malloc/calloc/realloc/free; they may also be implemented using mmap/munmap to reserve/unreserve potentially non-contiguous regions of virtual memory into the process' virtual address space). The heap area is shared by all threads, shared libraries, and dynamically loaded modules in a process.

Stack

The stack area contains the program stack, a LIFO structure, typically located in the higher parts of memory. A "stack pointer" register tracks the top of the stack; it is adjusted each time a value is "pushed" onto the stack. The set of values pushed for one function call is termed a "stack frame". A stack frame consists at minimum of a return address. Automatic variables are also allocated on the stack.

The stack area traditionally adjoined the heap area and they grew towards each other; when the stack pointer met the heap pointer, free memory was exhausted. With large address spaces and virtual memory techniques they tend to be placed more freely, but they still typically grow in a converging direction. On the standard PC x86 architecture the stack grows toward address zero, meaning that more recent items, deeper in the call chain, are at numerically lower addresses and closer to the heap.

Source: wikipedia.org

Primary goals

  1. Get comfortable looking at memory segments from /proc/[pid]/maps.

  2. Understand how process stack and heap grow.

Getting Started

  • You need access to a Linux machine with sudo access.

  • Have g++ compiler to run simple cpp programs.


 g++ SampleProgram.cc -o SampleProgram

./SampleProgram

  • Understand how to run a background process and get its process id.

image alt text

When you run a program with ‘&’ in the end, it runs as a background job and prints the process id. In the above case, 226285 is the process id.

  • You may have to periodically kill these processes you put in the background. Otherwise your system may become slow. If you run **_ps _**in the same terminal, you will be able to see the list of all processes. You can then kill the process either using pkill or kill commands.

image alt text

image alt text

What is procfs?

Procfs is a special filesystem in Linux that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel. It is mounted at /proc in your Linux machine.

You can look at the reference material for more information. To start with, get comfortable with the following command which gives the memory layout of your current process denoted by self.


cat /proc/self/maps

image alt text

Similarly you can get the memory layout of any process by using


cat /proc/<pid>/maps

Get started with code

Start with a simple C++ program to understand how a process memory is typically laid out. Run the following program and get it’s process id.


// File: StackAndHeap.cc


#include <iostream>


using namespace std;


int main() {

  cout << endl << "Let's Learn by Doing!" << endl;


  int stack_variable;

  cout << "Address of stack variable: " << &stack_variable << endl;


  int *ptr_heap = new int;

  cout << "Address of heap: " << ptr_heap << endl;


  // Infinite loop to keep the process running for you to examine the procfs.

  while (1) {}

}

You may be able to see a similar output.

image alt text

Memory layout of the process

Get the memory layout of the process from procfs and understand different sections of it.


cat /proc/<pid>/maps

image alt text

Stack allocations

Modify the given program to allocate multiple stack variables


int a;

char b = ‘C’;

float c[100];

Print their addresses and see if they are within the range of stack section of your process?

In the following example, you can clearly see 0x7ffd734b4cdc is within 7ffd73495000-7ffd734b6000 address range.

image alt text

image alt text

Heap allocations

Modify the given program to allocate memory in heap.


int *ptr_heap1  = new int;

char *ptr_heap2 = new char[500];

Print the addresses the pointers are pointing to (not the address of the pointer variables), and you will see them to be in the heap section.

Curious Cats

  • What if you allocate local variables in a different function? Where do you see it going - stack or heap?

  • Which segment are your global variables going into?

  • Do you notice any patterns when you allocate consecutive allocations on stack or heap?

  • Your heap segment’s range grows as you allocate more and more memory using heap. Can you find out what’s the minimum size the range can increase by? Tip: Page size.

Which direction is your stack and heap growing?

Does this question become silly now? Given the knowledge you now have about memory allocations, won’t you be able to tell in which direction the process stack and heap are growing?

Tip

Just make multiple allocations in stack and heap and see whether the addresses of later allocations are increasing or decreasing. There you go!

Newfound Superpowers

  • Practical knowledge of process stack and heap

  • Ability to peek into process internals using the proc filesystem

Now you can

  • Show off with visualization, when and how the stack and heap get used in a process - malloc/new vs local variables.

  • Look into memory addresses to figure out where they reside - stack, heap or any other segment of the process memory.