Byte Introduction

Understand the details of one of the most common errors that you may encounter when using Process Heap - OOM

Skills:

OS Concepts

Objective

Understand one of the most common errors that you may encounter when using the Process Heap to allocate memory - OOM.

Background/Recap:

Process stack and heap are foundational concepts in computer science. Whenever a process runs, its memory is organized into a bunch of segments with heap and stack being two of them.

image alt text

Layout of a process in memory

OOM and Segfaults are two of the most common errors that developers run into, when dealing with heap. So, understanding when and why they happen will help you avoid a ton of debugging. In this module, you will get into the details of OOM.


OOM

Out of memory (OOM) is an undesired state of computer operation where no additional memory can be allocated for use, either by programs or the operating system. Usually, this state results in incorrect functioning of the program.


Segfault

C/C++ Programmer’s favourite error :)

A segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection. It notifies an operating system (OS) that the software has attempted to access a restricted area of memory (a memory access violation). Usually, this causes abnormal termination of the process (a program crash) or a core dump.

Primary goals

  1. Understand what causes OOM and what doesn’t cause OOM

  2. Understand OOM in 32-bit systems

Objective

Understand one of the most common errors that you may encounter when using the Process Heap to allocate memory - OOM.

Background/Recap:

Process stack and heap are foundational concepts in computer science. Whenever a process runs, its memory is organized into a bunch of segments with heap and stack being two of them.

image alt text

Layout of a process in memory

OOM and Segfaults are two of the most common errors that developers run into, when dealing with heap. So, understanding when and why they happen will help you avoid a ton of debugging. In this module, you will get into the details of OOM.


OOM

Out of memory (OOM) is an undesired state of computer operation where no additional memory can be allocated for use, either by programs or the operating system. Usually, this state results in incorrect functioning of the program.


Segfault

C/C++ Programmer’s favourite error :)

A segmentation fault or access violation is a fault, or failure condition, raised by hardware with memory protection. It notifies an operating system (OS) that the software has attempted to access a restricted area of memory (a memory access violation). Usually, this causes abnormal termination of the process (a program crash) or a core dump.

Primary goals

  1. Understand what causes OOM and what doesn’t cause OOM

  2. Understand OOM in 32-bit systems

Getting Started

  • You need access to a Linux machine with sudo access.

  • Have g++ compiler to run simple cpp programs.


 g++ SampleProgram.cc -o SampleProgram

./SampleProgram

  • Understand how to run background process and get its process id.

image alt text

When you run a program with ‘&’ in the end, it runs as a background job and prints the process id. In the above case, 226285 is the process id.

  • You may have to periodically kill these processes you put in the background. Otherwise your system may become slow. If you run ps in the same terminal, you will be able to see the list of all processes. You can then kill the process either using pkill or kill commands.

image alt text

image alt text

Out of Memory

When do you think you will get an OOM error for a user process:

  • Recursively call functions infinitely?

  • Keep allocating memory in heap without freeing?

Let’s try them out!

Infinite Function Recursion


// File: OOMRecursiveFunctionCall.cc


#include <iostream>


using namespace std;


void infinite_function(int i) {some

  cout << "Infinite Function getting called: " << i << endl; 

  infinite_function(++i);

}


int main() {

  cout << endl << "Let's Learn by Doing!" << endl;


  infinite_function(1);

}

Did you get a segfault instead of OOM? Feel free to research further on why it didn’t return OOM.

image alt text

Heap allocation without free

Try the program to see what happens - would you expect a segmentation fault or OOM?


// File: HeapFunOOM-AllocationWithoutFree.cc


#include <iostream>


using namespace std;


int main() {

  cout << endl << "Let's Learn by Doing!" << endl;


  int *ptr_heap = NULL;  // sizeof(int) = 4

  // Allocate roughly 4GB of integers.

  int count_to_allocate = 1000000000;

  unsigned long total_allocation = 0;

  

  char user_choice;


  while (user_choice != 'N') {

    // TODO(1): Allocate without handling allocation failure.

    ptr_heap = new int[count_to_allocate];

    total_allocation += count_to_allocate * sizeof(int);


    // TODO(2): Uncomment code to allocate by handling failure.

    // You might have to comment above code.

    // try {

    //   ptr_heap = new int[count_to_allocate];

    //   total_allocation += count_to_allocate * sizeof(int);

    // } catch (const bad_alloc& e) {      

    //   cout << "Allocation failed: " << e.what() << endl;

    // }


    cout << "Allocated 4GB at " << ptr_heap

	 << "; Total allocation : " << total_allocation/1000000000 << " GB"

	 << "; Allocate more heap space? Y/N" << endl;


    // TODO: Uncomment to stop/observe heap growth of the program in procfs.

    // cin >> user_choice;

  }

}

You will see something like this

image alt text

Seems like your program did throw an OOM error, but since it was handled, the OS terminated the process. However, in this case, you can ensure your program doesn’t crash if you handle std::bad_alloc case using a try/catch clause.

Try that out by commenting/uncommenting appropriate code in the program!

Newfound Superpowers

  • Practical knowledge of the difference between OOM and Segmentation fault

  • How to (or not to?) cause them :)

Now you can

  • Show that OOM is a recoverable condition. You can keep retrying heap allocation if you know some other thread in your process may free memory.

  • Also show that, in contrast, infinite recursive function calls will crash your process with a segfault from which recovery is not possible.

Curious Cats

  • Can you make your program oscillate between OOM and proper running state? Maintain the allocations in an array of pointers int *ptr_heap[] and try freeing memory whenever your process is under memory pressure.

} catch (const bad_alloc& e) {      

  cout << "Allocation failed: " << e.what() << endl;

  // Add logic to free existing memory allocations.

  // Now does your program alter between OOM and proper running state?

}

2 minute overview of Virtual Addresses

To quickly set context, the addresses you see when looking at cat /proc/<pid>/maps are all virtual addresses of the process. You can try listing this for multiple running processes in your system and all of them would have similar ranges. They are called virtual addresses and they are not the same as physical RAM addresses.

image alt text

When you list this information, you would see that multiple processes may have overlapping address ranges. In other words, one process writing something in 7f3ae0000000-7f3ae002e000 may not have an impact on a different process using the same address range.

If you are not completely clear yet, spend a bit more time reading about virtual memory before coming back to this task.

OOM in 32-bit systems

On 64-bit systems, allocating memory in heap may not hit any address limits as your virtual address range is quite huge. You will mostly run out of system resources before running out of virtual addresses. However, for 32-bit systems, this is not true.

You can use -m32 flag to compile the same program in 32-bit mode. Try running the program then, see how your heap size is growing and find out when it throws OOM error. You may find that your process will never be able to allocate 4GB!

Try changing the code to allocate 400MB at a time and see what happens.

image alt text

  • Are you able to see that it fails somewhere around ~3+ GB?

  • Did you notice the virtual addresses printed? Do you find any difference in the number of digits in the address?

  • What’s the maximum address that can be represented with 32-bits? Do you see the relationship when your process hits OOM vs not?

Tip

On 32-bit machines, addresses are only 32-bits

Each hex digit (e.g. 0xf = 1111) -> 4 bits

So 8 digits are needed to represent a 32 bit address - 0x26a7c010

For 64-bit systems, addresses are 64-bits

0x7fdc81399010 -> Check if they have 16 digits, why or why not?.