Memory leak when using OpenMP

The below test case runs out of memory on 32 bit machines (throwing std::bad_alloc) in the loop following the "post MT section" message when OpenMP is used, however, if the #pragmas for OpenMP are commented out, the code runs through to completion fine, so it appears that when the memory is allocated in parallel threads, it does not free correctly and thus we run out of memory.

Question is whether there is something wrong with the memory allocation and deletion code below or is this a bug in gcc v4.2.2 or OpenMP? I also tried gcc v4.3 and got same failure.

int main(int argc, char** argv)
{
    std::cout << "start " << std::endl;

    {
            std::vector<std::vector<int*> > nts(100);
            #pragma omp parallel
            {
                    #pragma omp for
                    for(int begin = 0; begin < int(nts.size()); ++begin) {
                            for(int i = 0; i < 1000000; ++i) {
                                    nts[begin].push_back(new int(5));
                            }
                    }
            }

    std::cout << "  pre delete " << std::endl;
            for(int begin = 0; begin < int(nts.size()); ++begin) {
                    for(int j = 0; j < nts[begin].size(); ++j) {
                            delete nts[begin][j];
                    }
            }
    }
    std::cout << "post MT section" << std::endl;
    {
            std::vector<std::vector<int*> > nts(100);
            int begin, i;
            try {
              for(begin = 0; begin < int(nts.size()); ++begin) {
                    for(i = 0; i < 2000000; ++i) {
                            nts[begin].push_back(new int(5));
                    }
              }
            } catch (std::bad_alloc &e) {
                    std::cout << e.what() << std::endl;
                    std::cout << "begin: " << begin << " i: " << i << std::endl;
                    throw;
            }
            std::cout << "pre delete 1" << std::endl;

            for(int begin = 0; begin < int(nts.size()); ++begin) {
                    for(int j = 0; j < nts[begin].size(); ++j) {
                            delete nts[begin][j];
                    }
            }
    }

    std::cout << "end of prog" << std::endl;

    char c;
    std::cin >> c;

    return 0;
}

Answers


Changing the first OpenMP loop from 1000000 to 2000000 will cause the same error. This indicates that the out of memory problem is with OpenMP stack limit.

Try setting the OpenMP stack limit to unlimit in bash with

ulimit -s unlimited

You can also change the OpenMP environment variable OMP_STACKSIZE and setting it to 100MB or more.

UPDATE 1: I change the first loop to

{
    std::vector<std::vector<int*> > nts(100);
    #pragma omp for schedule(static) ordered
    for(int begin = 0; begin < int(nts.size()); ++begin) {
        for(int i = 0; i < 2000000; ++i) {
            nts[begin].push_back(new int(5));
        }
    }

    std::cout << "  pre delete " << std::endl;
    for(int begin = 0; begin < int(nts.size()); ++begin) {
        for(int j = 0; j < nts[begin].size(); ++j) {
            delete nts[begin][j]
        }
    }
}

Then, I get a memory error at i=1574803 on the Main thread.

UPDATE 2: If you are using the Intel compiler, you can add the following to the top of your code and it will solve the problem (providing you have enough memory for the extra overhead).

std::cout << "Previous stack size " << kmp_get_stacksize_s() << std::endl;
kmp_set_stacksize_s(1000000000);
std::cout << "Now stack size " << kmp_get_stacksize_s() << std::endl;

UPDATE 3: For completeness, like mentioned by another member, if you are performing some numerical computation, it is best to preallocate everything in a single new float[1000000] instead of using OpenMP to do 1000000 allocations. This applies to allocating objects as well.


I found this issue elsewhere seen without OpenMP but just using pthreads. The extra memory consumption when multi-threaded appears to be typical behavior for the standard memory allocator. By switching to the Hoard allocator the extra memory consumption goes away.


Need Your Help

Wav to MP3 Convertion

delphi mp3 converter wav wave

I'm making a program in Delphi, which records the audio from vinyl records, then detects and separates the different tracks from each other. My problem is I can't convert the recorded wav files to ...

What are the steps to be able to connect to a mysql database on linux from windows

php mysql linux windows

On my localhost (windows) I am trying to connect to 192.168.0.16 (linux box).

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.