Five Practical Memory Principles for Coding Programs

Printer-friendly version

Your programs are more reliable and often faster when you code with careful memory techniques. Those advantages come at a low cost, too: the ideas behind the five tips presented below are simple enough to understand on a first reading. Start to practice them today, and you'll soon see pay-offs in the the applications or libraries you write.

It feels strange to explain "memory hygiene" in 2008, for many of the most important aspects of the subject were already documented thirty years ago, when decks of punched cards were first yielding to mini- and micro-computers. I still see many of the same mistakes made as were common in 1980.

These stubborn errors are particularly unfortunate not just because we understand memory techniques better today, and the pertinent tools are more readily available: the value of good coding is arguably higher today. Are you responsible for a game server that needs to be 99.99% available for months on end, or a factory automation system where unscheduled down-time costs $3500/hour? These are typical situations for the applications I see--and hazardous ones if your program accidentally leaks memory.

It's Easy for Leaks to Form

Let's look first at a concrete example of one of the fifteen or so ways memory use can go wrong. The code which follows only outlines the most basic form of what can go wrong; it makes no provision for wchar_t, graphics contexts, or concurrency models. Even without those complexities, the example is a fair reflection of what occurs "in the wild":

Suppose you're responsible for a C-coded function that sanitizes user input, something like


    char *get_safe_request(FILE *stream)

    {

	char *ptr;

	    /* Always remember to include space for the terminator. */

	char tmpbuffer[BUFFER_SIZE + 1];



            /* Don't worry about stream-related error-handling for now. */

        (void) fgets(tmpbuffer, BUFFER_SIZE, stream);



	    /* Handle any hazards found in the string, that is,

	       character sequences (semi-colons, exclamation

	       marks, ...) that might confuse up-stream processors.

	       Assume this sanitization can only shorten the string. */

	if ...

	if (ptr = malloc(1 + strlen(tmpbuffer)))

	    return strcpy(ptr, tmpbuffer);

        else:

	    return NULL;

    }

This is correct code, just what you should write for the library. Perhaps you've carefully documented that any invoking segment is responsible for memory clean-up, that is, that calling code should look like


    ptr = get_safe_request(my_stream);

    scan(ptr);

    free(ptr);

Maybe you've even done tests, and verified in a couple of ways that execution appears clean: memory usage seems bounded, and no leaks turned up. Suppose, though, that when the code goes into production, it also activates an alternative channel that enables a monitor:


    while (1) {

	scan(get_safe_request(irc_channel));

    }

Perhaps the person who installed the monitor is accustomed to working in a garbage-collecting environment, and doesn't understand resource clean-up; maybe the code was intended as a quick hack retained without full consideration of the consequences. In any case, despite everyone on the team writing reasonable code, and even testing the application in several different ways, there's now a memory leak. The program as a whole performs malloc()s without corresponding free()s. As time goes on, the program fills up more and more of working memory. Eventually, filling memory with leaked data will effectively disable at least your program, and perhaps the whole operating system, as "thrashing" of swap space fully occupies the host. However trivial or obvious this example seems in isolation, it's typical of countless real-life incidents I encounter. Among all the memory faults a program can exhibit, recurring leaks seem to be the hardest to diagnose and cure.

What's the remedy? There are only three possibilities:

  • Don't leak memory
  • Make sure you're leaking so little memory as not to impact your application
  • Set up "insurance" for your program so that, when it does leak too much memory, the insurance restores the program to a healthy state

To understand how to put any of these into practice, it's important to be clear on a few of the fundamentals of memory use.

Memory's Role

The title mentions "long-lasting" applications. Think of these in terms of a service model--a Web server, for example. A Web server idles quietly, until it receives a request for a page from a client. The server prepares the page, responds to the client, then return to wait for the next one.

The crucial point for us is that, to maintain its own health, the server forgets everything specific to a particular request after it's done. While constructing the page, the Web server must have all sorts of things in memory: the URL of the request, any pertinent cookies, transient data for its server-side scripting engines, the time of day, any details the client has passed along, and so on. All these are in the memory of the Web server process, at least temporarily.

Once the request has been fulfilled, and any related reports logged, all that memory must be released, though, so the cycle can start again. If not--if, for example, the server keeps information about who requested a particular page, and at what time--the memory space of the Web server process will gradually fill with more and more data pieces. Eventually it'll fill physical memory or otherwise overflow system limits, and unhappiness results.

How can you know what's going on in the memory of one of your programs? Current Linux distributions offer a rich array of tools for monitoring different aspects of memory use, including top or htop or gps, free, treeps, vmstat, and more. One quick way to begin study of memory use is with a command-line utility such as ps h -aeo pid,user,ucmd,size; this prints a simple table that might look on your system like


    ...

        214712 31928 root     syslog-ng

        129672 13496 www-data apache2

        129528 13506 www-data apache2

        129204 13486 www-data apache2

        128328 13494 www-data apache2

        127352 13490 www-data apache2

         25788 30171 claird   sort

          8732 16350 claird   Xvnc

          5884 30477 nobody   openvpn

          2744 13477 root     apache2

          1768 16355 claird   xterm

    ...

This table lists in order the processes currently using the most memory. The point for the purpose of this article is that uncontrolled growth of memory usage is a leak.

Full understanding of memory usage involves subtle distinctions between address space, resident set size, share memory spaces, and other operating-system concepts beyond the scope of this article. For our purpose, it's enough to know that a memory leak shows up in tables such as the one above.

Don't Leak

Several related articles describe the most effective techniques for making sure your programs don't leak. My conclusion: for the best results in correct use of memory, it's important to partner both

  • "Manual" review
  • Automated testing

There are wonderful memory-checking tools and libraries available both as proprietary and open-source products. I use several.

It's unrealistic to expect tools to solve memory problems on their own, though; you'll be far better off when you develop a few "bench-checking" techniques of your own:

  • Document the resource consequences of each function or procedure definition;
  • Peer review of source code can be a great way to spot memory problems; and
  • "Deep reading" of code you've written already helps understand what your program truly does.

These general approaches apply across a broad range of development environments. At the same time, specific languages and libraries have a lot to offer. It's hard to write correct C source code free of any memory defects; programming in such higher-level languages (HLLs) as Python, Erlang, or Scheme eliminates whole categories of memory faults. I strongly advocate use of HLLs in many situations, just for the benefits in reliability I see.

However, even the comparison between C and HLLs in regard to memory usage is more complex than on first appearance. Good garbage-collecting libraries are now available for C, so that it, too, has the potential to be a memory-managing language. At the same time, moving to a HLL does not solve all memory leaks, despite the claims of naive enthusiasts. Most HLLs are subject to cyclic references, orphaned resource handles, or other miscodings that result in memory leaks.

Leak Slowly

Even with all these precautions--thoughtful use of a high-level language, well-reviewed source code, and regular memory checking--leakfree programming is occasionally an unattainable goal. You might be dependent, for instance, on a proprietary third-party library that turns out to leak. What then?

Make measurements, and calculate the impact. If leakage is only 23 bytes per invocation, as I observed in one situation, and the function call was made no more than five times per second, then total losses per day were under ten megabytes (23 bytes per invocation times 5 invocations per second times time 86400 seconds per day). In some situations--with limited hardware, or for a truly long-lived process--that sort of loss would be disastrous. In others, it's inconsequential.

How do you make such measurements for yourself? Commands such as ps measure a process "from the outside", as already illustrated above. For even more precision, special-purpose libraries that replace malloc() illuminate exactly how a particular program uses the memory heap. Tools including valgrind and Electric Fence are based on memory-management libraries; you can operate at a lower level by use of a debugging library such as dmalloc.

Arm Yourself Against Leakage

In an extreme case, discretion beats valor, and it's acceptable to let a process leak, but be prepared to replace it if leakage grows too large. Twice in my own programming, I've encountered complicated special-purpose legacy Web servers that appeared to leak memory mysteriously. Rather than fully analyze their behavior, I created fast restart sequences and trustworthy memory introspection for them. It was easier to add new code that tracked when memory usage exceeded a configurable threshold, then shutdown the process and start a new one. A related expedient that is in production five years after its installation as a temporary fix at one of my sites is to have cron restart a process every hour. Users perceive availability of over 99% (under half a minute restart time hourly), and memory usage remains bounded.

Summary

Memory leaks are an old problem, already well-described decades ago. As mundane as they seem, and as irrelevant as hardware advances sometimes appear to make them, memory leaks remain important, especially in light of:

  • Very long-lasting services
  • Constrained hardware, especially in a world where most processors operate inside telephones, automobiles, or other "embedded" systems, rather than in conventional computers
  • The ways common caching architectures, copy-on-write semantics, new devices such as solid-state drives and small form-factor computers, and even trends in encryption and "green" computing put a premium on thrifty memory use.

If you're among the majority of programmers for whom memory leaks matter, learn to

  • Code and document carefully to eliminate errors
  • Review source
  • Use special-purpose tools such as valgrind to spot memory faults
  • Measure the actual memory use of your programs
  • Prepare your programs to restart themselves when necessary

Managers and end-users are unlikely to thank you for your leakfree programming. Without good memory practices, though, your users are likely to spend their time focused on mysterious slowdowns or internal errors and not appreciate all the functionality you've delivered them. Clean up memory, and let your applications' real value shine through.

4
Average: 4 (3 votes)
Copyright © 2008 Linux Foundation. All rights reserved.
LSB is a trademark of the Linux Foundation. Linux is a registered trademark of Linus Torvalds