Forcing the garbage collector

I’m currently working on a system that is being designed to minimize problems with garbage collection. In the past I would have said that the garbage collection system is not the concern of application developers. I stand by that for most cases but this is a somewhat unusual application in that it combines a large number of operations per second with a very large number of objects in play (the “working set”).

I have written a small program to expose similar behaviour and it is below in full (short and complete). The basic operation is:
* create a large array of objects which hold some data
* select an object at random from the array and replace it with a new object; repeat this as fast as possible
* monitor progress on another, very light, thread that polls every second or so and reports the number of operations completed since the last poll.

The idea is that the large number of new objects makes for a very high memory allocation velocity, this means that there is a lot of garbage to clean up. The Large working set means that the garbage collector has a lot of objects to traverse and the random nature of the deletions mean that the memory becomes fragmented and will require extensive compaction in order to create contiguous blocks of memory for allocating of new objects.

Each of the objects in the array is about 4kB (= 10*100*sizeof(int)) and I started to observe the effects of garbage collection at about 20,000 objects on my dual-core desktop machine.

To run the program:

  1. save it (as say, c:\temp\Garbage.cs)
  2. open the Visual studio command prompt
  3. change to the c:\temp folder
  4. compile it running csc c:\temp\garbage.cs
  5. run the program, specifying the number of objects in the working set as an argument (in this case 1000) Garbage.exe 1000

First, let’s look what happens when the working set is small. If I run with 100 objects I see something like this:

C:\Temp>garbage.exe 100
starting main..
Filling the working set with 100 objects
starting monitor thread..
starting worker thread..
press enter to kill it
the count this iteration is:8,         ticks :0
the count this iteration is:17489,     ticks were:2127710960
the count this iteration is:17480,     ticks were:2127296384
the count this iteration is:17178,     ticks were:2127757912
the count this iteration is:17488,     ticks were:2127710808
the count this iteration is:17339,     ticks were:2127596432
the count this iteration is:17083,     ticks were:2132044856
the count this iteration is:17323,     ticks were:2127488824
the count this iteration is:17412,     ticks were:2127741720
the count this iteration is:17537,     ticks were:2127549544

so the program polls regularly and the number of updates to the data is recorded as the “count this iteration”, also the ticks elapsed since the last poll is recorded. If I run this for a while I get this kind of distribution of counts:

iteration    #


15900        2
15950        0
16000        2
16050        1
16100        0
16150        7
16200        20
16250        27
16300        42
16350        49
16400        27
16450        11
16500        1
16550        0
16600        0

standard dev = 123
std/average = 0.8%

which looks “normal” we don’t see any spikes in memory, the garbage collector is keeping up with no pressure on memory or CPU. This is a typical app and we don’t need to worry about garbage collection. This behaviour actually persists — on my machine — up to about 20,000 objects (=20,000*4k = 80 MB). Above that, things start to happen.

When I run with 100,000 on my machine, I get something like this:

C:\Temp>garbage.exe 100000
starting main..
Filling the working set with 100000 objects
starting monitor thread..
starting worker thread..
the count this iteration is:1,           ticks were:0
press enter to kill it
the count this iteration is:12498,     ticks were:2127792200
the count this iteration is:12502,     ticks were:2127633784
the count this iteration is:12666,     ticks were:2127541688
the count this iteration is:7239,      ticks were:2127698056

the count this iteration is:12174,     ticks were:2127482736
the count this iteration is:8522,      ticks were:3823133528

the count this iteration is:12649,     ticks were:2127596272
the count this iteration is:12781,     ticks were:2186759112
the count this iteration is:12856,     ticks were:2127987848
the count this iteration is:12435,     ticks were:2127509440
the count this iteration is:6981,      ticks were:3197386008

the count this iteration is:12589,     ticks were:2148094192

we notice 2 things immediately:
1) the number of operations per poll is no longer distributed around an average, there are “bad polls” when a lot less operations complete
2) during those “bad polls” actually the monitoring thread is affected as well. The poll actually takes 50% or so longer to complete, so these bad polls are very bad, the throughput of operations on the data has dropped by as much as 75%

The distribution of operations per poll looked like this for me (your mileage may vary):

iteration    #


1000        3
2000        6
3000        2
4000        4
5000        2
6000        0
7000        0
8000        1
9000        0
10000        0
11000        6
12000        133
13000        43
14000        1

average = 10975
standard dev = 2715
std/ave = 25%

The big things to notice are
1) the operations per poll has dropped, even on good polls, but that seems somewhat reasonable but the decrease is quite a lot
2) the distribution goes nearly down to 0; the the range of operation counts went from 13681 to just 261! This “fat tail” is the result of the “bad polls”. This is what sometimes happens to stock prices, wiping out years of modest gains with a single big loss and it is why you shouldn’t always trust salespeople who tell you that the “average” gain on this investment is 10%, if, once every 10 years it loses 50% of its value you won’t do so well..
3) nearly 10% of the polls were “bad”; that is, having less than 50% of the average count

A picture is worth 1k words so let’s look at task manager:


and the saw-tooth pattern of repeated garbage collections is quite clear. Now we don’t see a peak in CPU activity, as we are already running at high CPU but I guess that the CPU has switched from doing the real work to doing the garbage collection work!

So, the question is: how do we avoid this.
The answer is mostly that you don’t need to. If you aren’t trying to do 10,000 operations per second on 10,000 objects of reasonably large size (I don’t regularly create data objects with 100 integers inside). If you do have to do that then the best thing to focus on is object lifetime. If you need to create objects, make sure that they go out of scope quickly so that the can be collected in the generation 0 garbage collection, which is much faster. If you just need to store some data, don’t create new objects but modify the data stored in an object that you already have so the object lifetime becomes infinite, then it won’t be collected at all. Finally, be careful how you pass objects around, if references to them get stuck somewhere – for instance in a collection – you might be extending lifetimes that you don’t intend to. This kind of non-reference passing, controlled object-lifetime code is tough to write and maintain, so don’t do it if you don’t need it. The other alternatives like specialized hardware like Azul for Java or buying in a product that is an in-memory enterprise database (sometimes called an Enterprise Data Fabric or EDF) are both even more expensive.

code follows:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Diagnostics;
using System.Threading;

namespace GarbageCollectConsole
class Program
/// <summary>
/// the large working set that will be the garbage collectors
/// main problem; as we update it, it has to work hard to keep it clean.
/// </summary>
private static ComplexObject[] _data;

private static int counter = 0;
private readonly static ReaderWriterLockSlim counterLock = new ReaderWriterLockSlim();
private static int lastCount;

/// <summary>
/// better than using DateTime.Now.Ticks
/// </summary>
private static readonly Stopwatch clock = new Stopwatch();

private volatile static bool stopNow = false;

/// <summary>
///  the entry point to the program
/// </summary>
/// <param name=”args”></param>
static void Main(string[] args)
Console.WriteLine(“starting main..”);

int arraySize = GetArraySize(args);
Console.WriteLine(“Filling the working set with {0} objects”, arraySize);
_data = new ComplexObject[arraySize];

var mon = new Thread(MonitorCounter);
Console.WriteLine(“starting monitor thread..”);

var worker = new Thread(AllocateObjects);
Console.WriteLine(“starting worker thread..”);

Console.WriteLine(“press enter to kill it”);
stopNow = true;
Console.WriteLine(“press enter to exit main…”);

Console.WriteLine(“about to exit main…”);
/// <summary>
/// parse the command line to get the size of the working set, start with
/// at least 10,000
/// </summary>
/// <param name=”args”></param>
/// <returns></returns>
private static int GetArraySize(string[] args)
return 100000;
return int.Parse(args[0]);

/// <summary>
/// start another thread that will put messages to the console
/// this is so slow that it won’t really affect things on the
/// worker thread
/// </summary>
static void MonitorCounter()
while (!stopNow)
int localcounter;
localcounter = counter;

Console.WriteLine(“the count this iteration is:{0},\t\t ticks were:{1}”,
(localcounter – lastCount),

lastCount = localcounter;


static void InitialArrayFill()
for (int i = 0; i < _data.Length; i++)
_data[i] = new ComplexObject();

/// <summary>
/// randomly abandon a referenced object. One of the objects
/// that is referenced from the _data array is replaced. As the _data
/// array is the only reference to the object, it is available for
/// garbage collection. As this loop only hits about 10,000 per second
/// — on my machine — most of the objects in the array generally live
/// for a long time, thus they get into generation 2.
/// </summary>
static void AllocateObjects()
var indexGenerator = new Random();

Interlocked.Increment(ref counter); //safe to do as this is the only thread updating
_data[NextIndex(indexGenerator)] = new ComplexObject();

/// <summary>
/// use random, as we don’t want to just go through the array as it will
/// end up being contiguous in memory again. We want the heap memory to get
/// as fragmented as possible
/// </summary>
/// <param name=”indexGenerator”></param>
/// <returns></returns>
static private int NextIndex(Random indexGenerator)
return (int)(indexGenerator.NextDouble() * _data.Length );

/// <summary>
/// a simple object, I had to include the _data array
/// to add some memory pressure. Without it the memory allocation
/// velocity was so small that the memory was cleared without even
/// noticing a CPU or memory spike
/// </summary>
class SimpleObject
public int Number { get; set;}
public readonly Guid Foo;
public readonly DateTime stamp = DateTime.Now;
public readonly int[]_data = new int[100];

public SimpleObject()
Number = (int) (new Random()).NextDouble();
Foo = Guid.NewGuid();
for (int index = 0; index < _data.Length; index++)
_data[index] = index;

/// <summary>
/// create a deeper object graph, may not make a big difference
/// </summary>
class ComplexObject
public readonly SimpleObject[] _data = new SimpleObject[10];

public ComplexObject()
for(int index =0; index< _data.Length ;index++)
_data[index] = new SimpleObject();



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s