Building A Search Engine: Memory leaks

Let's consider memory leaks in the document parser.

They are usually easy to track: the memory behavior gets wrong, too much memory is used as compared to what it should. This good thing usually happen fast for at least two reasons: time and code coverage. As opposed to some software which will show it has memory leaks after a few days, a parser show it in a few minutes: the amount of documents to process will be high in a short span of time. They are easy to see also because of code coverage: if anything could trigger a memory leak, it will.

When these memory leaks happen, they are either obvious, or really hard to track. Whichever way, it consumes time.

What could be a solution to this problem then ?

The algorithm/methodology I've used, and which worked quite well so far I will the call the "door algorithm". When you open a door, you close it after yourself.
With memory, it's the same reflex, Allocate/Free. Be a good boy, or a good girl. Also, "Standardize" where memory gets allocated and freed. Allocate at much as possible in the "constructor", free it all in the "destructor" (quotes applying if you are not using an object oriented language).

Wouldn't it be more efficient and less error-prone to rely only on tools such as valgrind ?
Well, the two methods, I think, must walk hand in hand: these tools can give false positives, and miss memory leaks. It does a good job at tracking them, but can miss them too. We talked about Murphy's law before.
And, above all, I think these tools must be used as a parachute, not as the only programing strategy (referring to debug time, to mention the least).

Building A Search Engine

2007/08/22

Memory leaks

No comments:

Blog Archive

About Me