I have a bulk import process that needs running nightly. Currently there are around 4,200 "rows" to process, which actually can encompass many tables so row is not entirely appropriate. The problem is that the script poops out after ~200 "rows" with memory limit errors. While increasing the memory limit is do-able, I am not interested in that as a solution currently. First, I recorded some numbers to benchmark where the script was in memory consumption. The first number is the "row", the peak is how much memory the script has allocated, and the mem is the delta in memory allocation per row.

0 peak: 41,103,408 mem: 29,884,416 1 peak: 42,440,264 mem: 1,310,720 2 peak: 43,613,848 mem: 1,310,720 3 peak: 43,893,960 mem: 262,144 4 peak: 44,223,040 mem: 262,144 5 peak: 44,896,296 mem: 786,432 6 peak: 45,671,560 mem: 786,432 7 peak: 45,865,952 mem: 0 8 peak: 46,418,272 mem: 786,432 9 peak: 47,917,888 mem: 1,310,720 10 peak: 48,566,312 mem: 786,432

After the initialization phase, memory allocation increases fairly steadily and it is not long until the 128M memory limit is reached. This is unacceptable as I know some "rows" should be much closer to 0 as nothing is imported on the majority of rows. My first solution was to disable logging:

sfConfig::set('sf_logging_enabled', FALSE);

The initial memory allocation was decreased, but the running deltas remained higher than expected. Second, I inserted a ton of unset() calls in the various functions. This dropped my deltas a little:

0 peak: 30,611,472 mem: 20,971,520 1 peak: 31,969,760 mem: 1,310,720 2 peak: 32,180,112 mem: 262,144 3 peak: 32,333,216 mem: 262,144 4 peak: 32,525,144 mem: 262,144 5 peak: 32,656,616 mem: 0 6 peak: 32,863,736 mem: 262,144 7 peak: 33,103,264 mem: 262,144 8 peak: 33,455,544 mem: 262,144 9 peak: 33,754,288 mem: 262,144 10 peak: 33,984,976 mem: 262,144

But allocation still killed the import before it could complete. Browsing through various sites, I discovered Propel had a hard time cleaning up circular references, which meant PHP couldn't garbage-collect that memory. However, to combat this, Propel 1.3 offers a static method disableInstancePooling that allowed me to override Propel's desire to keep instances around. Adding:

Propel::disableInstancePooling();

to the beginning of the import gave me these results:

0 peak: 26,569,632 mem: 17,301,504 1 peak: 28,582,152 mem: 2,097,152 2 peak: 30,455,352 mem: 1,835,008 3 peak: 30,536,176 mem: 0 4 peak: 30,536,176 mem: 0 5 peak: 31,517,088 mem: 1,048,576 6 peak: 31,534,152 mem: 0 7 peak: 31,552,120 mem: 0 8 peak: 31,589,632 mem: 0 9 peak: 31,695,504 mem: 0 10 peak: 31,695,504 mem: 0

Now new memory was allocating only when the import was actually doing something of significance. In fact, watching the import proceed with the deltas displayed, I could observe the memory decreasing at times, prolonging the life of the script by orders of magnitude. Whereas before the script was processing ~200 "rows", it currently processes the whole batch (4,237 "rows" currently) in one go. As the importable "rows" increase, I know I won't be butting up against memory limits for some time.