In our case, the issue wasn't kernel process creation; it was the CPU and I/O overhead of service start-up. At some point, the system gets dominated by context-switching, and throughput suffers. For example, you certainly wouldn't want the box to go into swap because of start-up allocation spikes.