Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> First, cite sources for specific cases where a mainstream malloc() is "a lot slower" than a specific GC'd allocation in a mainstream HLL. This rings more truthy than true to me.

> Second, fixing malloc slowness is among the easiest and fastest optimizations you can make in a C program (in most cases, a pool and freelist will get you 90% of the way there), and no GC'd allocator is faster than pool and arena allocation (which is usually just a couple ALU operations in both the alloc and free cases).

In a semispace or generational GC, allocating memory is just bumping a pointer.



You're not counting the cost of deallocation.


Not an easy thing to count globally. Your GC cost (at least in mark-sweep) is dependent on the number of objects you're not deallocating. The rest are implicitly destroyed.


You can't decouple the cost of malloc() from the cost of free(); if malloc has to do any more work than simply bumping a pointer (or, in the general case, grabbing a mutex and then bumping a pointer), it's a concession to free() (or to defragmentation, a side effect of free).


Semispace (and generational) collectors do decouple the cost of allocation from the cost of collection by only considering the set of live objects. They also compact (defragment) as a side effect of collection.


In the case of a minor collection, you don't do a full mark-sweep. You only look at a small fraction of the live objects. Generally you only need to trace objects in the youngest generation which are reachable from GC roots. Since you expect most young generation objects to be eligible for collection, this is extremely fast.

It is unusual for a lot of references to exist from older generations to the youngest generation, but when this occurs there are data structures for detecting and resolving these references very efficiently.


In the collectors I mentioned, deallocation is a matter of moving the live objects to the other semispace, and considering the "current" semispace to be garbage.

If you're allocating a lot of short-lived objects, then explicitly keeping the live objects is much more efficient than explicitly culling the dead objects.


> just bumping a pointer

If only. You have to check for out-of-memory, and throw an exception if so. (This check can sometimes be optimized away, though). You also have to mark the size of the allocated block, so the garbage collector knows how many bytes to copy during the sweep phase.

Also, garbage collectors usually allocate from a shared memory pool, so every allocation also involves some mutex operations.

Allocation is bumping a pointer in theory only.


> You have to check for out-of-memory, and throw an exception if so.

You have to check if there is space for the allocation in the current space. This is two native instructions, a compare and a conditional branch. If the allocation fails, you don't throw an exception, you launch a minor collection.

> You also have to mark the size of the allocated block

Yes and no. When an object is initialized, type information is recorded that contains the size. I don't think that the Java VM explicitly stores the size of the allocation somewhere. It certainly doesn't need to since the same information is available by just looking at the object itself.

> Also, garbage collectors usually allocate from a shared memory pool, so every allocation also involves some mutex operations.

In the Java VM this problem is addressed by dividing the Eden space into chunks where each chunk is private to a single thread.

> Allocation is bumping a pointer in theory only.

An object allocation in Java is around 10 assembly instructions. More than just 'bumping a pointer' but not a lot more.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: