Cool. How did you deal with the case where a block gets freed out of order? For ...

srcmap · on Jan 7, 2019

The blocks were actually fix in pre-allocated blocks.

When a thread, process, kernel thread "allocated" a block. It actually acquired and owned the "index" via atomic inc from the free list.

When it free the block, it put the block's index back to the "free" list.

The "free" list is actually a array. Thus the list index can be allocated and free with atomic inc, dec. Thus it can do SMP safe alloc, free operations without any locks and guarantee the correctness.

This was actually a IPC, DIPC library to support TCP session redundancy in a network router product. Other programmers call this library from user, kernel and TCP stack, etc.

Since the blocks and index are all in share memory, there are huge potential for share memory corruptions. In debug mode, there are memory space before and after each block, every allocate, free are checked for ownership, double free, and border space corruption. There are regression tests to cover all aspect of API calls that were run 24/7 check for every commit. When not compile with debug mode, the operations were much faster but there were still quite a bit of assertions check in place.

There were a debug trace system that track all system activities (including Linux, QNX, vxwork context switches.) The log is actually in special region of memory that preserved over warm boot. If the system was completely hang, in 99% of cases, I can recover the memory to do post mordem analysis and know exactly the last user, kernel space threads that were running before the system crash. It was more useful for vxWork, less useful for Linux as majority of code were in user space and didn't take down the system.

The regression test coverage code and the debug mode validations were 80% of code. The actually implementations were very small.

okl · on Jan 6, 2019

Freeing is usually easy (for free() and memory pool) because the pointer can be used to compute the index into the bool array or bitfield. Allocating is more difficult since you have to search. You can use a linear search that starts at the last block that was allocated (for memory pools that's usually good enough). You can also use a bloom filter to accelerate the search. Example:

256 blocks of memory, you use 8 words of 32 bits to to indicate used/free status. 1 means free, 0 means occupied. To quickly find a free block you compare the words with zero.