More specifically, a pointer in x86-64 (and most other 64-bit architectures) is going to be 8-byte aligned, which means that a struct of an 8-byte pointer plus a 2-byte short will actually consume 16 bytes.
Alternatively, keeping them in separate arrays removes the alignment overhead, and each pair of values will consume 10 bytes. That's a substantial savings when you're trying to fit into something like a 32KB L1 data cache.
Another technique that can help you fit as much useful data as possible into L1/L2 cache is to store offsets rather than raw pointers. Using a 2-byte unsigned short for an offset rather than an 8-byte pointer would save another 6 bytes of cache per entry, assuming you're happy limiting your URL size to 64 kilobytes. The extra address arithmetic should be essentially free on a modern pipeline.
Alternatively, keeping them in separate arrays removes the alignment overhead, and each pair of values will consume 10 bytes. That's a substantial savings when you're trying to fit into something like a 32KB L1 data cache.
Another technique that can help you fit as much useful data as possible into L1/L2 cache is to store offsets rather than raw pointers. Using a 2-byte unsigned short for an offset rather than an 8-byte pointer would save another 6 bytes of cache per entry, assuming you're happy limiting your URL size to 64 kilobytes. The extra address arithmetic should be essentially free on a modern pipeline.