Notably x86-64 canonical form uses the full 64 bit address when referring to mem...

Dylan16807 · on March 5, 2023

I'd say it does use just the first 48 bits, but in a sign-extended way. You could sign-extend to any size you want, even beyond 64 bits.

And it forces the program to do the sign extension, so it'll be forwards-compatible with chips that use more bits. But once those bits are verified they get immediately discarded.

zamadatix · on March 5, 2023

The method would scale to any number of bits but the actual registers and addresses passed to the MMU really are 64 bit in real processors, as in you can actually set it to some other non-sign extended 48 bit value with your program today and you’ll just get an MMU exception when you go to use it. After all the CPU doesn’t know something in a register is to be treated as a memory address until after the value is already loaded into the register.

The advantage of 48 but addressing in x86-64 is fewer levels of page lookups (speed up), not abbreviated addresses.

Dylan16807 · on March 5, 2023

> as in you can actually set it to some other non-sign extended 48 bit value with your program today and you’ll just get an MMU exception when you go to use it

You get an error because you didn't sign extend. That doesn't tell you anything concrete about the address size.

> After all the CPU doesn’t know something in a register is to be treated as a memory address until after the value is already loaded into the register.

> The advantage of 48 but addressing in x86-64 is fewer levels of page lookups (speed up), not abbreviated addresses.

If you're taking advantage of the address size in your program, you do only load 48 bits into your register from memory. Usually this is inside of a larger struct, or you've partitioned a 64 bit word into address plus flags.

One method that gets significant use is NaN boxing for dynamically typed values. Double precision floats are stored as-is, while other types of value including multiple types of pointer are squeezed into the 53 unused bits of a NaN.

zamadatix · on March 5, 2023

You can do “mov rax, [0xFFFF…FF]” or “mov eax, 0x1234…FF” and it each case you have loaded either a 64 bit or 32 bit value into the register (depending which memory mode you’re using). Saying it came from a struct or larger abstract type or has leading 0s does not change the actual hardware and make the register itself 48 bits.

Dylan16807 · on March 5, 2023

You could use a 16 bit load and a 32 bit load. I'm not sure what's faster.

If I make a struct that has a 32 bit field followed by a 48 bit field, I might use a 64 bit load as part of getting the latter. But that doesn't make the field larger. It's an optimization because it's safe to overshoot.

Edit: When I wrote this comment it was replying to a request for a 48 bit load instruction.

zamadatix · on March 5, 2023

You can’t do a 32 bit load followed by a 16 bit load, the memory address is in one register so the only option is a 64 bit value be stored there and you need to know what the first 16 bits are to see if the last 48 bits refers to the high or low portion. It’s trivial to check it’s passed all 64 since things like “0xFF0F…” gives an exception but adding “0x00F0…” to the same register and doing the load/call does not.

It’s not a matter of what wizardry you can do in the high level languages to act like it’s a true 48 bits for the sake of speed/size in your data structures it’s a matter of what happens when the rubber hits the road and the CPU loads the address. One thing you can really do on x86 CPUs is run them in 64 bit mode (i.e. get the newer instructions and registers) but only use 32 bit addressing. This does actually go faster do to halving the address sizes, some examples can be found in the Linux kernel as the x32 ABI.

Dylan16807 · on March 5, 2023

> You can’t do a 32 bit load followed by a 16 bit load, the memory address is in one register so the only option is a 64 bit value be stored there

Putting the bits together into one register and doing the sign extension is a matter of arithmetic. It doesn't touch memory any more. You very much can do a 32 bit load and a 16 bit load and no other loads.

> you need to know what the first 16 bits are to see if the last 48 bits refers to the high or low portion

No you don't. That's not how the sign extending works. The top seventeen bits of the 64-bit value are all the same.

> what happens when the rubber hits the road and the CPU loads the address

The CPU will fetch an entire cache line, 128 bytes, but you only need to pull 6 bytes out of L1.

Then you can sign extend the upper bits and use your address as normal, to load any number of bytes from elsewhere in memory (maybe only one byte if you're feeling feisty).

Also you added the part about the register itself not being 48 bits to your previous comment after I replied, but I've never claimed the register was 48 bits. Just that it's effectively containing a 48 bit value. The reason I keep hammering on about sign extension is that you could put the same memory addresses into an 80 bit register, but it wouldn't make them 80 bit addresses. You're just using a register of arbitrary size to store a 48 bit signed number.

zamadatix · on March 5, 2023

> Putting the bits together into one register and doing the sign extension is a matter of arithmetic. It doesn't touch memory any more. You very much can do a 32 bit load and a 16 bit load and no other loads.

There is no 48 bit register nor can you make a 48 bit register via a 32 and 16 bit load, you'd still have a 64 bit register you wrote 48 bits into it's just you've assumed you're going to be using only the lower half of the mapping so 0x0000... is a valid sign-extension. If you were trying to access the other half of addressable memory that doesn't work though.

> No you don't. That's not how the sign extending works. The top seventeen bits of the 64-bit value are all the same.

The bits are the same if you want it to work but it's not the same thing as the bits being extended for you. You still actually have to load the sign extended bits i.e. if you put 0x0000F22... into the register and tell it to load the memory address it doesn't turn the memory address into 0xFFFFF22... for you. Your compiler or OS might be doing that on your behalf but the CPU does not. You will get an exception from the MMU if the memory address is not already sign extended when referenced. It may appears to work that way if you only ever work in user space where 0x0000... happens to extend the lower half for you if you don't set it but only half the memory is mapped in a way that holds true and it's still passing those 0's to the MMU (as again is evident if you set one of the bits to something else then try to call the address).

I think the rest follows with the above. It's not a matter of theory just create an assembly program that tries to load rax set to 0xFF0F...F20000 and compare it to one that tries to load rax set to 0xFFFF..F20000. One works, the other doesn't. The only explanation is the full value is being sent. If only 48 bits were sent to the MMU and it signed extended automatically 0xFF0F wouldn't create an exception. It's a 64 bit address where only 48 bits worth are legal, that's very different than a 48 bit address.

Dylan16807 · on March 5, 2023

> There is no 48 bit register nor can you make a 48 bit register via a 32 and 16 bit load, you'd still have a 64 bit register you wrote 48 bits into it's just you've assumed you're going to be using only the lower half of the mapping so 0x0000... is a valid sign-extension. If you were trying to access the other half of addressable memory that doesn't work though.

You need to rearrange the bits into place. If you mov/shift the correct way it'll sign extend with 0x0000 or 0xffff as appropriate.

> it's not the same thing as the bits being extended for you

Yes you have to tell the CPU to sign extend. I wasn't trying to say otherwise. But you don't have to load those bits from anywhere. You only have to load 48 bits from memory.

> The only explanation is the full value is being sent. If only 48 bits were sent to the MMU

It's still possible the verification happens pre-MMU, but either way as soon as those bits are verified they get discarded. Only 48 bits are used for any functionality.

> and it signed extended automatically

Well again I wasn't saying that. The program needs to ask for the sign extension.

zamadatix · on March 5, 2023

In your high level logic you can treat memory addresses 17 bits and transform them into 64 bit memory addresses on actual load but that doesn't mean the architecture uses 17 bit memory addresses it just means you're generating the addresses on the fly.

> It's still possible the verification happens pre-MMU

It's also possible there is a teapot in orbit between Earth and Mars but as far as actual x86 CPUs they don't deal with MARs (memory address registers) which would have that kind of logic, which is probably a good thing given all the different memory modes multiplied by the number of registers you can use. There are AGUs in newer x86 CPUs (well at least Intel, I assume in AMD) which offload common memory offset calculations but they take and return the standard register load size.

That only 48 bits are only used for any functionality is a fair description though, assuming we're still excluding newer 57 bit CPUs. The main thing I was taking issue with was your initial statement "With 48 bits as a hard limit of what fits into a register" not how many bits of the register or memory address are functionally meaningful.

Dylan16807 · on March 5, 2023

> The main thing I was taking issue with was your initial statement "With 48 bits as a hard limit of what fits into a register" not how many bits of the register or memory address are functionally meaningful.

Oh, Oh!

That was a hypothetical talking about 48 bit CPUs that grew out of 12 bit CPUs.

That's why I said 384 tera-octets and something we would stick with.

zamadatix · on March 6, 2023

Makes sense :) thanks