I still don't get it, a context switch should save the entire size of the FPU register stack (each 80bits) when a context switch happens, and should then restore it when you thread resumes. The context switcher does not even know if the registers contain doubles or expanded floats, so it needs to save all bits anyway.
Since each FPU register is 80 bits on x86, you have expansion anyway regardless of whether you use floats or doubles, because doubles are only 64bit.
Do you have a link where that issue is described in a bit more detail because i feel there is something missing here.