On Stdio... - GNU/Linux
pranomostro
jkl: Kinda. It still just reads bytes, not codeunits. I should have written 'It first reads the byte 61, then it reads the byte 62, and then the byte 63 (at least in ASCII).'
z3bra
Is there any character encoding that would treat 0x61 as something else than 'a' ?
jkl
In EBCDIC, it's '/'.


(edited for correctness)
pranomostro
In UTF16, 'a' is the two bytes 0xfeff 0x0061.

Poor Windows people.
jkl
In UTF-8, most characters are two bytes.

Poor Unicode people.
pranomostro
But not all of them, and UTF-16 has no ASCII backwards compatability.
jkl
Emojis are UTF-16. I fail to see how this is related to Windows.
venam
This argument is going on forever...
It now deserves its name in history: The Unicode war.
OP has got the basic idea, I don't think there's a need to argue more about this.
It may deserve its own thread if it tackles anyone's interest.
pranomostro
I think we are just arguing about two different things. I wanted to say that

1. Windows uses UTF-16 almost exclusively
2. UTF-16 has 2 bytes per codepoint (sometimes 3?)
3. And no ASCII backwards compatability.

jkl was making fun of me (understandably) for being snarky.

Well. OP, congrats! You initiated the first tiny flamewar on these forums. Now we can stop.
jkl
Just when it starts to be interesting... ;o)




Members  |  Stats  |  Night Mode