Talk:Code page

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Code page vs Encoding[edit]

After reading the article, I failed to understand how are code page and encoding different, which is claimed by the article. --Voidvector (talk) 13:49, 31 August 2008 (UTC)[reply]


Reply: think of a codepage as a list of characters, and an encoding as a way that the characters are stored.
For instance, the Unicode character set has a trademark symbol at position 8482 (2122 hex). So the codepage simply says: 8482 -> TM.
Now if this is encoded as UTF-32, this is a 32-bit word with value 8482. If it's encoded as UTF-16LE, it would be two bytes with values 34 and 33.

8-bit codepages don't have different encodings: a byte is a byte. So if a codepage has a TM at position 153 for instance, that means the encoding is the value 153 for that character. So the encoding matches the codepage listing byte for byte.
Pim 2 (talk) 15:00, 22 May 2009 (UTC)[reply]


I totally agree with Voidvector, the difference between a code page and a character encoding is still not clear, even with Pim 2 explanation. The "character encoding" definition is any number of pairs { character + code }, thus it contradicts Pim 2 "the encoding matches the codepage listing byte for byte". Thus code page = character encoding, just the name is different Sandrarossi (talk) 10:17, 6 August 2009 (UTC)[reply]

I have got to disagree with Pim. A code page doesn't just specify a character set, it specifies how this set is encoded as well. For example, code page 932 doesn't just specify the JIS character set, but also how it is encoded with single bytes, lead bytes and trail bytes. For a more modern example, different encodings of Unicode have been assigned code page numbers. And old code pages can be retroactively considered to be encodings of subsets of Unicode as well. — Preceding unsigned comment added by 82.139.82.82 (talk) 15:13, 21 November 2015 (UTC)[reply]
I added a link to the relevant section of Character encoding from the intro. -- Beland (talk) 05:35, 25 July 2020 (UTC)[reply]

MIK[edit]

MIK is almost certainly Code Page 879. ISO 8859-11 is almost certainly Code Page 873.

Code page 854[edit]

The Spanish code page 854 is not from IBM, but what was the code page layout? IBM's code page 854 was probably DOS Latin 4, continuing the sequence created by code pages 852, 853, and 855. Alexlatham96 (talk) 20:23, 12 May 2020 (UTC)[reply]

There was a DOS codepage for Spanish/Catalan that added À Á È Í Ï Ò Ó Ú Ŀ ŀ; it was supported by Wyse terminals, but I don't know exact layout. 178.49.152.92 (talk) 06:41, 10 June 2023 (UTC)[reply]

Notability of individual articles[edit]

Given the decision at Wikipedia:Articles for deletion/Code page 875 to move nearly all articles on EBCDIC code pages to Wikibooks, are there other articles linked from this page that should be moved as well? -- Beland (talk) 17:05, 20 July 2020 (UTC)[reply]

Conflict between Microsoft and IBM codepage 1200[edit]

In the Microsoft part, it says:

1200 – UTF-16LE Unicode (little-endian)

1201 – UTF-16BE Unicode (big-endian)

In the IBM part, it says:

1200 – UTF-16BE Unicode (big-endian) with PUA

1201 – UTF-16BE Unicode (big-endian)

1202 – UTF-16LE Unicode (little-endian) with IBM PUA

1203 – UTF-16LE Unicode (little-endian)

Making a clear anti definition with BE and LE conflicting around 1200 / 1201.

So, what is this mess? 77.159.196.124 (talk) 13:41, 29 August 2022 (UTC)[reply]

IBM PUA mapping[edit]

Where can I find the full IBM PUA mapping? For example code page 1056 has many PUA characters.Alexlatham96 (talk) 03:17, 1 May 2023 (UTC)[reply]

"page numbers in the IBM standard character set manual"[edit]

Until somebody can come up with a specific reference to this manual, this should be regarded as apocryphal. Note discussion at https://retrocomputing.stackexchange.com/questions/14780/is-the-ibm-standard-character-set-manual-around MarkMLl (talk) 20:59, 31 December 2023 (UTC)[reply]