Talk:8b/10b encoding

This is the talk page for discussing improvements to the 8b/10b encoding article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Computing: Networking Low‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
Low	This article has been rated as Low-importance on the project's importance scale.
	This article is supported by Networking task force (assessed as Mid-importance).

Telecommunications Mid‑importance

	Telecommunication portal This article is within the scope of WikiProject Telecommunications, a collaborative effort to improve the coverage of Telecommunications on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.TelecommunicationsWikipedia:WikiProject TelecommunicationsTemplate:WikiProject TelecommunicationsTelecommunications articles
Mid	This article has been rated as Mid-importance on the project's importance scale.

Lowercase 'b' vs. Uppercase 'B'[edit]

Should this be "8b/10b" instead of "8B/10B" since little b means "bit"?--SFoskett 14:47, Sep 2, 2004 (UTC)

That's a good question. The 10GE FC draft spec refers to it as 8B/10B, but the PCI Express spec refers to it as 8b/10b. However this article gets named, we need a redirect from the other, IMO. MatthewWilcox 14:54, 24 Nov 2004 (UTC)

Finally renamed as most of the external links appear to use the lower-case variant. —Sladen 05:56, 27 August 2007 (UTC)[reply]

I studied also a 4b/5b encoding. Is it the same as this, referred to the byte (8 bits)? I can't find it in the Line code page. Maybe it is also known with some other name.--Luc4 Fri Sep 23 12:14:02 2005 (CEST)

I'm not especially familiar with 4b/5b encoding, but from what I gather here it appears to be somewhat similar but simpler. Aluvus 17:27, 16 June 2006 (UTC)[reply]

4b/5b is listed on that page... just as 4b5b. And yes, 4b/5b is similar but simplier. Mrand 18:26, 16 June 2006 (UTC)[reply]

8b/10b coding table[edit]

For eventual cut & paste into main article. (Go ahead and delete this section from talk if/when that's done.)

The 8 input bits of the 8B/10B code are conventionally identified by upper-case letters A through H, with A the low-order bit and H the high-order bit. Thus, in standard binary notation, the byte is HGFEDCBA. The output is 10 bits, abcdei fghj, where a is transmitted first.

Coding is done in alternate 5-bit and 3-bit sub-blocks. As coding proceeds, the encoder maintains a "running disparity", the difference between the number of 1 bits and 0 bits transmitted. At the end of sub-block, this disparity is ±1, and can be represented by a single state bit. The input and the running disparity bit are used to select the coded bits as follows:

5b/6b code
	EDCBA	abcdei			EDCBA	abcdei
input		RD = −1	RD = +1	input		RD = −1	RD = +1
D.00	00000	100111	011000	D.16	10000	011011	100100
D.01	00001	011101	100010	D.17	10001	100011
D.02	00010	101101	010010	D.18	10010	010011
D.03	00011	110001		D.19	10011	110010
D.04	00100	110101	001010	D.20	10100	001011
D.05	00101	101001		D.21	10101	101010
D.06	00110	011001		D.22	10110	011010
D.07	00111	111000	000111	D.23	10111	111010	000101
D.08	01000	111001	000110	D.24	11000	110011	001100
D.09	01001	100101		D.25	11001	100110
D.10	01010	010101		D.26	11010	010110
D.11	01011	110100		D.27	11011	110110	001001
D.12	01100	001101		D.28	11100	001110
D.13	01101	101100		D.29	11101	101110	010001
D.14	01110	011100		D.30	11110	011110	100001
D.15	01111	010111	101000	D.31	11111	101011	010100
				K.28		001111	110000

3b/4b code
input		RD = −1	RD = +1
	HGF	fghj
D.x.0	000	1011	0100
D.x.1	001	1001
D.x.2	010	0101
D.x.3	011	1100	0011
D.x.4	100	1101	0010
D.x.5	101	1010
D.x.6	110	0110
D.x.P7	111	1110	0001
D.x.A7	111	0111	1000

Note that, while in most cases, the codes that depend on RD change it by ±2, the encodings of D.07 and D.x.3 do not change it.

There are two encodings for D.x.7, "primary" and "alternate". The alternate code is used whenever using the primary code would result in bits eifgh being all the same, namely after D.11, D.13 and D.14 when RD=−1, and after D.17, D.18 and D.20 when RD=+1. This ensures that five consecutive identical bits never appear in a normal output symbol.

Using an additional 6B output value, called K.28, and/or the alternate D.x.A7 output in contexts where it would not be otherwise required, an additional 12 non-data "control symbols" can be formed. Some of these contain a "comma sequence" abcdeifg = 00111110 or 11000001, which never appears anywhere else in the bit stream, and can be used to establish the byte boundaries in the data stream.

Note that the comma sequence includes two leading identical bits, because five consecutive identical bits can appear as part of normal output when straddling two symbols. In particular, D.x.A7 followed by a symbol that begins with two identical bits such as D.03, D.11, D.12, D.19, D.20, D.28 or K.28. However, D.x.A7 is always preceded by a non-identical bit, so the patterns formed are ifghjabc = 10111110 or 01000001.

The control symbols are used by higher-level protocols to indicate boundaries or gaps (idle time) in the data stream. The names are assigned because their encoding is a slight variant of the corresponding D.x.y codes.

Control symbols
input	RD = −1	RD = +1
	abcdei fghj	abcdei fghj
K.28.0	001111 0100	110000 1011
K.28.1^*	001111 1001	110000 0110
K.28.2	001111 0101	110000 1010
K.28.3	001111 0011	110000 1100
K.28.4	001111 0010	110000 1101
K.28.5^*	001111 1010	110000 0101
K.28.6	001111 0110	110000 1001
K.28.7^*	001111 1000	110000 0111
K.23.7	111010 1000	000101 0111
K.27.7	110110 1000	001001 0111
K.29.7	101110 1000	010001 0111
K.30.7	011110 1000	100001 0111

(^* K28.1, K28.5, and K28.7 are comma symbols, containing the comma sequence abcdeifg = 00111110 or 11000001. K28.7 must not appear after another K.28.7, or it would form a second false comma sequence.) —The preceding unsigned comment was added by 192.35.100.1 (talk • contribs).

Wow, you put a lot of work into that. However, I think it may be a little excessive for an encyclopedic article. It is a great ref tho. Maybe link to it from the main article and put it in wikisource or something like that? — RevRagnarok ^{Talk Contrib} 12:04, 17 August 2006 (UTC)[reply]

I think they are absolutely appropriate for an encyclopedia! Cburnett 03:29, 19 December 2006 (UTC)[reply]

I think it's worth noting that, as 2023-04-25, the table in the article is WRONG. The table here is correct, though. Until whichever joker broke the main page finds this. Aldenrw (talk) 18:46, 25 April 2023 (UTC)[reply]

Running Disparity (RD)[edit]

At least one part of this more detailed explanation should be included in the article: the discussion of Running Disparity. This Acronym (RD) is given in the article, but it is never explained (note: RD is not yet defined in the above discussion either, though it is pretty obvious what it refers to). Also, I liked the following external links, should they be added? http://www.xilinx.com/ipcenter/catalog/logicore/docs/encode_8b10b.pdf, http://www.xilinx.com/ipcenter/catalog/logicore/docs/decode_8b10b.pdf DaraParsavand 18:39, 9 January 2007 (UTC)[reply]

I strongly agree about adding an explanation to Running Disparity (RD). Not only is the word used in the article but also the abbreviation in the table - and it is never explained. Actually I came here (to the discussion page) in hope of just finding this. IHMO the article is already quiet technical and adding this information seems appropriate and gives a good indication on "how" 8B/10B "works". MJost 13:22, 10 January 2007 (UTC)[reply]

I've added something as the Running Disparity (and DC-freeness) is the chief reason that brought me to 8b/10b encoding in the first place. The explanation needs some work to point out that only −ve/0/+ve disparity needs to be stored as even through the slew can be −2/−1/0/+1/+2 the direction of the next code will always take it towards zero. Sladen 14:12, 19 August 2007 (UTC)[reply]

Merge from Fibre Channel 8B/10B encoding[edit]

I propose a merge, since Fibre Channel 8B/10B encoding contains the same information (maybe in friendlier form). Nothing specific to Fibre Channel, really. --Kubanczyk 14:58, 22 August 2007 (UTC)[reply]

Agreed. They are the same. —Mrand ^T-C 02:03, 27 August 2007 (UTC)[reply]

However, the use of the control characters differs from implementation to implementation. Fibre Channel uses K28.5 at the beginning of every Ordered Set, while the other control chars are unused. Maybe that should be rolled into the main article?

Either Fiber channel uses a different scheme that I don't understand ("even/odd disparity") or it is plain wrong. If the latter, I think it could be deleted. I did change the link iom the "§fiber channel" page to point here. --Mschnell 14:05, 9 October 2007 (UTC)[reply]

I do not think it is good idea to merge the fiber channel article with this page. 8b/10b initially invented for fiber channel but I believe that its used more generally now for serial communication. It is used in many other applications (listed in article). Its good idea to keep the link between the two articles though. --Rahul —Preceding unsigned comment added by 63.119.227.6 (talk) 18:49, 16 July 2008 (UTC)[reply]

IMHO, Disparity not zero after two Blocks[edit]

The article sais "This means that there are just as many "1"s as "0"s in a string of two symbols". I don't see how this is possible. when e.g encoding

0x00, 0x31, 0x31, 0x31, 0x31, 0x31, ...

0x00 introduces a disparity and 0x31 (=D17.1 => 1000111001) can't change it.

So the "relative disparity" is granted to get zero only with an infinite number of characters.

In fact the running disparity never is 0. It starts with either -1 or +1 and with any 6 or 4 bit block it gets incremented by either of -2, 0, or +2, the increment's sign being opposite to the value's.

A similar (but supposedly correct) version of the statement would be "This means that the difference between "1"s aand "0"s in a string of at least 10 bits is no more than 2". --Mschnell 07:35, 20 September 2007 (UTC)[reply]

Question : consider the following sequence D14.A7, D13.5, D20.2 (Initial RD = +1)[edit]

code	Initial RD	6 bit code	mid RD	4 bit code	Final RD
		`abcdei`		`fghj`
D14.A7	+1	`011100`	+1	`1000`	-1
D13.5	-1	`101100`	-1	`1010`	-1
D20.2	-1	`001011`	-1	`0101`	-1

transmit sequence a first	`011100 1000 101100 1010 001011 0101`
bits 25:6	`....00 1000 101100 1010 0010.. ....`

This arbitrary 20 bit substring has 7 ones and 13 zeroes.

IMHO this is not consistent with the statement in the first paragraph of the article : "This means that the difference between the counts of ones and zeros in a string of at least 20 bits is no more than two"

The statement appears true for 20 bits ending on a symbol boundary (as it would be for any sequence ending on a abcdei or fghj boundary with an integral number of symbols. With that interpretation, 20 bits would have no special meaning. As written the statement can reasonably be interpreted to include the counter example with unaligned bits. I have not seen this property stated for 8b10b elsewhere and would appreciate an expert restating this (or else explaining my misunderstanding).

(Note: newbie both to Wikipedia edits and the details of 8b10b encoding ) --kmrbrierley 07:56, 16 November 2015 (UTC)[reply]

IMHO, Table "Effect of Running Disparity" hard to understand and not necessary[edit]

1) In table "5B/6B code" the input RD is noted as "RD=-2" and "RD=+2". IMHO RD always is either +1 or -1 so here this should be the same as with table "3B/4B code": "RD=-1" and "RD=+1". I think best you just would say "RD=-" and "RD=+" in both tables.

Moreover it could be helpful to add the Disparity effect (-2, 0 or +2) with any result code entry in both tables:

3b/4b code
input		RD = -1	RD = +1
	HGF	fghj
D.x.0	000	1011 (Dis=+2)	0100 (Dis=-2)
D.x.1	001	1001 (Dis= 0)
D.x.2	010	0101 (Dis= 0)
D.x.3	011	1100 (Dis= 0)	0011 (Dis= 0)
D.x.4	100	1101 (Dis=+2)	0010 (Dis=-2)
D.x.5	101	1010 (Dis= 0)
D.x.6	110	0110 (Dis= 0)
D.x.P7	111	1110 (Dis=+2)	0001 (Dis=-2)
D.x.A7	111	0111 (Dis=+2)	1000 (Dis=-2)

2) As the disparity is handeled equally when starting a 6 bit block and when starting a 4 bit block, you don't need to handle the disparity of complete 10 bit blocks at all.

So the table "Effect of Running Disparity" could be replaced by

Rules for Running Disparity
Previous RD	RD of 6 or 4 Bit Code	Next RD
-	-	Error
-	0	-
-	+	+
+	-	-
+	0	+
+	+	Error

In the text you could state that for each byte to be coded the rule is applied to the 5B/6B part first and to the 3B/4B part afterwards with the RD resulting then to be the carry to the next byte operation. --Mschnell 15:26, 19 September 2007 (UTC)[reply]

IMHO, "five identical bits must not appear in normal code" is not true[edit]

When generating e.g. D.18.7, D.19.7 (i.e. D.18.A7, D.19.P7) after RD=-1, the bit sequence will be 010011 0111 110010 0001. Here we do have five identical bits. But we don't have one of the dedicate commas (with two zero bits before the five ones, which is not possible with D.x.A7) --Mschnell 16:51, 19 September 2007 (UTC)[reply]

I agree, as I understand it, and furthermore the assertion that the running disparity never exceeds the range -2 .. +2 is untrue since several of the zero-disparity codes start 00 or 11 (the former giving -3 disparity if -1 initially, the latter +3 if +1) MarkTillotson (talk) 15:08, 13 September 2013 (UTC)[reply]

For a lay person such as myself, this statement is a bit confusing: "This means that the difference between the count of ones and zeros in a string of at least 20 bits is no more than two, and that there are not more than five ones or zeros in a row."

Without knowing any of the technical details regarding the veracity of the statement, it's difficult to propose an alternative, but perhaps something like: "the difference between the count of ones as compared with the count of zeros" or "the difference between the number of ones and the number of zeros" would be better. Also, separating the last bit into its own sentence may be worthwhile. 108.222.212.63 (talk) 15:58, 10 June 2014 (UTC)[reply]

Hm, the only thing that to me seems wrong with "the difference between the count of ones and zeros" is the singular form of "count". Will get it corrected, and it should be a bit better – if you agree. — Dsimic (talk | contribs) 15:04, 13 June 2014 (UTC)[reply]

IMHO, a separate table for K-symbols is not necessary[edit]

We better enhance the 3b/4b table appropriately. (4 result columns D-, D+, K-, K+ --Mschnell 15:31, 24 September 2007 (UTC)[reply]

After I got no response here, I did most of the changes I suggested. Sorry for forgetting to log in first :( :--Mschnell 09:04, 25 September 2007 (UTC)[reply]

Ideas for article improvement[edit]

Howdy all, the page is looking considerably better than it did not too long ago... good job everyone! Looking over it, here are a few thoughts I had:

Any useful content (or simpler phrasing) in the Fibre Channel 8B/10B encoding article should be merged in and that article set to redirect here
The "How it works" section should probably be moved down below the "Technologies that use 8b/10b" so that the encoding tables can be made to be a subsection of "how it works".
While the concept described in the sentence about the Ethernet transformer is correct, the overall statement looks technically wrong since 1000Base-T doesn't use 8b/10b.
Lastly, it seems like the intro could be re-written to be somewhat easier for laymen to understand.

Anyone interested, please feel free to be bold and do any (or all) of these! —Mrand ^T-C 14:12, 26 September 2007 (UTC)[reply]

while the first issue is beyond my nowledge and the last is hard to do for me as I'm not a native English speaker, I'll take a look at the two others tomorrow. --Mschnell 20:32, 27 September 2007 (UTC)[reply]

Differences with TMDS[edit]

According to the author the difference between IBM 8b10b and TMDS is subtle. But in my opinion the differences are significant:

TMDS selects between 3 different encoding schemes depending on audio/video/control data content in a particular time. There are also differences in encoding between TMDS data channels.
IBM 8b10b uses 5b6b and 3b4b subblocks while TMDS has a dedicated algorithm during the video data period.
TMDS has special coding for q_out[8] and q_out[9].
IBM 8b10b has much tighter control over the DC balance, while TMDS introduces the concept of a counter that tracks the disparity.

In my eyes the only thing in common is the translation from 8 bits to 10 bits.

What do you think about chancing this? —Preceding unsigned comment added by 83.83.112.2 (talk) 21:30, 12 July 2008 (UTC)[reply]

It's a wiki! You are clearly better informed than the information presently available in the article. Please, dive in and update the article; or add web addresses for any references you have regarding the differences and somebody can help you or try to update the article themselves based on those new references. Once again, many thanks for taking the time to get involved! —Sladen (talk) 21:58, 12 July 2008 (UTC)[reply]

The paragraph on the subject at the moment is not very good. What about "Note that while most uses of the term 8b/10b refer to this particular code, this name is sufficiently generic that it is sometimes applied to other, incompatible codes that also expand 8 bits to 10 bits. One such example is Transition Minimized Differential Signaling, which is not related to the 8b/10b code described in this article."? Larry Doolittle (talk) 22:31, 22 June 2010 (UTC)[reply]

Running Disparity confusion[edit]

I'm by no means an expert on 8b/10b, but after doing some digging on the following statement it seems to be reversing cause and effect:

Obviously, if the six- or four-bit code has equal numbers of ones and zeros, there is no choice to make, as the disparity would be unchanged, considering the following exceptions. RD is positive at the end of the six-bit sub-block if the six-bit sub-block is 000111, and RD is positive at the end of the four-bit sub-block if the four-bit sub-block is 0011. RD is negative at the end of the six-bit sub-block if the six-bit sub-block is 111000, and RD is negative at the end of the four-bit sub-block if the four-bit sub-block is 1100.

To me, this reads as "Regardless of what value RD has, it will be positive after using 000111 or 0011 and negative after 111000 or 1100". If I'm not mistaken, the desired statement is "Even though the disparity of 000111 is 0, if RD is positive use 000111 or 0011, if it's negative use 111000 or 1100, and in both cases leave RD unchanged." I've gone back through the edits and it seems this general wording has been in place since 2012, so while I'm pretty sure my understanding is correct, I'm not quite 100%. I'm going to go ahead and change the wording of the article, but I'm leaving this comment here so somebody who knows better than I do is more likely to double check it.

Rezurok (talk) 22:11, 27 September 2016 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just modified one external link on 8b/10b encoding. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Replaced archive link https://web.archive.org/web/*/http://www.knowledgetransfer.net/dictionary/Storage/en/8b10b_encoding.htm with https://web.archive.org/web/20140608160037/http://www.knowledgetransfer.net:80/dictionary/Storage/en/8b10b_encoding.htm on http://www.knowledgetransfer.net/dictionary/Storage/en/8b10b_encoding.htm

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 14:08, 30 September 2016 (UTC)[reply]

Counting in binary[edit]

I would like to know how to count in binary. How do numbers look in binary, as opposed to letters? Also, how do you translate pictures into binary — Preceding unsigned comment added by 174.131.36.41 (talk) 18:36, 14 December 2017 (UTC)[reply]

Error in 3b/4b table[edit]

As of 2018 June 20, I think there's an error in the 3b/4b table. Unless I'm misunderstanding something, all of the RD+ and RD- columns are inverted. Here's a snapshot from Table 36-2 of the IEEE 802.3 standard. I will make an edit to flip them in a few minutes. Just wanted to make a note here so people know why I'm doing this. — Preceding unsigned comment added by 130.221.224.7 (talk) 22:39, 20 June 2018 (UTC)[reply]

The 3b/4b table appears to be wrong now. The reference link from June 20, 2018 shows an image where all of the 5b/6b entries change the disparity, so the opposite RD column is used for the 3b/4b values. The table on this talk page appears to be the correct one. With the table on the main page, it appears that disparity will always be driven away from zero instead of switching (more zeros are selected when negative instead of more ones.) — Preceding unsigned comment added by 2600:1:B00F:B3FF:A86B:7D0D:8EAF:8713 (talk) 12:37, 23 September 2018 (UTC)[reply]

The columns were not flipped. Here is the original Patent for the 8b/10b code: https://patents.google.com/patent/US4486739A/en — Preceding unsigned comment added by 84.187.55.14 (talk) 12:11, 18 November 2018 (UTC)[reply]

I have to at least agree with the original poster that the columns seems to be flipped. If it is not, then it is definitely confusing. Calculate something simple like D27.7 with either positive or negative RD. You will get RD=-1 = 1101101110 or RD=+1 = 0010010001 which is obviously wrong. it will be corrected if you flip the columns. It could be an error with the RD calculation, I assumed it is calculated per every 10 encoded bits which would give you the error above. If it was calculated for the first 6 bits, RD would be flipped when calculated the last 4 bits which would give you the correct results. But there is no indication it calculates RD over the 6bits. — Preceding unsigned comment added by 207.102.152.135 (talk) 03:35, 6 October 2019 (UTC)[reply]

Name dropping[edit]

The article mentions, at the start, the two IBM inventors of the 8b/10b code. It then goes on to mention, twice, two other implementations and inventors. Self-promotion? — Preceding unsigned comment added by 47.185.113.137 (talk) 15:11, 23 August 2019 (UTC)[reply]

Comma symbols or comma codes or comma sequences?[edit]

The page defines and refers to "comma symbols" in section "How_it_works_for_the_IBM_code". It refers to "comma codes" in sections "Encoding tables" and "3b/4b code (fghj)". It also refers to "comma sequences" in section "3b/4b code (fghj)". In section "Control symbols" is refers to them as "comma symbols" but also has one reference to "comma sequences" although this is for part of a symbol. I think with the exception of the last one, all references should be to "comma symbols". More generally, the use of "code" is used where "symbol" should be used: for example, I would argue that "K.23.7 code" should read "K.23.7 symbol" and "Code" in table "Control symbols" should read "Symbol". I'm happy to do the changes if no-one objects. Marios.agathangelou (talk) 16:57, 25 May 2020 (UTC)[reply]

3b/4b code (fghj) edits[edit]

I think There is a mistake in the 3b/4b code (fghj) table for the data.

Input		RD = −1	RD = +1
Code	HGF	f g h j
D.x.4	100	0010	1101
D.x.P7 †	111	1110	0001
D.x.A7 †	111	1000	0111

should be changed to

Input		RD = −1	RD = +1
Code	HGF	f g h j
D.x.4	100	1101	0010
D.x.P7 †	111	1110	0001
D.x.A7 †	111	0111	1000