Talk:Byte

From Citizendium
Revision as of 09:57, 13 April 2007 by imported>Greg Woodhouse (percentages and raw numbers)
Jump to navigation Jump to search


Article Checklist for "Byte"
Workgroup category or categories Computers Workgroup [Editors asked to check categories]
Article status Developed article: complete or nearly so
Underlinked article? Yes
Basic cleanup done? Yes
Checklist last edited by Joshua David Williams 21:57, 12 April 2007 (CDT); Eric M Gearhart 16:52, 6 April 2007 (CDT)

To learn how to fill out this checklist, please see CZ:The Article Checklist.





missing on purpose?

Hi, I miss info about the small and big-endian. It should IMHO be part of the byte story. Robert Tito |  Talk  20:54, 6 April 2007 (CDT)

I did not mention it because, quite honestly, that's largely outside my scope of knowledge. If you're knowledgeable in that area, we would appreciate a contribution to the article :) --Joshua David Williams 21:03, 6 April 2007 (CDT)
Edit - I did not realize you're an editor when I wrote that. If you're busy, I could find another user to help (Eric may be able to). In answer to your question, no, it was not excluded purposely - that is, to not include it at all. --Joshua David Williams 21:06, 6 April 2007 (CDT)

what is it

Big and small-endian refer to the 'sign'-bit, in big-endian it is at the end of the byte, in small at the begin (or the other way around - I still look that up). It is used in diverse protocols to discrimninate them from others. The best known IPX/PX versus TCP/IP. It took quite some problemsolving for cisco to let these two networks communicate without problem (it gave rise to their iOS version 13 and above - created when I was on the phone with them.) Signs of bytes are of importance for the variables needed to transfer specific information. Robert Tito |  Talk  21:43, 6 April 2007 (CDT)

Should this be a separate article that deserves a mention on Byte? Remember we don't want to overwhelm the average person with too much info stuffed into the Byte article --Eric M Gearhart 04:07, 7 April 2007 (CDT)

I think it is more relevant than all the prefixes as it IS info within a Byte. Robert Tito |  Talk  09:01, 7 April 2007 (CDT)

See this jargon? That's exactly what we want to avoid. I can't make heads or tails of it. Could someone please explain this in layman's terms? --Joshua David Williams 09:47, 7 April 2007 (CDT)

big and small endians are only the way to tell the machine what the sign of the byte is: signed (+) or unsigned. Signed means only positive values are allowed. unsigned means the whole range of number space can be used. If your address space allows for file sizes up to 4 GB and you use an unsigned int to address it you CAN access that space. Using a signed variable allows you to address only 2 GB. Endian types only state where that sign is stored in the byte: the low bit or the high bit. nothing more nothing less. Some compilers use predominantly big others small endian variable. windows and unix in general use the two different styles. Robert Tito |  Talk 

I don't think I'd put it that way. First of all, the sign bit is a function of how integer values are encoded, not of bytes themselves. Big endian means most significant bit first, and little endian means least significant bit first. We write numbers in big endian form because 24 is 20 + 4, and the most significant digit comes first. Of common architectures, the i386 (including the Pentium etc.) is little endian, and virtually everything else is big endian. Oh, and you might want mention the connection to "Gulliver's Travels".Greg Woodhouse 22:40, 12 April 2007 (CDT)
I think I'm finally starting to understand this concept clearly. I'm going to re-write the endianness section of the article to make it a bit clearer, especially of what the "most significant byte" is. --Joshua David Williams 22:49, 12 April 2007 (CDT)

OK I will try and work in a one-liner on Byte, something like an "also worth mentioning is whether a Byte is big-endian or little-endian" and a link to an Endianness article.. maybe with Big endian and Little endian redirecting to it.

And yea holy crap the Wikipedia article looks more like "Look at me I can write terse technical articles" rather than striving to be reachable to the masses.

To clarify on Rovert's signed versus unsigned example: You would use an "unsigned" variable for a file system, because you're only going to deal with positive numbers. You would use a "signed" (meaning has positive and negative) address space when talking about a number that can be from -2 to positive 2 (for example).

In very very simple terms, "big endian" means you're placing importance on the leftmost numbers first. "Little endian" means you're placing importance on the rightmost numbers first.

For example: Networks generally use big-endian order; the historical reason is that this allowed routing while a telephone number was being composed.

757-421-2233 is big endian, because first comes the area code (Virginia), then 421 is the prefix (Norfolk), and then the last four numbers actually get you to the specific house.

That's the type of explanation we need in the Endianness article in my opinion --Eric M Gearhart 10:43, 7 April 2007 (CDT)

Not totally true but nice as metaphor. Robert Tito |  Talk  11:29, 7 April 2007 (CDT)

bigger better?

Both LaCie and Iomega have single disk-enclosures out with disks of 1 TB below US$500. The density of the data however is that high these disks cannot be used without solid error-correction. Bigger is not always better, at most easier. Robert Tito |  Talk  09:36, 7 April 2007 (CDT)

What else should be added?

I did a bit of research on the topic of endianness and added a section for it. If anything I said is inaccurate, please correct it. Also, what else should we add? --Joshua David Williams 19:12, 12 April 2007 (CDT)

Hexer image

Should Image:Hexer.png be in this article? I'm not sure since it shows the data in hexadecimal format instead of binary. --Joshua David Williams 19:19, 12 April 2007 (CDT)

Well bytes can be represented in Hex or binary (or octal or decimal or...). I'd say that the caption of the image should reflect that "these values represent bytes in Hexadecimal." --Eric M Gearhart 20:02, 12 April 2007 (CDT)

Integers

Is this the place to discuss how signed values are encoded (i.e., one's complement vs. two's complement)? Greg Woodhouse 22:42, 12 April 2007 (CDT)

kibibyte?

I'd like to hear what other editors have to say, but kibibyte sounds like a neologism that never really gained acceptance. Certainly, I've never heard it used. A Google search did turn up an interesting page though, [1]. Apparently, there actually was a proposal circulated some years ago, but I don't know how far it went. As a general rule, powers of 2 are used for disk storage. For example a typical block size on modern filesystems is 4K, mean 4096 bytes, not 4000. On the other hand, data rates are always expressed in powers of 10. The 10 in 10base-T means 10 megabits per second, and the nominal data rate ofr 100base-T is 100 megabits per second. Greg Woodhouse 23:18, 12 April 2007 (CDT)

Okay, here you go

1541-2002

IEEE Trial-Use Standard for Prefixes for Binary Multiples

Status: Active
Publication Date: 2003
Page(s): 0_1- 4
E-ISBN: 0-7381-3386-8
ISSN: 
ISBN: 0-7381-3385-X
Year: 2003 
Sponsored by: 
   SCC14

OPAC Link: http://ieeexplore.ieee.org/servlet/opac?punumber=8450

Calling this terminology "standard" overstates things, IMO. Greg Woodhouse 23:32, 12 April 2007 (CDT)

See this page as well. --Joshua David Williams 23:34, 12 April 2007 (CDT)

Yes, I saw that, too. IEC might publish a standard, but the IEEE approach is much more, well, realistic. Truth be told, I can't even find the IEC document, so I'm not sure of its status, but I think IEC is just spelling out the meaning of some new words, should you choose to use them. At best, I think this terminology can be called experimental. Greg Woodhouse 23:49, 12 April 2007 (CDT)

So how should we deal with it in this article then? --Joshua David Williams 10:09, 13 April 2007 (CDT)
I've heard it quite a bit (hehe) in the last few months, but never before that. I wouldn't call it a standard now, but it's definitely worth mentioning because the differences are going to be very large. From what i've seen, only the "1337" are using "KiB". Andrew Swinehart 10:22, 13 April 2007 (CDT)

Differences

In a recent edit, Phillip Stewart changed the percentage of difference between a yottabyte and a yobibyte from 1.209% to 17.281%. Is this correct, a mistake, or vandalism? I used the formula (2^80)/(10^24) to calculate my number. --Joshua David Williams 00:00, 13 April 2007 (CDT)

I've reverted Phillip's version for the sake of consistency. I believe that he was incorrect. If not, please post a message regarding this. This is important information that we must know - and agree upon - when writing an article. --Joshua David Williams 00:25, 13 April 2007 (CDT)

I don't think so. See for yourself

1 - (pow(10, 24)/ (pow(2, 80)
= 0.17281938745
 
0.17281938745 * 100
= 17.281938745

so 17.2819% is right. Greg Woodhouse 00:29, 13 April 2007 (CDT)

Ah, I see my mistake now. I'll fix the table, but we'll need to check these things very carefully afterwards. --Joshua David Williams 00:32, 13 April 2007 (CDT)
I think we should use the raw numbers and not percents. IMO, they're much easier to understand and calculate. (2^10)-(10^3), (2^20)-(10^6), etc. Thoughts? Andrew Swinehart 10:38, 13 April 2007 (CDT)

I'd stick with percentages, as the point is that the differences can be substantial. (Did you see the footnote about the disk manufacturer that tried to use powers of 10 and the subsequent law suit?) Of course, if there's room, raw numbers might be a good thing to include, too. Greg Woodhouse