Bitwise

Hexadecimal

Since we're no longer working with text-based information, we use Hexadecimal. Hexadecimal means the numbers have a radix of 16, which means each digit has 16 values. Decimal is the usual convention where digits range from 0 to 9. Binary is 0 to 1. Hexadecimal counts like so: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, 10, 11, 12, and so on.

Hexadecimal is used and is so wonderful because each hex digit describes a particular 4-bit pattern, so 2 hex digits describe a full byte in a very concise manner! Furthermore, this is why most programmers write hexadecimal numbers in code as some even number, usually just 2 digits. Hexadecimal numbers are also typed as 0x## because in most languages if it starts as a character it's a variable, so '0x' is needed to have the compiler/interpreter know this is a number, and specifically a Hexadecimal number.

Here's a sample conversion table from a few Hexadecimal values to binary and decimal:

	
Hex
		
Binary
	
Decimal
0x00 0000 0000 0
0x01 0000 0001 1
0x02 0000 0010 2
... ... ...
0x0E 0000 1110 14
0x0F 0000 1111 15
0x10 0001 0000 16

| and ^ operators

These are logical 'or' and 'exclusive or' operations.

| Is the or operation and it takes two numbers/bytes and look at each bit in each number one at a time. If there's a 1 in either bit, a 1 is placed in the return bit. If they are both 0, a 0 is placed.

^ Is exclusive or. It also takes two numbers and looks at each bit one at a time. However, it only places a 1 in the return value if only one byte has a 1 there, it puts a '0' there if both bits are 0 or both bits are 1.

Here's an example of both:

byte a = 00101100;
byte b = 11110000;
byte bitwiseOr = a | b;
00101100
11110000
--------
11111100

byte bitwiseXOr = a ^ b;
00101100
11110000
--------
11011100

Because of the way 'or' works, you can use it to merge two bytes into one complete number:

short a = 10010111 00000000; // Decimal 38656
byte b = 11001001; // 201
short merged = a | b;
10010111 00000000
00000000 11001001
-----------------
10010111 11001001

Do note that because of the way Java assumes every number to be signed, you do need to go out of your way to make bytes unsigned before performing this operation. Check out the signed section for more info.

& operator

This is a logical 'and' operation. It takes two numbers/bytes, looks at each bit one at a time. If there's a 1 in both bits, a 1 is placed in the return number.

byte a = 01101100;
byte b = 00001111;
byte bitwiseAnd = a & b;
01101100
00001111
--------
00001100

Because of the way 'and' works it's used as a "bit mask". You can use it to query whether a specific bit in a byte is a 1 or a 0. For example, if you wanted to know the 4th bit is 1 or 0, you would write this:

byte a = 11011011;
byte mask = 0x08; // 0000 1000 == 00001000
boolean isOn = (a & b) != 0;
00001000
11011011
--------
00001000
true

You can't compare the result to '1', because the returning number is 00001000, or 8 in decimal. You could check to see if the value is equal to the mask, but checking to see if it's not 0 will work in every situation.

This function is really useful for saving memory since you can essentially have 8 boolean values in a single byte. However, don't stress about tiny memory-saving methods like this if you don't absolutely need to save every single byte of memory. Most computers have copious amounts anyway.

Bitshift

A bitshift is when you move all of the bits in a byte left or right some number of spaces or digits. It's the strangest sounding operation, but very useful for reading byte data!

It uses << to shift bits left and >> to shift right. The operator takes two numbers, the one on the left is the number to shift, and the other number being the number of times to shift the bits. When bits get shifted, 0s or 1s are added to the byte depending on if the number is signed or not. There's also the operator >>> which is moves bits right, but places 0s even if the number is signed.

byte a = 00111100;
byte b = a << 2;
original:    00111100
shift once:  01111000
shift twice: 11110000 // this is the value returned.
// Left shift operator adds 0s to the right of the byte everytime.

byte c = a >> 4;
original:   00111100
shift once  00011110
shift twice 00001111
shift 3     00000111
shift 4     00000011 // This is the value returned.
// Right shift operator adds 0s to the left of the byte if it's unsigned, 1 if it's signed.
// Notice that the '1' bits are discarded as they run out of room.

// This is equal to 195 unsigned, but since this is java, which assumes everything is signed, it's equal to -195.
byte signed = 1100 0011;
byte d = signed >> 3;
original: 11000011
shift 1:  11100001
shift 2:  11110000
shift 3:  11111000
// '1's are added because the original number was signed.

byte e = signed >>> 3;
original: 11000011
shift 1:  01100001
shift 2:  00110000
shift 3:  00011000
// '0's are added because of the unsigned operator.

In a right shift, bits are discareded as they go beyond the 0th bit. In a left shift though, the byte is "upgraded" to use 2 or more bytes of memory and the bits are kept unless explcitily discarded through a type-cast, like byte result = (byte) (a << 7);. This is covered more appropriately in the signed section.

Reading Bytes

Finally something useful in this whole page!

Thanks to java's huge native library, a super handy class for reading byte data is readily available to us!

DataInputStream has a bunch of nice functions for reading byte-data from any input stream:

// Of course, InputStream is an abstract Class with subclasses like FileInputStream.
DataInputStream(InputStream in)
DataInputStream(new FileInputStream(String path)) // Throws java.io.FileNotFoundException
DataInputStream(new FileInputStream(File file))   // Throws java.io.FileNotFoundException

/****************************
**  All methods of DataInputStream throw both
**  java.io.IOException and java.io.EOFException
*****************************
byte readByte()		// 1 Byte
short readShort()	// 2 Bytes
char readChar()		// 2 Bytes
int readInt()		// 4 Bytes
long readLong()		// 8 Bytes

// These methods return values that are unsigned.
int readUnsignedByte()	// 1 Byte
int readUnsignedShort()	// 2 Bytes

float readFloat()	// 4 Bytes
double readDouble()	// 8 Bytes

boolean readBoolean() // 1 Byte

void readFully(byte[] b)
// Places bytes from this stream directly into the array until the array is full.
// If the stream runs out of bytes, an EOFException is thrown.

void readFully(byte[] b, int offset, int length)
// Places bytes from this stream directly into the array, starting at the index indicated
// by 'offset', reading at most 'length' bytes. If (offset + length) is greater than the
// length of the array, IndexOutOfBoundsException is thrown.

int skipByte(int n)
//Ignores 'n' byte from the input stream.

Signed Bytes

Since computers cannot physically attach a negative sign to bytes, there has to be a workaround to make computers use negative values. Generally this is done by making the last bit in a byte a 1, indicating the other bits make up a negative number. This allows bytes to represent numbers from -128 to 127 instead of 0 to 255.

This is through a fancy maneuver called a "Two's Complement".

A two's complement is actually pretty simple. To write some negative number, take its positive value, subtract 1, then invert all of the bits (1s to 0s and 0s to 1s). And to figure out the value of a negative byte just reverse the process: invert all of the bits and then add 1!

// If we want to write -3...
00000011	// Step 1: write positive 3
00000010	// Step 2: Subtract 1
11111101	// Step 3: Invert bits!

// Now, what's the value of 11111111?
00000000	// Step 1: Invert all of the bits
00000001	// Step 2: Add 1
// It's negative 1!

Now, what happens when we try to "upgrade" a negative byte to use 2 or even 4 bytes of memory? When we upgrade a positive byte, we just add bytes full of 0s to the left, thus retaining the same value. If we do this with a negative number, its value suddenly changes to positive though:

byte a = 0x44; // 01000100
int upgrade = (int) a; // 00000000 00000000 00000000 01000100
// Notice it still equals 0x44, even with all of the newly added bytes.

byte b = -12; // 11110100
int upgrade = (int) b; // 00000000 00000000 00000000 11110100 (What you might think happens)
/* This is a problem. Now the value is 244! And Java accounts for this.
* When java notices a negative number and assigns it more bytes of memory, it instead puts bytes full of 1s instead!
* This will actually keep the same value for a negative number:
*

int upgrade = (int) b; // 11111111 11111111 11111111 11110100 (What actually happens)
// Just to prove this is the same value, let's figure it out:
11110100 // Original, -12
11111111 11111111 11111111 11110100 // Original now using 4 bytes of memory
00000000 00000000 00000000 00001011 // Step 1: Inverted bits
00000000 00000000 00000000 00001100 // Step 2: Add 1
//The value is 12! Therfore, the value of the original bytes are -12!

So, when a '1' is detected in the leftmost bit, java assumes the number is negative because java doesn't have strictly unsigned primitives (meaning every int, short, byte, or long can represent positive and negative numbers).

Now, why is this more of a problem than useful when dealing with byte data? Often, we write byte data one byte at a time, even if we're writing an integer that uses 4 bytes of memory. This means we read them one byte at a time and merge them back together using bitshift operators and the bitwise or operator:

int bigNumber = 0xAB18C39F;
// 10101011 00011000 11000011 10011111
// we would write out to a file 4 separate bytes:
0xAB
0x18
0xC3
0x9F

// Now, let's read them in one byte at a time:
byte first = 0xAB; // 10101011
byte second = 0x18; // 00011000
byte third = 0xC3; // 11000011
byte fourth = 0x9F; // 10011111

int merged = (first << 24) | (second << 16) | (third << 8) | (fourth);

/*
It's important to note than during a bitshift operation, java upgrades each byte to an int
(4 bytes of memory) before shifting.
So this happens:
*
// Intermediary 'upgrade'
11111111 11111111 11111111 10101011
00000000 00000000 00000000 00011000
11111111 11111111 11111111 11000011
11111111 11111111 11111111 10011111
// Performing the bitshift, and then 'or'.
10101011 00000000 00000000 00000000
00000000 00011000 00000000 00000000
11111111 11111111 11000011 00000000 
11111111 11111111 11111111 10011111
-----------------------------------
11111111 11111111 11111111 10011111 = merged;

/*
Java sees 'first', 'third', and 'fourth' as a negative number, so it adds bytes full of 1s to left when the byte gets upgraded.
Therefore, when the 'or' operation kicks in, it effectively ignores a lot of information since there are bytes full of 1s.
How do we fix this?! We perform an 'and' operation before we do a bit shift.
The 'and' operator also upgrades bytes to integers, but if we do it right we can make it drop all of the extra bytes full of 1s:
*

int merged = ((first & 0xFF) << 24) | ((second & 0xFF) << 16) | ((third & 0xFF) << 8) | (fourth & 0xFF);

// The 'and' operation:
  11111111 11111111 11111111 10101011 0xAB
  00000000 00000000 00000000 11111111 0xFF
& -----------------------------------
  00000000 00000000 00000000 10101011

  00000000 00000000 00000000 00011000 0x18
  00000000 00000000 00000000 11111111 0xFF
& -----------------------------------
  00000000 00000000 00000000 00011000
  
  11111111 11111111 11111111 11000011 0xC3
  00000000 00000000 00000000 11111111 0xFF
& -----------------------------------
  00000000 00000000 00000000 11000011

  11111111 11111111 11111111 10011111 0x9F
  00000000 00000000 00000000 11111111 0xFF
& -----------------------------------
  00000000 00000000 00000000 10011111

// The bitshift and 'or' commands now:
10101011 00000000 00000000 00000000
00000000 00011000 00000000 00000000
00000000 00000000 11000011 00000000 
00000000 00000000 00000000 10011111
-----------------------------------
10101011 00011000 11000011 10011111 = merged;

You might wonder why 0xFF doesn't have a bunch of 1s to the left of it, since its leftmost bit is a 1! It's because 0xFF is equivalent to writing 255, which is an integer using 4 bytes of memory. 255 isn't negative, so a bunch of 0s are filled in the other 3 bytes. Java assumes all numbers are integers, unless casted to a different type.

We did it!

TL;DR

// bitwise 'and'
numberA & numberB
0xF0 & 0xFF = 0xF0

// bitwise 'or'
numberA | numberB
0xF0 | 0x0F = 0xFF

// bitwise 'exclusive or'
numberA ^ numberB
0x2C ^ 0xF0 = 0xDC
00101100
11110000
--------
11011100

// bitshift Left
numberToShift << shiftAmount

// bitshift Right
numberToShift >> shiftAmount

// unsigned bitshift Right
numberToShift >>> shiftAmount