Sunday, March 27, 2011

Bit operators in Java

Today I was understanding how eventually operators >>, >>> and << work in Java.

First of all, Java uses two's complement integer representation. This representation is very fast because allows to operate with negative and positive numbers in exactly the same way.

Let's consider Java's primitive type byte. It is actually represented by single byte having 8 bits correspondingly. This type is signed and may hold values in range [-128, 127]. (Note that the only unsigned numeric type in Java is char which is used to hold numeric values of UTF-16 symbols, thus it is always positive).

Why so? The answer is in binary representation of two's complement integers:
127: 01111111
-128: 10000000

Two's complement: binary to decimal conversion

Decimal numbers are calculated from their binary representation in two's complement notation as following:
If the leftmost bit is 0 -- number is positive. So, bits 1-7 are taken and converted to decimal form as usual by formula

bit7*2^6 + bit6*2^5 + bit5*2^4+ bit3*2^2+ bit2*2^1+ bit1*2^0

If leftmost bit is 1 -- it indicates that number is negative, so the following process applies:
  1. Inverse bits [1-7]
  2. Calculate decimal number as usual by above formula
  3. Add 1 to the obtained number
  4. Assign "-" to the obtained number

Example: 11111110
The leftmost bit is 1 -- this is negative number
  1. Inverse bits [1-7]: 0000001
  2. Calculate decimal number: 1
  3. Add 1 to obtained number: 2
  4. Assign sign "-" to obtained number: -2
So, 11111110 represents -2 in two's complement notation.

Corner cases

Now let's understand corner cases using byte type as an example:
Example 1: 0-1=-1
Binary (two's complement): 00000000-00000001=11111111
So, 11111111 represents -1 in binary (two's complement single byte type)? Don't believe me, let's verify:

Converting 11111111 to decimal. Leftmost bit is negative -- number is negative
  1. Inverse: 0000000 (7 bits)
  2. Convert to decimal: 0
  3. Add 1: 0+1=1
  4. Assign "-" sign: -1

Example 2: 00000001+11111111=00000000
That's natural, 11111111=-1 => 1-1=0

Example 3:
The next case is more interesting: 01111111+00000001=10000000

This expression is totally correct in binary world. But what about decimal interpretation in two's complement notation? As you can see, 01111111 is positive, 00000001 is also positive, however, their sum is negative since 10000000 is negative number. This is the main "magic" of two's complement integers since one may become confused of this code written in Java:

byte b = (byte) (127 + 1);
System.out.println(b);

Output is: -128

Note casting to byte is important since all operations are performed in int space by default, thus the result without casting will be just 128.

If you understand the logic above -- this is not a surprise for you, just convert 10000000 to decimal number using provided algorithm.

Generalizing this information:

Integer.MAX_VALUE + 1 == Integer.MIN_VALUE
Integer.MIN_VALUE - 1 == Integer.MAX_VALUE


Shift operators

Understanding two's complement it is easy to understand bit operators in Java as well. In order to do that you just need to understand how the number looks in binary system, perform shift and convert it back to decimal. This is exactly what is going on when you write something like

System.out.println((byte) (64 << 1));
Output: -128

Really:
  • 64 = 01000000
  • 01000000 << 1 = 10000000
  • 10000000 = -128
Right shift operator or "<<" just shifts all bits to the right and fills rightmost bits with zero's.

There are two left-shift operators: ">>" and ">>>". Let's see their difference:

Difference between ">>" and ">>>" operators

Operator ">>" is called "signed" left shift, which means it preserves the sign of the number upon shifting.
Example: Integer.MIN_VALUE >> 2

Integer.MIN_VALUE = -2147483648
-2147483648 : 10000000000000000000000000000000
Integer.MIN_VALUE >> 2
-536870912 : 11100000000000000000000000000000

That's said, leftmost bits before and after shift are always the same when using ">>"

For example: 5 >> 1 = 2
5 : 00000000000000000000000000000101
2 : 00000000000000000000000000000010

Operator ">>>" or "unsigned" right shift always sets leftmost bit to zero
Example: Integer.MIN_VALUE >>> 2
-2147483648 : 10000000000000000000000000000000
536870912 : 00100000000000000000000000000000

Bit operators performance

Bit operators works more than 10 times faster than usual / and * operators.
Consider the following code
Example:

long start = System.currentTimeMillis();
  for (int i = 0; i < 100000000; i++) {
   int p = i / 2;
   p = i * 2;
  }
  System.out.println(System.currentTimeMillis() - start);
  start = System.currentTimeMillis();
  for (int i = 0; i < 100000000; i++) {
   int p = i >> 1;
   p = i << 2;
  }
  System.out.println(System.currentTimeMillis() - start);
Output is similar to this:
5089
300
That's it. As a bonus I provide a piece of code which represents Java values of type int to full binary form.
public void printNumber(int b) {
  String binaryRepresentation = Integer.toBinaryString(b);
  StringBuilder leadingZeros = new StringBuilder();
  for (int i = 0; i < Integer.numberOfLeadingZeros(b); i++) {
   leadingZeros.append("0");
  }
  System.out.println(String.format("%-12d: %s%s", b, leadingZeros, binaryRepresentation));
  
 }
}

No comments:

Post a Comment