Reference Types (Java in a Nutshell)

2.10. Reference Types

Now that we have discussed the syntax for working with objects and arrays, we can return to the issue of why classes and array types are known as reference types. As we saw in Table 2-2, all the Java primitive types have well-defined standard sizes, so all primitive values can be stored in a fixed amount of memory (between one and eight bytes, depending on the type). But classes and array types are composite types; objects and arrays contain other values, so they do not have a standard size, and they often require quite a bit more memory than eight bytes. For this reason, Java does not manipulate objects and arrays directly. Instead, it manipulates references to objects and arrays. Because Java handles objects and arrays by reference, classes and array types are known as reference types. In contrast, Java handles values of the primitive types directly, or by value.

A reference to an object or an array is simply some fixed-size value that refers to the object or array in some way.[4] When you assign an object or array to a variable, you are actually setting the variable to hold a reference to that object or array. Similarly, when you pass an object or array to a method, what really happens is that the method is given a reference to the object or array through which it can manipulate the object or array.

[4] Typically, a reference is the memory address at which the object or array is stored. However, since Java references are opaque and cannot be manipulated in any way, this is an implementation detail.

C and C++ programmers should note that Java does not support the & address-of operator or the * and −> dereference operators. In Java, primitive types are always handled exclusively by value, and objects and arrays are always handled exclusively by reference. Furthermore, unlike pointers in C and C++, references in Java are entirely opaque: they cannot be converted to or from integers, and they cannot be incremented or decremented.

Although references are an important part of how Java works, Java programs cannot manipulate references in any way. Despite this, there are significant differences between the behavior of primitive types and reference types in two important areas: the way values are copied and the way they are compared for equality.

2.10.1. Copying Objects and Arrays

Consider the following code that manipulate a primitive int value:

int x = 42;
int y = x;

After these lines execute, the variable y contains a copy of the value held in the variable x. Inside the Java VM, there are two independent copies of the 32-bit integer 42.

Now think about what happens if we run the same basic code but use a reference type instead of a primitive type:

Point p = new Point(1.0, 2.0);
Point q = p;

After this code runs, the variable q holds a copy of the reference held in the variable p. There is still only one copy of the Point object in the VM, but there are now two copies of the reference to that object. This has some important implications. Suppose the two previous lines of code are followed by this code:

System.out.println(p.x);  // Print out the X coordinate of p: 1.0
q.x = 13.0;               // Now change the X coordinate of q
System.out.println(p.x);  // Print out p.x again; this time it is 13.0

Since the variables p and q hold references to the same object, either variable can be used to make changes to the object, and those changes are visible through the other variable as well.

This behavior is not specific to objects; the same thing happens with arrays, as illustrated by the following code:

char[] greet = { 'h','e','l','l','o' };  // greet holds an array reference
char[] cuss = greet;                     // cuss holds the same reference
cuss[4] = '!';                           // Use reference to change an element
System.out.println(greet);               // Prints "hell!"

A similar difference in behavior between primitive types and reference types occurs when arguments are passed to methods. Consider the following method:

void changePrimitive(int x) {
  while(x > 0)
    System.out.println(x--);
}

When this method is invoked, the method is given a copy of the argument used to invoke the method in the parameter x. The code in the method uses x as a loop counter and decrements it to zero. Since x is a primitive type, the method has its own private copy of this value, so this is a perfectly reasonable thing to do.

On the other hand, consider what happens if we modify the method so that the parameter is a reference type:

void changeReference(Point p) {
  while(p.x > 0)
    System.out.println(p.x--);
}

When this method is invoked, it is passed a private copy of a reference to a Point object and can use this reference to change the Point object. Consider the following:

Point q = new Point(3.0, 4.5);  // A point with an X coordinate of 3
changeReference(q);             // Prints 3,2,1 and modifies the Point
System.out.println(q.x);        // The X coordinate of q is now 0!

When the changeReference() method is invoked, it is passed a copy of the reference held in variable q. Now both the variable q and the method parameter p hold references to the same object. The method can use its reference to change the contents of the object. Note, however, that it cannot change the contents of the variable q. In other words, the method can change the Point object beyond recognition, but it cannot change the fact that the variable q refers to that object.

The title of this section is "Copying Objects and Arrays," but, so far, we've only seen copies of references to objects and arrays, not copies of the objects and arrays themselves. To make an actual copy of an object or an array, you must use the special clone() method (inherited by all objects from java.lang.Object):

Point p = new Point(1,2);    // p refers to one object
Point q = (Point) p.clone(); // q refers to a copy of that object
q.y = 42;                    // Modify the copied object, but not the original

int[] data = {1,2,3,4,5};           // An array
int[] copy = (int[]) data.clone();  // A copy of the array

Note that a cast is necessary to coerce the return value of the clone() method to the correct type. The reason for this will become clear later in this chapter. There are a couple of points you should be aware of when using clone(). First, not all objects can be cloned. Java only allows an object to be cloned if the object's class has explicitly declared itself to be cloneable by implementing the Cloneable interface. (We haven't discussed interfaces or how they are implemented yet; that is covered in Chapter 3, "Object-Oriented Programming in Java".) The definition of Point that we showed earlier does not actually implement this interface, so our Point type, as implemented, is not cloneable. Note, however, that arrays are always cloneable. If you call the clone() method for a non-cloneable object, it throws a CloneNotSupportedException, so when you use the clone() method, you may want to use it within a try block to catch this exception.

The second thing you need to understand about clone() is that, by default, it is implemented to create a shallow copy of an object or array. The copied object or array contains copies of all the primitive values and references in the original object or array. In other words, any references in the object or array are copied, not cloned; clone() does not recursively make copies of the objects or arrays referred to by those references. A class may need to override this shallow copy behavior by defining its own version of the clone() method that explicitly performs a deeper copy where needed. To understand the shallow copy behavior of clone(), consider cloning a two-dimensional array of arrays:

int[][] data = {{1,2,3}, {4,5}};       // An array of 2 references
int[][] copy = (int[][]) data.clone(); // Copy the 2 refs to a new array
copy[0][0] = 99;                       // This changes data[0][0] too!
copy[1] = new int[] {7,8,9};           // This does not change data[1]

If you want to make a deep copy of this multidimensional array, you have to copy each dimension explicitly:

int[][] data = {{1,2,3}, {4,5}};       // An array of 2 references
int[][] copy = new int[data.length][]; // A new array to hold copied arrays
for(int i = 0; i < data.length; i++)
  copy[i] = (int[]) data[i].clone();

2.10.2. Comparing Objects and Arrays

We've seen that primitive types and reference types differ significantly in the way they are assigned to variables, passed to methods, and copied. The types also differ in the way they are compared for equality. When used with primitive values, the equality operator (= =) simply tests whether two values are identical (i.e., whether they have exactly the same bits). With reference types, however, = = compares references, not actual objects or arrays. In other words, = = tests whether two references refer to the same object or array; it does not test whether two objects or arrays have the same content. For example:

String letter = "o";
String s = "hello";                      // These two String objects
String t = "hell" + letter;              // contain exactly the same text. 
if (s == t) System.out.println("equal"); // But they are not equal!

byte[] a = { 1, 2, 3 };                  // An array. 
byte[] b = (byte[]) a.clone();           // A copy with identical content. 
if (a == b) System.out.println("equal"); // But they are not equal!

When working with reference types, there are two kinds of equality: equality of reference and equality of object. It is important to distinguish between these two kinds of equality. One way to do this is to use the word "equals" when talking about equality of references and the word "equivalent" when talking about two distinct object or arrays that have the same contents. Unfortunately, the designers of Java didn't use this nomenclature, as the method for testing whether one object is equivalent to another is named equals(). To test two objects for equivalence, pass one of them to the equals() method of the other:

String letter = "o";
String s = "hello";                      // These two String objects
String t = "hell" + letter;              // contain exactly the same text. 
if (s.equals(t))                         // And the equals() method 
  System.out.println("equivalent");      // tells us so.

All objects inherit an equals() method (from Object, but the default implementation simply uses = = to test for equality of references, not equivalence of content. A class that wants to allow objects to be compared for equivalence can define its own version of the equals() method. Our Point class does not do this, but the String class does, as indicated by the code above. You can call the equals() method on an array, but it is the same as using the = = operator, because arrays always inherit the default equals() method that compares references rather than array content. Starting in Java 1.2, you can compare arrays for equivalence with the convenience method java.util.Arrays.equals(). Prior to Java 1.2, however, you must loop through the elements of the arrays and compare them yourself.

2.10.3. The null Reference

We've seen the null keyword in our discussions of objects and arrays. Now that we have described references, it is worth revisiting null to point out that it is a special value that is a reference to nothing, or an absence of a reference. The default value for all reference types is null. The null value is unique in that it can be assigned to a variable of any reference type whatsoever.

2.10.4. Terminology: Pass by Value

I've said that Java handles arrays and objects "by reference." Don't confuse this with the phrase "pass by reference."[5] "Pass by reference" is a term used to describe the method-calling conventions of some programming languages. In a pass-by-reference language, values--even primitive values--are not passed directly to methods. Instead, methods are always passed references to values. Thus, if the method modifies its parameters, those modifications are visible when the method returns, even for primitive types.

[5]Unfortunately, previous editions of this book may have contributed to the confusion!

Java does not do this; it is a "pass by value" language. However, when a reference type is involved, the value that is passed is a reference. But this is not the same as pass-by-reference. If Java were a pass-by-reference language, when a reference type was passed to a method, it would be passed as a reference to the reference.

2.10.5. Memory Allocation and Garbage Collection

As we've already noted, objects and arrays are composite values that can contain a number of other values and may require a substantial amount of memory. When you use the new keyword to create a new object or array or use an object or array literal in your program, Java automatically creates the object for you, allocating whatever amount of memory is necessary. You don't need to do anything to make this happen.

In addition, Java also automatically reclaims that memory for reuse when it is no longer needed. It does this through a process called garbage collection. An object is considered garbage when there are no longer any references to it stored in any variables, the fields of any objects, or the elements of any arrays. For example:

Point p = new Point(1,2);           // Create an object
double d = p.distanceFromOrigin();  // Use it for something
p = new Point(2,3);                 // Create a new object

After the Java interpreter executes the third line, a reference to the new Point object has replaced the reference to the first one. There are now no remaining references to the first object, so it is garbage. At some point, the garbage collector will discover this and reclaim the memory used by the object.

C programmers, who are used to using malloc() and free() to manage memory, and C++ programmers, who are used to explicitly deleting their objects with delete, may find it a little hard to relinquish control and trust the garbage collector. Even though it seems like magic, it really works! There is a slight performance penalty due to the use of garbage collection, and Java programs may sometimes slow down noticeably while the garbage collector is actively reclaiming memory. However, having garbage collection built into the language dramatically reduces the occurrence of memory leaks and related bugs and almost always improves programmer productivity.

2.10.6. Reference Type Conversions

When we discussed primitive types earlier in this chapter, we saw that values of certain types can be converted to values of other types. Widening conversions are performed automatically by the Java interpreter, as necessary. Narrowing conversions, however, can result in lost data, so the interpreter does not perform them unless explicitly directed to do so with a cast.

Java does not allow any kind of conversion from primitive types to reference types or vice versa. Java does allow widening and narrowing conversions among certain reference types, however. As we've seen, there are an infinite number of potential reference types. In order to understand the conversions that can be performed among these types, you need to understand that the types form a hierarchy, usually called the class hierarchy.

Every Java class extends some other class, known as its superclass. A class inherits the fields and methods of its superclass and then defines its own additional fields and methods. There is a special class named Object that serves as the root of the class hierarchy in Java. It does not extend any class, but all other Java classes extend Object or some other class that has Object as one of its ancestors. The Object class defines a number of special methods that are inherited (or overridden) by all classes. These include the toString(), clone(), and equals() methods described earlier.

The predefined String class and the Point class we defined earlier in this chapter both extend Object. Thus, we can say that all String objects are also Object objects. We can also say that all Point objects are Object objects. The opposite is not true, however. We cannot say that every Object is a String because, as we've just seen, some Object objects are Point objects.

With this simple understanding of the class hierarchy, we can return to the rules of reference type conversion:

An object cannot be converted to an unrelated type. The Java compiler does not allow you to convert a String to a Point, for example, even if you use a cast operator.
An object can be converted to the type of a superclass. This is a widening conversion, so no cast is required. For example, a String value can be assigned to a variable of type Object or passed to a method where an Object parameter is expected. Note that no conversion is actually performed; the object is simply treated as if it were an instance of the superclass.
An object can be converted to the type of a subclass, but this is a narrowing conversion and requires a cast. The Java compiler provisionally allows this kind of conversion, but the Java interpreter checks at runtime to make sure it is valid. Only cast an object to the type of a subclass if you are sure, based on the logic of your program, that the object is actually an instance of the subclass. If it is not, the interpreter throws a ClassCastException. For example, if we assign a String object to a variable of type Object, we can later cast the value of that variable back to type String:
```
Object o = "string";    // Widening conversion from String to Object
// Later in the program... 
String s = (String) o;  // Narrowing conversion from Object to String
```
All array types are distinct, so an array of one type cannot be converted to an array of another type, even if the individual elements could be converted. For example, although a byte can be widened to an int, a byte[] cannot be converted to an int[], even with an explicit cast.
Arrays do not have a type hierarchy, but all arrays are considered instances of Object, so any array can be converted to an Object value through a widening conversion. A narrowing conversion with a cast can convert such an object value back to an array. For example:
```
Object o = new int[] {1,2,3};  // Widening conversion from array to Object
// Later in the program... 
int[] a = (int[]) o;           // Narrowing conversion back to array type
```