A normal (non-polymorphic) method's address is determined at compile time, and the bytecode instruction to invoke it can call the method directly. This is sometimes called early binding (or confusingly, static binding) because a method name is bound to a memory address at compile time. This is efficient but it isn't always convenient! Sometimes it isn't clear what the type of some variable should be until we run the program, because it might depend on user input, random numbers, or other external data such as from a file. (Alright, in Java we should technically say “the type of the object some reference variable refers to”, but that's too long.)
Consider this code:
// Written 12/2007 by Wayne Pollock, Tampa Florida USA, // From an idea posted in comp.lang.java.programmer on 12/23/07 // by Michael Jung ("Re: Polymorphism in Java SE?") import java.text.NumberFormat; public class PolymorphismDemo { public static void main ( String [] args ) throws Exception { if ( args.length == 0 ) { System.out.println( "Usage: java PolymorphismDemo <some-number>" ); return; } Number num = NumberFormat.getInstance().parse( args[0] ); System.out.println( "The number " + num.toString() + " has class " + num.getClass() ); } }
The type (or class) of the Number
object created
(on line 16) depends on whether the argument is an integer (in which
case it's a Long
object) or a floating point number (in
which case it's a Double
object).
At compile time there is no way to know which toString
method to use!
It depends on which type of object the variable num
refers to, either Long.toString
or
Double.toString
.
And in this program there's no way to know that until run time:
C:\Temp>javac PolymorphismDemo.java C:\Temp>java PolymorphismDemo 123 The number 123 has class class java.lang.Long C:\Temp>java PolymorphismDemo 123.456 The number 123.456 has class class java.lang.Double
So what can the compiler do?
When compiling the main
method it can't bind the
method name toString
(used on line 18) to an
address.
Instead the compiler defers the binding until run time, by using
bytecode that will look up the address of the correct
toString
method.
This is why polymorphism is also known as late binding,
delayed binding, or dynamic binding.
(In some languages such as C++ polymorphic methods are known
as virtual methods or functions.)
Without polymorphism this program would be more complicated;
you'd have to use some custom method to convert the
args[0]
String
to a number
,
that also somehow returns a type name.
You would then need a switch
statement or an
if
chain to test the type name and call the
right method yourself.
(I.e., either Integer.toString(x)
or Double.toString(x)
.)
This sort of code is very ugly, easily broken, and hard to
maintain or enhance.
That is the main reason why Java supports polymorphic methods.
The addresses of an object's polymorphic methods is stored in a
method table in the object.
When invoking some polymorphic method at runtime the method name is
looked up in this table to get the address.
A method table contains the names and addresses of the object's
dynamically bound (polymorphic) methods.
The method table is the same for all objects belonging to the same
class, so is stored in the Class
object (for the
object's type, here Integer
or Double
).
(In other languages method tables are called vtables.)
Method tables are not part of the language but might be used in
some implementations.
(Different JVM vendors are free to implement polymorphism anyway
they please as long as the end result is the same.)
The Sun JVM mixes method table entries in
an object's “constant pool”, which can be seen using
the command “javap -verbose foo
”.
(Since all objects of the same class will have the same
method table, the JVM may keep it elsewhere.)
It is illustrative to see how the system constructs a method table
for some class such as Integer
.
Initially the method table is empty.
Then the method table is filled with the polymorphic methods in
the most distant ancestor class, usually the Object
class:
Method Name | Address | Comment |
---|---|---|
Object.toString | 111 | Object.toString method address |
... | ... | 10 Additional methods |
This list is added to (and existing entries modified) by the
polymorphic methods in the next most-distant ancestor class, here
its the Number
class.
If you look at the API (JavaDocs) you'll find the
Number
class doesn't over-ride any methods but does
add six new ones to the table.
So the toString
entry in the method table is left unmodified:
Method Name | Address | Comment |
---|---|---|
Object.toString | 111 | Number.toString method address |
Number.intValue | 222 | Number.intValue method address |
... | ... | 15 Additional methods |
This continues until the method tables of all parent classes have been
merged.
Finally the method table is updated again with the Integer
class' polymorphic methods.
Now toString
is over-ridden:
Method Name | Address | Comment |
---|---|---|
Object.toString | 333 | Integer.toString method address |
Number.intValue | 444 | Integer.intValue method address |
Integer.parseInt | 555 | Number.longValue method address |
... | ... | Additional methods |
The "internal" method names in the
table includes the original class name.
Over-ridding some method merely changes the address
in the method table slot but doesn't change the name.
This is why the javap
output shows the polymorphic
method name as Object.toString
and
not merely toString
or Integer.toString
.
Take a look at the bytecode for this method call (use
“javap -verbose PolymorphismDemo
”),
which reads:
40: invokevirtual #11; //Method java/lang/Object.toString:()Ljava/lang/String;
The “#11
” refers to a method name in the method table
(as mentioned above this is an index into the “constant
pool”), in this case the method named
Object.toString
that returns a String
).
The invokevirtual
bytecode instruction causes the JVM to treat
the value at #11
, not as an address (as it would be for early
binding), but the name of a method to look up in the method table for
the current object.
And that's how it works!