您的位置:寻梦网首页编程乐园Java天地Core JavaJava Lecture Notes

Introduction

Content

Apply

Reflect

Extend

previous.gif
 (3087 bytes)

next.gif
 (2959 bytes)

 

Content Index

Content Page # 3

From source file to execution

Although all the programs you will be writing for this module will be in Java, computers have at their "heart" a microprocessor that cannot understand, and so cannot execute, statements in the Java programming language. Each computer's microprocessor can only understand and execute instructions in its specific, very low level machine language. By low level it is meant that machine code instructions relate to tasks close to the physical design of the microprocessor -- moving data to and from places in memory, storing a value in a temporary memory location (register) and performing simple computation (arithmetic and logic) on data in these temporary locations. Java is a high level language, allowing human programmers to ignore the low level details of memory and simple logic. High level languages provide facilities for humans to program in terms of more abstract concepts -- object oriented programming languages allow humans to program in terms of object concepts, with the benefits and sophistication object technology brings through features such as encapsulation and inheritance.

This section presents a summary of how it is possible for a microprocessor to execute languages that were originally written in a non-microprocessor specific programming language such as Java. There are a number of stages in this program translation process. Such concepts are important, since Java has been designed to be a much more portable programming language than many before it. Such a feature comes with associated costs, of efficiency and complexity of language translation, however the great interest and investment in Java by academia and industry demonstrate that for many the benefits outweigh these costs.

Microprocessors and their machine code

Computers work on electronic signals We find it convenient to represent highs and lows as zeroes and ones - the digits of the binary number system. We go further by discussing sequences of binary digits, bits, as representing contents of memory, registers, disk sectors, etc. These binary sequences can be interpreted as a whole plethora of things: instructions to a processor or controller, packets on a network, or data such as numbers, strings, true and false values, etc. Consequently we often (incorrectly) talk of machine instructions (machine code) as binary numbers. They are maybe structured sequences of bits, but not always numbers.

This kind of misuse of language is common in computing where you are often switching between different levels of abstraction. You even hear people say "It's all bits in the end." Of course, this is not true, but physicists might even challenge the maybe more accurate statement that "It's all electrons in the end."

The machine code for a particular microprocessor is called its "native machine code". The we could talk about Pentium II native machine code, PowerPC native machine code, and Athlon native machine code etc.

A need for higher level languages and virtual machines

Software is designed and written by different groups of people for different purposes. For hardware designers and "systems" programmers there is a need to think in bits or hardware instructions -- i.e. these people need to think about and write software that is "close" to the particular computer they are working on. Such people may need to program computers in "low" or "machine" level languages. However, for most analysts and designers and (higher level) programmers, it is not useful to think in such machine-level concepts, but to think in terms of higher level abstractions. A plethora of plausible abstractions have been proposed, usually expressed in a programming language, graphical notation or mathematical notation. 

Traditionally many programming language development environments were implemented in a way that closely matched specific hardware. This led to problems moving software from one hardware or operating system platform to another. However, a compiler technology that assumes an idealised processor has alleviated this problem. First used for languages like Pascal and Smalltalk, this idea of a ‘virtual’ machine has become popular with Java, and platform independence for a popular language has enhanced its popularity and usefulness.

From Java source files to microprocessor specific machine codes

The statements we write in high level languages such as Java are called "source code". When a source code statement has been translated into some other (usually lower level) computer language, the result is called "object code". 

JDK Java Translation requirement:

A source file containing a Java program has to have the extension ".java"

One stage may produce code for and idealised, so-called virtual, machine. This means there is a two stage translation process: from high level language to virtual machine code, and then from virtual machine to a native (real world microprocessor) machine code.

Translation from one computer language to another can be one of two forms: interpretation or compilation.

    Interpreting software means that, in essence, each instruction in turn is translated to a sequence of real (not virtual) machine instructions and executed by the processor.
    Compilation means that the whole piece of software (source code file) is translated into machine instructions first and stored. A stored translated program is called an object code file. If a piece of software has been compiled into a native machine code, then the object code instructions can be executed. If compilation has been into a virtual machine code, then this compiled object file needs to be translated again (again via either interpretation or compilation) to be executed on a real microprocessor.

The first way, interpretation, is more flexible, and results in smaller storage requirements for the software expressed in virtual machine code. The second way, compilation, results in faster execution, but much larger storage requirements.

In almost all cases where translation is from a high level language to a virtual machine language the translation process is by compilation -- i.e. from a high level source file one compiles to a virtual machine code object file. Once this virtual machine code file has been prepared, it may be interpreted, or compiled into a native machine language.

Java has been designed to be first compiled and then interpreted. A typical Java compiler produces Java bytecode, which are instructions for an idealised computer -- called the Java Virtual Machine (JVM). Bytecode object files can be loaded into a real computer for interpretation by a piece of software that essentially pretends to be the idealised computer and so is confusingly called the Java Virtual Machine (JVM). The Java Virtual Machine interpreter translates one instruction at a time from bytecode into the native machine code for the computer it is running on.

JDK Java Translation requirement:

An object file containing a Java program compiled into JVM bytecode is automatically given the extension ".class"

Of course, in order to run itself, the JVM interpreter must be a native machine code program itself. So in order to run a Java program on, say, a PC one needs a set of compiled Java bytecode object files, and a JVM interpreter for the PCs microprocessor. In order to run the same compiled Java bytecode object files on a Macintosh, one needs another JVM interpreter written in the Macintoshes native machine code.

A popular feature of Java is that a certain kind of Java program, called an " applet " can be run on computers with modern web browsers. This is possible because such modern web browsers, like Internet Explorer 4 and Netscape Communicator 4.5, include their own Java Virtual Machine interpreter. So the web browser can interpret (and so execute) compiled Java bytecode files.

It is not necessary to use a browser to execute all Java programs -- a second kind of Java program, called an "application" is not designed to be run within a browser. To run a Java application one uses a "stand alone" Java Virtual Machine interpreter. In this module we use the JVM interpreter provided in Sun's Java Development Kit (JDK) called "java.exe".

In fact, it is possible to run Java applet programs outside of a web browser using a second of Sun's interpreters, called "appletviewer.exe" which interprets Java bytecode files in the same way that web browsers are designed to.

The real world

Although not untrue, the above is a slightly simplistic view of what happens in the real world. Almost always, the translated instructions (in the virtual machine code) that results directly from what the programmer wrote is packaged with a run-time environment. A run-time environment is code written by developers of the language implementation because it is needed to match the target hardware and operating system.

With the exception of very trivial software systems, you will almost always be working with 2 or more Java classes. Each class needs to be compiled separately (from its ".java" file into a corresponding bytecode ".class" file) and then the run-time interpreter is used to repeatedly choose the appropriate bytecode instruction to translate and execute (from the appropriate bytecode file).

Life story of a Java program

The life story of one possible Java program is illustrated in the figure below.

 

The following steps are illustrated in the figure above:

    a Java source file has been written and saved into the file "MyClass.java"
    a Java compiler has then been used to compiled this source file into JVM bytecode instructions stored in a file called "MyClass.class"
    in order to execute the program, the bytecode "MyClass.class" file, along with 2 other required pre-compiled files are provided to the JVM interpreter

      this JVM interpreter iterates, choosing the next bytecode instruction to be executed, translating that instruction into the native machine code, and executing it (then choosing the next bytecode instruction and so on until the program terminates).

In the figure above "XXX"s indicate bytecode instructions, and the sequence of binary digists "001001…" represent a native machine code instruction that has just been translated for execution by the run-time interpreter.

A note on computers with Java processors

In fact research is underway to develop a new family of microprocessors whose native machine language is actually Java bytecode. However, such computers are not yet available for consumer or business use, and for a very long time most people will be wishing to execute Java programs on computers whose microprocessors native machine code is not JVM bytecode.

A note on "native code" compilers

Some companies (for example Symantec) are now beginning to release Java software development environments that will compile Java programs into native machine code object files. This means that, potentially, software developers can create source files that can be both interpreted on any computer with a Java Virtual Machine, and also run efficiently on computers for which there exist native code compilers.

However, the introduction of native code compilers also introduces possible problems of portability, since it is very tempting for a software developer to introduce special Java libraries designed to take advantage of a particular computer's features. Thus one can write Java programs in, say, Microsoft's J++ environment that make very efficient use of Windows Java classes, but result in Java programs that cannot be compiled into "pure" Java Virtual Machine bytecode. 

Native code compilers and machine specific Java libraries resulting in non-standard bytecode are both examples of directions developers are going in order to overcome some of the efficiency and speed limitations of interpreted "pure" Java programs.

From source file to execution: Summary

    A compiler converts a whole file from one language into a machine code

      either into a native or virtual machine code

    A native machine code file can be executed
    A virtual machine code file needs to be further translated
    An interpreter translates and executes one statement at a time
    The end user does not need a compiler to run a bytcode Java program
    The end-user does need a run-time interpreter to interpret a Java bytecode virtual machine code file 

Back to top

basicline.gif (169 bytes)

RITSEC - Global Campus
Copyright ?1999 RITSEC- Middlesex University. All rights reserved.
webmaster@globalcampus.com.eg