Discussion Topic 2 – compiler
'intelligence'
In the
previous discussion, we saw that the compiler rejected the line System.out.Println() because there was no operation in System.out called `Println()'. The compiler prints a thoroughly unhelpful error message,
because it doesn't understand the programmer's intentions. In some cases it would
be nice if the compiler could recognize common errors, and correct them automatically.
Particularly useful would be misspelled identifiers, and missing semicolons and brackets.
How easy do you think this would be? What disadvantages would it have?
Notes for tutors
It is often stated that compilers cannot correct
errors like writing `Println' instead of `println' but this is untrue. It is not difficult
to design a procedure that would work most of the time. For example, my Prolog compiler
(Prolog is a programming language based on logic) does this quite well. The equivalent of
`println' in Prolog is `write'. If it say
write(hello).
or
vrite(hello).
the compiler says:
correct to: write(hello) ?
and if I
say `y' (for `yes') the compiler makes the change. This is called the `do what I mean'
strategy.
The problem with this is that it does not always
get it right. Sometimes if I misspell an identifier it suggests an unsuitable alternative,
or no alternative at all. So the compiler still has to ask the programmer what to do; it
can't make corrections automatically. In the long term very little time is saved by this
facility, but it is quite useful at times.
The problem of detecting missing braces and semicolons
is a more difficult one. I don't know of any programming language whose compiler is very
helpful here. Consider the example below.
line 1: class
Test {
line 2: void test(void)
line 3: {
line 4: for (int i = 0; i < 10; i++)
line 5: {
line 6: System.out.println("test");
line 7: // missing brace here!
line 8: }
line 9: }
In this
simple example, a reader can spot quite easily where a brace has been missed out. But the
compiler `thinks' like this:
line 1: `class' statement. Fine. There's an opening
brace, so this definition of a class has several lines. I'll expect a closing brace later
line 2: this is the definition of an operation called
`test'. There might be a brace next.
line 3: yes there is. So I've seen two opening braces,
and no closing braces so far.
line 4: `for' statement. Is it followed by a brace?
line 5: oh yes. So this is the start of a compound
statement. So far I've seen three opening braces and no closing braces.
line 6: this line is inside the compound statement
line 7: (the compiler doesn't do anything, as I've
forgotten to put the brace in)
li: here's a closing brace. Good: that's the end of the
`for' statement. Now I'm expecting two closing braces.
line 9: here's another closing brace. That's the end of
the `test()' operation. Now I'm expecting one closing brace…
Hmmm. Now I've got to the end of the program, and I'm
still expecting a closing brace. I'd better report an error.
So the compiler reports an error right at the end of
the program, and not where the error really was. This is inevitable, because one brace is
the same as another to the compiler.
If the program were longer, the problem would be even
worse:
line 1: class Test {
line 2: void test(void)
line 3: {
line 4: for (int i = 0;
i < 10; i++)
line 5: {
line 6: System.out.println("test");
line 7: // missing
brace here!
line 8: }
line 9: }
line 10: class Test2 {
line 11:
///more statements here…
When the compiler gets to line 10, it thinks it's still
compiling the definition of class `Test'. Now it's confronted with another `class'
statement. So it thinks the programmer is trying to define one class inside another. The
stage is now set for an error message about `inner classes' which will be completely
incomprehensible.
This problem could be overcome by a different
design to the language. For example, suppose we did not use brackets to indicate compound
statements, but words like `begin_method' and `begin_class'. In this case the example
above might look a bit like this:
line 1: class Test begin_class
line 2: void test(void)
line 3: begin_method
line
4: for
(int i = 0; i < 10; i++)
line 5: begin_for_loop
line 6: System.out.println("test");
line 7: // missing
`end_for_loop' here!
line 8: end_method
line 9: end_class
line 10: class Test2 {
line 11:
///more statements here…
In this case the compiler would see `end_method' on line 8, and check what the corresponding `begin_...' was. In this case it was `begin_for_loop'. Now the compiler understands that the
programmer can't end a for loop with `end_method'; only `begin_method' goes with `end_method'. It can then report an error like this:
`end_method' found on line 8,
while expecting `end_for_loop'. Perhaps you have missed out an `end_for_loop' between
lines 6 and 8?
In this case the compiler's error
message can be much more helpful. It still can’t locate the error exactly, but it's
much closer.
Back to top

RITSEC - Global Campus
Copyright ?1999 RITSEC- Middlesex University. All rights reserved.
webmaster@globalcampus.com.eg |