Date: Sat, 7 Feb 1998 04:34:28 GMT
To: java-security@web1.javasoft.com
From: David Hopwood <hopwood@zetnet.co.uk>
Subject: Possible verifier bug
-----BEGIN PGP SIGNED MESSAGE-----
[cross-posted to c.l.j.security from c.l.j.machine, and cc:d to
java-security@java.sun.com]
In message <34DA27AB.5C149C62@jivetech.com>
Paul Haahr <haahr@jivetech.com> wrote:
> Should this method ever return false?
> public static boolean tautology(boolean x) {
> boolean[] array = new boolean[1];
> array[0] = x;
> return array[0] == x;
> }
> As far as I can tell, it is impossible to create a Java program where it
> returns false. Normally, I'd argue that means that a compiler should be
> able to eliminate the array creation (the array can never be shared) and
> just generate code to return true.
Yep, that's a valid optimisation.
> On the other hand, this call does return false in every implementation
> I've tried it on:
> tautology(FunkyBoolean.of(256))
> What is FunkyBoolean? It passes verification. No Java source exists
> for it, but javap prints this disassembly:
> public synchronized class FunkyBoolean extends java.lang.Object
> /* ACC_SUPER bit set */
> {
> public static boolean of(int);
> public FunkyBoolean();
> }
> Method boolean of(int)
> 0 iload_0
> 1 ireturn
> Method FunkyBoolean()
> 0 aload_0
> 1 invokespecial #3 <Method java.lang.Object()>
> 4 return
The 'of' method shouldn't pass verification. The top element of the stack
at instruction 1 can be any int. The ireturn instruction requires the top
stack element to have a type that is assignable to the return type of the
method, as given by parsing its signature string. Since int is not
assignable to boolean, the ireturn should be invalid.
> Obvious generalizations yield out-of-range bytes, shorts, and chars.
> I'm pretty sure that the verifiers I've tried here are correct. I can
> find no rule* anywhere in the VM specification that says a method which
> returns boolean (signature 'Z') must return only 0 or 1.
That's an omission, then. According to the language spec, boolean values
are either true or false, and the results of &&, ||, etc. are defined for
all possible inputs. If a class that passes verification can cause the
language semantics to be violated (other than by using native code), it's
a bug.
(Ideally it shouldn't be necessary to refer to the language spec for this,
but the VM spec doesn't document verification particularly well.)
- From a security point of view, it's definitely a Bad Thing that this can
happen. Suppose someone had proven the correctness of 'tautology' on the
basis of the Java language semantics. An attacker can try to pass it a
invalid boolean (or in general, pass any out-of-range value into a method
of a more trusted class), and get it to do something that it had been
proven not to be able to do - which could easily be something insecure.
> And verifying
> such things would certainly add some difficulty, perhaps approaching a
> halting problem issue.
It's a bit tricky, but certainly possible. The verifier already treats
boolean, byte, short, char and int as separate types, and there are
separate instructions for converting values between them. (That's what
makes me think that this was intended to be checked.)
The only real problem is with constants, since iconst_1 is used to push
either true, (byte)1, (short)1, (char)1 or (int)1, for example. However,
that can be fixed by treating each constant value, for the purpose of
verification, as belonging to the type with the smallest range containing
that value:
Range Type
-2147483648..2147483647 int
0..65535 char
-32768..32767 short
-128..127 byte
0..1 boolean
and allowing each of these types to be used wherever a wider type is
expected (although strictly speaking there is no conversion from, say,
boolean to int in Java, it cannot do any harm for the verifier to allow
that as an implicit conversion).
If all the types were signed, that would be sufficient, and merging
these types during the flow analysis would be straightforward.
Unfortunately, in order to treat char correctly, 2 extra ranges are
needed:
0..127 "positive byte"
0..32767 "positive short"
since otherwise it wouldn't be possible to handle the fact that some
short values are assignable to char, and some are not (and similarly
for byte).
The rule for merging these types would be something like:
| int char short +short byte +byte boolean
--------+--------------------------------------------------
int | int int int int int int int
char | int char int char int char char
short | int int short short short short short
+short | int char short +short short +short +short
byte | int int short short byte byte byte
+byte | int char short +short byte +byte +byte
boolean | int char short +short byte +byte boolean
If anyone can think of a simpler way than this which guarantees to
pass all valid code, and fail invalid code, I'd like to hear about it.
> (* Well, actually, there is such a rule, but it appears to be an error.
> Notably, on page 272 of the printed VM spec, the description of ireturn
> says ``The returning method must have return type byte, short, char, or
> int.'' I believe boolean should have been included in that list.)
Yes, it should.
> So, the question I raise is, is an implementation allowed to assume that
> a value returned from a boolean method, passed as a boolean parameter,
> or loaded from a boolean field or array is 0 or 1?
IMHO yes.
Note that native methods can use an invalid representation for a boolean,
byte, short, or char, but I think it is reasonable to treat any native code
that does that as causing undefined behaviour. A warning should probably be
put in the JNI spec that it is important not to do this.
> My opinion is that no such assumption can be made, given the VM
> definition.
> Is this really a problem? I don't think so, though compiler writers may
> need to be a little careful. (Forgive me for bringing this up if it's
> already covered in conformance suites.)
> Paul
- --
David Hopwood <hopwood@zetnet.co.uk>
PGP public key: http://www.users.zetnet.co.uk/hopwood/public.asc
Key fingerprint = 71 8E A6 23 0E D3 4C E5 0F 69 8C D4 FA 66 15 01
Key type/length = RSA 2048-bit (always check this as well as the fingerprint)
-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv
iQEVAwUBNNvYPjkCAxeYt5gVAQHsCgf+KfUFbwDeKzSNxWPv9MDbZjbBgwAHuEIq
QLtsmuK2uNGYdPN3S9Ifqqj6Rg194uKIA4vRQGX1fb03cUPSVk1j5yj7HfujisHJ
qEZux7c1hChtbeIYOIDyD8r43wdZxXgudwtHRIu6ZNQGawc6d5Jcx86Qj2fWJk3M
JgkFkWnOIdvX9024nTLeh3rOoOUlSq+T41U6SBy61FItd4pJHMZ7RrUX19jgAl3l
hrx5atqLNtbdHOnDQGoedeLMMMqbzIs+p92j7gDg9SelNFbG9fsKhxfFfLmpLxlj
jzdrQibzlm+TeOwoQLf+ODxbxqlGezdRj824KJgm61HAanPq+xzsFQ==
=sZyM
-----END PGP SIGNATURE-----