2011年7月4日星期一

Variant types in OCaml suck

So, here's the first of the promised long, boring, technical rants.

My MSc thesis is a compiler for shaders written in the RenderMan Shading Language for SaarCOR hardware. Writing a compiler is a pretty straightforward task - gather some tests, write a hardware emulator, then a parser, a middle end, a code generator, then play with the components until they produce something satisfactory. Not much of a blogable material. The first issue however, choice of the programming language, was a bit interesting. The options I considered were:

  • Java + possibly some high-level JVM-targeted scripting language
  • OCaml, used very conservatively
  • OCaml, used with all the fancy stuff like variant types
  • Other functional language, like Common Lisp or Scheme
  • Some real high-level language like Ruby, Perl or Python
OCaml's (and SML's) static typing is usually way too annoying. The reason I decided to try OCaml anyway were all the fancy things that it has and the older MLs didn't, the stuff which supposedly makes coding much easier, like objects, variant types, camlp4 and so on. That was kinda the last chance I was giving it, not quite satisfied with OCaml, but not yet willing to abandon it.

And the way the fancy features are implemented really sucks.
Today, let's talk about suckiness of the variant types.

Imagine there are two types, flt_temp = `FLT_TEMP of int (floating point temporaries), and vec_temp = `VEC_TEMP of int (vector temporaries). There is also a type any_temp = [flt_temp|vec_temp] for temporaries of either kind. So in contexts where only float temporaries are allowed (like division), flt_temp is used, in contexts where only vector temporaries are allowed (like dot products) - vec_temp, and in contexts where any temporary is ok (like function argument) - any_temp. So far so good, and a dumb wrappers like F of flt_temp|V of vec_temp wasn't needed to implement any_temp.

let any_temp_to_string : [ string
= function
|`FLT_TEMP(x) -> sprintf "t%d:float" x
|`VEC_TEMP(x) -> sprintf "t%d:vector" x

Notice the <. It means that any subtype of any_temp can be argument of this function. So it's possible to run any_temp_to_string on any_temp, flt_temp or vec_temp values, what's a really nice improvement compared to the old MLs. However at this point you may be a little bit confused - why is < needed ? Isn't a value of type flt_temp/vec_temp also any_temp ?

Now that's where we get to the suckiness part, because it's not and it needs an explicit typecast. Unfortunately the typecast often needs to be repeated many times, because OCaml will report a type error without even looking at our annotations:

For example, one might have thought that the following would work:
let extract_used_temps : any_comp -> any_temp list = function
|`DIVISION(a, b) -> [a; b]
|`DOT_PRODUCT(a, b) -> [a; b]

But it won't. The "correct" rewrite is (extremely ugly and verbose):
let extract_used_temps : any_comp -> any_temp list = function
|`DIVISION(a, b) -> [(a :> any_temp); (b :> any_temp)]
|`DOT_PRODUCT(a, b) -> [(a :> any_temp); (b :> any_temp)]

A related issue is that polymorphic functions of type 'a->whatever
get instantiated to any_temp->whatever, not [whatever.
So Hashtbl.find ht_of_any_temps some_flt_temp is going to fail unless we use (some_flt_temp :> any_temp).

The "let's screw the type system" function Obj.magic (identity function of type 'a->'b) every here and there limits the typecast bloat a bit, but I've found that it helps in maybe 10% of cases, and of course then, if you make a typo, you end up with random segfaults (not even a nice runtime exception like in a dynamically typed language). In the 90% of cases there isn't really a way to use Obj.magic, mostly because it's applied too late, only after OCaml finds what it thinks is a type error.

Now some statistics - the compiler itself at the moment has 4239 lines, and that includes 175 uses of :> and 32 uses of Obj.magic. It's so ugly :-/

So in the end the type system is mostly something to fight with, not something that helps, just like it was so often the case in plain OCaml/SML. So for the next project, I'll try something with fewer types. And I'll probably give up on OCaml.

Oh, and I also had some problems with OCaml's object system, maybe I'll blog about it sometime later. ^_^

没有评论:

发表评论