Article 9833 of comp.graphics.rendering.renderman: Path: news.lava.net!news.Hawaii.Edu!logbridge.uoregon.edu!news.maxwell.syr.edu!news.mel.connect.com.au!news.unimelb.edu.au!cs.mu.OZ.AU!bromage From: bromage@cs.mu.oz.au (Andrew Bromage) Newsgroups: comp.graphics.rendering.renderman Subject: Re: Quest for Larry (or anyone else) regarding BMRT Interpreter.... Date: 9 Jul 1999 06:05:50 GMT Organization: Computer Science, The University of Melbourne Lines: 141 Message-ID: <7m43fu$n8j$1@mulga.cs.mu.OZ.AU> References: <931208767.6390.0.nnrp-04.d4e417a3@news.demon.co.uk> <931388725.21530.0.nnrp-03.d4e417a3@news.demon.co.uk> <7m0q98$qrb$1@sherman.pixar.com> <3784662E.DDCB6C52@zorro.ruca.ua.ac.be> NNTP-Posting-Host: mundook.cs.mu.oz.au Xref: news.lava.net comp.graphics.rendering.renderman:9833 G'day all. Stefan Wils writes: >Just a question : is a stack-based VM really efficient for interpreting ? >Since stack-based or 0-address machines generally have more instructions >per program, and if the execution of each instruction (like mulPP or pushP) >would be considered as *about* equally slow/fast, it might be reasonable >to create an assembler set in which the number of instruction per program >is kept to a minimum. >I had the idea of a RISC-like machine with an "unlimited" number of >registers. IMO, your idea is a good one. More thoughts follow this disclaimer. Disclaimer: My academic background is in compiler implementation (graphics being more of a hobby), so if I go into jargon here, my apologies. (BTW, yes, I'm half-heartedly working on a shading language compiler plus virtual machine in my copious(!) spare time, and I'm seriously considering a VM like this as the target language.) In addition, this is only my thinking on the matter to date, and I reserve the right to change my mind at whim. :-) The two most common interpreted VM formats are stack machines and register transfer machines. (BTW, "RISC" is a misnomer here, since you can buy processors which use a RISC stack machine architecture.) Most people use a stack machine because it's very easy to implement. Using a stack machine not only makes implementing a simple working compiler a hell of a lot easier (though it can make certain kinds of optimisation harder), it also solves a number of run-time issues such as: 1. Temporary storage 2. Saving temporary storage locations over function/procedure calls 3. Where to place arguments for function/procedure calls and any return values 4. Operations with different numbers of arguments Issue (1) is straightforward to deal with if you don't have an upper limit on the number of "registers". (BTW, you don't need an infinite number of registers. You just need to know how many registers are needed in advance so the interpreter can allocate sufficient storage in one go; the compiler can work out this information easily enough.) Since the RenderMan(tm) shading language doesn't really have functions or procedures (they are instantiated at compile time and recursion is disallowed), (2) and (3) are non-issues. Issue (4) is dealt with below. >The compiler would keep a register pool, to see which registers >are free. If they are all currently in use, a new register would be created. Reasonably good register allocation schemes can be almost as simple as that. If you really want to limit the number of registers (and it's a good idea even in an interpreter because you want to minimise cache misses) but are happy with a reasonable attempt, you can reorder the code in as good an ordering as possible, then use registers just as if they were stack elements (i.e. allocate a new register where in a stack machine you'd perform a "push" operation, and deallocate the register where in the stack machine you'd perform a "pop" operation). Anyway, this is all covered in any standard compiler text, such as the dragon book[1] or the tiger book[2]. >Then something like (this might or might not be pure SLC :-) > > pushF r[0] > pushF r[1] > mulFF > popF r[2] >Could be replaced by: > mulFF r[2], r[1], r[0] Issue (4) is a problem here if you insist on VM instructions which have a fixed number of arguments each. Consider how you'd implement spline() or printf(), for example. They can have arbitrarily many arguments coming from all sorts of different sources (such as compile- time constants, shader parameters, computed values; this is important because if the parameters to spline() were all compile-time constants, you could get away with just pointing to a table of static values). In a stack machine, it's easy to implement since you can just push as many arguments as you have. In a register transfer language, there are several solutions I can think of: - Generate a call to a function to handle it. This seems like an unnecessary complication to add to a language which doesn't need function calls for any other reason. - Implement some kind of aggregate structure (e.g. a mutable array-like type) and pass that as one argument. Again, it seems a bit wasteful to add it to a language which otherwise doesn't need it. - Have some opcodes with variable numbers of arguments. Since we're not even pretending to want to generate assembly language, this seems the most acceptable solution. >When the shader is then loaded, it could be parsed: > - each instruction being replaced by the address of the executing > function, to minimize runtime overhead This is called "threaded code", and it's a very common compilation scheme. It's also a way to build a (fairly) cheap JIT compiler: simply generate code at interpretation-time to do the function calls. > - each instruction parameter being replaced by an index Even better, replace it by a pointer to the register contents. > - oneexecution environment is prepared for all instances You'd need to be able to have more than one instance of a given shader active in an implementation which uses ray tracing because the trace() function could invoke a shader of the same type, and execution has to be resumed when trace() returns. It would also apply if you wanted to implement an extension to the shading language so that shaders can invoke other shaders. (IMO, a sorely missing feature.) >Of course, this is just an idea; I've never tried this or anything else :-) >Anybody have an opinion on this design ??? I think it's a good one, so long as you realise that stack machines solve a lot of small niggly problems which using a register transfer design need to be addressed. As always: IMHO. Contents may settle in transit. YMMV. Cheers, Andrew Bromage [1] "Compilers -- Principles, Techniques and Tools" by Aho, Sethi and Ullman, Addison-Wesley, 1988 [2] "Modern Compiler Implementation in {ML,Java,C}" by A.W. Appel, Cambridge University Press, 1998