August 7, 2017 at 6:05 pm
patrickmcginnis59 10839 - Monday, August 7, 2017 1:30 PMsome problems where enormous performance gains can be had by interpreting instead of compiling to machine level
whats some examples of those problems? in my simple mind, interpretation is invariably slower so it'd be good to broaden my horizons on that one!
What sometimes happens is smaller code size resulting in better code locality; the interpreted instructions do much more than a single machine instruction (if the right instruction set for being efficient has been chosen, which is a crucial part of the design), so the code size for the application is much smaller with that coding than when translated right down to machine code. The interpreter is itself much smaller than the app. So with reasonable caching there is a big reduction in cache misses for executable or interpretable stuff when compared to cache misses for instructions for the fully compiled version - a given amount of work has fewer cache misses. On top of the reduced cache misses, the app code is so much smaller that a significantly greater portion of it can remain in main storage so for a given amount of work there is less pulling of code from discs or drums of whatever you have paged it out to and you may even reduce storage requirement so much that your can have reduced data page swapping for a given amount of work too. Of course the paging benefits for data (as opposed to for code - you can probably just lock the interpreter and the app in main store) will be pretty small if there is fairly random access to a great mass of data - so the big gains tend to happen for apps with lots of code and without massive random data access - but even there you need to understand how thebe system functions, which bits you can locked into main store, how much code size reduction you get from using your interpreter, and do a thorough study for your particular app before you can tell whether interpretation or compilation will deliver better performance for your workload (and for a given app, the answer will depend on the hardware - so perhaps you will want to reevaluate each time you go for a hardware upgrade that changes some significant factors).
One thing a lot of people forget (or perhaps never knew) is that a lot of the computer designs that ever actually got built didn't execute their "machine code" - they used microprogramming to interpret their machine code, as first suggested by Maurice Wilkes in 1951, nstead of implementing their order code directly in hardware. This was Wilkes's second big contribution to computer hardware design - the first was to have the world's first real stored program computer (EDSAC) completed and fully working in 1949 (leading to the Lyons adaptation of EDSAC which had the LEO 1 computer running the first commecial computer applications on a stored program computer in 1951); his later EDSAC2 was the first computer to have a microprogrammed order-code.
Tom
August 8, 2017 at 7:53 am
What sometimes happens is smaller code size resulting in better code locality; the interpreted instructions do much more than a single machine instruction (if the right instruction set for being efficient has been chosen, which is a crucial part of the design), so the code size for the application is much smaller with that coding than when translated right down to machine code. The interpreter is itself much smaller than the app. So with reasonable caching there is a big reduction in cache misses for executable or interpretable stuff when compared to cache misses for instructions for the fully compiled version - a given amount of work has fewer cache misses.
Do you have a citation for that? Because if it were the case, then compilers would compile to byte codes as a matter of efficiency right? Not dismissing the effects of cache locality but if what you're saying is true then compiler writers would be producing this code as an optimization option right? I know theres all sorts of optimization flags available in tool chains so why wouldn't that be a flag then? If what you were saying were true in the broad sense, why did the java folks feel the need for that hotspot jit stuff?
I'm not saying you're making it up, just looking for a citation and why it isn't used more. Its not like various levels of cache is a new thing after all, I've read lots about writing code to be cache friendly and I've never run across this before although I certainly could have missed it and it sounds interesting.
August 8, 2017 at 8:09 am
n.ryan - Monday, August 7, 2017 2:25 PMTomThomson - Monday, August 7, 2017 12:06 PM(snip) Despite that, it would be nice not to have to remember 37 different idioms for testing whether one number is smaller than another - wouldn't it be nice if that was always done using the symbol " < "? Also different hardware may need different languages - I can't imagine trying to write a compiler to get C++ executed efficiently on a distributed graph reduction engine, for example. (snip)Comparing less than or greater than is usually fairly consistent... until one gets to a language where one has to use words instead (most recent for this for me is PowerShell). Where I usually come unstuck at least once or twice when switching languages is the equivalence operators and the often barely comprehensible rules that are overlaid over them, this is made particularly troublesome in a language where pointers are obscured from the developers however sometimes they are used in comparisons and other times they are not, sometimes objects are copied, sometimes objects are passed by reference (known as pointers for everyone else).
One of the annoyances I frequently have with languages is their desire to hide the machine implementation so much that as a developer that often needs to interact with hardware more directly, working out the exact damn data size of a variable type is excessively annoying. Quite why the types in, for example, C which is almost as low as you can get before assembly aren't named int8, int16, int32, int64, uint8, uint16 and so on I just can't fathom. I'm quite happy to accept that sometimes I just don't care about the size (within some constraints), therefore accepting a type which is optimal for the target bitness of the processor is fine, however on many other occasions I really need to know for sure that when I'm referring to a uint8 that I am always referring to a uint8 and it won't be compiled to a uint32 because that's more efficient for the processor to handle.
I can sympathize, but I really do treat references much different than pointers, references I really just follow the intents for references whereas with pointers (in C anyways) I don't mind incrementing or some of the other array language lawyering, but that's pretty much a C sort of thing anyways, you really are more responsible in every way with memory management and it pays off to know whats happening. I don't mess with C++ at all, maybe changed a few existing programs a bit.
With C types and sizes theres times I want to know, for manipulating sound files for instance, and times that I just fill in the blanks and its all good. For signed issues (when it matters anyways) I think its best to just trace down the sign-ed-ness of your data type although there do seem to be some hacky macros out there that might work. For size, theres sizeof() which I use when I need to know. Obviously you're probably up on at that stuff anyways but just throwing that out there. I don't do hardly any C on windows though and don't program in C professionally at all, more of a hacker minus the "er" part LOL
August 8, 2017 at 7:13 pm
patrickmcginnis59 10839 - Tuesday, August 8, 2017 7:53 AMWhat sometimes happens is smaller code size resulting in better code locality; the interpreted instructions do much more than a single machine instruction (if the right instruction set for being efficient has been chosen, which is a crucial part of the design), so the code size for the application is much smaller with that coding than when translated right down to machine code. The interpreter is itself much smaller than the app. So with reasonable caching there is a big reduction in cache misses for executable or interpretable stuff when compared to cache misses for instructions for the fully compiled version - a given amount of work has fewer cache misses.
Do you have a citation for that? Because if it were the case, then compilers would compile to byte codes as a matter of efficiency right? Not dismissing the effects of cache locality but if what you're saying is true then compiler writers would be producing this code as an optimization option right? I know theres all sorts of optimization flags available in tool chains so why wouldn't that be a flag then? If what you were saying were true in the broad sense, why did the java folks feel the need for that hotspot jit stuff?
I'm not saying you're making it up, just looking for a citation and why it isn't used more. Its not like various levels of cache is a new thing after all, I've read lots about writing code to be cache friendly and I've never run across this before although I certainly could have missed it and it sounds interesting.
There were quite a few papers in technical journals about it beginning quite a long way back - maybe 1960s - and it continued to show up in journals into the 90s, and I saw one or two practical demonstrations of it at some point, and I got performance boosts out of it on a couple of things, but I think that most applications won't get significant gains from it because the combination of very large code volume and less large data volume and IO throughput is probably not common - and of course as cache sizes increase a program may cease to need too much cache and as main store sizes increase the program may cease to thrash between disc and RAM so unless the program is expected to grow in complexity as fast as hardware grows in capability the benefits of the interpreter approach may be rather short-lived. But I don't remeber where the papers and reports appeared because I shifted at about the end of '95 from an R&D environment which had a very large R component to one that had very little R and I soon lost track of where the papers I had read on most things had appeared, because I was giving attention to research (whether academic or industrial) only in security and in relational theory (except for a little in high performance computing so that I could, when asked, evaluate requests for funding for the EU's ESPRIT programme).
Tom
Viewing 4 posts - 76 through 78 (of 78 total)
You must be logged in to reply to this topic. Login to reply