Is there a command that shows the C-code the interpreter creates?

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

apjanke-floss


On 4/16/19 7:00 PM, GoSim wrote:

> Do you have access to the compiled structures that Bison creates?
>
> If yes, then changing the name of the variables in the m-file (instead of
> the c-file that doesn't exist) by adding a suffix everytime the type changes
> should be sufficient. Put the structures after eachother and that's the
> compiled code. Without loops there is no performance difference, but with
> loops if it is not needed to run the parser more than once for every line it
> could be a big difference.
>
> I need to know how you detect if it is a string, int, imaginary number etc.
>
>
> If I create a program that changes the m-code according to my idea will you
> take care of the Bison thing and put its compiled structures after eachother
> and run it?
> In the future though, busy atm.

I think you're a little optimistic about how hard a project this would
be. One big issue is that *variables* in Octave are essentially untyped.
Only *values* (arrays) have types. And that type is not known until run
time, when you execute the individual line of code that is referencing
the value. In general, you can't know what the type of the value in a
particular variable is at parse time, because function calls and most
Octave expressions are untyped. You could only deduce the type for
variables which are initialized with literals inside a given function.
And thanks to the existence of evalin() and assignin(), that value and
its type can change at any time during program execution, _even when_ no
assignment or reference to the variable holding it is made. So that type
deduction is only valid for the line where initialization occurs.

All the Bison stuff is in the Octave source code repo in the "libinterp"
subdirectory. (I think.)

Cheers,
Andrew


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
I read your informative posts and of course I have new ideas. I still believe
it is possible but I will accept that GoSim will run at 100-500Hz and not
kHz. And will leave this to the future maybe.

I understand that your variables are dynamic, changing objects. The idea is
to change the m-code so that these objects never change, now they are static
and maybe a compilable program can be created.

Let's say the m-file has two lines:

a=1;
a="s";

this requires the variable type to be dynamic. Now change the variables to:

a_0=1;
a_1="s";

Now these variables types are static. Some time in your program you know
their type and they can be declared. You have all the functionality to
create a compiled program if the m-script is written "static" i.e. in a way
so that variables don't change type and this can be done automatically with
added suffixes.
The parser has to be the thing that gives you what kind of type it is and
since the type never changes now you know the type before runtime.

The biggest ? for me is how the compiled data that Bison's parser creates
work, how accessible and usable are they.





--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

nrjank
Let's say the m-file has two lines:

a=1;
a="s";

this requires the variable type to be dynamic. Now change the variables to:

a_0=1;
a_1="s";

Now these variables types are static. S

so, are you suggesting that in order to compile, users would be restricted in the variable naming that they use, or that this variable changing would somehow happen "behind the scenes" by the interpreter and be transparent to the user?  The user could still specify a = 1;  a = "s", and would never see the a_0, a_1 formulation?  Because the alternative is that you would require a complete break in m-code compatibility with Matlab and backward compatibility with previous versions of Octave.  It would also be quite a cumbersome burden to place on code development to have one style of variable naming for general use, and another much stricter variable naming scheme for compiler-compatible m-code.




Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
It would be automatic and not done by the user. Presenting the variables is
easy, just remove the suffix. Adding suffixes to every variable with a
counter when it changes type is not a problem, your interpreter has all the
info.
Steps would be:
m-code (by user) -> m-code auto changed to have "static variable types" ->
create runnable code


Also commands like assignin() that directly affect the dynamic objects would
not be supported, but only a few commands not working is acceptable.

Only the compiled data that Bison creates worries me, I have no idea how it
can be used. But assuming they can be put after eachother and the variables
in them can be declared in the beginning with another compiled piece...
that's a runnable program.

You already have variable handling capability and experience. Every variable
is in this compiler an object, with a counter that counts every time it
changes type, the suffix changes with the counter, _0 _1 _2 ...etc. The
types can be kept in a LinkedList (java) or something you are used to in
C++. No you have all the names and types. Variables are static in their
type.
The solution is there.





--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by nrjank
So basically every m-file-variable becomes many C-file-variables. It changes
name everytime it changes type. The names and types are stored and these are
used to create runnable code.



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

nrjank
On Wed, Apr 17, 2019 at 5:13 PM GoSim <[hidden email]> wrote:
So basically every m-file-variable becomes many C-file-variables. It changes
name everytime it changes type. The names and types are stored and these are
used to create runnable code.


how well would that scale? we often have people generating approaching-memory-limit sized data.  


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

apjanke-floss


On 4/17/19 6:11 PM, Nicholas Jankowski wrote:

> On Wed, Apr 17, 2019 at 5:13 PM GoSim <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     So basically every m-file-variable becomes many C-file-variables. It
>     changes
>     name everytime it changes type. The names and types are stored and
>     these are
>     used to create runnable code.
>
>
> how well would that scale? we often have people generating
> approaching-memory-limit sized data.  
>
>

I believe this could be done automatically, and if you placed some
limits on the use of assignin(), evalin(), and eval() (probably
forbidding them outright like GoSim says), it would have the same memory
use properties as existing code, because you could just automatically
clear the variables immediately after their last reference. (Of course,
you'd have to implement your own reference counting if you weren't
simply using octave_value objects for everything; and if you're still
using octave_value objects for everything, than what's the point of
converting to C? This also begs the question of what data structures you
_would_ use to represent arrays, and how you would pass them to all the
existing Octave functions that take Octave Array objects.)

This is a known approach called Static Single Assignment.
https://en.wikipedia.org/wiki/Static_single_assignment_form

It would probably be heck of hard to do to this based on the existing
bison/yacc approach instead of an IR that you could transform
programmatically. My guess is this would be like a year or more of work
for an experienced compiler hacker.

Cheers,
Andrew


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
I think it is easier to instead of trying to hack the compiled file make the
parser create compiled segments which are useful. This approch is probably
close to what Octave devs do all the time.



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by GoSim
GoSim wrote

> So basically every m-file-variable becomes many C-file-variables. It
> changes
> name everytime it changes type. The names and types are stored and these
> are
> used to create runnable code.
>
>
>
> --
> Sent from:
> http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


I was wrong here.

Every m-file variable is turned in to many static m-file variables, not
C-file variables.



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

Ian McCallion
On Thursday, 18 April 2019, GoSim <[hidden email]> wrote:

Every m-file variable is turned in to many static m-file variables, not
C-file variables.

I don't think this works. How for example would it cope with:

    if condition
       A = 1;
    else 
       A = "1";
    end

    B = A;

By nature an untyped interpretive language starts to fail when inner loops need executing many times and arguably it is a shame that at those points it is necessary for performance to switch to a different language and IDE.  So here is an alternative approach to address this particular issue - a language extension to bracket the code needing compilation. For example:

function x()
  Octave code
  compile
     (Octave code with language restrictions)
  endcompile
  More octave code
endfunction

The code would be compiled and cached when the containing function is first encountered. You could start with some severe language restrictions (eg only numeric data and for loops) as proof of concept.  

Cheers... Ian


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
  if condition
       A = 1;
    else
       A = "1";
    end

    B = A;


a method could be:
when an if condition is encountered search for "declarations". Put
declarations in front of the if construction. Introduce values that the user
can not set. Check if any of these values changes and select that one.

A_0=null
A_1=null
  if condition
       A_0 = 1;
    else
       A_1 = "1";
    end

find which A is not null
let's say A_1 is not null

    B_0 = A_1;

Some minor additions to your interpreter could handle this.



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by Ian McCallion
http://octave.1599824.n4.nabble.com/compile-m-file-loops-nested-loops-more-comprehensive-solution-td4691043.html

this is an older post where I solved nested loops, I thought ints and
doubles were different in octave then, but the idea is the same.





--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

apjanke-floss

On 4/18/19 12:13 PM, GoSim wrote:
> http://octave.1599824.n4.nabble.com/compile-m-file-loops-nested-loops-more-comprehensive-solution-td4691043.html
>
> this is an older post where I solved nested loops, I thought ints and
> doubles were different in octave then, but the idea is the same.
>

If you actually want to do this, you might have better luck writing it
as an Octave package rather than waiting for the core Octave interpreter
to grow this functionality. You could start with one of these ANTLR
grammars for Matlab:

https://www.mathworks.com/matlabcentral/fileexchange/32769-mparser
https://github.com/antlr/grammars-v4/tree/master/matlab

and modify it to support Octave's dialect. ANTLR can build parse tree
data structures, not just execute parsing. (Of course, anything that can
execute parsing could also be used to build a parse tree, by having it
create parse tree nodes as its action on visiting each parse token or
whatever.) That could be the basis for an IR on which you do the
transformations.

For that matter, contributing an Octave grammar to that ANTLR grammars
repo or as a standalone project might be a fun project for someone who
knows Octave well. That could be useful as a basis for a linter, code
formatter, etc., in addition to this C/C++ converter. Those could work
as Octave packages since the Octave interpreter and desktop
automatically pick up changes to "externally" modified files.

Cheers,
Andrew


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
For an outsider hacking the compiled code could be the easiest route as you
first suggested. How does your compiled code look? Is it readable or just
0011010101? If it is readable is it hard to make it accessible by command
for outsiders?



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


12