Is there a command that shows the C-code the interpreter creates?

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Is there a command that shows the C-code the interpreter creates?

GoSim
A command that shows the C-code the interpreter creates, does such a thing
exist?
I would like to create a m-file compiler for Octave and need this because I
don't want to look at your source code :-)



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

siko1056
On Tue, Apr 16, 2019 at 7:28 AM GoSim <[hidden email]> wrote:
A command that shows the C-code the interpreter creates, does such a thing
exist?
I would like to create a m-file compiler for Octave and need this because I
don't want to look at your source code :-)


In general it might be difficult to write an Octave related tool without looking at its source code?!  But if you take a look at https://hg.savannah.gnu.org/hgweb/octave/file/159402e52cfa/libinterp/parse-tree, currently Octave uses flex and bison for parsing m-code.  Thus objects are created and evaluated while parsing, without an intermediate language representation as you seek to find.

HTH,
Kai 


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

John W. Eaton
Administrator
In reply to this post by GoSim
On 4/15/19 6:27 PM, GoSim wrote:
> A command that shows the C-code the interpreter creates, does such a thing
> exist?

What do you mean by "the C-code the interpreter creates"?  Why do you
think it creates C code?  How do you think it works?

> I would like to create a m-file compiler for Octave and need this because I
> don't want to look at your source code :-)

Unless you do your work completely from scratch without using Octave, I
don't see how you would do this job without looking at Octave source code.

jwe



Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

Jordi Gutiérrez Hermoso-2
In reply to this post by GoSim
On Mon, 2019-04-15 at 17:27 -0500, GoSim wrote:
> A command that shows the C-code the interpreter creates, does such a thing
> exist?

I don't think it exists, but I'm surprised you're expecting it to
already exist. Is it because you think Octave was implemented in LLVM?
Octave is implemented in flex and bison. Others have pointed out where
those source files are in our tree.

Are you perhaps looking for an example of the bison and flex output?

Making Octave compile m-files into any other language is a huge
undertaking.

We did have someone a long time ago start to do some basic porting of
Octave to start using LLVM so that the IR that you might want could
exist, but the porting effort was using the unstable C++ LLVM API
which nobody could keep up to date, so that effort is mostly dead.
Reviving the LLVM bindings, or using any other JIT compiler like
libgccjit, would be a huge and significant contribution.


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by John W. Eaton
I think the intepreter creates C-code which is run....?

I thought you created the interpreter but now I understand you are using a
tool called Bison. But Bison creates C code, no? Is this code not available
to you?



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by Jordi Gutiérrez Hermoso-2
Yes, the Bison and/or flex output. This is C code, isn't it?

Wikipedia says it can create Java code also.



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by GoSim
Wikipedia says it can create Java code also. Since I am a Java guy I would
prefer that if java code can run your other files.



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by GoSim
The Bison parser implementation file is C code which defines a function named
yyparse which implements that grammar. This function does not make a
complete C program: you must supply some additional functions. One is the
lexical analyzer. Another is an error-reporting function which the parser
calls to report an error. In addition, a complete C program must start with
a function called main; you have to provide this, and arrange for it to call
yyparse or the parser will never run. See Parser C-Language Interface.


from:
http://www.gnu.org/software/bison/manual/html_node/Bison-Parser.html

I am assuming this is someting that you do? Somehow you are running the
code, can't you just make it available? I will try to solve the differing
variable problem and create a m-file compiler which everyone wants.

Are you running it line by line? How do you run your c-code?



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

Ian McCallion
On Tue, 16 Apr 2019 at 18:42, GoSim <[hidden email]> wrote:

>
> The Bison parser implementation file is C code which defines a function named
> yyparse which implements that grammar. This function does not make a
> complete C program: you must supply some additional functions. One is the
> lexical analyzer. Another is an error-reporting function which the parser
> calls to report an error. In addition, a complete C program must start with
> a function called main; you have to provide this, and arrange for it to call
> yyparse or the parser will never run. See Parser C-Language Interface.
>
>
> from:
> http://www.gnu.org/software/bison/manual/html_node/Bison-Parser.html
>
> I am assuming this is someting that you do? Somehow you are running the
> code, can't you just make it available? I will try to solve the differing
> variable problem and create a m-file compiler which everyone wants.
>
> Are you running it line by line? How do you run your c-code?

I think there is a terminological misunderstanding here. Clearly,
during Octave execution, functions are parsed when first encountered
and the parser output is cached as a data structure in memory.  These
cached data structures could be considered as compiled code (and could
be the starting point of your compiler) but obviously no-one with
knowledge of Octave internals thinks this way, whence the
misunderstanding!

For what its worth, a decade or so ago I had access to Matlab and
experimented briefly with their compiler. From memory the compiled C
code consisted almost entirely of function calls to a large runtime
library containing entry points to do such things as "add". Could this
be your first step? It would have no significant performance advantage
but could.be the basis for further development.

Cheers... Ian


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
No, I want the readable code, not compiled code.

--------------------------------------------------------
Here is a Bison tutorial:
http://alumni.cs.ucr.edu/~lgao/teaching/bison.html

Steps to use Bison:

Write a lexical analyzer to process input and pass tokens to the parser
(calc.lex).
Write the grammar specification for bison (calc.y), including grammar rules,
yyparse() and yyerror().
Run Bison on the grammar to produce the parser. (Makefile)
*Compile the code output by Bison, as well as any other source files.*
Link the object files to produce the finished product.
------------------------------------------------------

I want the bold marked line, "code output by Bison". Is this doable?





--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

Mike Miller-4
On Tue, Apr 16, 2019 at 14:49:53 -0500, GoSim wrote:

> No, I want the readable code, not compiled code.
>
> --------------------------------------------------------
> Here is a Bison tutorial:
> http://alumni.cs.ucr.edu/~lgao/teaching/bison.html
>
> Steps to use Bison:
>
> Write a lexical analyzer to process input and pass tokens to the parser
> (calc.lex).
> Write the grammar specification for bison (calc.y), including grammar rules,
> yyparse() and yyerror().
> Run Bison on the grammar to produce the parser. (Makefile)
> *Compile the code output by Bison, as well as any other source files.*
> Link the object files to produce the finished product.
> ------------------------------------------------------
>
> I want the bold marked line, "code output by Bison". Is this doable?

This is the code produced by Bison for Octave:

  https://sources.debian.org/src/octave/5.1.0-1/libinterp/parse-tree/oct-parse.cc/

--
mike


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

Jordi Gutiérrez Hermoso-2
In reply to this post by GoSim
On Tue, 2019-04-16 at 12:20 -0500, GoSim wrote:
> Yes, the Bison and/or flex output. This is C code, isn't it?

It's C++, not C. Here you go, I attached the output of an old Octave
build I had lying around.

> Wikipedia says it can create Java code also.

I think we'd have to rewrite our source to do so, since I believe our
bison and flex sources are written for C++ output.


bison-flex-octave-output.tar.gz (128K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by Mike Miller-4
Ok, I misunderstood, Bison doesn't parse the m-code but creates a parser. So
Bison is a general interpreter. And the parser creates a parse tree.

And you run this parser on every line in the m-file. I thought the m-code
was converted to C-code somewhere...

How does your parser know if it is a int or double?



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by Jordi Gutiérrez Hermoso-2
That was not necessary, I had misunderstood how it worked. I am sorry.



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by Ian McCallion
>I think there is a terminological misunderstanding here. Clearly,
>during Octave execution, functions are parsed when first encountered
>and the parser output is cached as a data structure in memory.  These
>cached data structures could be considered as compiled code (and could
>be the starting point of your compiler) but obviously no-one with
>knowledge of Octave internals thinks this way, whence the
>misunderstanding!


This is very interesting. If these data structures are put after
eachother...do you do anything manually or does Bison do everything? I am
guessing you have to manually take care of the variables?



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

apjanke-floss
In reply to this post by GoSim


On 4/16/19 4:56 PM, GoSim wrote:
> Ok, I misunderstood, Bison doesn't parse the m-code but creates a parser. So
> Bison is a general interpreter. And the parser creates a parse tree.

Pretty much. Bison generates a C program which is a parser, and the
execution of that parser procedurally processes the parse tree, though
the parse tree itself may not be represented explicitly in a data structure.

> And you run this parser on every line in the m-file. I thought the m-code
> was converted to C-code somewhere...

Nope.

> How does your parser know if it is a int or double?

All numerics in Octave are doubles unless you explicitly convert them to
ints using int32() or similar conversion functions. All numeric literals
in Octave produce doubles. So if you see a number in Octave code, it's a
double.

Cheers,
Andrew


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

apjanke-floss
In reply to this post by Ian McCallion


On 4/16/19 3:12 PM, Ian McCallion wrote:

> On Tue, 16 Apr 2019 at 18:42, GoSim <[hidden email]> wrote:
>>
>> The Bison parser implementation file is C code which defines a function named
>> yyparse which implements that grammar. This function does not make a
>> complete C program: you must supply some additional functions. One is the
>> lexical analyzer. Another is an error-reporting function which the parser
>> calls to report an error. In addition, a complete C program must start with
>> a function called main; you have to provide this, and arrange for it to call
>> yyparse or the parser will never run. See Parser C-Language Interface.
>>
>>
>> from:
>> http://www.gnu.org/software/bison/manual/html_node/Bison-Parser.html
>>
>> I am assuming this is someting that you do? Somehow you are running the
>> code, can't you just make it available? I will try to solve the differing
>> variable problem and create a m-file compiler which everyone wants.
>>
>> Are you running it line by line? How do you run your c-code?
>
> I think there is a terminological misunderstanding here. Clearly,
> during Octave execution, functions are parsed when first encountered
> and the parser output is cached as a data structure in memory.  These
> cached data structures could be considered as compiled code (and could
> be the starting point of your compiler) but obviously no-one with
> knowledge of Octave internals thinks this way, whence the
> misunderstanding!
>
> For what its worth, a decade or so ago I had access to Matlab and
> experimented briefly with their compiler. From memory the compiled C
> code consisted almost entirely of function calls to a large runtime
> library containing entry points to do such things as "add". Could this
> be your first step? It would have no significant performance advantage
> but could.be the basis for further development.
>
> Cheers... Ian

There are two different Matlab "compiler" products that work in
completely different ways.

The Matlab Compiler, which is used to "compile" and deploy/redistribute
arbitrary Matlab code, doesn't actually compile code at all. It just
bundles up M-code into obfuscated encrypted zip files, and then the
Matlab Runtime executes them by running a headless (no visible desktop
IDE) version of Matlab that is embedded in your application. There's no
performance advantage at all here; in fact, there's some overhead due to
the de-encryption stage at execution time.

The Matlab Coder generates C/C++ programs from M-code programs. It only
supports a subset of Matlab language features and library functions. The
performance advantage is going to be limited to the parts of your code
that do scalar operations, have stateful loops, have non-vectorizable
code, or the like, since most vectorized/array operations are going to
be implemented with basically the same library functions that the M-code
interpreter uses at run time, but it is there. I'm guessing this is the
one you were using, Ian, because of the library entry points like "add"?

There's also the pcode() command, which is an alternate code obfuscation
mechanism. I believe this is purely an obfuscation mechanism, and
doesn't alter how the program is interpreted at runtime, so also offers
no performance or portability advantage.

Cheers,
Andrew


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
In reply to this post by apjanke-floss
apjanke-floss wrote

> On 4/16/19 4:56 PM, GoSim wrote:
>> Ok, I misunderstood, Bison doesn't parse the m-code but creates a parser.
>> So
>> Bison is a general interpreter. And the parser creates a parse tree.
>
> Pretty much. Bison generates a C program which is a parser, and the
> execution of that parser procedurally processes the parse tree, though
> the parse tree itself may not be represented explicitly in a data
> structure.
>
>> And you run this parser on every line in the m-file. I thought the m-code
>> was converted to C-code somewhere...
>
> Nope.
>
>> How does your parser know if it is a int or double?
>
> All numerics in Octave are doubles unless you explicitly convert them to
> ints using int32() or similar conversion functions. All numeric literals
> in Octave produce doubles. So if you see a number in Octave code, it's a
> double.
>
> Cheers,
> Andrew

How does your parser know if it is a string or int or matrix or imaginary
number?
thanks for clearing stuff up btw.




--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

GoSim
Do you have access to the compiled structures that Bison creates?

If yes, then changing the name of the variables in the m-file (instead of
the c-file that doesn't exist) by adding a suffix everytime the type changes
should be sufficient. Put the structures after eachother and that's the
compiled code. Without loops there is no performance difference, but with
loops if it is not needed to run the parser more than once for every line it
could be a big difference.

I need to know how you detect if it is a string, int, imaginary number etc.


If I create a program that changes the m-code according to my idea will you
take care of the Bison thing and put its compiled structures after eachother
and run it?
In the future though, busy atm.



--
Sent from: http://octave.1599824.n4.nabble.com/Octave-General-f1599825.html


Reply | Threaded
Open this post in threaded view
|

Re: Is there a command that shows the C-code the interpreter creates?

apjanke-floss
In reply to this post by GoSim


On 4/16/19 6:38 PM, GoSim wrote:

> apjanke-floss wrote
>> On 4/16/19 4:56 PM, GoSim wrote:
>>> Ok, I misunderstood, Bison doesn't parse the m-code but creates a parser.
>>> So
>>> Bison is a general interpreter. And the parser creates a parse tree.
>>
>> Pretty much. Bison generates a C program which is a parser, and the
>> execution of that parser procedurally processes the parse tree, though
>> the parse tree itself may not be represented explicitly in a data
>> structure.
>>
>>> And you run this parser on every line in the m-file. I thought the m-code
>>> was converted to C-code somewhere...
>>
>> Nope.
>>
>>> How does your parser know if it is a int or double?
>>
>> All numerics in Octave are doubles unless you explicitly convert them to
>> ints using int32() or similar conversion functions. All numeric literals
>> in Octave produce doubles. So if you see a number in Octave code, it's a
>> double.
>>
>> Cheers,
>> Andrew
>
> How does your parser know if it is a string or int or matrix or imaginary
> number?
> thanks for clearing stuff up btw.

You're welcome!

See the Octave docs for how strings, ints, imaginaries, and so on are
represented in M-code source: https://octave.org/doc/interpreter/

From the parser's point of view:

> a string

Strings are represented with single-quoted or double-quoted string
literals ('...' or "...").

> or int

Ints are only created with conversion functions, like "int32(42)".

> or matrix

In Octave, *every* value is a matrix. There is no such thing as a scalar
value, like there is in most every other programming language.
*Everything* is a matrix. (Or, rather, everything is an N-dimensional
array, and we call specifically 2-D arrays "matrixes".) Let *that* sink
in. :)

> imaginary number

Imaginary numbers are produced by suffixing a numeric literal with "i"
or "j". E.g. "1.23i". To create a complex number, you add a real and an
imaginary number with the regular addition operator, e.g. "1.23 + 4.56i".

And variables have no type; it's all dynamically typed. (In fact,
because Octave is so dynamic, and function calls & array indexing use
the same syntax, you don't necessarily know at parse time whether a
given identifier is a variable or a function.)

As for how the *interpreter* knows this stuff about live objects after
the parsing stage, it's all done dynamically with data structures.
(Almost) every Octave value is represented by a C++ object of class
Array or one of its subclasses, and that object contains fields that
indicate the length/size/dimensionality, imaginariness, and data type
(int32/double/char/MCOS-object) of the array, along with a pointer to
the C array containing the raw underlying data and some
reference-counting/bookkeeping data. Variables are all untyped; type and
size information is only contained in values. So there's nothing like
the C notion of a primitive scalar int or char or double. It's more like
every value in Octave is like a Java java.lang.Object, where there's a
heap-allocated interpreter data structure holding run-time type info for
it, and every Octave variable is like a Java variable of type
java.lang.Object, that just holds a reference to a structure whose type
you can query dynamically.

This stuff is Octave internals, so there's not much doco for it, but you
can see the source code and all the details in the "liboctave"
subdirectory of the octave source code. https://www.octave.org/hg/octave
The "libinterp" subdir holds most of the interpreter/parser stuff; with
"libinterp/octave-value" hodling the stuff for variables. There's also
Appendix A: External Code Interface in the Octave manual that describes
how oct-files (Octave functions written in C++) interact with the Octave
internals, which could be relevant here.

Cheers,
Andrew


12