Learning the Octave interpreter code to implement Java class dot-referencing

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Learning the Octave interpreter code to implement Java class dot-referencing

apjanke-floss
Hi, Octave maintainers,

A while ago, I took a stab at adding support for dot-reference syntax
for Java classes. (https://savannah.gnu.org/bugs/index.php?41239) I
totally failed, because I don't understand the Octave interpreter code
well enough, and don't have bison/yacc skills.

Can anyone point me at resources for learning about the Octave
interpreter code, or bison/yacc generically, that would help me get good
enough to write a patch for this?

Cheers,
Andrew

Reply | Threaded
Open this post in threaded view
|

Re: Learning the Octave interpreter code to implement Java class dot-referencing

John W. Eaton
Administrator
On 7/8/20 2:27 AM, Andrew Janke wrote:

> Hi, Octave maintainers,
>
> A while ago, I took a stab at adding support for dot-reference syntax
> for Java classes. (https://savannah.gnu.org/bugs/index.php?41239) I
> totally failed, because I don't understand the Octave interpreter code
> well enough, and don't have bison/yacc skills.
>
> Can anyone point me at resources for learning about the Octave
> interpreter code, or bison/yacc generically, that would help me get good
> enough to write a patch for this?

Unless you need new syntax for this feature,  I don't think you'll need
to know too much about Bison.

Do you want to make expressions like

   java.lang.Double (42)

?  Is this similar to invoking static methods for classdef classes?  If
so, maybe we can do the same kind of thing that we do for those?   Or,
maybe there is a better way?

Here is what happens for classdef: an expression like

   myclass.method (args)

is evaluated in tree_index_expression::evaluate_n in pt-idx.cc.

When evaluating the first component of the expression ("myclass"),
Octave will find and load the constructor for the myclass object.  This
step happens in the fcn_info::fcn_info_rep::xfind function (around line
740 of fcn-info.cc).  But since we don't know at that point whether we
are looking up a constructor call or the name will be used to invoke a
static method, we get a classdef_meta object (a builtin function object)
instead of the constructor function itself.

Then, back in tree_index_expression::evaluate_n, we see that we are
indexing an object (classdef_meta) and then gather up the remaining
arguments to pass to the classdef_meta subsref method.

So, to handle "java" similarly, we would need some kind of java_meta
object with an appropriate subsref method.  That object would be
inserted in the function table when the octave_java type is installed.
Then fcn_info::xfind can return that object when it sees the "java"
symbol.  Or, if we only have to handle the single word "java", maybe it
could just be a special case at the appropriate place infcn_info::xfind
to get the precedence right.

I'm not 100% certain, but I think the classdef_meta object is derived
from octave_function instead of just being a value so that it can't be
wiped out by a variable definition.  But if there is a better way, then
maybe we can simplify the way the whole classdef_meta thing works at the
same time we implement support for java static methods.

jwe



Reply | Threaded
Open this post in threaded view
|

Re: Learning the Octave interpreter code to implement Java class dot-referencing

apjanke-floss


On 7/8/20 1:28 PM, John W. Eaton wrote:

> On 7/8/20 2:27 AM, Andrew Janke wrote:
>> Hi, Octave maintainers,
>>
>> A while ago, I took a stab at adding support for dot-reference syntax
>> for Java classes. (https://savannah.gnu.org/bugs/index.php?41239) I
>> totally failed, because I don't understand the Octave interpreter code
>> well enough, and don't have bison/yacc skills.
>>
>> Can anyone point me at resources for learning about the Octave
>> interpreter code, or bison/yacc generically, that would help me get good
>> enough to write a patch for this?
>
> Unless you need new syntax for this feature,  I don't think you'll need
> to know too much about Bison.
>
> Do you want to make expressions like
>
>   java.lang.Double (42)
>
> ?  Is this similar to invoking static methods for classdef classes?  If
> so, maybe we can do the same kind of thing that we do for those?   Or,
> maybe there is a better way?

Exactly. I want to support the

  java.lang.Double(42)

constructor invocation form, and the

  java.lang.Double.parseDouble("42")

static method invocation form. It is completely analagous to invoking
constructors or static methods on M-code classdef classes.

> Here is what happens for classdef: an expression like
>
>   myclass.method (args)
>
> is evaluated in tree_index_expression::evaluate_n in pt-idx.cc.
>
> When evaluating the first component of the expression ("myclass"),
> Octave will find and load the constructor for the myclass object.  This
> step happens in the fcn_info::fcn_info_rep::xfind function (around line
> 740 of fcn-info.cc).  But since we don't know at that point whether we
> are looking up a constructor call or the name will be used to invoke a
> static method, we get a classdef_meta object (a builtin function object)
> instead of the constructor function itself.
>
> Then, back in tree_index_expression::evaluate_n, we see that we are
> indexing an object (classdef_meta) and then gather up the remaining
> arguments to pass to the classdef_meta subsref method.
>
> So, to handle "java" similarly, we would need some kind of java_meta
> object with an appropriate subsref method.  That object would be
> inserted in the function table when the octave_java type is installed.
> Then fcn_info::xfind can return that object when it sees the "java"
> symbol.  Or, if we only have to handle the single word "java", maybe it
> could just be a special case at the appropriate place infcn_info::xfind
> to get the precedence right.

Yep, a similar mechanism for Java sounds right to me.

(It's not just the single word "java"; Java packages can start with any
identifier, just like Octave packages, and most of the interesting ones
don't start with "java.*", so we'd have to handle all identifiers.)

The one complication is in resolving package-qualified names. Consider:

  x = foo.bar.baz.Qux(42);

The namespace "foo.bar.baz" could be either a Java package or an Octave
classdef package. And in fact, it could be both in the same codebase: in
Matlab at least, M-code and Java namespaces do not mask each other, and
identifier resolution is done at the fully-qualified class level. So in
order to determine whether this expression should dispatch to a Java
constructor or an M-code classdef constructor, the interpreter needs to
consider the full expression "foo.bar.baz.Qux", and not consider "foo"
on its own and resolve that independently.

That is where I got stuck when I first tried this. Based on my reading
of the libinterp code, the Octave interpreter takes the components
"foo", "bar", "baz", and "Qux" one at a time, and ends up eagerly
resolving "foo" to an Octave namespace, and then proceeds to resolve
"bar" relative to that. Doing it this way I think means that package
prefixes in Java and M-code would need to mask each other, and do so at
the prefix/component level.

And in the Java world, packages names are not hierarchical: the
existence of the "foo.bar.baz" package does not imply the existence of
or have any relationship to the "foo" package. The dots are just plain
characters that are part of the package name, and not special; arranging
those packages in a hierarchy based on prefixes is just something done
by humans and Java IDEs as a convenience for developers. So, given a
Java class "foo.bar.baz.Qux" loaded in to your JVM, if you queried the
JVM for the existence of the "foo" package, it would return false
(unless some other class separately defined some "foo.Whatever" class).

In practice, maybe this isn't an issue: M-code developers don't use
package names the way Java developers do, so it seems unlikely to me
that there would actually be a namespace conflict. So it would probably
be fine if Octave did allow M-code and Java packages to mask each other.
And Octave could, I think, just get a list of the Java packages from the
JVM, and then build the fake subpackage hierarchy internally.

A java_meta object sounds like a good approach to me.

Cheers,
Andrew

> I'm not 100% certain, but I think the classdef_meta object is derived
> from octave_function instead of just being a value so that it can't be
> wiped out by a variable definition.  But if there is a better way, then
> maybe we can simplify the way the whole classdef_meta thing works at the
> same time we implement support for java static methods.

I'll have a look at classdef_meta.

>
> jwe
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Learning the Octave interpreter code to implement Java class dot-referencing

John W. Eaton
Administrator
In reply to this post by John W. Eaton
On 7/8/20 1:28 PM, I wrote:
> On 7/8/20 2:27 AM, Andrew Janke wrote:

> I'm not 100% certain, but I think the classdef_meta object is derived
> from octave_function instead of just being a value so that it can't be
> wiped out by a variable definition.

No, wait, that can't be right.  Since it is just the object that is
returned when looking up a class constructor, it seems like it could be
a value object instead of a function, unless there are reasons for it to
behave more like a function.

In any case, things might be clearer if the fcn_info object used a name
like classdef_meta_info instead of class_constructors, since I think
that map doesn't actually contain the constructor functions themselves.

jwe



Reply | Threaded
Open this post in threaded view
|

Re: Learning the Octave interpreter code to implement Java class dot-referencing

Hossein Sajjadi
In reply to this post by apjanke-floss
On 7/8/20, Andrew Janke <[hidden email]> wrote:

> Hi, Octave maintainers,
>
> A while ago, I took a stab at adding support for dot-reference syntax
> for Java classes. (https://savannah.gnu.org/bugs/index.php?41239) I
> totally failed, because I don't understand the Octave interpreter code
> well enough, and don't have bison/yacc skills.
>
> Can anyone point me at resources for learning about the Octave
> interpreter code, or bison/yacc generically, that would help me get good
> enough to write a patch for this?
>
> Cheers,
> Andrew
>
>

Backed by developing Octave Coder here are my insights  when trying to
create an external language interface for python and before placing it
in my TODO queue (The same concepts here are applied to java):
For each java class/ package used in an expression an equivalent
Octave classdef class / package is generated on the fly. Each class
has a data member of type ov-java.h/octave_java that holds the
underlying java object. The generated class only contains public
members. Those on the fly classes can be written to disk as .m
classdef files using a code generator. That is a simple solution but
the best and the more complex solution  is, creating in-memory
classdef classes.
For it you need to work with functions in ov-classdef.h and
ov-classdef.cc and finally use ov-classdef.h/cdef_manager to register
new on the fly packages and classes.
I'm not sure but I think some functions in ov-classdef.h have been
made private that may make that difficult. If you encounter problems
you may need help from admin to make those methods public.

When you complete developing such an interface there is no need to
have special rules to evaluate indexing expressions containing java
objects (base_expr_val.isjava () in libinterp/pt-eval.cc) and java
objects are evaluated like other classdef objects.

This idea can be extended further and an Octave classdef object that
is inherited from base java object can be sent to java. But it
requires that on-the-fly java classes are generated from octave
classdef classes.
--

Sincerely,
Hossein

Reply | Threaded
Open this post in threaded view
|

Re: Learning the Octave interpreter code to implement Java class dot-referencing

apjanke-floss


On 7/11/20 10:01 AM, Hossein Sajjadi wrote:
>
> This idea can be extended further and an Octave classdef object that
> is inherited from base java object can be sent to java. But it
> requires that on-the-fly java classes are generated from octave
> classdef classes.
>

The thing is, in this case, the Java External Interface for Octave is
already written, and it works fine. The issue is just that you have to
explicitly call the `javaObject(...)` and `javaMethod(...)` functions,
instead of using `myjavapackage.foo.bar.Class.Method(...)`
dot-referencing syntax in your M-code, which I think is solely in the
domain of the interpreter. All the other pieces are already in place.

Cheers,
Andrew

Reply | Threaded
Open this post in threaded view
|

Re: Learning the Octave interpreter code to implement Java class dot-referencing

Hossein Sajjadi
In reply to this post by Hossein Sajjadi
On 7/11/20, Mike Miller <[hidden email]> wrote:

> Hi Hossein,
>
> On Sat, Jul 11, 2020 at 18:31:38 +0430, Hossein Sajjadi wrote:
>> Backed by developing Octave Coder here are my insights  when trying to
>> create an external language interface for python and before placing it
>> in my TODO queue (The same concepts here are applied to java):
>
> Are you aware of the Python language interface I've been developing for
> several years now? Care to help improve it instead of creating a new
> one?
>
>   https://wiki.octave.org/Pythonic
>   https://gitlab.com/mtmiller/octave-pythonic
>
> Thanks,
>
> --
> mike
>

Hi Mike!
Interesting! If I correctly understand, In Pythonic a class is defined
in Octave named 'py' and subsref is overloaded to enable dot ref
indexing. The result of indexing operation will be a dot separated
string  plus the argument list.The argument list is converted to
python objects and the proper python function  is searched and
executed.

One may follow the same approach to implement dot referencing for java
with the benefit that he has a working project at hand to follow. But
if they want to follow the method that I pointed out they need to do a
different work. But java.__ here will be treated as package
(namespace) and the idea can be extended so that one can subclass a
java class in Octave and send its instance objects to java.

Honestly my TODO for python is just restricted to an idea. Currently I
focused on Coder and some other things.
--

Sincerely,
Hossein

Reply | Threaded
Open this post in threaded view
|

Re: Learning the Octave interpreter code to implement Java class dot-referencing

Hossein Sajjadi
In reply to this post by apjanke-floss
On 7/12/20, Andrew Janke <[hidden email]> wrote:

>
>
> On 7/11/20 10:01 AM, Hossein Sajjadi wrote:
>>
>> This idea can be extended further and an Octave classdef object that
>> is inherited from base java object can be sent to java. But it
>> requires that on-the-fly java classes are generated from octave
>> classdef classes.
>>
>
> The thing is, in this case, the Java External Interface for Octave is
> already written, and it works fine. The issue is just that you have to
> explicitly call the `javaObject(...)` and `javaMethod(...)` functions,
> instead of using `myjavapackage.foo.bar.Class.Method(...)`
> dot-referencing syntax in your M-code, which I think is solely in the
> domain of the interpreter. All the other pieces are already in place.
>
> Cheers,
> Andrew
>

It already is written  when no classdef implementation existed. The
`octave_java` as pointed out in the previous post is related to
javaObject and javaMethod. 'octave_java' is one of pieces of the
proposed design . I think the current  interpreter has all of things
to be used in the implementation of java dot referencing. If MATLAB
compatibility is important all things should be implemented as
classdef. Defining special rules for evaluation of expressions
containing java objects complicates the evaluator and may have
performance implications.
--

Sincerely,
Hossein

Reply | Threaded
Open this post in threaded view
|

Re: Learning the Octave interpreter code to implement Java class dot-referencing

apjanke-floss


On 7/11/20 5:35 PM, Hossein Sajjadi wrote:

>
> It already is written  when no classdef implementation existed. The
> `octave_java` as pointed out in the previous post is related to
> javaObject and javaMethod. 'octave_java' is one of pieces of the
> proposed design . I think the current  interpreter has all of things
> to be used in the implementation of java dot referencing. If MATLAB
> compatibility is important all things should be implemented as
> classdef. Defining special rules for evaluation of expressions
> containing java objects complicates the evaluator and may have
> performance implications.
>

It's not a "special rule for evaluation of expressions containing java
obects"; no Java objects are involved at all. It's the general rule for
evaluating expressions that may contain dot-qualified identifiers which
may resolve to either Java classes (not objects) or M-code classdef
classes (not objects).

Cheers,
Andrew