String display

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

String display

Rik-4
jwe,

It's trivial to get 90% of the way there, but the last 10% is going to be a
PITA.  See the attached diff.

The trouble is that disp() does not print surrounding characters around
strings, and disp() is used by all the print routines in the octave-value
directory.  This is required because disp() might be overloaded by a
user-defined class.  Here's a backtrace showing that display() is called to
print the name tag, which then calls disp(), which then calls the print
routines which are overloaded for a particular octave_value type

#0  octave_print_internal (os=..., chm=..., pr_as_read_syntax=false,
pr_as_string=true)
    at libinterp/corefcn/pr-output.cc:2628
#1  0x00007f097b2a9976 in octave_print_internal (os=..., nda=...,
pr_as_read_syntax=false,
    extra_indent=0, pr_as_string=true) at libinterp/corefcn/pr-output.cc:2678
#2  0x00007f097ad15ef3 in octave_char_matrix_str::print_raw
(this=0x7f0944456fc0, os=...,
    pr_as_read_syntax=false) at libinterp/octave-value/ov-str-mat.cc:260
#3  0x00007f097ac6f3fc in octave_base_matrix<charNDArray>::print
(this=0x7f0944456fc0, os=...,
    pr_as_read_syntax=false) at libinterp/octave-value/ov-base-mat.cc:454
#4  0x00007f097acf7000 in octave_value::print (this=0x7f0951317500, os=...,
pr_as_read_syntax=false)
    at ./libinterp/octave-value/ov.h:1233
#5  0x00007f097b2ae343 in Fdisp (args=..., nargout=0) at
libinterp/corefcn/pr-output.cc:3350
#6  0x00007f097ac5bde5 in octave_builtin::call (this=0x7f094409a960,
tw=..., nargout=0, args=...)
    at libinterp/octave-value/ov-builtin.cc:62
#7  0x00007f097b1a7616 in octave::interpreter::feval (this=0x7f0944004a60,
name="disp", args=...,
    nargout=0) at libinterp/corefcn/interpreter.cc:1402
#8  0x00007f097b1a74a5 in octave::interpreter::feval (this=0x7f0944004a60,
    name=0x7f097b3b90c5 "disp", args=..., nargout=0) at
libinterp/corefcn/interpreter.cc:1388
#9  0x00007f097ade7402 in octave::feval (name=0x7f097b3b90c5 "disp",
args=..., nargout=0)
    at libinterp/parse-tree/oct-parse.yy:4950
#10 0x00007f097b2ae884 in Fdisplay (args=...) at
libinterp/corefcn/pr-output.cc:3517
#11 0x00007f097ac5bde5 in octave_builtin::call (this=0x7f094409afa0,
tw=..., nargout=0, args=...)
    at libinterp/octave-value/ov-builtin.cc:62
#12 0x00007f097b1a7616 in octave::interpreter::feval (this=0x7f0944004a60,
name="display",
    args=..., nargout=0) at libinterp/corefcn/interpreter.cc:1402
#13 0x00007f097b1a74a5 in octave::interpreter::feval (this=0x7f0944004a60,
    name=0x7f097b3953d0 "display", args=..., nargout=0) at
libinterp/corefcn/interpreter.cc:1388
#14 0x00007f097ade7402 in octave::feval (name=0x7f097b3953d0 "display",
args=..., nargout=0)
    at libinterp/parse-tree/oct-parse.yy:4950

The simplest solution would be to have some knowledge of whether this
particular call to print was being done on behalf of the interpreter or
not.  Then the code within the overload of octave_print_internal for
charMatrix becomes

if (for_interpreter)
  os << '\'' << row << '\'';
else
  os << row;

But I don't have a good idea of how to pass that information down.

--Rik

56973.diff (760 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: String display

John W. Eaton
Administrator
On 10/23/19 5:47 PM, Rik wrote:

> It's trivial to get 90% of the way there, but the last 10% is going to be a
> PITA.  See the attached diff.
>
> The trouble is that disp() does not print surrounding characters around
> strings, and disp() is used by all the print routines in the octave-value
> directory.  This is required because disp() might be overloaded by a
> user-defined class.  Here's a backtrace showing that display() is called to
> print the name tag, which then calls disp(), which then calls the print
> routines which are overloaded for a particular octave_value type

Yeah, I see similar issues with the design when looking at how to add
the type and dimension info.

jwe

Reply | Threaded
Open this post in threaded view
|

Re: String display

apjanke-floss
In reply to this post by Rik-4
Here's something else to consider while we're on the subject: the disp()
API has some limitations, especially when it comes to customizing output
for compound data structures.

Disp:
a) Combines both conversion of a data value to a displayable string
representation, and the outputting of that string to the console
b) Operates on an entire array at once, instead of on individual elements
c) Does not call disp() overrides for user-defined classes which are
displayed inside a compound data structure like a struct or cell array.

For example, b) is an issue if you want to compose some arrays into a
dataframe or table with heterogeneous columns: say you have a table foo
with a double, a char or cellstr, and a datetime object. The
dataframe/table class's disp needs to have string representations for
each element of the double and datetime arrays, and special handling for
the char/cellstr column. And then it needs to arrange those in its own
2-D layout, maybe with cell borders and whatnot.

Or let's say you make a struct with a double and a datetime in its fields.

>> s = struct('foo', 42, 'bar', datetime('2019-01-02 12:34'))

I'd like it to display like this:

s =
  scalar structure containing the fields:
    foo = 42
    bar = 02-Jan-2019 12:34:00
>>

Not like this:

s =
  scalar structure containing the fields:
    foo =  42
    bar =
    <object datetime>

>>


For Matlab programming, I came up with my own generic API called
"dispstr" to handle this: https://github.com/apjanke/dispstr. It defines
conventional, overridable functions dispstr() and dispstrs() that do
just the string conversion step of disp(), and do it respectively for
either an entire array, or for each element in an array. (Because it's
an add-on and not part of Matlab itself, I had to do horrible things
with monkeypatching to get it to work nicely. They're so horrible that I
didn't even include them in the public repo.)

Octave's disp() isn't as bad as Matlab's, because you can capture the
output without evalc(). But it still has the other limitations. And in
the case where you're disp()ing a string, you can't really tell whether
the user intended to generically get the string representation for a
value, or if they just wanted a convenient way to put some text on the
console.

Maybe Octave would like to consider something like that? Many other
modern programming languages, especially high-level or OOP ones, have an
equivalent, like toString, to_s, str/repr.

I realize this would be a big change, so it's not something to take on
lightly.

Cheers,
Andrew

On 10/23/19 5:47 PM, Rik wrote:

> jwe,
>
> It's trivial to get 90% of the way there, but the last 10% is going to be a
> PITA.  See the attached diff.
>
> The trouble is that disp() does not print surrounding characters around
> strings, and disp() is used by all the print routines in the octave-value
> directory.  This is required because disp() might be overloaded by a
> user-defined class.  Here's a backtrace showing that display() is called to
> print the name tag, which then calls disp(), which then calls the print
> routines which are overloaded for a particular octave_value type
>
> #0  octave_print_internal (os=..., chm=..., pr_as_read_syntax=false,
> pr_as_string=true)
>     at libinterp/corefcn/pr-output.cc:2628
> #1  0x00007f097b2a9976 in octave_print_internal (os=..., nda=...,
> pr_as_read_syntax=false,
>     extra_indent=0, pr_as_string=true) at libinterp/corefcn/pr-output.cc:2678
> #2  0x00007f097ad15ef3 in octave_char_matrix_str::print_raw
> (this=0x7f0944456fc0, os=...,
>     pr_as_read_syntax=false) at libinterp/octave-value/ov-str-mat.cc:260
> #3  0x00007f097ac6f3fc in octave_base_matrix<charNDArray>::print
> (this=0x7f0944456fc0, os=...,
>     pr_as_read_syntax=false) at libinterp/octave-value/ov-base-mat.cc:454
> #4  0x00007f097acf7000 in octave_value::print (this=0x7f0951317500, os=...,
> pr_as_read_syntax=false)
>     at ./libinterp/octave-value/ov.h:1233
> #5  0x00007f097b2ae343 in Fdisp (args=..., nargout=0) at
> libinterp/corefcn/pr-output.cc:3350
> #6  0x00007f097ac5bde5 in octave_builtin::call (this=0x7f094409a960,
> tw=..., nargout=0, args=...)
>     at libinterp/octave-value/ov-builtin.cc:62
> #7  0x00007f097b1a7616 in octave::interpreter::feval (this=0x7f0944004a60,
> name="disp", args=...,
>     nargout=0) at libinterp/corefcn/interpreter.cc:1402
> #8  0x00007f097b1a74a5 in octave::interpreter::feval (this=0x7f0944004a60,
>     name=0x7f097b3b90c5 "disp", args=..., nargout=0) at
> libinterp/corefcn/interpreter.cc:1388
> #9  0x00007f097ade7402 in octave::feval (name=0x7f097b3b90c5 "disp",
> args=..., nargout=0)
>     at libinterp/parse-tree/oct-parse.yy:4950
> #10 0x00007f097b2ae884 in Fdisplay (args=...) at
> libinterp/corefcn/pr-output.cc:3517
> #11 0x00007f097ac5bde5 in octave_builtin::call (this=0x7f094409afa0,
> tw=..., nargout=0, args=...)
>     at libinterp/octave-value/ov-builtin.cc:62
> #12 0x00007f097b1a7616 in octave::interpreter::feval (this=0x7f0944004a60,
> name="display",
>     args=..., nargout=0) at libinterp/corefcn/interpreter.cc:1402
> #13 0x00007f097b1a74a5 in octave::interpreter::feval (this=0x7f0944004a60,
>     name=0x7f097b3953d0 "display", args=..., nargout=0) at
> libinterp/corefcn/interpreter.cc:1388
> #14 0x00007f097ade7402 in octave::feval (name=0x7f097b3953d0 "display",
> args=..., nargout=0)
>     at libinterp/parse-tree/oct-parse.yy:4950
>
> The simplest solution would be to have some knowledge of whether this
> particular call to print was being done on behalf of the interpreter or
> not.  Then the code within the overload of octave_print_internal for
> charMatrix becomes
>
> if (for_interpreter)
>   os << '\'' << row << '\'';
> else
>   os << row;
>
> But I don't have a good idea of how to pass that information down.
>
> --Rik
>

Reply | Threaded
Open this post in threaded view
|

disp architecture

Rik-4
On 10/23/2019 03:34 PM, Andrew Janke wrote:

> Here's something else to consider while we're on the subject: the disp()
> API has some limitations, especially when it comes to customizing output
> for compound data structures.
>
> Disp:
> a) Combines both conversion of a data value to a displayable string
> representation, and the outputting of that string to the console
> b) Operates on an entire array at once, instead of on individual elements
> c) Does not call disp() overrides for user-defined classes which are
> displayed inside a compound data structure like a struct or cell array.

Indeed it does have problems.  Unfortunately, resolving this is going to be
difficult because it requires close conformance to Matlab.  In effect, we
need imagination within the confines of a straitjacket.

For point a), at least Octave's disp() function can either send the string
to the console or return it to the caller for further post-processing.  As
such, a user-defined class can overload disp, use the built-in disp to get
a string representation, and then modify it before displaying it.

For point b), Matlab also operates on whole arrays so we can't get rid of
that.  Example code:

x = magic (3);
disp (x)

For point c), Matlab seems to get around this by not calling disp() at all
on elements of an aggregating data structure like a struct or cell array. 
Instead, it merely prints the name tag for the object (class and size). 
Example code

x = magic (3);
s.a = int8 (x);
s.b = single (x);
disp (s)
a: [3x3 int8]
b: [3x3 single]

We could shift to doing something like that in which case you would need to
use disp on individual elements to actually see what they contain.

If any of this seems argumentative, it's not meant to be.  I'm just trying
to lay out what the baseline is and where innovation would need to start.

--Rik

Reply | Threaded
Open this post in threaded view
|

Re: disp architecture

John W. Eaton
Administrator
On 10/25/19 1:08 PM, Rik wrote:

> For point c), Matlab seems to get around this by not calling disp() at all
> on elements of an aggregating data structure like a struct or cell array.
> Instead, it merely prints the name tag for the object (class and size).

Isn't the display method involved as well?  I thought that was the
method that was called to display an object, and that it might use disp
internally?  And that Matlab's display method accepts a second argument
to force the name tag that is used (Octave copies this feature since it
appears to be required for basic compatibility even though it is not
documented).

Anyway, I agree that compatibility is important for display and disp
because they are used by classdef classes to display objects.  So if we
aren't compatible, display of those objects won't work as expected.

I'm also just trying to understand what is required and how it is
supposed to work.

jwe

Reply | Threaded
Open this post in threaded view
|

Re: disp architecture

Rik-4
On 10/25/2019 10:37 AM, John W. Eaton wrote:
On 10/25/19 1:08 PM, Rik wrote:

For point c), Matlab seems to get around this by not calling disp() at all
on elements of an aggregating data structure like a struct or cell array.
Instead, it merely prints the name tag for the object (class and size).

Isn't the display method involved as well?

display() is called by the interpreter if an object needs to be printed.  display() seems to take care of printing the name tag for the object, and then it calls disp() for the actual display of the object.  Quoting the original example,
x = magic (3);
s.a = int8 (x);
s.b = single (x);
disp (s)
a: [3x3 int8]
b: [3x3 single]
You can see that disp() doesn't display a name tag nor does it do any indentation.  However, if you just type 's' with no semicolon to display the structure you get

s
s = struct with fields:
    a: [3x3 int8]
    b: [3x3 single]


I thought that was the method that was called to display an object, and that it might use disp internally?  And that Matlab's display method accepts a second argument to force the name tag that is used (Octave copies this feature since it appears to be required for basic compatibility even though it is not documented).


In newer versions of Matlab they seem to have a different API.  But for the older one, which we need to support, see https://www.mathworks.com/help/matlab/matlab_oop/displaying-objects-in-the-command-window.html.

--Rik


Anyway, I agree that compatibility is important for display and disp because they are used by classdef classes to display objects.  So if we aren't compatible, display of those objects won't work as expected.

I'm also just trying to understand what is required and how it is supposed to work.

jwe