Matlab-compatible string class

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Matlab-compatible string class

John W. Eaton
Administrator
With this change:

   http://hg.savannah.gnu.org/hgweb/octave/rev/0b65949870e3

we are beginning to handle the existence of the Matlab string class that
is now created in Matlab when double quoted strings are used.  I expect
that it will not be too long before users will expect full compatibility
in this area.  But I'm not sure how we will transition from Octave's
current behavior for double-quoted string constants to the one now used
by Matlab.  For example, in  Octave "foo\n" is a 4-element character
array containing 'f', 'o', 'o', and a linefeed character.  In Matlab it
is a string object, but it also contains 5 characters, 'f', 'o', 'o',
'\', and 'n'.  I'm sure there are other differences, but this is sure to
cause some trouble.  I don't see a smooth transition path.  Does anyone
have any ideas about what to do?

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

jbect
Le 27/12/2017 à 23:47, John W. Eaton a écrit :

> With this change:
>
>   http://hg.savannah.gnu.org/hgweb/octave/rev/0b65949870e3
>
> we are beginning to handle the existence of the Matlab string class
> that is now created in Matlab when double quoted strings are used.  I
> expect that it will not be too long before users will expect full
> compatibility in this area.  But I'm not sure how we will transition
> from Octave's current behavior for double-quoted string constants to
> the one now used by Matlab.  For example, in Octave "foo\n" is a
> 4-element character array containing 'f', 'o', 'o', and a linefeed
> character.  In Matlab it is a string object, but it also contains 5
> characters, 'f', 'o', 'o', '\', and 'n'. I'm sure there are other
> differences, but this is sure to cause some trouble.  I don't see a
> smooth transition path.  Does anyone have any ideas about what to do?

I don't if this is feasible but here is a suggestion :

1) Maintain both behaviors and provide a flag to select the desired one.

2) Use the "old behavior" for now, since the the change is still recent
(R2016b I gather) and probably the support for string arrays won't be
complete in the next releases.

3) Make the transition to the new behavior when the support is
considered as complete, while still providing the flag to support the
old behavior.

4) Remove the old behavior much much later.

@++
Julien

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Rik-4
In reply to this post by John W. Eaton
On 12/27/2017 09:38 PM, [hidden email] wrote:
Subject:
Matlab-compatible string class
From:
"John W. Eaton" [hidden email]
Date:
12/27/2017 02:47 PM
To:
Octave Maintainers List [hidden email]
List-Post:
[hidden email]
Content-Transfer-Encoding:
7bit
Precedence:
list
MIME-Version:
1.0
Message-ID:
[hidden email]
Content-Type:
text/plain; charset=utf-8; format=flowed
Message:
3

With this change:

  http://hg.savannah.gnu.org/hgweb/octave/rev/0b65949870e3

we are beginning to handle the existence of the Matlab string class that is now created in Matlab when double quoted strings are used.  I expect that it will not be too long before users will expect full compatibility in this area.  But I'm not sure how we will transition from Octave's current behavior for double-quoted string constants to the one now used by Matlab.  For example, in  Octave "foo\n" is a 4-element character array containing 'f', 'o', 'o', and a linefeed character.  In Matlab it is a string object, but it also contains 5 characters, 'f', 'o', 'o', '\', and 'n'.  I'm sure there are other differences, but this is sure to cause some trouble.  I don't see a smooth transition path.  Does anyone have any ideas about what to do?

jwe

I don't see a smooth transition path either.  Although Matlab presents the string class as a fundamental data type, it is actually closer to a container data type like a cell array.  In fact, by using cell arrays of strings you can make portable code that will run in either Octave or Matlab.  Maybe this is unkind, but the principal advantage seems to be the syntactic sugar of using parentheses '()' for indexing rather than cell array indexing '{}'.

Since string arrays are really a data container type, they just contain ordinary Matlab strings which we already know to be the equivalent of single-quoted 1xN character row vectors.  Hence, '\n' is two characters even within a string array created with quotes.

This is probably such an involved topic, with architectural implications, that we should discuss it face-to-face at OctConf 2018.

--Rik

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Mike Miller-4
In reply to this post by John W. Eaton
On Wed, Dec 27, 2017 at 17:47:50 -0500, John W. Eaton wrote:
> I
> don't see a smooth transition path.  Does anyone have any ideas about what
> to do?

If we agree that it's important to support Matlab's double quote syntax
eventually, then we should probably start by discouraging the use of
backslash-escaped sequences in double quoted strings and other likely
incompatibilities.

We can start to implement the string data type without changing the
meaning of double quoted literals. We can implement the string() and
strings() constructor functions to create string objects without the
need for any change in how double quotes are interpreted.

After the type is supported, then we can look at changing how the
interpreter handles the double quoted literal string.

--
mike

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

John W. Eaton
Administrator
On 12/28/2017 03:39 PM, Mike Miller wrote:
> On Wed, Dec 27, 2017 at 17:47:50 -0500, John W. Eaton wrote:
>> I
>> don't see a smooth transition path.  Does anyone have any ideas about what
>> to do?
>
> If we agree that it's important to support Matlab's double quote syntax
> eventually, then we should probably start by discouraging the use of
> backslash-escaped sequences in double quoted strings and other likely
> incompatibilities.

Do you think we should do that now?  If so, then something like the
attached diff will do it, but it will be very noisy because it will warn
about all strings with escape sequences.  It would be a lot better if we
could skip the warning if the string appears as the format argument of a
function that will do backslash escape processing for single-quoted
strings (and, presumably, string objects), but that will require quite a
bit more work.

> We can start to implement the string data type without changing the
> meaning of double quoted literals. We can implement the string() and
> strings() constructor functions to create string objects without the
> need for any change in how double quotes are interpreted.

Yes, we can create a string class independent of the way double-quoted
strings are handled.  This would probably be a good test for the
classdef and is likely to uncover some bugs or missing features.

> After the type is supported, then we can look at changing how the
> interpreter handles the double quoted literal string.

Yes.  This is the painful part...

jwe


backslash-warning-diffs.txt (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Richard Crozier
In reply to this post by jbect


On 28/12/17 05:38, Julien Bect wrote:
> Le 27/12/2017 à 23:47, John W. Eaton a écrit :
>> With this change:

>
> 2) Use the "old behavior" for now, since the the change is still recent
> (R2016b I gather) and probably the support for string arrays won't be
> complete in the next releases.
>

double quotes do not work in R2016b, you must use:

x = string('fdsafad')

to get a string.

Richard

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

John W. Eaton
Administrator
On 12/28/2017 07:42 PM, Richard Crozier wrote:

>
>
> On 28/12/17 05:38, Julien Bect wrote:
>> Le 27/12/2017 à 23:47, John W. Eaton a écrit :
>>> With this change:
>
>>
>> 2) Use the "old behavior" for now, since the the change is still
>> recent (R2016b I gather) and probably the support for string arrays
>> won't be complete in the next releases.
>>
>
> double quotes do not work in R2016b, you must use:
>
> x = string('fdsafad')
>
> to get a string.

Yes, double quotes for strings were introduced in 2017a:

https://www.mathworks.com/help/matlab/release-notes.html

jwe


Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Richard Crozier
In reply to this post by Rik-4


On 28/12/17 16:41, Rik wrote:
> On 12/27/2017 09:38 PM, [hidden email] wrote:

>
> I don't see a smooth transition path either.  Although Matlab presents
> the string class as a fundamental data type, it is actually closer to a
> container data type like a cell array.  In fact, by using cell arrays of
> strings you can make portable code that will run in either Octave or
> Matlab.  Maybe this is unkind, but the principal advantage seems to be
> the syntactic sugar of using parentheses '()' for indexing rather than
> cell array indexing '{}'.
>

I believe you can also do things like:

x = "the cat " + "sat on the mat"

which is kind of handy. More like a python string.

Richard

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

ederag
In reply to this post by John W. Eaton
On Wednesday, December 27, 2017 17:47:50 John W. Eaton wrote:

> With this change:
>
>    http://hg.savannah.gnu.org/hgweb/octave/rev/0b65949870e3
>
> we are beginning to handle the existence of the Matlab string class that
> is now created in Matlab when double quoted strings are used.  I expect
> that it will not be too long before users will expect full compatibility
> in this area.  But I'm not sure how we will transition from Octave's
> current behavior for double-quoted string constants to the one now used
> by Matlab.  For example, in  Octave "foo\n" is a 4-element character
> array containing 'f', 'o', 'o', and a linefeed character.  In Matlab it
> is a string object, but it also contains 5 characters, 'f', 'o', 'o',
> '\', and 'n'.  I'm sure there are other differences, but this is sure to
> cause some trouble.  I don't see a smooth transition path.  Does anyone
> have any ideas about what to do?
>
> jwe
>

The octave way is much smarter.
The double quote string is like a C string, very intuitive.
Changing that would be a regression, IMHO.

Could it be possible to expand escaped characters like \n
to '\', 'n' only in the --braindead mode ?

Ederag


Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

John W. Eaton
Administrator
On 12/29/2017 06:41 AM, ederag wrote:

> Could it be possible to expand escaped characters like \n
> to '\', 'n' only in the --braindead mode ?

No, we've been down that path before and we won't do it again.  It makes
it too difficult to write code that will just work if the meaning of the
language can change based on some option.  What happens when you pass
your code that requires "\n" to mean LF to someone who uses Octave with
--braindead?

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

John W. Eaton
Administrator
In reply to this post by Richard Crozier
On 12/28/2017 07:57 PM, Richard Crozier wrote:

>
>
> On 28/12/17 16:41, Rik wrote:
>> On 12/27/2017 09:38 PM, [hidden email] wrote:
>
>>
>> I don't see a smooth transition path either.  Although Matlab presents
>> the string class as a fundamental data type, it is actually closer to
>> a container data type like a cell array.  In fact, by using cell
>> arrays of strings you can make portable code that will run in either
>> Octave or Matlab.  Maybe this is unkind, but the principal advantage
>> seems to be the syntactic sugar of using parentheses '()' for indexing
>> rather than cell array indexing '{}'.
>>
>
> I believe you can also do things like:
>
> x = "the cat " + "sat on the mat"
>
> which is kind of handy. More like a python string.

This could have also been done with character arrays.  Except that the
original design of Matlab chose to have character arrays work like
numbers when used with arithmetic operators.

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Rik-4
In reply to this post by John W. Eaton
On 12/29/2017 03:41 AM, [hidden email] wrote:
Subject:
Re: Matlab-compatible string class
From:
ederag [hidden email]
Date:
12/29/2017 03:41 AM
To:
[hidden email]
CC:
"John W. Eaton" [hidden email]
List-Post:
[hidden email]
Content-Transfer-Encoding:
7Bit
Precedence:
list
MIME-Version:
1.0
References:
[hidden email]
In-Reply-To:
[hidden email]
Message-ID:
<3784791.IIQK51SOdH@silex>
Content-Type:
text/plain; charset="us-ascii"
Message:
6

On Wednesday, December 27, 2017 17:47:50 John W. Eaton wrote:
With this change:

   http://hg.savannah.gnu.org/hgweb/octave/rev/0b65949870e3

we are beginning to handle the existence of the Matlab string class that 
is now created in Matlab when double quoted strings are used.  I expect 
that it will not be too long before users will expect full compatibility 
in this area.  But I'm not sure how we will transition from Octave's 
current behavior for double-quoted string constants to the one now used 
by Matlab.  For example, in  Octave "foo\n" is a 4-element character 
array containing 'f', 'o', 'o', and a linefeed character.  In Matlab it 
is a string object, but it also contains 5 characters, 'f', 'o', 'o', 
'\', and 'n'.  I'm sure there are other differences, but this is sure to 
cause some trouble.  I don't see a smooth transition path.  Does anyone 
have any ideas about what to do?

jwe

The octave way is much smarter.
The double quote string is like a C string, very intuitive.
Changing that would be a regression, IMHO.

Could it be possible to expand escaped characters like \n
to '\', 'n' only in the --braindead mode ?

I'm also not interested in taking away good features of the Octave language.  The vision is that Octave is a superset of Matlab--it can do everything Matlab does, and then some more.  In this case, The single/double quote distinction is very familiar for anyone who has done any sort of programming, and it is also useful for making the strings more readable when they contain single quotes.

Consider,

'I can''t understand, don''t want, shan''t accept losing double quotes'
to
"I can't understand, don't want, shan't accept losing double quotes"

Of course, the devil is in the details and finding some way of accommodating string arrays will be important.

--Rik

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Mike Miller-4
On Fri, Dec 29, 2017 at 08:27:13 -0800, Rik wrote:
> Consider,
>
> 'I can''t understand, don''t want, shan''t accept losing double quotes'
> to
> "I can't understand, don't want, shan't accept losing double quotes"

This will probably continue to work the same during and after the
transition to the new string type.

--
mike

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Daniel Sebald
In reply to this post by John W. Eaton
On 12/29/2017 07:22 AM, John W. Eaton wrote:

> On 12/29/2017 06:41 AM, ederag wrote:
>
>> Could it be possible to expand escaped characters like \n
>> to '\', 'n' only in the --braindead mode ?
>
> No, we've been down that path before and we won't do it again.  It makes
> it too difficult to write code that will just work if the meaning of the
> language can change based on some option.  What happens when you pass
> your code that requires "\n" to mean LF to someone who uses Octave with
> --braindead?

Could make such an option software settable, e.g.,

interpconfig('-stringtype', 'escape');
AllMyOtherCode();

That avoids the command-line configuration option, but interpconfig()
function wouldn't be backward compatible; not the most egregious problem
though.  It would be something like:

if (compare_versions(version(), "4.4.0", ">="))
   interpconfig('-stringtype', 'escape');
end
AllMyOtherCode();

That would be fairly backward compatible, I would think.

In some sense, the above is analogous to the "tex" setting of graphics text:

octave:45> get(get(gca,"title"), "interpreter")
ans = tex

With that in mind, a alternate approach would be to add such an option
to all the functions that use strings.  That is, leave the
interpretation/use of the string to the very last moment.  How to
display such strings at the command line (i.e., escape or non-escape) is
arbitrary I suppose then, but how important that is, I don't know.

Dan

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

John W. Eaton
Administrator
On 12/29/2017 03:58 PM, Daniel J Sebald wrote:

> On 12/29/2017 07:22 AM, John W. Eaton wrote:
>> On 12/29/2017 06:41 AM, ederag wrote:
>>
>>> Could it be possible to expand escaped characters like \n
>>> to '\', 'n' only in the --braindead mode ?
>>
>> No, we've been down that path before and we won't do it again.  It
>> makes it too difficult to write code that will just work if the
>> meaning of the language can change based on some option.  What happens
>> when you pass your code that requires "\n" to mean LF to someone who
>> uses Octave with --braindead?
>
> Could make such an option software settable, e.g.,
>
> interpconfig('-stringtype', 'escape');
> AllMyOtherCode();
>
> That avoids the command-line configuration option, but interpconfig()
> function wouldn't be backward compatible; not the most egregious problem
> though.  It would be something like:
>
> if (compare_versions(version(), "4.4.0", ">="))
>    interpconfig('-stringtype', 'escape');
> end
> AllMyOtherCode();
>
> That would be fairly backward compatible, I would think.
>
> In some sense, the above is analogous to the "tex" setting of graphics
> text:
>
> octave:45> get(get(gca,"title"), "interpreter")
> ans = tex
>
> With that in mind, a alternate approach would be to add such an option
> to all the functions that use strings.  That is, leave the
> interpretation/use of the string to the very last moment.  How to
> display such strings at the command line (i.e., escape or non-escape) is
> arbitrary I suppose then, but how important that is, I don't know.

No, it is no good to have some kind of option that controls this kind of
behavior.  It means that I can't write a function that uses a backslash
escape sequence unless I first ensure that it works the way I want.

Really, we don't want this.  We already did this experiment with many
other user-configurable settings that affect the way the language works
and we spent literally years removing all of them.  The configuration
options that remain affect things like output that don't matter so much.

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Richard Crozier
In reply to this post by John W. Eaton


On 29/12/17 13:25, John W. Eaton wrote:

> On 12/28/2017 07:57 PM, Richard Crozier wrote:
>>
>>
>> On 28/12/17 16:41, Rik wrote:
>>> On 12/27/2017 09:38 PM, [hidden email] wrote:
>>
>>>
>>> I don't see a smooth transition path either.  Although Matlab
>>> presents the string class as a fundamental data type, it is actually
>>> closer to a container data type like a cell array.  In fact, by using
>>> cell arrays of strings you can make portable code that will run in
>>> either Octave or Matlab.  Maybe this is unkind, but the principal
>>> advantage seems to be the syntactic sugar of using parentheses '()'
>>> for indexing rather than cell array indexing '{}'.
>>>
>>
>> I believe you can also do things like:
>>
>> x = "the cat " + "sat on the mat"
>>
>> which is kind of handy. More like a python string.
>
> This could have also been done with character arrays.  Except that the
> original design of Matlab chose to have character arrays work like
> numbers when used with arithmetic operators.
>
> jwe

Definitely true, and I guess the string class is their way of trying to
fix this while maintining backward compatibility.

Richard


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

Daniel Sebald
In reply to this post by John W. Eaton
On 12/29/2017 03:31 PM, John W. Eaton wrote:

> On 12/29/2017 03:58 PM, Daniel J Sebald wrote:
>> On 12/29/2017 07:22 AM, John W. Eaton wrote:
>>> On 12/29/2017 06:41 AM, ederag wrote:
>>>
>>>> Could it be possible to expand escaped characters like \n
>>>> to '\', 'n' only in the --braindead mode ?
>>>
>>> No, we've been down that path before and we won't do it again.  It
>>> makes it too difficult to write code that will just work if the
>>> meaning of the language can change based on some option.  What
>>> happens when you pass your code that requires "\n" to mean LF to
>>> someone who uses Octave with --braindead?
>>
>> Could make such an option software settable, e.g.,
>>
>> interpconfig('-stringtype', 'escape');
>> AllMyOtherCode();
>>
>> That avoids the command-line configuration option, but interpconfig()
>> function wouldn't be backward compatible; not the most egregious
>> problem though.  It would be something like:
>>
>> if (compare_versions(version(), "4.4.0", ">="))
>>    interpconfig('-stringtype', 'escape');
>> end
>> AllMyOtherCode();
>>
>> That would be fairly backward compatible, I would think.
>>
>> In some sense, the above is analogous to the "tex" setting of graphics
>> text:
>>
>> octave:45> get(get(gca,"title"), "interpreter")
>> ans = tex
>>
>> With that in mind, a alternate approach would be to add such an option
>> to all the functions that use strings.  That is, leave the
>> interpretation/use of the string to the very last moment.  How to
>> display such strings at the command line (i.e., escape or non-escape)
>> is arbitrary I suppose then, but how important that is, I don't know.
>
> No, it is no good to have some kind of option that controls this kind of
> behavior.  It means that I can't write a function that uses a backslash
> escape sequence unless I first ensure that it works the way I want.
>
> Really, we don't want this.  We already did this experiment with many
> other user-configurable settings that affect the way the language works
> and we spent literally years removing all of them.  The configuration
> options that remain affect things like output that don't matter so much.

Here's a couple other ideas, but first it seems that no matter what one
does here there is a backward compatibility issue.  Making the change
will mean that there is probably lots of user-written Octave scripts out
there using escape-strings that no longer works perfectly.  (One
solution is a translator, which I'll come back to.)

1) This idea isn't too much different than the compatibility option.
Have the routine eval() accept an input option "legacy", "escapestring",
"cstylestring" that will treat all strings as escape-strings, i.e.,
C-style.  That way, if someone has some code with escape characters in
it, s/he can still use it.  (BTW, I assume the only way to get escapes
into new-strings will be via sprintf.)

2) Introduce a new escaped-string or C-style-string syntax using a
character that isn't likely to be used for something else in a
syntactical sense.  For example, currently the following (and many other
variations) creates a syntax error:

octave:1> x = \"This is an\nescaped string.\n"
parse error:

   syntax error

 >>> x = \"This is an\nescaped string.\n"
         ^

But if that syntax were to mean treat the string as C-style, then it is
close to the current format.  Note that the above is slightly unusual
from a language point of view because only the left-hand opening
character has a backslash \" while the ending is sans backslash \.
Otherwise there is no way of using \" inside the string.  For example:

octave:1> x = "This is a\nregular string.\n"
x = This is a\nregular string.\n
octave:2> x = \"This is an\n\"escape\" string.\n"
x = This is an
"escape" string.

If one wanted to retain the open/close delimiter similarity, then maybe

octave:2> x = /"This is an\n\"escape\" string.\n/"
x = This is an
"escape" string.

Or,

octave:2> x = @This is an\n\"escape\" string.\n@
x = This is an
"escape" string.

is single characters on both ends, but it is getting two far afield from
what one normally might guess as being an string delimiter symbol.  I
sort of like the idea of just tagging a backslash before any string to
make it an "escape-string".

So, here's where an Octave-supplied "translator-function" comes into
play.  It would take as input a string or file and identify any legacy
"..." strings in the item and convert the strings to \"...".  A bit
tricky in the sense that some parsing is required to identify what are
truly strings as opposed to say " used in a comment, and things that are
already \"" should not become \\"", etc.  However, editors like vim
already accomplish these sorts of language parsings for color
highlighting without too much hassle.

Dan

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

ederag
In reply to this post by John W. Eaton
On Friday, December 29, 2017 08:22:23 John W. Eaton wrote:

> On 12/29/2017 06:41 AM, ederag wrote:
>
> > Could it be possible to expand escaped characters like \n
> > to '\', 'n' only in the --braindead mode ?
>
> No, we've been down that path before and we won't do it again.  It makes
> it too difficult to write code that will just work if the meaning of the
> language can change based on some option.  What happens when you pass
> your code that requires "\n" to mean LF to someone who uses Octave with
> --braindead?
>
> jwe


OK, missed some history, I liked
https://savannah.gnu.org/bugs/?35911#comment3
but --braindead is very cosmetic now.
That solves many development headaches,
so the following it just sharing some thoughts.


Right now, in octave
numel("\t123\n")
ans =  5
makes sense
In matlab this would yield 2+3+2=7, very non-intuitive
for programmers coming from another language, isn't it ?


Another one:
strtrim(" \t1 2 3\t\n")
ans = 1 2 3

But with matlab's current implementation, this would yield the same as
strtrim(' \t1 2 3\t\n')                                                                                                    
ans = \t1 2 3\t\n


Of course it is possible to insert an sprintf() around the string, like
strtrim(sprintf(' \t1 2 3\t\n'))
but this looks ugly.

Let's hope they think it over.
But I guess python will remain superior to matlab for string handling.
https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
especially
"": interpreted (like octave)
r"": raw (like current matlab)
not to mention the handy f"" format...

Let's say that r"" mean raw string in octave too.

One way to maintain octave's nice behavior while allowing
to use or distribute packages compatible with matlab
would be a way to declare _files_ as "matlab braindead".
The "pkg load" or initialization script would perform the declaration.
Especially for pkg load, this would be deterministic and transparent to the user.

Instead of braindead, borrowing Dan's suggestion, it could be
interpconfig({files}, '-defaultStringType', 'escape');

For these marked files, "" would be interpreted as r"" strings.

Since it affects only the interpretation of a file,
it seems safer and easier to maintain
than the old global --braindead that is definitely discarded.

Ederag

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

John W. Eaton
Administrator
On 12/31/2017 05:04 AM, ederag wrote:

> Let's hope they think it over.

I don't think that they will.

> But I guess python will remain superior to matlab for string handling.

I may agree with you, but I also suppose that is subjective.

> https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
> especially
> "": interpreted (like octave)
> r"": raw (like current matlab)
> not to mention the handy f"" format...
>
> Let's say that r"" mean raw string in octave too.

If you want this in Octave, then it has to be the other way around.

"" is compatible with Matlab, always.

i"" or e"" or something means "interpolated escape sequences in this
string".

> One way to maintain octave's nice behavior while allowing
> to use or distribute packages compatible with matlab
> would be a way to declare _files_ as "matlab braindead".
> The "pkg load" or initialization script would perform the declaration.
> Especially for pkg load, this would be deterministic and transparent to the user.
>
> Instead of braindead, borrowing Dan's suggestion, it could be
> interpconfig({files}, '-defaultStringType', 'escape');
>
> For these marked files, "" would be interpreted as r"" strings.
>
> Since it affects only the interpretation of a file,
> it seems safer and easier to maintain
> than the old global --braindead that is definitely discarded.

No, we are not doing any of that.  Sorry, but as I said, we've been down
that path before and we are not doing it again.  Do I have to throw in a
"this is non-negotiable"?  :-)

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Matlab-compatible string class

ederag
On Sunday, December 31, 2017 09:06:43 John W. Eaton wrote:
> > Let's say that r"" mean raw string in octave too.
>
> If you want this in Octave, then it has to be the other way around.
>
> "" is compatible with Matlab, always.
>
> i"" or e"" or something means "interpolated escape sequences in this
> string".


Forgetting about backward compatibility,
this would be an interesting addition.


> > One way to maintain octave's nice behavior while allowing
> > to use or distribute packages compatible with matlab
> > would be a way to declare _files_ as "matlab braindead".
> > The "pkg load" or initialization script would perform the declaration.
> > Especially for pkg load, this would be deterministic and transparent to the user.
> >
> > Instead of braindead, borrowing Dan's suggestion, it could be
> > interpconfig({files}, '-defaultStringType', 'escape');
> >
> > For these marked files, "" would be interpreted as r"" strings.
> >
> > Since it affects only the interpretation of a file,
> > it seems safer and easier to maintain
> > than the old global --braindead that is definitely discarded.
>
> No, we are not doing any of that.  Sorry, but as I said, we've been down
> that path before and we are not doing it again.  Do I have to throw in a
> "this is non-negotiable"?  :-)


Of course no need to :-)
Although I sometimes wish octave would be freer from matlab,
it was simple feedback.

After searching for posts about "braindead",
it just seemed an idea on the verge of being interesting.
And sufficiently distinct from the --braindead
to be exposed and probably discarded.
The proposition was more shallow:
only the file processed,
no propagation inside any called function for instance.

Now this too has been clearly discarded,
and again, I can see why.
Thanks for the clarification.

Ederag


12