Converting (more or less) arbitrary strings to valid variable names

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Converting (more or less) arbitrary strings to valid variable names

Bård Skaflestad
All,

In the long-running thread on "C-equivalent" structure (array) initialisation, someone asked of a way of converting an arbitrary string into a valid variable name.  Unfortunately, I deleted the e-mail too early so I cannot give proper attribution here.

Still, I may be able to provide at least a partial answer to the inquiry.  The built-in function 'genvarname', present since at least Octave 3.2.3,  does solve some of this problem.  Here's an example

    > genvarname ("0  _f00  ba'r")
    ans = _0___f00__ba_r

Obviously, the normal restrictions on variable names apply (string restricted to "namelengthmax" characters being the most severe).


I hope this helps a little bit.

Sincerely,
--
Bård Skaflestad
SINTEF ICT, Applied Mathematics
_______________________________________________
Help-octave mailing list
[hidden email]
https://mailman.cae.wisc.edu/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Converting (more or less) arbitrary strings to valid variable names

Sergei Steshenko




----- Original Message -----

> From: Bård Skaflestad <[hidden email]>
> To: "[hidden email]" <[hidden email]>
> Cc:
> Sent: Monday, November 19, 2012 11:47 PM
> Subject: Converting (more or less) arbitrary strings to valid variable names
>
> All,
>
> In the long-running thread on "C-equivalent" structure (array)
> initialisation, someone asked of a way of converting an arbitrary string into a
> valid variable name.  Unfortunately, I deleted the e-mail too early so I cannot
> give proper attribution here.
>
> Still, I may be able to provide at least a partial answer to the inquiry.  The
> built-in function 'genvarname', present since at least Octave 3.2.3, 
> does solve some of this problem.  Here's an example
>
>     > genvarname ("0  _f00  ba'r")
>     ans = _0___f00__ba_r
>
> Obviously, the normal restrictions on variable names apply (string restricted to
> "namelengthmax" characters being the most severe).
>
>
> I hope this helps a little bit.
>
> Sincerely,
> --
> Bård Skaflestad
> SINTEF ICT, Applied Mathematics
> _______________________________________________
> Help-octave mailing list
> [hidden email]
> https://mailman.cae.wisc.edu/listinfo/help-octave
>


There is an old Jewish joke.

The joke is a story about a poor Jew who had quite a hard life.

So he, as the tradition prescribes, decided to consult the local rabbi.

The rabbi said: "Buy a goat".

The poor Jew didn't quite understand the essence of the advice, but followed it.

...

Several months later the poor Jew visited the rabbi again, this time simply lamenting and crying about how unbearable his life had become.

The rabbi said: "Sell the goat !".

The poor Jew next day after selling the goat couldn't find enough words of gratitude to thank the rabbi. The Jew couldn't belief the relief he found after selling the goat.

...


The problem with your function is that there may already be underscores in otherwise illegal name, e.g. 'foo_ _bar'. So, there will be no easy way to get back the original name.

A much better way has already been suggested: uuencode -> uudecode - like solutions. I.e. reversibility is guaranteed.

But why do we need the "goat" in the first place ?

I went carefully through https://en.wikipedia.org/wiki/List_of_data_structures and did _not_ find there "struct arrays". So, this Matlab artifact should be implemented in the least obtrusive for end user manner - only to make sure Matlab code runs in Octave.

A _consistent_ hash table implementation is fair enough.

Regards,
  Sergei.
_______________________________________________
Help-octave mailing list
[hidden email]
https://mailman.cae.wisc.edu/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

RE: Converting (more or less) arbitrary strings to valid variable names

Bård Skaflestad
Sergei,

I'm afraid I don't fully understand your joke.  Maybe you wish to point out that imposing needless restrictions on your own (working) environment leads to greef that is better avoided altogether.  I think that's an astute observation and I won't argue the point.

Like I said, the 'genvarname' function is at best a partial solution.  It has the advantage of being built into Octave but, as you say, there is no direct support for recovering the original string from the 'genvarname' output.  If reversibility is important then I suppose uu{en,de}code might be a better solution.

On the other hand, you *could* create som sort of "translation table" to preserve reversibility when dynamically forming structure field names through 'genvarname'.  For instance, you could do something like this (pseudo code, glossing over a lot of detail, for instance what to do with repeated strings):

    symtab = {};
    S = struct([]);
    while (get data)
         str = some string that uniquely identifies 'data';
         vname = getvarname(str, symtab(:,1));

         S.(vname) = data;
         symtab = [ symtab ; { vname, str } ];
    endwhile

Then, whenever you require the identification string of a field name, say, '_foo', you can extract it from 'symtab' using a statement such as

     str = symtab{strcmp(symtab(:,1), '_foo'), 2}

It is certainly not pretty, but it is a general approach that can be adopted to many interesting situations.

Sincerely,
--
Bård Skaflestad
SINTEF ICT, Applied Mathematics

________________________________________
From: Sergei Steshenko [[hidden email]]
Sent: 20 November 2012 03:23
To: Bård Skaflestad; [hidden email]
Subject: Re: Converting (more or less) arbitrary strings to valid variable names

----- Original Message -----

> From: Bård Skaflestad <[hidden email]>
> To: "[hidden email]" <[hidden email]>
> Cc:
> Sent: Monday, November 19, 2012 11:47 PM
> Subject: Converting (more or less) arbitrary strings to valid variable names
>
> All,
>
> In the long-running thread on "C-equivalent" structure (array)
> initialisation, someone asked of a way of converting an arbitrary string into a
> valid variable name.  Unfortunately, I deleted the e-mail too early so I cannot
> give proper attribution here.
>
> Still, I may be able to provide at least a partial answer to the inquiry.  The
> built-in function 'genvarname', present since at least Octave 3.2.3,
> does solve some of this problem.  Here's an example
>
>     > genvarname ("0  _f00  ba'r")
>     ans = _0___f00__ba_r
>
> Obviously, the normal restrictions on variable names apply (string restricted to
> "namelengthmax" characters being the most severe).
>
>
> I hope this helps a little bit.
>
> Sincerely,
> --
> Bård Skaflestad
> SINTEF ICT, Applied Mathematics
> _______________________________________________
> Help-octave mailing list
> [hidden email]
> https://mailman.cae.wisc.edu/listinfo/help-octave
>


There is an old Jewish joke.

The joke is a story about a poor Jew who had quite a hard life.

So he, as the tradition prescribes, decided to consult the local rabbi.

The rabbi said: "Buy a goat".

The poor Jew didn't quite understand the essence of the advice, but followed it.

...

Several months later the poor Jew visited the rabbi again, this time simply lamenting and crying about how unbearable his life had become.

The rabbi said: "Sell the goat !".

The poor Jew next day after selling the goat couldn't find enough words of gratitude to thank the rabbi. The Jew couldn't belief the relief he found after selling the goat.

...


The problem with your function is that there may already be underscores in otherwise illegal name, e.g. 'foo_ _bar'. So, there will be no easy way to get back the original name.

A much better way has already been suggested: uuencode -> uudecode - like solutions. I.e. reversibility is guaranteed.

But why do we need the "goat" in the first place ?

I went carefully through https://en.wikipedia.org/wiki/List_of_data_structures and did _not_ find there "struct arrays". So, this Matlab artifact should be implemented in the least obtrusive for end user manner - only to make sure Matlab code runs in Octave.

A _consistent_ hash table implementation is fair enough.

Regards,
  Sergei.
_______________________________________________
Help-octave mailing list
[hidden email]
https://mailman.cae.wisc.edu/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|

Re: Converting (more or less) arbitrary strings to valid variable names

Sergei Steshenko




----- Original Message -----

> From: Bård Skaflestad <[hidden email]>
> To: Sergei Steshenko <[hidden email]>; "[hidden email]" <[hidden email]>
> Cc:
> Sent: Tuesday, November 20, 2012 3:22 PM
> Subject: RE: Converting (more or less) arbitrary strings to valid variable names
>
> Sergei,
>
> I'm afraid I don't fully understand your joke.  Maybe you wish to point
> out that imposing needless restrictions on your own (working) environment leads
> to greef that is better avoided altogether.  I think that's an astute
> observation and I won't argue the point.
>
> Like I said, the 'genvarname' function is at best a partial solution. 
> It has the advantage of being built into Octave but, as you say, there is no
> direct support for recovering the original string from the 'genvarname'
> output.  If reversibility is important then I suppose uu{en,de}code might be a
> better solution.
>
> On the other hand, you *could* create som sort of "translation table"
> to preserve reversibility when dynamically forming structure field names through
> 'genvarname'.  For instance, you could do something like this (pseudo
> code, glossing over a lot of detail, for instance what to do with repeated
> strings):
>
>     symtab = {};
>     S = struct([]);
>     while (get data)
>          str = some string that uniquely identifies 'data';
>          vname = getvarname(str, symtab(:,1));
>
>          S.(vname) = data;
>          symtab = [ symtab ; { vname, str } ];
>     endwhile
>
> Then, whenever you require the identification string of a field name, say,
> '_foo', you can extract it from 'symtab' using a statement such
> as
>
>      str = symtab{strcmp(symtab(:,1), '_foo'), 2}
>
> It is certainly not pretty, but it is a general approach that can be adopted to
> many interesting situations.
>
> Sincerely,
> --
> Bård Skaflestad
> SINTEF ICT, Applied Mathematics
>
> ________________________________________
> From: Sergei Steshenko [[hidden email]]
> Sent: 20 November 2012 03:23
> To: Bård Skaflestad; [hidden email]
> Subject: Re: Converting (more or less) arbitrary strings to valid variable names
>
> ----- Original Message -----
>>  From: Bård Skaflestad <[hidden email]>
>>  To: "[hidden email]" <[hidden email]>
>>  Cc:
>>  Sent: Monday, November 19, 2012 11:47 PM
>>  Subject: Converting (more or less) arbitrary strings to valid variable
> names
>>
>>  All,
>>
>>  In the long-running thread on "C-equivalent" structure (array)
>>  initialisation, someone asked of a way of converting an arbitrary string
> into a
>>  valid variable name.  Unfortunately, I deleted the e-mail too early so I
> cannot
>>  give proper attribution here.
>>
>>  Still, I may be able to provide at least a partial answer to the inquiry. 
> The
>>  built-in function 'genvarname', present since at least Octave
> 3.2.3,
>>  does solve some of this problem.  Here's an example
>>
>>      > genvarname ("0  _f00  ba'r")
>>      ans = _0___f00__ba_r
>>
>>  Obviously, the normal restrictions on variable names apply (string
> restricted to
>>  "namelengthmax" characters being the most severe).
>>
>>
>>  I hope this helps a little bit.
>>
>>  Sincerely,
>>  --
>>  Bård Skaflestad
>>  SINTEF ICT, Applied Mathematics
>>  _______________________________________________
>>  Help-octave mailing list
>>  [hidden email]
>>  https://mailman.cae.wisc.edu/listinfo/help-octave
>>
>
>
> There is an old Jewish joke.
>
> The joke is a story about a poor Jew who had quite a hard life.
>
> So he, as the tradition prescribes, decided to consult the local rabbi.
>
> The rabbi said: "Buy a goat".
>
> The poor Jew didn't quite understand the essence of the advice, but followed
> it.
>
> ...
>
> Several months later the poor Jew visited the rabbi again, this time simply
> lamenting and crying about how unbearable his life had become.
>
> The rabbi said: "Sell the goat !".
>
> The poor Jew next day after selling the goat couldn't find enough words of
> gratitude to thank the rabbi. The Jew couldn't belief the relief he found
> after selling the goat.
>
> ...
>
>
> The problem with your function is that there may already be underscores in
> otherwise illegal name, e.g. 'foo_ _bar'. So, there will be no easy way
> to get back the original name.
>
> A much better way has already been suggested: uuencode -> uudecode - like
> solutions. I.e. reversibility is guaranteed.
>
> But why do we need the "goat" in the first place ?
>
> I went carefully through https://en.wikipedia.org/wiki/List_of_data_structures 
> and did _not_ find there "struct arrays". So, this Matlab artifact
> should be implemented in the least obtrusive for end user manner - only to make
> sure Matlab code runs in Octave.
>
> A _consistent_ hash table implementation is fair enough.
>
> Regards,
>   Sergei.


Please bottom-post.

The essence of the joke is that if someone complains about something, make that someone's life first even harder (make the complainer buy the goat - with whom the complainer lives in the same house), and then suggest to get rid of this new artificial difficulty ("sell the goat").

The newly acquired relief masks the original complaint.


So, further along the "goat" lines.


The original complaint is that


foo = struct
setfield(foo, "foo bar", 1)


works, but

foo = struct
foo.("foo bar") = 1

doesn't work.


The "goat" is adding check of hash key to make sure it's a valid identifier.


"Selling the goat" is getting rid of the check.



Fixing the original problem (the poor Jew's original complaint regarding his hard life) is making


foo.("foo bar") =


work, additionally dropping worshiping Matlab original idiocy (Matlab, IIUC, requires struct fields to be valid identifiers.


Regards,
  Sergei.



>
_______________________________________________
Help-octave mailing list
[hidden email]
https://mailman.cae.wisc.edu/listinfo/help-octave