character strings in octave

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

character strings in octave

John Eaton-3
Octave currently has a very limited ability to handle character
strings.  I would like to improve this, but I'm not sure how to
proceed.

Matlab's way of treating character data as a numeric matrix with a
special flag set that says `print these values as characters instead
of numbers' is not very appealing to me.

For string operations, how important is it to be compatible with
Matlab?

I don't think it's a very good idea to encourage users of a
high-level language to think in terms of ASCII codes when they need to
do character manipulations.

Do very many people rely on Matlab's storage scheme, or do they use it
just because it's there, and there's no other way to do the operations
they need?

If it is not terribly important to be compatible with matlab, what
features would you like to see?

The minimum functionality that I expect Octave to provide would be
some way to define individual character strings and arrays of
character strings, probably using the existing syntax:

  s = 'this is a string'

  A = [ 'this is an array of strings' ;
        'not all of the strings have to be the same length' ]

Unfortunately, there seem to be a couple of problems.

  * Should a string and an array of strings be a different things, or
    should a string simply be an array of strings with a single
    element?  If they are treated as different objects, should there
    be an implicit conversion if an array of strings is created with
    only one element?

  * How should indexing work?  One possible approach is:

      -- A single index on an array of strings selects one (or
         several) of the strings.

      -- A single index on a character string selects a substring.

      -- Two indices for an array of strings could be used to do both
         of these operations at once.

      -- Two indices for a string results in an error.


Also, I am leaning toward making all numeric operations on strings
invalid.  However, if you really feel the need to do some
manipulations on the ASCII values, you will still be able to, because
it will always be possible to convert a string or an array of strings
to and from a numeric matrix.

I would appreciate any comments or suggestions.

--
John W. Eaton      | The exam demonstrates a comminuted, slightly overlapping
[hidden email] | angulated fracture of the midfifth metatarsal.