io-2.4.13 released

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

io-2.4.13 released

PhilipNienhuis
Hi,

There's a new release of the io package, io-2.4.13 [1]
It's primarily a bug fix release.

Please see the NEWS for what this new release brings [2]


For the Octave maintainers ML here's some extra info on the next plans
for the io package, motivated by a question by Markus Muetzel about the
status of the plans for moving spreadsheet I/O + XML functions to core
Octave, partly or completely.

Well, whether that will ever happen, let alone whether it is still
desired, is beyond me. Yet I do want to adapt the spreadsheet I/O
functions code base to a much more easily maintainable status. A benefit
is that it'll make it easier to move them to core some day, maybe.

An important part is merging the now largely separate ods????? and
xls????? functions sets into just one xls????? function set and morphing
the ods????? functions into wrappers for their xls????? counterparts and
deprecate them. After all, Matlab's xlsread can perfectly read .ods
these days and we do want to be compatible.
This way a lot of duplicate code can be eliminated.

A next step might be splitting up the OCT interface .xlsx and maybe .ods
private functions into smaller dedicated pieces. However, .xlsx and .ods
are very complicated file formats so maintaining these functions still
won't be easy peasy.

Another wish is to adapt the current test scripts (a.o., test_spsh.m) to
"regular" BIST tests as much as possible. More on that once I get there.

I'll plan to start a 2.6.x branch for these developments soon.

Anyway, enjoy.

Philip

[1] https://octave.sourceforge.io/io/index.html
[2] https://octave.sourceforge.io/io/NEWS.html

Reply | Threaded
Open this post in threaded view
|

Re: io-2.4.13 released

Olaf Till-2
On Thu, Oct 17, 2019 at 09:51:04PM +0200, Philip Nienhuis wrote:
> and morphing the ods?????
> functions into wrappers for their xls????? counterparts and deprecate them.
> After all, Matlab's xlsread can perfectly read .ods these days and we do
> want to be compatible.

According to Matlab online documentation:

https://de.mathworks.com/help/matlab/ref/xlsread.html?searchHighlight=xlsread&s_tid=doc_srchtitle

xlsread() is deprecated now in Matlab and readtable(), readmatrix(),
or readcell() should be used.

I'm not familiar with these functions, but from the perspective of
naming, this move of Matlab seems a good one to me. For many people,
'spreadsheet' (if they use this term at all) is the same as 'Excel',
which is a misconception we shouldn't contribute to. But we would, if
we used 'xls...()' for every type of spreadsheet.

Olaf

--
public key id EAFE0591, e.g. on x-hkp://pool.sks-keyservers.net

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: io-2.4.13 released

PhilipNienhuis
Olaf Till-2 wrote

> On Thu, Oct 17, 2019 at 09:51:04PM +0200, Philip Nienhuis wrote:
>> and morphing the ods?????
>> functions into wrappers for their xls????? counterparts and deprecate
>> them.
>> After all, Matlab's xlsread can perfectly read .ods these days and we do
>> want to be compatible.
>
> According to Matlab online documentation:
>
> https://de.mathworks.com/help/matlab/ref/xlsread.html?searchHighlight=xlsread&s_tid=doc_srchtitle
>
> xlsread() is deprecated now in Matlab and readtable(), readmatrix(),
> or readcell() should be used.

Not so much "deprecated" but rather "not recommended".

Yeah, Matlab's xlsread/xlswrite have a lot of quirks that IMO are a direct
consequence of their dependence on MS-Excel behind the scenes. That might be
a reason for TMW to move away from these functions.
When I started writing the Octave counterparts ten years ago, incl. the
ods????? siblings, I tried to avoid those issues as much as I could, but
often Matlab compatibility was -and still is- in the way.


> I'm not familiar with these functions, but from the perspective of
> naming, this move of Matlab seems a good one to me. For many people,
> 'spreadsheet' (if they use this term at all) is the same as 'Excel',
> which is a misconception we shouldn't contribute to. But we would, if
> we used 'xls...()' for every type of spreadsheet.

Oh, I wouldn't quite mind if all spreadsheet I/O would be done through
functions starting with "ods" rather than "xls" in their names :-)  Or maybe
a more neutral "spsh". But then that Matlab compatibility ...

I wouldn't be surprised if the spreadsheet I/O in the io package is among
the most used of all OF package functions. FWIW, the vast majority of io
package bug reports was about .xls and .xlsx; bug reports for .ods have been
very scarce. Adding in JWE's observation (expressed at an OctConf 2018
lecture) that the user base mostly cares about Matlab compatibility I think
we'd just have to yield to spreadsheet I/O function names starting with
"xls", at least for the time being. Whether in an OF package or in core
wouldn't matter.

Mind you, for Matlab there's a Spreadsheet Link toolbox. Just guess what the
underlying external SW for that toolbox is. Hint: it isn't LibreOffice nor
any other OSS spreadsheet SW.

For readTable() and friends Octave needs to have the Table class
implemented. To my knowlegde there are no definite plans for that; there is
a prospect of a start by Markus Bergholz somewhere in (IIRC) the bug
tracker, from several years back.

Philip




--
Sent from: https://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html

Reply | Threaded
Open this post in threaded view
|

Re: io-2.4.13 released

siko1056
On 10/18/19 5:30 PM, PhilipNienhuis wrote:

> [...]
>
> I wouldn't be surprised if the spreadsheet I/O in the io package is among
> the most used of all OF package functions. FWIW, the vast majority of io
> package bug reports was about .xls and .xlsx; bug reports for .ods have been
> very scarce. Adding in JWE's observation (expressed at an OctConf 2018
> lecture) that the user base mostly cares about Matlab compatibility I think
> we'd just have to yield to spreadsheet I/O function names starting with
> "xls", at least for the time being. Whether in an OF package or in core
> wouldn't matter.
>

Thank you for the new release Philip.  Sure, Octave without the io
toolbox would be by far less attractive for many former Matlab and
current MS Windows users, that no longer have a Matlab license, but tons
of "old" Matlab code reading spreadsheet data by "xlsread".
Unfortunately, for many of those users the words "data" and "MS Excel
spreadsheets" are still synonyms.  But hopefully things might change in
the future and we can go Olafs and my desire.

> [...]
>
> For readTable() and friends Octave needs to have the Table class
> implemented. To my knowlegde there are no definite plans for that; there is
> a prospect of a start by Markus Bergholz somewhere in (IIRC) the bug
> tracker, from several years back.
>

Regarding this, I want to remind of Andrews project [1].

Kai

[1] https://github.com/apjanke/octave-tablicious

Reply | Threaded
Open this post in threaded view
|

Re: io-2.4.13 released

PhilipNienhuis
Kai Torben Ohlhus wrote:

> On 10/18/19 5:30 PM, PhilipNienhuis wrote:
>> [...]
>>
>> I wouldn't be surprised if the spreadsheet I/O in the io package is among
>> the most used of all OF package functions. FWIW, the vast majority of io
>> package bug reports was about .xls and .xlsx; bug reports for .ods have been
>> very scarce. Adding in JWE's observation (expressed at an OctConf 2018
>> lecture) that the user base mostly cares about Matlab compatibility I think
>> we'd just have to yield to spreadsheet I/O function names starting with
>> "xls", at least for the time being. Whether in an OF package or in core
>> wouldn't matter.
>>
>
> Thank you for the new release Philip.  Sure, Octave without the io
> toolbox would be by far less attractive for many former Matlab and
> current MS Windows users, that no longer have a Matlab license, but tons
> of "old" Matlab code reading spreadsheet data by "xlsread".
> Unfortunately, for many of those users the words "data" and "MS Excel
> spreadsheets" are still synonyms.  But hopefully things might change in
> the future and we can go Olafs and my desire.

... and mine desire as well.
I sometimes frown upon the way Octave tends to mimick Matlab beyond just
code compatibility. But I guess in the end the user base dictates so.

>> [...]
>>
>> For readTable() and friends Octave needs to have the Table class
>> implemented. To my knowlegde there are no definite plans for that; there is
>> a prospect of a start by Markus Bergholz somewhere in (IIRC) the bug
>> tracker, from several years back.
>>
>
> Regarding this, I want to remind of Andrews project [1].

Thanks, I forgot about that. Yeah, an impressive piece of work. But,
still no Table I/O there AFAICS.
On github (or gitlab?) I saw a readTable() that effectively is a wrapper
for xlsread. Could be a temporary solution.

Note that on Matlab, "If your system does not have Excel ..., the
[readTable] importing function .... reads only .xls, .xlsx, .xlsm,
.xltx, and .xltm files" and "... with Excel® software, the [readTable]
importing function reads any Excel spreadsheet file format recognized by
your version of Excel"
(https://nl.mathworks.com/help/matlab/ref/readtable.html).
I'm sure Octave (= we) can do better :-)

Philip

Reply | Threaded
Open this post in threaded view
|

Re: io-2.4.13 released

jbect
In reply to this post by PhilipNienhuis
Le 18/10/2019 à 10:30, PhilipNienhuis a écrit :
> For readTable() and friends Octave needs to have the Table class
> implemented. To my knowlegde there are no definite plans for that; there is
> a prospect of a start by Markus Bergholz somewhere in (IIRC) the bug
> tracker, from several years back.


About that, there is also the "tablicious" project by Andrew Janke :
https://apjanke.github.io/octave-tablicious/

As a statistican, I would love to see some progress on this missing
feature...

@++
Julien


Reply | Threaded
Open this post in threaded view
|

Re: io-2.4.13 released

apjanke-floss
In reply to this post by PhilipNienhuis

On 10/18/19 2:29 AM, Philip Nienhuis wrote:

> Kai Torben Ohlhus wrote:
>> On 10/18/19 5:30 PM, PhilipNienhuis wrote:
>>> [...]
>>>
>>> For readTable() and friends Octave needs to have the Table class
>>> implemented. To my knowlegde there are no definite plans for that;
>>> there is
>>> a prospect of a start by Markus Bergholz somewhere in (IIRC) the bug
>>> tracker, from several years back.
>>>
>>
>> Regarding this, I want to remind of Andrews project [1].
>
> Thanks, I forgot about that. Yeah, an impressive piece of work. But,
> still no Table I/O there AFAICS.
> On github (or gitlab?) I saw a readTable() that effectively is a wrapper
> for xlsread. Could be a temporary solution.

Thanks Kai and Philip! Table I/O is on my TODO list for Tablicious. Now
that I know there's appetite for it, I'll bump up its priority.

https://github.com/apjanke/octave-tablicious/issues/49

It looks like Forge io's spreadsheet reading is mature enough that I
could build Table I/O on top of that. And that seems like a good design:
no need for another package to re-implement basic spreadsheet I/O.

With one exception I can see: cell formatting. I need to be able to
detect which columns are formatted as dates, so they can automatically
be converted to @datetime objects. (And eventually I'd like to provide
the option to efficiently read selected columns as @categorical, but
that's much lower priority.)

Is there an efficient way in io's spreadsheet support to detect the
"type"/format of cells/columns? Or would you be willing to work with me
to add one to io? (Maybe by just exposing the cell format as yet another
output arg of xlsread()?)

Same with writing tables: I'd need a way to control cell formatting so
that dates could be formatted as dates explicitly instead of shooting
them in as strings and relying on Excel's auto-parsing functionality
(which I think only works when you're using the Excel COM interface
anyway). Would also be nice to do stuff like bold column headers and
freeze panes.

> Note that on Matlab, "If your system does not have Excel ..., the
> [readTable] importing function .... reads only .xls, .xlsx, .xlsm,
> .xltx, and .xltm files" and "... with Excel® software, the [readTable]
> importing function reads any Excel spreadsheet file format recognized by
> your version of Excel"
> (https://nl.mathworks.com/help/matlab/ref/readtable.html).
> I'm sure Octave (= we) can do better :-)

I'm sure we can. I don't intend to have this limitation in Tablicious.
Matlab's dependency on automating Excel itself for this stuff is
unfortunate. :)

Cheers,
Andrew

Reply | Threaded
Open this post in threaded view
|

Table I/O [WAS: io-2.4.13 released]

PhilipNienhuis
apjanke-floss wrote

> On 10/18/19 2:29 AM, Philip Nienhuis wrote:
>> Kai Torben Ohlhus wrote:
>>> On 10/18/19 5:30 PM, PhilipNienhuis wrote:
>>>> [...]
>>>>
>>>> For readTable() and friends Octave needs to have the Table class
>>>> implemented. To my knowlegde there are no definite plans for that;
>>>> there is
>>>> a prospect of a start by Markus Bergholz somewhere in (IIRC) the bug
>>>> tracker, from several years back.
>>>>
>>>
>>> Regarding this, I want to remind of Andrews project [1].
>>
>> Thanks, I forgot about that. Yeah, an impressive piece of work. But,
>> still no Table I/O there AFAICS.
>> On github (or gitlab?) I saw a readTable() that effectively is a wrapper
>> for xlsread. Could be a temporary solution.
>
> Thanks Kai and Philip! Table I/O is on my TODO list for Tablicious. Now
> that I know there's appetite for it, I'll bump up its priority.
>
> https://github.com/apjanke/octave-tablicious/issues/49
>
> It looks like Forge io's spreadsheet reading is mature enough that I
> could build Table I/O on top of that. And that seems like a good design:
> no need for another package to re-implement basic spreadsheet I/O.
>
> With one exception I can see: cell formatting. I need to be able to
> detect which columns are formatted as dates, so they can automatically
> be converted to @datetime objects. (And eventually I'd like to provide
> the option to efficiently read selected columns as @categorical, but
> that's much lower priority.)
>
> Is there an efficient way in io's spreadsheet support to detect the
> "type"/format of cells/columns? Or would you be willing to work with me
> to add one to io? (Maybe by just exposing the cell format as yet another
> output arg of xlsread()?)
>
> Same with writing tables: I'd need a way to control cell formatting so
> that dates could be formatted as dates explicitly instead of shooting
> them in as strings and relying on Excel's auto-parsing functionality
> (which I think only works when you're using the Excel COM interface
> anyway). Would also be nice to do stuff like bold column headers and
> freeze panes.

First off, sorry for a long reply.

I think, no I'm sure that all that you want there is possible. But it isn't
going to be easy.

The first thing you'll hit (at least, I hit it) is that Octave itself has no
Date or Time type. Only since classdef got implemented there may be a way
out, but for classdef objects there still is no reliable I/O to e.g., .mat
files.
This (no dedicated date or time type) is one of the reasons I left dates and
times aside for spreadsheet I/O. In fact, for file types and support
libraries that do offer date/time types I made the io package convert
date/time values into Octave datenums = doubles.

Then, when you write about cell formatting to uncover cell type I can't help
smelling "Excel". But .ods has a much richer cell type spectrum, maybe you
can largely skip formatting there.

Furthermore, it looks like you imply something like dataframes with headers
and row id's but the spreadsheet file types I know of really only have
individual cell types inside. Spreadsheet I/O usually happens at a fairly
low level (i.e., individual cells). Formatting ("styles") might happen over
ranges but not necessarily along table paradigms.
But I know of Java support SW (e.g., jOpendocument) that might also offer
higher level I/O - the level of entire tables. ActiveX also does that
("Ranges"). I once tried such I/O with jOpenDocument but it was too
complicated for me at the time as there's also a lot of Java itself
involved, i.e., it seemed I needed to build my own .jars. (Oh, and now I
remember) javaArrays didn't work well at the time; even these days I have
the impression they're not so robust (but I may well be wrong).

A bit of context:

.xls files (the old BIFF5 and BIFF8 ones) have no dedicated Date, Time or
DateTime cell type. In fact they basically only have Double, Text and
Boolean. The contents are further indicated by cell formatting; that way one
can find out -in theory!- which text cells are formulas and/or date/time. I
think this is what you referred to when mentioning "cell formatting".
BUT: AFAICS most of the Java libraries for spreadsheet I/O that I used (and
on Windows, ActiveX) shield this from you and offer at most a Formula cell
type. So DateTime etc. is harder to uncover; although in Visual Basic
(ActiveX) there is a dedicated type (see __COM__.cc in the OF windows
package).
The newer .xlsx files do have dedicated Date/Time cell types, see e.g.
__OCT_xlsx2oct__.m; the OCT interface explicitly processes them into
datenums :-)

.ods has a richer type spectrum, it does have Date, Time and even Currency
cell types. But again, not all spreadsheet interfaces return all cell types.

All in all, asking to uncover spreadsheet cell types beyond Double, Boolean,
String and Formula is asking to open a big and fully stuffed can of worms
:-)   It is all file type and interface dependent.
FYI, down in the io package NEWS file there's a table outlining which file
formats can be processed by what interfaces. That'll give a first indication
of complexity.

I am surely willing to help you out, given that I do not have not so much
time anymore for Octave these days. But yeah I think it'll be fun :-)
As the io package's spreadsheet I/O is so old (it started > 10 years ago)
and has been Just Working all that time I have to admit that my
"operational" knowledge of its innards got fairly rusty.

Maybe open a task in the Task Tracker? that tracker is largely, but IMO
unduly, dormant.

Philip




--
Sent from: https://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html

Reply | Threaded
Open this post in threaded view
|

Re: Table I/O [WAS: io-2.4.13 released]

apjanke-floss


On 10/18/19 2:19 PM, PhilipNienhuis wrote:

> apjanke-floss wrote
>> On 10/18/19 2:29 AM, Philip Nienhuis wrote:
>>> Kai Torben Ohlhus wrote:
>>>> On 10/18/19 5:30 PM, PhilipNienhuis wrote:
>>>>> [...]
>>>>>
>>>>> For readTable() and friends Octave needs to have the Table class
>>>>> implemented. To my knowlegde there are no definite plans for that;
>>>>> there is
>>>>> a prospect of a start by Markus Bergholz somewhere in (IIRC) the bug
>>>>> tracker, from several years back.
>>>>>
>>>>
>>>> Regarding this, I want to remind of Andrews project [1].
>>>
>>> Thanks, I forgot about that. Yeah, an impressive piece of work. But,
>>> still no Table I/O there AFAICS.
>>> On github (or gitlab?) I saw a readTable() that effectively is a wrapper
>>> for xlsread. Could be a temporary solution.
>>
>> Thanks Kai and Philip! Table I/O is on my TODO list for Tablicious. Now
>> that I know there's appetite for it, I'll bump up its priority.
>>
>> https://github.com/apjanke/octave-tablicious/issues/49
>>
>> It looks like Forge io's spreadsheet reading is mature enough that I
>> could build Table I/O on top of that. And that seems like a good design:
>> no need for another package to re-implement basic spreadsheet I/O.
>>
>> With one exception I can see: cell formatting. I need to be able to
>> detect which columns are formatted as dates, so they can automatically
>> be converted to @datetime objects. (And eventually I'd like to provide
>> the option to efficiently read selected columns as @categorical, but
>> that's much lower priority.)
>>
>> Is there an efficient way in io's spreadsheet support to detect the
>> "type"/format of cells/columns? Or would you be willing to work with me
>> to add one to io? (Maybe by just exposing the cell format as yet another
>> output arg of xlsread()?)
>>
>> Same with writing tables: I'd need a way to control cell formatting so
>> that dates could be formatted as dates explicitly instead of shooting
>> them in as strings and relying on Excel's auto-parsing functionality
>> (which I think only works when you're using the Excel COM interface
>> anyway). Would also be nice to do stuff like bold column headers and
>> freeze panes.
>
> First off, sorry for a long reply.
>
> I think, no I'm sure that all that you want there is possible. But it isn't
> going to be easy.

Glad to hear!

> The first thing you'll hit (at least, I hit it) is that Octave itself has no
> Date or Time type. Only since classdef got implemented there may be a way
> out, but for classdef objects there still is no reliable I/O to e.g., .mat
> files.
> This (no dedicated date or time type) is one of the reasons I left dates and
> times aside for spreadsheet I/O. In fact, for file types and support
> libraries that do offer date/time types I made the io package convert
> date/time values into Octave datenums = doubles.

Tablicious provides a Matlab-compatible Octave @datetime class. If io
can pass me datenums (or strings) and some indication that "this cell is
a date value", Tablicious can do the rest.

The mat file I/O I'll leave for upstream core Octave to sort out.

> Then, when you write about cell formatting to uncover cell type I can't help
> smelling "Excel". But .ods has a much richer cell type spectrum, maybe you
> can largely skip formatting there.

You definitely smell Excel. That's all I'm familiar with; the .ods
format is new to me. If it exposes a true DateTime cell type indication,
then probably no need for sniffing the formatting.

> Furthermore, it looks like you imply something like dataframes with headers
> and row id's but the spreadsheet file types I know of really only have
> individual cell types inside. Spreadsheet I/O usually happens at a fairly
> low level (i.e., individual cells). Formatting ("styles") might happen over
> ranges but not necessarily along table paradigms.

Yep, table works like a dataframe, and contains homogeneously-typed
columns. Converting from spreadsheet per-cell typing/formatting to
table's column-oriented typing would be done with heuristics like "if
all cells in a column contain date values, store that column as a
@datetime vector; if they're all numbers (detected using other
heuristics), use a primitive double vector; otherwise, use a
heterogeneous cell vector".

> But I know of Java support SW (e.g., jOpendocument) that might also offer
> higher level I/O - the level of entire tables. ActiveX also does that
> ("Ranges"). I once tried such I/O with jOpenDocument but it was too
> complicated for me at the time as there's also a lot of Java itself
> involved, i.e., it seemed I needed to build my own .jars. (Oh, and now I
> remember) javaArrays didn't work well at the time; even these days I have
> the impression they're not so robust (but I may well be wrong).

When I've done this in the past for Matlab, I've used Apache POI. It
worked fine, except for performance: to be fast at reading large arrays
of numerics, you need to write a small custom buffering layer in Java so
that you're not looping over per-cell read() calls in an M-code level
loop to do the type discovery heuristics and array buffering there. So I
ended up needing to build my own JARs, and expect we would for Octave
too if we wanted to Go Fast using a Java spreadsheet interface. A Range
in either ActiveX or POI (and I assume jOpendocument) doesn't work for
Matlab/Octave well because the contents of each cell are themselves
individual objects that need to be accessed with method or attribute
calls, whereas to be Fast, you need to pass an entire *primitive*
numeric array across the M-code/external language boundary.

But I'm happy to have it be slow for now, as long as the basics work.
Then, doing accelerated spreadsheet I/O could be an enhancement for the
io package some day, and if Tablicious were built on top of it, it would
get the speedup automatically when it happened.

> A bit of context:
>
> .xls files (the old BIFF5 and BIFF8 ones) have no dedicated Date, Time or
> DateTime cell type. In fact they basically only have Double, Text and
> Boolean. The contents are further indicated by cell formatting; that way one
> can find out -in theory!- which text cells are formulas and/or date/time. I
> think this is what you referred to when mentioning "cell formatting"

Yep, that's what I meant. Dates are detected by seeing that a cell holds
a Double, and then looking at the cell format to see if it's being
presented as a date.

> BUT: AFAICS most of the Java libraries for spreadsheet I/O that I used (and
> on Windows, ActiveX) shield this from you and offer at most a Formula cell
> type. So DateTime etc. is harder to uncover; although in Visual Basic
> (ActiveX) there is a dedicated type (see __COM__.cc in the OF windows
> package).

> The newer .xlsx files do have dedicated Date/Time cell types, see e.g.
> __OCT_xlsx2oct__.m; the OCT interface explicitly processes them into
> datenums :-)

I didn't know this! That'll make things easier. Or, rather, more
complicated, since a robust readtable() will need to support both .xls
and .xlsx, so it'll need code for both kinds of cell type detection.

> .ods has a richer type spectrum, it does have Date, Time and even Currency
> cell types. But again, not all spreadsheet interfaces return all cell types.
>
> All in all, asking to uncover spreadsheet cell types beyond Double, Boolean,
> String and Formula is asking to open a big and fully stuffed can of worms
> :-)   It is all file type and interface dependent.

I know it's a big can of worms and a mess, and will never be 100% Right;
I've done this whole shebang before for Matlab. But dates/times are *so*
important in data analysis, I want to support those out of the box in
Tablicious.

I don't care about Currency or other advanced types, I think. They don't
crop up as much in data analysis, there's probably no corresponding
Octave type to convert them to, and Matlab doesn't support them AFAIK,
so there's no Matlab-compatibility need for them.

> FYI, down in the io package NEWS file there's a table outlining which file
> formats can be processed by what interfaces. That'll give a first indication
> of complexity.
>
> I am surely willing to help you out, given that I do not have not so much
> time anymore for Octave these days. But yeah I think it'll be fun :-)
> As the io package's spreadsheet I/O is so old (it started > 10 years ago)
> and has been Just Working all that time I have to admit that my
> "operational" knowledge of its innards got fairly rusty.

Thanks! I'll have a look through that table and start reading the io
code to get familiar with it.

> Maybe open a task in the Task Tracker? that tracker is largely, but IMO
> unduly, dormant.

Where can I find the Task Tracker? I don't see it linked from the io
Octave Forge package page.

Cheers,
Andrew


Reply | Threaded
Open this post in threaded view
|

Re: io-2.4.13 released

apjanke-floss
In reply to this post by PhilipNienhuis


On 10/17/19 12:51 PM, Philip Nienhuis wrote:

> Hi,
> [...]
>
> [...] I do want to adapt the spreadsheet I/O
> functions code base to a much more easily maintainable status. A benefit
> is that it'll make it easier to move them to core some day, maybe.
>
> An important part is merging the now largely separate ods????? and
> xls????? functions sets into just one xls????? function set and morphing
> the ods????? functions into wrappers for their xls????? counterparts and
> deprecate them. After all, Matlab's xlsread can perfectly read .ods
> these days and we do want to be compatible.
> This way a lot of duplicate code can be eliminated.

I like this idea of merging them, especially if the file type could be
auto-detected. One less thing for the user to worry about.

> A next step might be splitting up the OCT interface .xlsx and maybe .ods
> private functions into smaller dedicated pieces. However, .xlsx and .ods
> are very complicated file formats so maintaining these functions still
> won't be easy peasy.

Here's an idea that you may or may not like: how about refactoring the
internal implementation into Octave OOP classdef classes? The pattern of
having "COM_", "JOD_", "JXL_", "POI_", etc variants of the same set of
"__XXX_chk_sprt__", "__XXX_spsh_open" etc functions, which are called
inside switch statements, just screams "polymorphism" to me. And then
there's the XLS vs XLSX vs ODS axis. Turning those into a family of
classes like "ComExcelAdapter", "PoiExcelAdapter", "UnoOdsAdapter" might
make organizing the code easier, and less painful to split up into small
methods. Then a bunch of OCT interface methods could live inside their
class without cluttering up the shared function namespace and making the
inst/ directory really big.

OOP could also be a nice way of representing more complicated state like
cell formatting controls, style definitions, and so on. And the "file
handle" structs returned by xlsopen() and odsopen() could be replaced
with better-typed objects. Two "assert (isa (...), ...)" calls would
replace this verbose, approximate type checking:

   ## Check if xls struct pointer seems valid
   if (! isstruct (xls))
     error ("File ptr struct expected for arg @ 1\n");
   endif
   test1 = ! isfield (xls, "xtype");
   test1 = test1 || ~isfield (xls, "workbook");
   test1 = test1 || isempty (xls.workbook);
   test1 = test1 || isempty (xls.app);
   test1 = test1 || ~ischar (xls.xtype);
   if test1
     error ("Invalid xls file pointer struct\n");
   endif

   ## Check worksheet ptr
   if (! (ischar (wsh) || isnumeric (wsh)))
     error ("Integer (index) or text (wsh name) expected for arg # 2\n");
   elseif (isempty (wsh))
     wsh = 1;
   endif


And then you could use Octave's handle class's delete() destructor to do
garbage collection on resources that need to be cleaned up, like active
COM Excel or UNO LibreOffice sessions running in the background, the
opened workbook files, and so on.

Stick them in a "forge.io.internal" namespace to indicate that they're
for Octave's internal use and end user code should use them directly.

I realize this would be a big change, though.

Cheers,
Andrew

Reply | Threaded
Open this post in threaded view
|

Re: Table I/O [WAS: io-2.4.13 released]

jbect
In reply to this post by apjanke-floss
Le 18/10/2019 à 23:58, Andrew Janke a écrit :
This (no dedicated date or time type) is one of the reasons I left dates and
times aside for spreadsheet I/O. In fact, for file types and support
libraries that do offer date/time types I made the io package convert
date/time values into Octave datenums = doubles.

Tablicious provides a Matlab-compatible Octave @datetime class.


Hi Andrew,

How complete is this implementation of @datetime [1, 2] ?

If it is complete, perhaps could you already extract it from tablicious and propose it on the patch tracker for inclusion in Octave core ?

Or at least as an independent Octave Forge package as a first step ?

@++
Julien


[1] https://github.com/apjanke/octave-tablicious/blob/master/inst/datetime.m

[2] https://www.mathworks.com/help/matlab/ref/datetime.html

Reply | Threaded
Open this post in threaded view
|

Re: Table I/O [WAS: io-2.4.13 released]

apjanke-floss


On 10/18/19 10:30 PM, Julien Bect wrote:

> Le 18/10/2019 à 23:58, Andrew Janke a écrit :
>>> This (no dedicated date or time type) is one of the reasons I left
>>> dates and
>>> times aside for spreadsheet I/O. In fact, for file types and support
>>> libraries that do offer date/time types I made the io package convert
>>> date/time values into Octave datenums = doubles.
>>
>> Tablicious provides a Matlab-compatible Octave @datetime class.
>
>
> Hi Andrew,
>
> How complete is this implementation of @datetime [1, 2] ?
>
> If it is complete, perhaps could you already extract it from tablicious
> and propose it on the patch tracker for inclusion in Octave core ?

I'd say it's about 80% feature complete, and maybe 30% code complete. :)
 From the user's standpoint, about 80% of what you'd want to use it for
works, though some of it is slow. But it still needs work to get that
last 20% and improve its speed, and if I'm unlucky, that could
necessitate big changes in the internals. And it has almost no unit tests.

So, it's not nearly ready to try donating to Octave core.

At some point, my goal is for the majority of Tablicious (@table,
@timetable, @datetime, @duration, @categorical, @string, and friends) to
all move in to Octave core. But that's a pretty long-term vision.

If you want to follow its development, related bugs and TODOs are tagged
"chrono" in Tablicious' issue tracker:
https://github.com/apjanke/octave-tablicious/labels/chrono.

And to see how far it is now, check the docs; I keep them pretty up to date.
*
https://apjanke.github.io/octave-tablicious/doc/tablicious.html#Date_002fTime-Representation
* https://apjanke.github.io/octave-tablicious/doc/tablicious.html#datetime


> Or at least as an independent Octave Forge package as a first step ?

I'm afraid not this either. It used to be its own separate package
(called Chrono), but development of @datetime ended up being so closely
tied to @table and @timetable, I decided to merge them to keep things
workable for myself.

If you want to play with @datetime, just install the full Tablicious
package: I'm trying to keep it easy to install and use, so aside from a
few megs of disk space, it shouldn't cost you anything.

Cheers,
Andrew

Reply | Threaded
Open this post in threaded view
|

Re: Table I/O [WAS: io-2.4.13 released]

PhilipNienhuis
In reply to this post by apjanke-floss
apjanke-floss wrote
>
> <snip>
>> Maybe open a task in the Task Tracker? that tracker is largely, but IMO
>> unduly, dormant.
>
> Where can I find the Task Tracker? I don't see it linked from the io
> Octave Forge package page.

https://savannah.gnu.org/task/?group=octave

Or, just scroll up in the Octave bug tracker, and its in the upper right
menu between Bugs and Patches, see attached pic.

Philip

<https://octave.1599824.n4.nabble.com/file/t248596/Octave_task_tracker.png>



--
Sent from: https://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html

Reply | Threaded
Open this post in threaded view
|

Re: Table I/O [WAS: io-2.4.13 released]

apjanke-floss

On 10/19/19 2:12 AM, PhilipNienhuis wrote:

> apjanke-floss wrote
>>
>> <snip>
>>> Maybe open a task in the Task Tracker? that tracker is largely, but IMO
>>> unduly, dormant.
>>
>> Where can I find the Task Tracker? I don't see it linked from the io
>> Octave Forge package page.
>
> https://savannah.gnu.org/task/?group=octave
>
> Or, just scroll up in the Octave bug tracker, and its in the upper right
> menu between Bugs and Patches, see attached pic.

Aha. Thanks.

"Submit New" under Tasks is crossed out for me, and not an active link.
Maybe the Octave project on Savannah is set up so that only project
members have permission to create new Tasks (or to Export bugs or tasks)?

Cheers,
Andrew

Reply | Threaded
Open this post in threaded view
|

Re: Table I/O [WAS: io-2.4.13 released]

PhilipNienhuis
apjanke-floss wrote

> On 10/19/19 2:12 AM, PhilipNienhuis wrote:
>> apjanke-floss wrote
>>>
>>>
> <snip>
>>>> Maybe open a task in the Task Tracker? that tracker is largely, but IMO
>>>> unduly, dormant.
>>>
>>> Where can I find the Task Tracker? I don't see it linked from the io
>>> Octave Forge package page.
>>
>> https://savannah.gnu.org/task/?group=octave
>>
>> Or, just scroll up in the Octave bug tracker, and its in the upper right
>> menu between Bugs and Patches, see attached pic.
>
> Aha. Thanks.
>
> "Submit New" under Tasks is crossed out for me, and not an active link.
> Maybe the Octave project on Savannah is set up so that only project
> members have permission to create new Tasks (or to Export bugs or tasks)?

Maybe you should be handed credentials by now (that is, if you want). I'd be
in favor of that.

But I can create a task, done:
https://savannah.gnu.org/task/index.php?15419

Now let's see if you can add comments there. If not the Task tracker works
not as I hoped.

Philip



--
Sent from: https://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html

Reply | Threaded
Open this post in threaded view
|

Re: Table I/O [WAS: io-2.4.13 released]

apjanke-floss


On 10/19/19 5:44 AM, PhilipNienhuis wrote:

> apjanke-floss wrote
>> On 10/19/19 2:12 AM, PhilipNienhuis wrote:
>>> apjanke-floss wrote
>>>>
>>>>
>> <snip>
>>>>> Maybe open a task in the Task Tracker? that tracker is largely, but IMO
>>>>> unduly, dormant.
>>>>
>>>> Where can I find the Task Tracker? I don't see it linked from the io
>>>> Octave Forge package page.
>>>
>>> https://savannah.gnu.org/task/?group=octave
>>>
>>> Or, just scroll up in the Octave bug tracker, and its in the upper right
>>> menu between Bugs and Patches, see attached pic.
>>
>> Aha. Thanks.
>>
>> "Submit New" under Tasks is crossed out for me, and not an active link.
>> Maybe the Octave project on Savannah is set up so that only project
>> members have permission to create new Tasks (or to Export bugs or tasks)?
>
> Maybe you should be handed credentials by now (that is, if you want). I'd be
> in favor of that.
>
> But I can create a task, done:
> https://savannah.gnu.org/task/index.php?15419
>
> Now let's see if you can add comments there. If not the Task tracker works
> not as I hoped.
>
> Philip
>

Yep, commenting worked. Thanks!

Cheers,
Andrew

Reply | Threaded
Open this post in threaded view
|

Re: Table I/O [WAS: io-2.4.13 released]

PhilipNienhuis
apjanke-floss wrote

> On 10/19/19 5:44 AM, PhilipNienhuis wrote:
>> apjanke-floss wrote
> <snip>
>> But I can create a task, done:
>> https://savannah.gnu.org/task/index.php?15419
>>
>> Now let's see if you can add comments there. If not the Task tracker
>> works
>> not as I hoped.
>>
>> Philip
>>
>
> Yep, commenting worked. Thanks!

Right, let's continue technical discussion for Table I/O there.
I've put up a sort of roadmap there for the first stages.

Philip



--
Sent from: https://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html