Loading Files with mixed text and numbers

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Loading Files with mixed text and numbers

Fritz Sonnichsen
I did this all the time with Matlab--loaded files with the very common
form of records like this:
2017/07/07,   13:59:59,   022.2,   12.69

I found the matlab "textscan" absurdly complicated for this simple task,
generating nested table arrays with lots of associated book keeping of
braces, brackets etc. So I finally used matlab "readtable" which worked
fine.

When I tried this in Octave it became apparent that the readtable
function is not implemented. Using csvread leads me to invalid results
(it apparently is for numbers and does not support the date strings).

So - short or writing loops does Octave implement something similar to
readtable or some simple way to perform this task--loading comma
separated strings into a regular array?

Thanks
Fritz


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Loading Files with mixed text and numbers

siko1056
Fritz Sonnichsen wrote
I did this all the time with Matlab--loaded files with the very common
form of records like this:
2017/07/07,   13:59:59,   022.2,   12.69

I found the matlab "textscan" absurdly complicated for this simple task,
generating nested table arrays with lots of associated book keeping of
braces, brackets etc. So I finally used matlab "readtable" which worked
fine.

When I tried this in Octave it became apparent that the readtable
function is not implemented. Using csvread leads me to invalid results
(it apparently is for numbers and does not support the date strings).

So - short or writing loops does Octave implement something similar to
readtable or some simple way to perform this task--loading comma
separated strings into a regular array?

Thanks
Fritz
AFAIK, Octave does not support something similar to Matlabs table. To overcome this, try to read for example a file with these lines:

2017/07/07,   13:59:56,   022.3,   12.66
2017/07/08,   13:59:57,   022.4,   12.67
2017/07/09,   13:59:58,   022.5,   12.68
2017/07/10,   13:59:59,   022.6,   12.69

Using a naive approach:

A = fileread ("log.txt");
A = strsplit (A, "\n");
for i = 1:length(A)
  B(i,:) = strtrim (strsplit (A{i}, ","));
endfor
# convert numerical values 3rd and 4th column of cell array of chars
B(:,[3, 4]) = cellfun(@str2num, B(:,[3, 4]), "UniformOutput", false);
fprintf("%s %s %f%f\n", B'{:})

Output:

2017/07/07 13:59:56 22.300000 12.660000
2017/07/08 13:59:57 22.400000 12.670000
2017/07/09 13:59:58 22.500000 12.680000
2017/07/10 13:59:59 22.600000 12.690000

HTH,
Kai
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Loading Files with mixed text and numbers

Fritz Sonnichsen
Thanks Kai.
    This makes sense and is easy to understand when reading the code
later. I did not know about cellfun-good one to remember.

cheers
Fritz


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Loading Files with mixed text and numbers

Francesco Potortì
In reply to this post by Fritz Sonnichsen
>I did this all the time with Matlab--loaded files with the very common
>form of records like this:
>2017/07/07,   13:59:59,   022.2,   12.69
>
>I found the matlab "textscan" absurdly complicated for this simple task,
>generating nested table arrays with lots of associated book keeping of
>braces, brackets etc. So I finally used matlab "readtable" which worked
>fine.
>
>When I tried this in Octave it became apparent that the readtable
>function is not implemented. Using csvread leads me to invalid results
>(it apparently is for numbers and does not support the date strings).

The function textread (an alternative to textscan) behaves in a way
similar to Matlab's readtable, only thing is that you must provide it
with the type of columns data.  In your case:

[dat,tim,n1,n2] = textread (filename, "%s %s %f %f");

>So - short or writing loops does Octave implement something similar to
>readtable or some simple way to perform this task--loading comma
>separated strings into a regular array?

Matlab's readtable loads number-type columns into regular arrays, and
text-type columns into cell arrays of strings, like textread.  Is that
what you want?

--
Francesco Potortì (ricercatore)        Voice:  +39.050.621.3058
ISTI - Area della ricerca CNR          Mobile: +39.348.8283.107
via G. Moruzzi 1, I-56124 Pisa         Skype:  wnlabisti
(entrance 20, 1st floor, room C71)     Web:    http://fly.isti.cnr.it


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Loading Files with mixed text and numbers

Przemek Klosowski-7
In reply to this post by siko1056
On 07/09/2017 06:53 PM, siko1056 wrote:
Fritz Sonnichsen wrote
I did this all the time with Matlab--loaded files with the very common 
form of records like this:
2017/07/07,   13:59:59,   022.2,   12.69

Personally, I found that it's a little awkward to process dates in Octave, so I like to keep my time series data in sqlite because it has a builtin CSV import, and does dates very well. The purist may object to such mixing, but the practical person would point out that the Domain Specific Languages (DSL) concept is a cool idea.

For Octave use, it's convenient to convert dates to numbers, e.g. Julian day numbers used by astronomers.  I'd do something like

sqlite3 mydata.db -csv '.import mydata.csv m' # import the CSV file to table 'm'

sqlite3 mydata.db 'select julianday(date),col3,col4 from m'   # spit out the data

or actually use the sqlite extension in Octave to read the database directly instead of using []=system()


Caveats:

- it's easier if the first line of the CSV file has column headers---they become the SQL column names

- actually, since your dates use / instead of - for separator, and separates date and time, and doesn't specify timezone, it'd be

sqlite3 mydata.db 'select julianday(replace(date,"/","-")||" "||time,"localtime"),col3,col4 from m'


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Loading Files with mixed text and numbers

Fritz Sonnichsen
Yes-unfortunately dates are always a problem with many languages. I really got messed up with a  system in PERL a while back-had to account for leap seconds and other annoying facts due to the earths rotation (can't someone change that!)

Thanks for the comments-I am sure I will need them eventually.
Fritz

On 7/10/2017 11:15 AM, Przemek Klosowski wrote:
On 07/09/2017 06:53 PM, siko1056 wrote:
Fritz Sonnichsen wrote
I did this all the time with Matlab--loaded files with the very common 
form of records like this:
2017/07/07,   13:59:59,   022.2,   12.69

Personally, I found that it's a little awkward to process dates in Octave, so I like to keep my time series data in sqlite because it has a builtin CSV import, and does dates very well. The purist may object to such mixing, but the practical person would point out that the Domain Specific Languages (DSL) concept is a cool idea.

For Octave use, it's convenient to convert dates to numbers, e.g. Julian day numbers used by astronomers.  I'd do something like

sqlite3 mydata.db -csv '.import mydata.csv m' # import the CSV file to table 'm'

sqlite3 mydata.db 'select julianday(date),col3,col4 from m'   # spit out the data

or actually use the sqlite extension in Octave to read the database directly instead of using []=system()


Caveats:

- it's easier if the first line of the CSV file has column headers---they become the SQL column names

- actually, since your dates use / instead of - for separator, and separates date and time, and doesn't specify timezone, it'd be

sqlite3 mydata.db 'select julianday(replace(date,"/","-")||" "||time,"localtime"),col3,col4 from m'



_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Loading...