comments in datafile

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

comments in datafile

Stefano Ghirlanda
Hi, I was wondering if it is possible to have some comment lines in a
datafile, apart from the #-lines stating types and dimensions.
I cannot use a source file since I read the file with gnuplot for fitting
purposes, and I thought that octave would ignore additional #-lines I put
fro comment, but it seems it doesn't work...

        Thanks for help and for octave,

Stefano Ghirlanda, Zoologiska Institutionen, Stockholms Universitet
Office: Frescati Campus, Hus D, Rum 554. Office Phone: +46-8-164055
Mail: Svante Arrheniusv. 14, S-106 91, Stockholm, Sweden          
E-mail: [hidden email], Web: http://rerumnatura.zool.su.se


Reply | Threaded
Open this post in threaded view
|

comments in datafile

John W. Eaton-6
On 20-Mar-1998, Stefano Ghirlanda <[hidden email]> wrote:

| Hi, I was wondering if it is possible to have some comment lines in a
| datafile, apart from the #-lines stating types and dimensions.
| I cannot use a source file since I read the file with gnuplot for fitting
| purposes, and I thought that octave would ignore additional #-lines I put
| fro comment, but it seems it doesn't work...

Sorry, Octave doesn't have this feature.  Perhaps someone would like
to do the work to add it?

Longer term, I would like to make the load and save functions more
easily extensible so that users can add their own special-purpose
functions for reading data files in whatever format they want.

Thanks,

jwe


Reply | Threaded
Open this post in threaded view
|

Re: comments in datafile

Dirk Eddelbuettel
In reply to this post by Stefano Ghirlanda

  Stefano>  Hi, I was wondering if it is possible to have some comment lines
  Stefano> in a datafile, apart from the #-lines stating types and
  Stefano> dimensions.  I cannot use a source file since I read the file with
  Stefano> gnuplot for fitting purposes, and I thought that octave would
  Stefano> ignore additional #-lines I put fro comment, but it seems it
  Stefano> doesn't work...

The highly recommended package of additional m-files by Kurt Hornik et al
(at ftp://ftp.ci.tuwien.ac.at/pub/octave/octave-ci.tar.gz) contains a
function  aload.m  which can execute arbitrary awk command before loading
(numerical data) from ascii files. I include the function below. There is
also a matching function  asave.m.

Hope this helps, Dirk


## Copyright (C) 1996, 1997  Kurt Hornik
##
## This program is free software; you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation; either version 2, or (at your option)
## any later version.
##
## This program is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with this file.  If not, write to the Free Software Foundation,
## 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

## x = aload (filename [, cw [, rw [, FS [, NA [, ignore_regexp]]]]])
## loads the flat ASCII data file `filename' into x.
##
## With the optional parameters cw and rw one can select the data
## columns (variables) and rows (observations) to load.  Both cw and rw
## may be index vectors or Inf (default), meaning to load everything.
##
## With FS, one can specify the field separator in the data file as one
## would do in AWK.  Default is " ".
##
## With NA, one can specify how unavailable data are represented in the
## data file, and how they should be loaded into Octave.  The default is
## "NA/NaN", meaning that NA's should be converted to NaN's.  (Note that
## this does not work yet.)
##
## Finally, ignore_regexp is an egrep regular expression specifying
## which lines in the data file should be ignored.  The default is
## "^[\t]*(#|%|$)", meaning that empty lines and lines where # or % are
## the first non-whitespace characters are ignored.
##
## Note that rw selects the data line (observation) numbers and NOT the
## line numbers in the file!
##
## Note also that currently, only real numbers can be loaded.
 
## Author:  KH <[hidden email]>
## Description:  Load from a flat ASCII data file

function x = aload (filename, cw, rw, FS, NA, ignore_regexp)

  if ((nargin < 1) || (nargin > 6))
    usage ("aload (filename, cw, rw, FS, NA, ignore_regexp)");
  endif

  if (nargin < 6)
    ignore_regexp = "^[ \t]*(#|%|$)";
  endif
  if (nargin < 5)
    NA = "NA/NaN";
  endif
  if (nargin < 4)
    FS = " ";
  endif
  if (nargin < 3)
    rw = Inf;
  endif
  if (nargin < 2)
    cw = Inf;
  endif

  ## maybe_do_more_sanity_checks ();  

  if !is_struct (stat (filename))
    error (sprintf ("aload:  File '%s' not found", filename));
  endif

  tmpfile = octave_tmp_file_name ();

  system (["cat ", filename, " | ", ...
           "egrep -ve \'", ignore_regexp, "\' | ", ...
           "sed -e 's/", NA, "/g' > ", tmpfile]);

  eval (system (["cat ", tmpfile, " | ", ...
                 "awk 'BEGIN { FS = \"", FS, "\" }; ", ...
                 "END { printf \"rf = %g; cf = %g;\", NR, NF }'"]));

  if (cw == Inf)
    cw = 1 : cf;
  elseif (min (size (cw)) == 1)
    cw = cw (find (cw <= cf));
  else
    error ("aload:  cw must be a scalar or a vector");
  endif

  if (rw == Inf)
    rw = 1 : rf;
  elseif (min (size (rw)) == 1)
    rw = rw (find (rw <= rf));
  else
    error ("aload:  rw must be a scalar or a vector");
  endif

  loadfile = octave_tmp_file_name ();

  fd = fopen (loadfile, "w");
  fprintf (fd, "# name x\n# type: matrix\n");
  fprintf (fd, "# rows: %g\n# columns: %g\n", length (rw), length (cw));
  fclose (fd);
 
  s = sprintf ("$%d", cw(1));
  for i = 2 : length (cw);
    s = sprintf ("%s, $%d", s, cw(i));
  endfor

  system (["cat ", tmpfile, " | ", ...
           "awk 'BEGIN { FS = \"", FS, "\" }; { print ", s, " };' ", ...
           " >> ", loadfile]);

  eval (["load -force -ascii ", loadfile]);

  x = x(rw, :);

  system (sprintf ("rm -f %s %s", tmpfile, loadfile));
 
endfunction



--
mailto:[hidden email]              According to the latest official figures,
http://rosebud.ml.org/~edd      43% of all statistics are totally worthless.