Slow Processing Issue

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Slow Processing Issue

Fritz Sonnichsen
I have a very simple script shown below. I am processing a small
file-about 50,000 lines of 60 bytes each. The file loads just fine but
the "for" loop takes around 15 minutes on a laptop that generally does
not have performance issues. I recall that early versions of matlab had
some type of "file reallocation" recommendation to speed things up.
   Should Octave be able to do this script in a reasonable amount of time?
   Am I doing anything wrong here?

Thanks
Fritz

=================================================================
1;
clear all
   workfile  = fileread
('C:\Users\fsonnichsen\Desktop\conduct\minicom.cap');
   workfile = strsplit (workfile, "\n");
   disp("WORKFILE LOADED");fflush(stdout);
   for i = 1:length(workfile)
     Clog(i,:) = strtrim (strsplit (workfile{i}, ","));
     %if mod(i,1000)==0 display(i); fflush(stdout);end;
   endfor


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow Processing Issue

siko1056
Fritz Sonnichsen wrote
I have a very simple script shown below. I am processing a small
file-about 50,000 lines of 60 bytes each. The file loads just fine but
the "for" loop takes around 15 minutes on a laptop that generally does
not have performance issues. I recall that early versions of matlab had
some type of "file reallocation" recommendation to speed things up.
   Should Octave be able to do this script in a reasonable amount of time?
   Am I doing anything wrong here?

Thanks
Fritz

=================================================================
1;
clear all
   workfile  = fileread
('C:\Users\fsonnichsen\Desktop\conduct\minicom.cap');
   workfile = strsplit (workfile, "\n");
   disp("WORKFILE LOADED");fflush(stdout);
   for i = 1:length(workfile)
     Clog(i,:) = strtrim (strsplit (workfile{i}, ","));
     %if mod(i,1000)==0 display(i); fflush(stdout);end;
   endfor
Hello Fritz,

Did you give the approach from Francesco Potortì [1], using textread, a try? Otherwise you could try to pre-allocate the cell-array Clog of my adapted example:

> more off # instead of fflush, why are you using fflush anyway??
> clear all
> workfile = fileread ('C:\Users\fsonnichsen\Desktop\conduct\minicom.cap');
> workfile = strsplit (workfile, "\n");
> disp("WORKFILE LOADED");
>
> # pre-allocation
> N = length(workfile);
> Clog = cell (N,4);
>
> for i = 1:N
>  Clog(i,:) = strtrim (strsplit (workfile{i}, ","));
>  %if mod(i,1000)==0 display(i); end;
> endfor

HTH,
Kai

[1]: http://octave.1599824.n4.nabble.com/Loading-Files-with-mixed-text-and-numbers-td4684044.html#a4684050
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow Processing Issue

Fritz Sonnichsen
Thanks all!

I just looked at Francesco's post (busy week here) and wrote up the
script using textread. I had avoided it because matlab is obsoleting
textread but I am more than happy to use it. This solved the problem--it
is much faster.  I ran 135,000 records in 13.6 seconds.

The pertinent code area was:
[C,R,V]= textread
(filename,"%*s,%f,%*f,%*f,%*f,%f,%f,%*f,%*f,%*s,%*s",'delimiter',',');

I had to use the "delimiter as I am using CSV.  The "asterik" format was
handy to avoid using up memory.

Thanks for the help
Fritz


On 7/12/2017 8:16 AM, siko1056 wrote:

> Fritz Sonnichsen wrote
>> I have a very simple script shown below. I am processing a small
>> file-about 50,000 lines of 60 bytes each. The file loads just fine but
>> the "for" loop takes around 15 minutes on a laptop that generally does
>> not have performance issues. I recall that early versions of matlab had
>> some type of "file reallocation" recommendation to speed things up.
>>     Should Octave be able to do this script in a reasonable amount of time?
>>     Am I doing anything wrong here?
>>
>> Thanks
>> Fritz
>>
>> =================================================================
>> 1;
>> clear all
>>     workfile  = fileread
>> ('C:\Users\fsonnichsen\Desktop\conduct\minicom.cap');
>>     workfile = strsplit (workfile, "\n");
>>     disp("WORKFILE LOADED");fflush(stdout);
>>     for i = 1:length(workfile)
>>       Clog(i,:) = strtrim (strsplit (workfile{i}, ","));
>>       %if mod(i,1000)==0 display(i); fflush(stdout);end;
>>     endfor
> Hello Fritz,
>
> Did you give the approach from Francesco Potortì [1], using textread, a try?
> Otherwise you could try to pre-allocate the cell-array Clog of my adapted
> example:
>
>> more off # instead of fflush, why are you using fflush anyway??
>> clear all
>> workfile = fileread ('C:\Users\fsonnichsen\Desktop\conduct\minicom.cap');
>> workfile = strsplit (workfile, "\n");
>> disp("WORKFILE LOADED");
>>
>> # pre-allocation
>> N = length(workfile);
>> Clog = cell (N,4);
>>
>> for i = 1:N
>>   Clog(i,:) = strtrim (strsplit (workfile{i}, ","));
>>   %if mod(i,1000)==0 display(i); end;
>> endfor
> HTH,
> Kai
>
> [1]:
> http://octave.1599824.n4.nabble.com/Loading-Files-with-mixed-text-and-numbers-td4684044.html#a4684050
>
>
>
> --
> View this message in context: http://octave.1599824.n4.nabble.com/Slow-Processing-Issue-tp4684086p4684088.html
> Sent from the Octave - General mailing list archive at Nabble.com.
>
> _______________________________________________
> Help-octave mailing list
> [hidden email]
> https://lists.gnu.org/mailman/listinfo/help-octave


_______________________________________________
Help-octave mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-octave
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Slow Processing Issue

Philip Nienhuis
Fritz Sonnichsen wrote
Thanks all!

I just looked at Francesco's post (busy week here) and wrote up the
script using textread. I had avoided it because matlab is obsoleting
textread but I am more than happy to use it. This solved the problem--it
is much faster.  I ran 135,000 records in 13.6 seconds.

The pertinent code area was:
[C,R,V]= textread
(filename,"%*s,%f,%*f,%*f,%*f,%f,%f,%*f,%*f,%*s,%*s",'delimiter',',');

I had to use the "delimiter as I am using CSV.  The "asterik" format was
handy to avoid using up memory.
If you have the io package installed, you can also use csv2cell to read such files.
As csv2cell is a binary function it is much, much faster than textread which is a mere .m function.

Philip
Loading...