Re: profiling results

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: profiling results

Rik-4
I finally got one of the profiling tools to work well enough to identify
some of the hotspots.  I ended up using the Linux Kernel tool 'perf'.

perf record -g -p <PID>

When running the benchmark bm.toeplitz.orig.m, I find that tree_evaluator
and tree_index_expression::lvalue seem to be time consuming routines.

  Children      Self       Samples  Command          Shared
Object               Symbol
  -  81.31%     0.27%          160  QThread         
liboctinterp.so.7.0.0       [.] octave::tree_evaluator::visit_simple_assignment
   - 81.04%
octave::tree_evaluator::visit_simple_assignment                                                                       
 
      + 50.98%
octave::tree_evaluator::evaluate                                                                                   
 
      + 19.84%
octave::tree_index_expression::lvalue                                                                              
 
      + 6.86%
octave::octave_lvalue::assign                                                                                       
 
      + 0.72%
octave::octave_lvalue::~octave_lvalue                                                                               
 
        0.51%
octave::octave_lvalue::octave_lvalue                                                                                
 

If, instead of a callgraph, I look directly at which functions are
consuming the most time it does seem that there is a lot of time spent
allocating/freeing memory and creating/destroying class objects.

  Overhead       Samples  Command          Shared Object               Symbol
+    8.87%          5253  QThread          libc-2.27.so                [.]
cfree@GLIBC_2.2.5
+    5.44%          3217  QThread          libc-2.27.so                [.]
malloc
+    5.31%          3140  QThread          liboctinterp.so.7.0.0       [.]
octave_value::operator=
+    4.95%          2926  QThread          liboctgui.so.5.0.0          [.]
octave_value::~octave_value
+    3.05%          1804  QThread          libc-2.27.so                [.]
_int_malloc
+    2.61%          1543  QThread          liboctgui.so.5.0.0          [.]
Array<std::__cxx11::basic_string<char, std::char_traits<char>, std:
+    2.41%          1427  QThread          liboctgui.so.5.0.0          [.]
octave_value_list::octave_value_list
+    2.32%          1365  QThread          liboctgui.so.5.0.0          [.]
Array<octave_value>::~Array

--Rik