Re: Octave coding conventions

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Octave coding conventions

Rik-4
How strictly do we want to enforce a line break after the return type in a
function declaration?  We mostly do this, but not always.  For example, in
Array.cc there is

template <typename T>
Array<T>
Array<T>::permute (const Array<octave_idx_type>& perm_vec_arg, bool inv) const

but also

template <typename T>
T * do_index (const T *src, T *dest, int lev) const

And do we want to enforce this convention in .cc files only or also .h
files?  For example, Array.h contains

octave_idx_type numel (void) const { return len; }

which is small and compact, but if enforcing the return type for a function
would become

octave_idx_type
numel (void) const { return len; }

There is no problem with declarations in a header file as that is a
separate switch to throw in astyle.

--Rik

Reply | Threaded
Open this post in threaded view
|

Re: Octave coding conventions

John W. Eaton
Administrator
On 1/13/20 11:02 AM, Rik wrote:

> How strictly do we want to enforce a line break after the return type in a
> function declaration?  We mostly do this, but not always.  For example, in
> Array.cc there is
>
> template <typename T>
> Array<T>
> Array<T>::permute (const Array<octave_idx_type>& perm_vec_arg, bool inv) const
>
> but also
>
> template <typename T>
> T * do_index (const T *src, T *dest, int lev) const
>
> And do we want to enforce this convention in .cc files only or also .h
> files?  For example, Array.h contains
>
> octave_idx_type numel (void) const { return len; }
>
> which is small and compact, but if enforcing the return type for a function
> would become
>
> octave_idx_type
> numel (void) const { return len; }
>
> There is no problem with declarations in a header file as that is a
> separate switch to throw in astyle.

The original reason for writing the return type on a separate line was
so that the function name would begin in column 1.  Then you could
easily grep for function declarations and definitions using an anchored
pattern like ^FCN_NAME".  But that doesn't work well for C++ member
functions or any function inside a namespace declaration if we are
indenting all code inside a namespace.

Recently, I've been putting the declaration all on one line if possible
or splitting after the return type if the return type is long.  I don't
see that there is one rule that will fit all cases.  I just try to do
what looks best and makes the most sense for each case.

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

Rik-4
On 01/13/2020 11:30 AM, John W. Eaton wrote:

> On 1/13/20 11:02 AM, Rik wrote:
>> How strictly do we want to enforce a line break after the return type in a
>> function declaration?  We mostly do this, but not always.  For example, in
>> Array.cc there is
>>
>> template <typename T>
>> Array<T>
>> Array<T>::permute (const Array<octave_idx_type>& perm_vec_arg, bool inv)
>> const
>>
>> but also
>>
>> template <typename T>
>> T * do_index (const T *src, T *dest, int lev) const
>>
>> And do we want to enforce this convention in .cc files only or also .h
>> files?  For example, Array.h contains
>>
>> octave_idx_type numel (void) const { return len; }
>>
>> which is small and compact, but if enforcing the return type for a function
>> would become
>>
>> octave_idx_type
>> numel (void) const { return len; }
>>
>> There is no problem with declarations in a header file as that is a
>> separate switch to throw in astyle.
>
> The original reason for writing the return type on a separate line was so
> that the function name would begin in column 1.  Then you could easily
> grep for function declarations and definitions using an anchored pattern
> like ^FCN_NAME".  But that doesn't work well for C++ member functions or
> any function inside a namespace declaration if we are indenting all code
> inside a namespace.
>
> Recently, I've been putting the declaration all on one line if possible
> or splitting after the return type if the return type is long.  I don't
> see that there is one rule that will fit all cases.  I just try to do
> what looks best and makes the most sense for each case.

"Beauty is in the eye of the beholder"

This makes sense to me.  I want the tools to help get close to a solution,
but ultimately, I think it comes down to programmer judgment and what
effectively communicates the intent of the code.

With that in mind, I would suggest we stop enforcing an 80-character
limit.  It isn't particularly important given that programmers are working
on 25" HiDPI monitors these days.  Instead, what I value is the clarity of
intent.  If items are grouped together on a single line it is likely
because they are all part of a single idea.  Breaking at an arbitrary point
within an expression thus leads to an incomplete thought which is harder to
understand.  An example from urlwrite.cc is

  std::string filename = args(1).xstring_value ("urlwrite: LOCALFILE must
be a string");

This declares a variable, attempts to initialize it with a string value,
and if that is not possible emits an error.  That is all one single idea
(input processing and validation).  Moving to a vertical coding style just
to stay under an 80-column limit leads to less concise code that isn't any
more effective than the original at communicating what it does

  std::string filename;
  if (! args(1).is_string ())
    error ("urlwrite: LOCALFILE must be a string");
  filename = args(1).string_value ();

The other typical case is we have a rather ordinary expression such as an
if conditional, but it is within a namespace, within a function, within a
for loop, etc.  The overall indent is therefore large, but since it is all
blank space it isn't confusing in any way in the sense that 80 characters
of pure code would be.  In this case, enforcing a line length limit is also
of no real utility.

I'd still say we use some (large-ish) limit in the formatting tools just as
a gentle reminder that extremely long lines are still likely to be confusing.

--Rik

Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

Andrew Janke-2


On 1/13/20 6:06 PM, Rik wrote:
> On 01/13/2020 11:30 AM, John W. Eaton wrote:
>> On 1/13/20 11:02 AM, Rik wrote:

> With that in mind, I would suggest we stop enforcing an 80-character
> limit.  It isn't particularly important given that programmers are working
> on 25" HiDPI monitors these days.  Instead, what I value is the clarity of
> intent.

Let's not go too long! Now that we have large monitors, some of us use
them to fit multiple editor and console windows side-by-side. A limit of
80-100 lines is still useful in those cases. Also some of us are working
on 13" laptops. And I'll bet there's some old curmudgeons in the Octave
community still running on 80-column terminal emulators.

Cheers,
Andrew

Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

John W. Eaton
Administrator
On 1/13/20 5:18 PM, Andrew Janke wrote:

>
>
> On 1/13/20 6:06 PM, Rik wrote:
>> On 01/13/2020 11:30 AM, John W. Eaton wrote:
>>> On 1/13/20 11:02 AM, Rik wrote:
>
>> With that in mind, I would suggest we stop enforcing an 80-character
>> limit.  It isn't particularly important given that programmers are working
>> on 25" HiDPI monitors these days.  Instead, what I value is the clarity of
>> intent.
>
> Let's not go too long! Now that we have large monitors, some of us use
> them to fit multiple editor and console windows side-by-side. A limit of
> 80-100 lines is still useful in those cases. Also some of us are working
> on 13" laptops.
Is it easy to ask the reformatting tool to not break any expressions
into multiple lines and then get a distribution of line lengths for all
lines?  I think it would be interesting to see just how many lines are
longer than 80, 90, 100, ... characters.

 > And I'll bet there's some old curmudgeons in the Octave
 > community still running on 80-column terminal emulators.

Emulator?!?  I still have a Wyse CRT terminal!  (But it's in a box and I
haven't used it in many years.)

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

siko1056
In reply to this post by Andrew Janke-2


On 1/14/20 8:18 AM, Andrew Janke wrote:

> On 1/13/20 6:06 PM, Rik wrote:
>> On 01/13/2020 11:30 AM, John W. Eaton wrote:
>>> On 1/13/20 11:02 AM, Rik wrote:
>
>> With that in mind, I would suggest we stop enforcing an 80-character
>> limit.  It isn't particularly important given that programmers are working
>> on 25" HiDPI monitors these days.  Instead, what I value is the clarity of
>> intent.
>
> Let's not go too long! Now that we have large monitors, some of us use
> them to fit multiple editor and console windows side-by-side. A limit of
> 80-100 lines is still useful in those cases. Also some of us are working
> on 13" laptops. And I'll bet there's some old curmudgeons in the Octave
> community still running on 80-column terminal emulators.
>
> Cheers,
> Andrew
>


Personally, I also favor to stick to old-school 80 columns, due to
reasons given by Andrew already.  Seeing documents side-by-side is
important for my editing and comparing.  Any vertical scrolling just
consumes time and the overview suffers a lot.  Additionally, it forces
me to think more about code structure, but like with any rule, a few
exceptions should be permitted.

Best,
Kai

Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

Rik-4
In reply to this post by John W. Eaton
On 01/13/2020 09:20 PM, John W. Eaton wrote:

> On 1/13/20 5:18 PM, Andrew Janke wrote:
>>
>>
>> On 1/13/20 6:06 PM, Rik wrote:
>>> On 01/13/2020 11:30 AM, John W. Eaton wrote:
>>>> On 1/13/20 11:02 AM, Rik wrote:
>>
>>> With that in mind, I would suggest we stop enforcing an 80-character
>>> limit.  It isn't particularly important given that programmers are working
>>> on 25" HiDPI monitors these days.  Instead, what I value is the clarity of
>>> intent.
>>
>> Let's not go too long! Now that we have large monitors, some of us use
>> them to fit multiple editor and console windows side-by-side. A limit of
>> 80-100 lines is still useful in those cases. Also some of us are working
>> on 13" laptops.

This is why I said "large-ish" value, rather than no limit.  It is still
important to nudge programmers towards writing clear code.

> Is it easy to ask the reformatting tool to not break any expressions into
> multiple lines and then get a distribution of line lengths for all
> lines?  I think it would be interesting to see just how many lines are
> longer than 80, 90, 100, ... characters.

I don't know how to do it with a reformatting tool, but with Perl it was
easy enough.  The following results are only for liboctave/ and libinterp/
including all *.cc, *.c, *.h files.  The line length begins at 1 because
the newline is counted as a character.

Line Length : Count

1: 82105
2: 15272
3: 9255
4: 8223
5: 1994
6: 13068
7: 5938
8: 4910
9: 1804
10: 7111
11: 2722
12: 4435
13: 3621
14: 5848
15: 3455
16: 4437
17: 4866
18: 5618
19: 5989
20: 5025
21: 4571
22: 6054
23: 5182
24: 5604
25: 4687
26: 4770
27: 5137
28: 4738
29: 3866
30: 4240
31: 4671
32: 5644
33: 4159
34: 4683
35: 4235
36: 5141
37: 4170
38: 3960
39: 5099
40: 3868
41: 3807
42: 3865
43: 3759
44: 3598
45: 3741
46: 3568
47: 3494
48: 4507
49: 3510
50: 4241
51: 3471
52: 4462
53: 3237
54: 3521
55: 3605
56: 3176
57: 5405
58: 3635
59: 3014
60: 2967
61: 3028
62: 4007
63: 2740
64: 3832
65: 4960
66: 3627
67: 2776
68: 2759
69: 7028
70: 2944
71: 2748
72: 2727
73: 4792
74: 6709
75: 2472
76: 2088
77: 1976
78: 1624
79: 1644
80: 1355
81: 990
82: 435
83: 404
84: 190
85: 163
86: 140
87: 95
88: 78
89: 121
90: 82
91: 97
92: 82
93: 65
94: 61
95: 38
96: 109
97: 68
98: 77
99: 45
100: 38
101: 43
102: 37
103: 37
104: 62
105: 40
106: 29
107: 22
108: 41
109: 23
110: 12
111: 18
112: 26
113: 17
114: 19
115: 20
116: 28
117: 10
118: 15
119: 11
120: 10
121: 16
122: 17
123: 14
124: 29
125: 11
126: 16
127: 19
128: 3
129: 7
130: 4
131: 3
132: 3
133: 4
134: 2
135: 7
136: 1
137: 1
138: 1
139: 1
141: 1
144: 1
145: 1
146: 1
147: 3
148: 4
149: 5
150: 1
151: 2
152: 7
154: 2
155: 1
157: 1
159: 2
160: 1
161: 1
163: 3
164: 2
165: 1
167: 1
170: 1
171: 1
189: 1
195: 1
202: 1
225: 1
237: 58
238: 37
239: 5
244: 5
245: 17
246: 2
247: 3
251: 1
277: 2
278: 1
279: 15
280: 6
281: 19
282: 4
312: 3
313: 2
314: 7
315: 7
316: 3
317: 1
384: 1
393: 1

The extremely long line lengths (> 200) should probably be checked and
dealt with.  I looked at one instance just to get an idea and it was

      return 0.11380523107427108222e0 + (0.43099572287871821013e-2 +
(0.36544324341565929930e-4 + (0.47965044028581857764e-6 +
(0.81819034238463698796e-8 + (0.17934133239549647357e-9 +
(0.50956666166186293627e-11 + (0.18850487318190638010e-12 +
0.79697813173519853340e-14 * t) * t) * t) * t) * t) * t) * t) * t;

This could easily be split, but it is in the file Faddeeva.cc which we
specifically do *not* use Octave coding conventions in so that we can more
easily merge changes from upstream.


Feeding the raw data in to Octave you can ask what line length is required
to include a certain fraction of the distribution.

empirical_inv ([0.9, 0.95, 0.98, 0.99, 0.995], hraw)
ans =

   68   74   78   80   85

The results do show the limit around 80 characters that we have been trying
to enforce.  In this case 99.5% of all lines are <= 85 characters.

I think trying something like 95 might be a good first step.

>
> > And I'll bet there's some old curmudgeons in the Octave
> > community still running on 80-column terminal emulators.
>
> Emulator?!?  I still have a Wyse CRT terminal!  (But it's in a box and I
> haven't used it in many years.)
>
> jwe
>

There is a histogram of developers as well.  I just don't imagine the
number who can *only* view 80 columns at a time is very large.

--Rik





Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

John W. Eaton
Administrator
On 1/14/20 11:25 AM, Rik wrote:

> I don't know how to do it with a reformatting tool, but with Perl it was
> easy enough.

How?  I'd like to take a look at some of the long lines you found.

> The extremely long line lengths (> 200) should probably be checked and
> dealt with.  I looked at one instance just to get an idea and it was
>
>        return 0.11380523107427108222e0 + (0.43099572287871821013e-2 +
> (0.36544324341565929930e-4 + (0.47965044028581857764e-6 +
> (0.81819034238463698796e-8 + (0.17934133239549647357e-9 +
> (0.50956666166186293627e-11 + (0.18850487318190638010e-12 +
> 0.79697813173519853340e-14 * t) * t) * t) * t) * t) * t) * t) * t;
>
> This could easily be split, but it is in the file Faddeeva.cc which we
> specifically do *not* use Octave coding conventions in so that we can more
> easily merge changes from upstream.

Yes, long initializer lists and pure arithmetic expressions like this
probably aren't really the problem.

> I think trying something like 95 might be a good first step.

Not 132, like old line printer output?

> There is a histogram of developers as well.  I just don't imagine the
> number who can *only* view 80 columns at a time is very large.

True, and terminal emulators wrap lines, don't they?

jwe



Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

Rik-4
On 01/14/2020 09:32 AM, John W. Eaton wrote:
> On 1/14/20 11:25 AM, Rik wrote:
>
>> I don't know how to do it with a reformatting tool, but with Perl it was
>> easy enough.
>
> How?  I'd like to take a look at some of the long lines you found.

I filed an issue report https://savannah.gnu.org/bugs/index.php?57599 with
an attached list of all lines > 100 characters.

--Rik


Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

John W. Eaton
Administrator
In reply to this post by Rik-4
On 1/14/20 11:25 AM, Rik wrote:
> On 01/13/2020 09:20 PM, John W. Eaton wrote:

>> Is it easy to ask the reformatting tool to not break any expressions into
>> multiple lines and then get a distribution of line lengths for all
>> lines?  I think it would be interesting to see just how many lines are
>> longer than 80, 90, 100, ... characters.
>
> I don't know how to do it with a reformatting tool, but with Perl it was
> easy enough.  The following results are only for liboctave/ and libinterp/
> including all *.cc, *.c, *.h files.  The line length begins at 1 because
> the newline is counted as a character.

I did the following:

   echo "LENGTH  COUNT"; for f in $(hg locate '*.c' '*.f' '*.cc' '*.h'
'*.ll' '*.yy'); do awk '{ n = length ($0); if (n > 80) print n; }' $f ;
done | sort | uniq -c | awk '{printf (" %4d   %4d\n", $2, $1); }' | sort -n

and my results are a little different from yours.

Using

   for f in $(hg locate '*.c' '*.f' '*.cc' '*.h' '*.ll' '*.yy') ; do awk
'{if (length ($0) > 100) print FILENAME; }' $f ; done | sort | uniq -c |
sort -nr

to get counts by file for lines longer than 100 characters, I see

     199 liboctave/external/Faddeeva/Faddeeva.cc
     141 libinterp/operators/op-int.h
      72 liboctave/numeric/lo-specfun.h
      70 libinterp/corefcn/besselj.cc
      54 libgui/src/gui-preferences-sc.h
      24 libgui/qterminal/libqterminal/unix/Filter.h
      23 libinterp/corefcn/data.cc
      22 libinterp/corefcn/graphics.in.h
      19 libgui/src/settings-dialog.cc
      18 libinterp/dldfcn/qr.cc
      18 libinterp/corefcn/rand.cc
      14 libinterp/corefcn/mappers.cc
      12 libinterp/octave-value/ov-java.cc
      11 libinterp/corefcn/graphics.cc
      10 liboctave/numeric/oct-rand.cc
      10 libinterp/corefcn/matrix_type.cc
      10 libgui/src/welcome-wizard.cc
       9 libinterp/corefcn/regexp.cc
       9 libinterp/corefcn/oct-stream.cc
       9 libgui/src/m-editor/file-editor-tab.cc
       8 liboctave/numeric/eigs-base.cc
       8 libinterp/parse-tree/oct-parse.yy
       8 libinterp/dldfcn/__glpk__.cc
       8 libinterp/corefcn/file-io.cc
       7 libinterp/octave-value/ov.cc
       7 libinterp/corefcn/time.cc
       7 libinterp/corefcn/lu.cc
       7 libinterp/corefcn/gsvd.cc
       7 libinterp/corefcn/cellfun.cc
       6 liboctave/array/Sparse.cc
       6 liboctave/array/CMatrix.cc
       6 libinterp/octave-value/ov-struct.cc
       6 libinterp/dldfcn/chol.cc
       6 libinterp/corefcn/utils.cc
       6 libinterp/corefcn/dasrt.cc
       6 libgui/qterminal/libqterminal/unix/TerminalView.h
       5 libgui/src/main-window.cc
       5 libgui/src/files-dock-widget.cc
       5 libgui/qterminal/libqterminal/unix/CharacterColor.h
       4 liboctave/operators/mx-op-defs.h
       4 liboctave/array/fCMatrix.cc
       4 libinterp/parse-tree/lex.ll
       4 libinterp/octave-value/ov-class.cc
       4 libinterp/corefcn/quadcc.cc
       4 libinterp/corefcn/quad.cc
       4 libinterp/corefcn/oct-tex-parser.yy
       4 libinterp/corefcn/event-manager.cc
       4 libgui/src/news-reader.cc
       4 libgui/qterminal/libqterminal/unix/ScreenWindow.h
       3 src/mkoctfile.in.cc
       3 libinterp/octave.cc
       3 libinterp/dldfcn/__init_fltk__.cc
       3 libinterp/dldfcn/audioread.cc
       3 libinterp/corefcn/sysdep.cc
       3 libinterp/corefcn/syscalls.cc
       3 libinterp/corefcn/strfns.cc
       3 libinterp/corefcn/stream-euler.cc
       3 libinterp/corefcn/stack-frame.cc
       3 libinterp/corefcn/pr-output.cc
       3 libinterp/corefcn/lsode.cc
       3 libinterp/corefcn/gl-render.cc
       3 libgui/src/m-editor/file-editor.cc
       3 libgui/qterminal/libqterminal/unix/Screen.h
       3 libgui/qterminal/libqterminal/unix/KeyboardTranslator.h
       3 libgui/qterminal/libqterminal/unix/History.h
       2 liboctave/wrappers/unistd-wrappers.c
       2 liboctave/util/lo-regexp.cc
       2 liboctave/numeric/sparse-qr.cc
       2 liboctave/array/CSparse.cc
       2 liboctave/array/CRowVector.cc
       2 libinterp/parse-tree/jit-typeinfo.h
       2 libinterp/operators/ops.h
       2 libinterp/octave-value/ov-fcn-handle.cc
       2 libinterp/dldfcn/__eigs__.cc
       2 libinterp/dldfcn/audiodevinfo.cc
       2 libinterp/corefcn/urlwrite.cc
       2 libinterp/corefcn/symtab.cc
       2 libinterp/corefcn/qz.cc
       2 libinterp/corefcn/ls-mat5.cc
       2 libinterp/corefcn/help.cc
       2 libinterp/corefcn/gl2ps-print.cc
       2 libinterp/corefcn/fcn-info.cc
       2 libinterp/corefcn/dot.cc
       2 libgui/src/qt-interpreter-events.cc
       2 libgui/src/gui-preferences-mw.h
       2 libgui/src/find-files-dialog.cc
       2 libgui/qterminal/libqterminal/unix/TerminalCharacterDecoder.h
       1 liboctave/util/quit.h
       1 liboctave/util/oct-sparse.h
       1 liboctave/util/oct-shlib.cc
       1 liboctave/util/lo-utils.cc
       1 liboctave/util/lo-ieee.cc
       1 liboctave/numeric/schur.cc
       1 liboctave/numeric/oct-fftw.cc
       1 liboctave/numeric/LSODE.cc
       1 liboctave/numeric/hess.cc
       1 libinterp/parse-tree/pt-mat.cc
       1 libinterp/octave-value/ov-typeinfo.cc
       1 libinterp/octave-value/ov-scalar.h
       1 libinterp/octave-value/ov-intx.h
       1 libinterp/octave-value/ov-float.h
       1 libinterp/octave-value/ov-cell.cc
       1 libinterp/octave-value/ov-bool-mat.cc
       1 libinterp/octave-value/ov-bool.h
       1 libinterp/octave-value/cdef-package.cc
       1 libinterp/octave-value/cdef-class.cc
       1 libinterp/dldfcn/__ode15__.cc
       1 libinterp/dldfcn/__fltk_uigetfile__.cc
       1 libinterp/dldfcn/__delaunayn__.cc
       1 libinterp/dldfcn/convhulln.cc
       1 libinterp/dldfcn/amd.cc
       1 libinterp/corefcn/toplev.cc
       1 libinterp/corefcn/sylvester.cc
       1 libinterp/corefcn/strfind.cc
       1 libinterp/corefcn/sparse.cc
       1 libinterp/corefcn/__qp__.cc
       1 libinterp/corefcn/psi.cc
       1 libinterp/corefcn/ordschur.cc
       1 libinterp/corefcn/load-save.cc
       1 libinterp/corefcn/load-path.cc
       1 libinterp/corefcn/__lin_interpn__.cc
       1 libinterp/corefcn/input.h
       1 libinterp/corefcn/input.cc
       1 libinterp/corefcn/__ilu__.cc
       1 libinterp/corefcn/hex2num.cc
       1 libinterp/corefcn/errwarn.cc
       1 libinterp/corefcn/ellipj.cc
       1 libinterp/corefcn/dassl.cc
       1 libinterp/corefcn/daspk.cc
       1 libgui/src/workspace-view.cc
       1 libgui/src/variable-editor.cc
       1 libgui/src/shortcut-manager.cc
       1 libgui/src/octave-qobject.cc
       1 libgui/src/m-editor/octave-qscintilla.cc
       1 libgui/src/gui-preferences-ed.h
       1 libgui/graphics/Panel.cc
       1 libgui/graphics/ButtonGroup.cc
       1 libgui/graphics/ButtonControl.cc

I bet a bunch of those are long error messages.

The thing I'm really curious about is how many lines longer than N (say
100 or 132) characters do we have if we reformat the sources so that no
expressions are split across lines?  Do things get any worse than what
we currently have, or would most of of the expressions that are split to
fit in 80 character lines fit in 100 or 132 character lines?

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Octave line length

Rik-4
On 01/15/2020 08:32 AM, John W. Eaton wrote:

> On 1/14/20 11:25 AM, Rik wrote:
>> On 01/13/2020 09:20 PM, John W. Eaton wrote:
>
>>> Is it easy to ask the reformatting tool to not break any expressions into
>>> multiple lines and then get a distribution of line lengths for all
>>> lines?  I think it would be interesting to see just how many lines are
>>> longer than 80, 90, 100, ... characters.
>>
>> I don't know how to do it with a reformatting tool, but with Perl it was
>> easy enough.  The following results are only for liboctave/ and libinterp/
>> including all *.cc, *.c, *.h files.  The line length begins at 1 because
>> the newline is counted as a character.
>
> I did the following:
>
>   echo "LENGTH  COUNT"; for f in $(hg locate '*.c' '*.f' '*.cc' '*.h'
> '*.ll' '*.yy'); do awk '{ n = length ($0); if (n > 80) print n; }' $f ;
> done | sort | uniq -c | awk '{printf (" %4d   %4d\n", $2, $1); }' | sort -n
>
> and my results are a little different from yours.

The differences are unlikely to change the conclusions.  I didn't include
"*.ll", "*.yy", or "*.f" files.  There are only two "*.ll" and two "*.yy"
files so they can't skew the result to much, and I didn't include Fortran
files at all because I wasn't sure it was appropriate to use coding
conventions for C/C++ files on that language.

>
> Using
>
>   for f in $(hg locate '*.c' '*.f' '*.cc' '*.h' '*.ll' '*.yy') ; do awk
> '{if (length ($0) > 100) print FILENAME; }' $f ; done | sort | uniq -c |
> sort -nr
>
> to get counts by file for lines longer than 100 characters, I see
>
>     199 liboctave/external/Faddeeva/Faddeeva.cc
>     141 libinterp/operators/op-int.h
>      72 liboctave/numeric/lo-specfun.h
>      70 libinterp/corefcn/besselj.cc
>      54 libgui/src/gui-preferences-sc.h
>      24 libgui/qterminal/libqterminal/unix/Filter.h
>      23 libinterp/corefcn/data.cc
>      22 libinterp/corefcn/graphics.in.h
>      19 libgui/src/settings-dialog.cc
>      18 libinterp/dldfcn/qr.cc
>      18 libinterp/corefcn/rand.cc
>      14 libinterp/corefcn/mappers.cc
>      12 libinterp/octave-value/ov-java.cc
>      11 libinterp/corefcn/graphics.cc
>      10 liboctave/numeric/oct-rand.cc
>      10 libinterp/corefcn/matrix_type.cc
>      10 libgui/src/welcome-wizard.cc
>
>
> I bet a bunch of those are long error messages.

Clearly a power law sort of thing.  The total number of long lines is
1,061.  But files with < 10 long lines only contribute 334 lines to the
total.  Taking care of the 14 files at the start would cover 727/1061 =
68.5% of the instances.

>
> The thing I'm really curious about is how many lines longer than N (say
> 100 or 132) characters do we have if we reformat the sources so that no
> expressions are split across lines?  Do things get any worse than what we
> currently have, or would most of of the expressions that are split to fit
> in 80 character lines fit in 100 or 132 character lines?
>
This is hard to accurately judge.  I worked up a Perl script that checked
for lines which did not end in a ';' or ')' character.  If they did not, it
joined the following line to the existing line and checked the length. 
Checking only the *.cc files showed more than 2,000 new lines which would
be > 100 characters.  As an example, errors and warning in liboctave/ such
as this

        (*current_liboctave_warning_with_id_handler)
          (warning_id_nearly_singular_matrix,
           "matrix singular to machine precision, rcond = %g", rcond);

become

        (*current_liboctave_warning_with_id_handler)
          (warning_id_nearly_singular_matrix,"matrix singular to machine
precision, rcond = %g", rcond);

In general, since we have tried to keep things below 80 characters, joining
two long lines is going to be roughly 160 characters - indent and the
indent is rarely more than 28, so even a large limit like 132 is insufficient.

--Rik