buildbot server down?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

buildbot server down?

Dmitri A. Sergatskov
$ wget http://buildbot.octave.org:8010/
--2020-09-04 15:11:28--  http://buildbot.octave.org:8010/
Resolving buildbot.octave.org (buildbot.octave.org)... 162.243.101.184
Connecting to buildbot.octave.org (buildbot.octave.org)|162.243.101.184|:8010... failed: Connection refused.
$ wget https://buildbot.octave.org:8010/
--2020-09-04 15:11:32--  https://buildbot.octave.org:8010/
Resolving buildbot.octave.org (buildbot.octave.org)... 162.243.101.184
Connecting to buildbot.octave.org (buildbot.octave.org)|162.243.101.184|:8010... failed: Connection refused.
Reply | Threaded
Open this post in threaded view
|

Re: buildbot server down?

John W. Eaton
Administrator
On 9/4/20 3:12 PM, Dmitri A. Sergatskov wrote:

> $ wget http://buildbot.octave.org:8010/
> --2020-09-04 15:11:28-- http://buildbot.octave.org:8010/
> Resolving buildbot.octave.org <http://buildbot.octave.org>
> (buildbot.octave.org <http://buildbot.octave.org>)... 162.243.101.184
> Connecting to buildbot.octave.org <http://buildbot.octave.org>
> (buildbot.octave.org
> <http://buildbot.octave.org>)|162.243.101.184|:8010... failed:
> Connection refused.
> $ wget https://buildbot.octave.org:8010/
> --2020-09-04 15:11:32-- https://buildbot.octave.org:8010/
> Resolving buildbot.octave.org <http://buildbot.octave.org>
> (buildbot.octave.org <http://buildbot.octave.org>)... 162.243.101.184
> Connecting to buildbot.octave.org <http://buildbot.octave.org>
> (buildbot.octave.org
> <http://buildbot.octave.org>)|162.243.101.184|:8010... failed:
> Connection refused.

I don't know why the server failed.  I restarted it.

jwe

Reply | Threaded
Open this post in threaded view
|

Re: buildbot server down?

Dmitri A. Sergatskov


On Fri, Sep 4, 2020 at 3:49 PM John W. Eaton <[hidden email]> wrote:
On 9/4/20 3:12 PM, Dmitri A. Sergatskov wrote:
> $ wget http://buildbot.octave.org:8010/
> --2020-09-04 15:11:28-- http://buildbot.octave.org:8010/
> Resolving buildbot.octave.org <http://buildbot.octave.org>
> (buildbot.octave.org <http://buildbot.octave.org>)... 162.243.101.184
> Connecting to buildbot.octave.org <http://buildbot.octave.org>
> (buildbot.octave.org
> <http://buildbot.octave.org>)|162.243.101.184|:8010... failed:
> Connection refused.
> $ wget https://buildbot.octave.org:8010/
> --2020-09-04 15:11:32-- https://buildbot.octave.org:8010/
> Resolving buildbot.octave.org <http://buildbot.octave.org>
> (buildbot.octave.org <http://buildbot.octave.org>)... 162.243.101.184
> Connecting to buildbot.octave.org <http://buildbot.octave.org>
> (buildbot.octave.org
> <http://buildbot.octave.org>)|162.243.101.184|:8010... failed:
> Connection refused.

I don't know why the server failed.  I restarted it.

jwe

And it is down again...
If this is an old system I'd suspect its power supply.

Dmitri.
--
Reply | Threaded
Open this post in threaded view
|

Re: buildbot server down?

John W. Eaton
Administrator
On 9/4/20 7:28 PM, Dmitri A. Sergatskov wrote:

> And it is down again...
> If this is an old system I'd suspect its power supply.

It's running on a digital ocean droplet, so I hope it is not the power
supply :-).  The system has been up for 100+ days.

Looking at log files, it seems that some processes were killed when the
system had memory issues.  At first, it looked like that was happening
while rotating log files, but I'm not sure about that being the actual
cause.

In any case, I looked at the log files and it seems they are filling up
with failed login attempts from randoms.  I don't know the best way to
fix that, so I'd gladly accept advice from anyone with current sysadmin
experience.

Thanks,

jwe

Reply | Threaded
Open this post in threaded view
|

Re: buildbot server down?

John W. Eaton
Administrator
On 9/4/20 9:59 PM, John W. Eaton wrote:

Thinking that maybe we were just running near the edge on resource
limits, I resized the droplet from 2GB to 4GB RAM.  When buildbot
starts, the python process uses about 80MB.  But it seems to grow fairly
rapidly.  Here is the output from top after a few hours:

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
COMMAND
   931 buildbot  20   0 3141096   3.0g   9780 R 103.0  76.4 111:40.02
python3

So it seems something is wrong there.  I don't think the buildbot server
process should need that much memory and I don't recall this being a
problem previously.

I updated to a newer version.   We were using 2.7.0-1, now we have
2.8.2-3.  I'll watch and see if that improves things.

jwe


Reply | Threaded
Open this post in threaded view
|

Re: buildbot server down?

John W. Eaton
Administrator
On 9/5/20 10:17 AM, John W. Eaton wrote:

> On 9/4/20 9:59 PM, John W. Eaton wrote:
>
> Thinking that maybe we were just running near the edge on resource
> limits, I resized the droplet from 2GB to 4GB RAM.  When buildbot
> starts, the python process uses about 80MB.  But it seems to grow fairly
> rapidly.  Here is the output from top after a few hours:
>
>    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
> COMMAND
>    931 buildbot  20   0 3141096   3.0g   9780 R 103.0  76.4 111:40.02
> python3
>
> So it seems something is wrong there.  I don't think the buildbot server
> process should need that much memory and I don't recall this being a
> problem previously.
>
> I updated to a newer version.   We were using 2.7.0-1, now we have
> 2.8.2-3.  I'll watch and see if that improves things.

Hmm, seems to be doing the same thing again after 90 minutes or so we
are up to

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
COMMAND
  1148 buildbot  20   0 3142408   3.0g   9636 S  75.0  76.5  56:44.73
python3

I have no idea why this is happening.  Is it buildbot?  Python 3?
Something wrong with our configuration?

jwe

Reply | Threaded
Open this post in threaded view
|

Re: buildbot server down?

siko1056
On 9/6/20 12:32 AM, John W. Eaton wrote:

> On 9/5/20 10:17 AM, John W. Eaton wrote:
>> On 9/4/20 9:59 PM, John W. Eaton wrote:
>>
>> Thinking that maybe we were just running near the edge on resource
>> limits, I resized the droplet from 2GB to 4GB RAM.  When buildbot
>> starts, the python process uses about 80MB.  But it seems to grow
>> fairly rapidly.  Here is the output from top after a few hours:
>>
>>    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
>> COMMAND
>>    931 buildbot  20   0 3141096   3.0g   9780 R 103.0  76.4 111:40.02
>> python3
>>
>> So it seems something is wrong there.  I don't think the buildbot
>> server process should need that much memory and I don't recall this
>> being a problem previously.
>>
>> I updated to a newer version.   We were using 2.7.0-1, now we have
>> 2.8.2-3.  I'll watch and see if that improves things.
>
> Hmm, seems to be doing the same thing again after 90 minutes or so we
> are up to
>
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
> COMMAND
>  1148 buildbot  20   0 3142408   3.0g   9636 S  75.0  76.5  56:44.73
> python3
>
> I have no idea why this is happening.  Is it buildbot?  Python 3?
> Something wrong with our configuration?
>
> jwe
>

Do I get the setup right, that only the Buildbot Master runs in that
Digital Ocean droplet?  Or anything else?

To my experience, there is no need for the Buildbot Master to use up
that much CPU and RAM.  My instance is not as busy as yours, but all
files (sqlite database, log files, etc.) are below 1 MB and the CPU is
mostly sleeping, RAM consumption is minimal.

In another thread [1], a tool called buildbot-profiler [2] is used.
Maybe this can be of some help to find the problem?

Kai


[1] https://github.com/buildbot/buildbot/issues/3444
[2] https://pypi.python.org/pypi/buildbot-profiler/

Reply | Threaded
Open this post in threaded view
|

Re: buildbot server down?

John W. Eaton
Administrator
On 9/5/20 12:40 PM, Kai Torben Ohlhus wrote:

> Do I get the setup right, that only the Buildbot Master runs in that
> Digital Ocean droplet?  Or anything else?
>
> To my experience, there is no need for the Buildbot Master to use up
> that much CPU and RAM.  My instance is not as busy as yours, but all
> files (sqlite database, log files, etc.) are below 1 MB and the CPU is
> mostly sleeping, RAM consumption is minimal.
>
> In another thread [1], a tool called buildbot-profiler [2] is used.
> Maybe this can be of some help to find the problem?

Thanks, yes, the buildbot master should never grow like that.

I finally discovered what was happening.  It was trying to build copies
of Octave that included changeset 549c10384cc2, which has a stray
"keyboard" in a test.  So it was apparently filling up some internal
buffers for the results of the tests with "keyboard> " repeated endlessly.

I didn't need the profiler, but it's good to know that it exists in case
we do need something like that in the future.

So, I reduced the size of the droplet again (no need to double the cost
if we don't need it) and I'm attempting to make it skip that revision.

jwe