Pirates@Home logo

Pirates@Home

Berkeley Open Infrastructure
BOINC!
for Network Computing
Home Help Status Forums Glossary Account

Database was unavailable

log in

Advanced search

Questions and Answers : Pirates@Home Problems : Database was unavailable

Author Message
Profile Pepo
Chief Petty Officer
Volunteer tester
Avatar
Send message
Joined: 13 Sep 04
Slovakia
TeamVision42
Credit: 924.2
RAC: 0.00
Joined: Sep 13, 2004
Verified: Aug 4, 2009
Dubloons: 3
Pieces of Eight: 5
Punishment: Cat o' Nine Tails
Message 6730 - Posted: 4 Nov 2007 | 19:16:22 UTC

On an attempt to go directly to http://pirates.spy-hill.net/forum_thread.php?id=847, I've got a polite response

Database unavailable

Unable to connect to database - please try again later

Error: 1045
Access denied for user 'apache'@'localhost' (using password: NO)

It was 3-4 minutes ago. I've tred multiple times, the same. Instead if my logo and "Logout" button at the upper right corner, there was only "Login" button, pressing it did not allow me to log in. Then suddenly everyting was fine.

What did happen? DB restart, or high load?
Uptime

alvarez:
14:05:06 up 33 days, 6:06, 5 users, load average: 11.72, 4.91, 2.18

____________
Peter .-)

Profile Wormholio
Captain
Avatar
Send message
Joined: 6 Jun 04
United States
Away
Credit: 4,009.8
RAC: 0.00
Joined: Jun 6, 2004
Verified: Mar 13, 2008
Dubloons: 3
Pieces of Eight: 10
Punishment: Aztec curse
Message 6742 - Posted: 5 Nov 2007 | 19:07:56 UTC
Last modified: 5 Nov 2007 | 19:10:22 UTC

I'm seeing a pattern here. We get these occasional high load conditions when the database is full of work (done or otherwise). But not when the database is empty of work (as it was recently after a long lull).

It makes sense that the database daemon has to work harder when it has to sort through the Result tables, and this occasionally causes a backlog and load spike. It's intersting enough to study further. Thanks for noting it.

____________
-- Eric Myers

"Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats

Profile Pepo
Chief Petty Officer
Volunteer tester
Avatar
Send message
Joined: 13 Sep 04
Slovakia
TeamVision42
Credit: 924.2
RAC: 0.00
Joined: Sep 13, 2004
Verified: Aug 4, 2009
Dubloons: 3
Pieces of Eight: 5
Punishment: Cat o' Nine Tails
Message 6756 - Posted: 7 Nov 2007 | 16:40:06 UTC
Last modified: 7 Nov 2007 | 16:40:27 UTC

Again just now, and no load this time:

Database unavailable
Unable to connect to database - please try again later

Error: 1045
Access denied for user 'apache'@'localhost' (using password: NO)


Uptime
alvarez:
11:30:03 up 36 days, 3:31, 6 users, load average: 0.11, 0.31, 0.61

Spy Hill Research Cluster Status
Uptime & Load
alvarez up 36+03:35, 2.00,
grendel up 6+02:22, 0.35,
harrahs up 1+17:32, 0.11,
moonflower up 36+03:34, 0.13,
Wed Nov 7 11:33:56 EST 2007

Database server status
Querys/sec (avg): 14.780
Open tables: 64
Threads: 37
Uptime: 36 days 4 hours 31 min


Captain's Cabin temperature monitor
Current temperature is 69.5 F, as of 11:24 AM EST on Wednesday, 7 November 2007

Cold enough :-)

And:
Workflow status
State #
Total Workunits 36,088
Total Results 73,384
Results ready to send 0
Results in progress 37
Results ready for validation 36
Workunits waiting for validation 0
Workunits waiting for assimilation 0
Workunits waiting for deletion 472
Results waiting for deletion 982


CACHED 11:31 AM EST on 7 Nov 2007 (updated every 10 minutes)

____________
Peter .-)

Profile Wormholio
Captain
Avatar
Send message
Joined: 6 Jun 04
United States
Away
Credit: 4,009.8
RAC: 0.00
Joined: Jun 6, 2004
Verified: Mar 13, 2008
Dubloons: 3
Pieces of Eight: 10
Punishment: Aztec curse
Message 6763 - Posted: 8 Nov 2007 | 21:47:16 UTC

To track this problem futher I've added the number of "slow" queries in the database section of the server status page. I previously didn't list this, as it was just a sum of all of them since the server was started, but now I divide by the uptime to get an average number of them per hour. We'll see how this does as we go through normal operations.

I have other changes in mind as well.

____________
-- Eric Myers

"Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats

Profile Wormholio
Captain
Avatar
Send message
Joined: 6 Jun 04
United States
Away
Credit: 4,009.8
RAC: 0.00
Joined: Jun 6, 2004
Verified: Mar 13, 2008
Dubloons: 3
Pieces of Eight: 10
Punishment: Aztec curse
Message 6771 - Posted: 9 Nov 2007 | 20:29:41 UTC

I have increased the maximum number of database connections from (presumably) 100 to 150, which I hope will cut down on the number of "database unavailable" errors and load spikes.

____________
-- Eric Myers

"Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats

Profile Pepo
Chief Petty Officer
Volunteer tester
Avatar
Send message
Joined: 13 Sep 04
Slovakia
TeamVision42
Credit: 924.2
RAC: 0.00
Joined: Sep 13, 2004
Verified: Aug 4, 2009
Dubloons: 3
Pieces of Eight: 5
Punishment: Cat o' Nine Tails
Message 6797 - Posted: 12 Nov 2007 | 11:04:58 UTC

And again few times, one hour ago.

CACHED 4:59 AM EST on 12 Nov 2007 (updated every 10 minutes)
Database server status
Querys/sec (avg): 16.576
Slow queries/hr: 8.67
Threads: 34
Open tables: 64
Uptime: 2 days 14 hours 3 min

Captain's Cabin temperature monitor
Current temperature is 59.2 F,
as of 4:36 AM EST on Monday, 12 November 2007 .

Workflow status
State #
Total Workunits 25,280
Total Results 51,518
Results ready to send 0
Results in progress 682
Results ready for validation 482
Workunits waiting for **** 0

Uptime
alvarez:
05:00:05 up 40 days, 21:01, 2 users, load average: 1.07, 0.53, 0.84


I can see nothing wrong among these numbers, the clue must be somewhere else.
____________
Peter .-)

Profile Ageless
Chief Petty Officer
Volunteer tester
Avatar
Send message
Joined: 20 Jul 04
Netherlands
Machinae Supremacy
Credit: 1,295.9
RAC: 0.00
Joined: Jul 20, 2004
Verified: Jul 9, 2011
Dubloons: 3
Pieces of Eight: 7
Punishment: Cat o' Nine Tails
Message 6805 - Posted: 13 Nov 2007 | 17:09:50 UTC

Just had another one.

I couldn't even post to a thread. Got this as answer and was showing as logged off as well:

Database unavailable

Unable to connect to database - please try again later

Error: 1040
Too many connections

____________
Jord.

The BOINC FAQ Service.

Profile Wormholio
Captain
Avatar
Send message
Joined: 6 Jun 04
United States
Away
Credit: 4,009.8
RAC: 0.00
Joined: Jun 6, 2004
Verified: Mar 13, 2008
Dubloons: 3
Pieces of Eight: 10
Punishment: Aztec curse
Message 6806 - Posted: 14 Nov 2007 | 1:03:29 UTC

Thank for reporting these events. I wish they were not happening, and I hoped increasing the max clients to 150 would have fixed it, but it helps to know what's happening in any case. I'll have to think more about what might help fix this.

____________
-- Eric Myers

"Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats

Profile Scott Brown
Volunteer tester
Avatar
Send message
Joined: 25 Jul 04
United States
Duke University
Credit: 38,955.3
RAC: 0.00
Joined: Jul 25, 2004
Verified: Oct 7, 2011
Dubloons: 3
Pieces of Eight: 11
Punishment: Keel Haul
Message 6807 - Posted: 14 Nov 2007 | 1:16:55 UTC - in response to Message 6806.



You could go to considerably higher max clients (500-1000 on Solaris or up to 4000 on some Linux MySQL setups), but setting the value higher will be irrelevant if your RAM is being maxed out. Since adding 50% more connections than before didn't help, it sounds like it might be specifc client activity rather than number of clients that is causing the problem. Are there any high RAM load activities that clients might be doing to eat up RAM now that wasn't happening before? Do the connection problems correlate with the addition of the mechanical turk (maybe too many clients are making rapid connections with this at times???)?


____________

Profile Wormholio
Captain
Avatar
Send message
Joined: 6 Jun 04
United States
Away
Credit: 4,009.8
RAC: 0.00
Joined: Jun 6, 2004
Verified: Mar 13, 2008
Dubloons: 3
Pieces of Eight: 10
Punishment: Aztec curse
Message 6810 - Posted: 14 Nov 2007 | 16:35:25 UTC - in response to Message 6807.

Scott Brown wrote:

You could go to considerably higher max clients (500-1000 on Solaris or up to 4000 on some Linux MySQL setups), but setting the value higher will be irrelevant if your RAM is being maxed out. Since adding 50% more connections than before didn't help, it sounds like it might be specifc client activity rather than number of clients that is causing the problem. Are there any high RAM load activities that clients might be doing to eat up RAM now that wasn't happening before? Do the connection problems correlate with the addition of the mechanical turk (maybe too many clients are making rapid connections with this at times???)?


I suspect the overall status page is the culprit. Not by itself, though it does do a lot with the database, but combined with the regular load of machines constantly hitting the scheduler to ask for new work. That, in turn, is likely higher than "normal" because Pirates workunits are both short and few.

I can try increasing the max clients further. I hadn't thought yet of increasing physical memory, but now I will. Thanks.
____________
-- Eric Myers

"Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats

Profile Scott Brown
Volunteer tester
Avatar
Send message
Joined: 25 Jul 04
United States
Duke University
Credit: 38,955.3
RAC: 0.00
Joined: Jul 25, 2004
Verified: Oct 7, 2011
Dubloons: 3
Pieces of Eight: 11
Punishment: Keel Haul
Message 6824 - Posted: 15 Nov 2007 | 2:33:09 UTC - in response to Message 6810.


I suspect the overall status page is the culprit. Not by itself, though it does do a lot with the database, but combined with the regular load of machines constantly hitting the scheduler to ask for new work. That, in turn, is likely higher than "normal" because Pirates workunits are both short and few...


That is why I was curious about the mechanical turk addition. It looks like the database is hit everytime a CAPTCHA is completed to award credit under the current reward scheme? If that is the case, then the additional load from the turk might be enough (in addition to the normal hits rather than by itself) to push the load over the maximum when several members of the crew are typing them in.

Profile Wormholio
Captain
Avatar
Send message
Joined: 6 Jun 04
United States
Away
Credit: 4,009.8
RAC: 0.00
Joined: Jun 6, 2004
Verified: Mar 13, 2008
Dubloons: 3
Pieces of Eight: 10
Punishment: Aztec curse
Message 6828 - Posted: 15 Nov 2007 | 17:36:05 UTC - in response to Message 6824.
Last modified: 15 Nov 2007 | 17:36:22 UTC

Scott Brown wrote:
That is why I was curious about the mechanical turk addition. It looks like the database is hit everytime a CAPTCHA is completed to award credit under the current reward scheme? If that is the case, then the additional load from the turk might be enough (in addition to the normal hits rather than by itself) to push the load over the maximum when several members of the crew are typing them in.

It's true that there is a DB hit with each CAPTCHA, but the number of people using the turk is small compared to the number of active hosts hitting the scheduler. That, in turn, changes with the amount of work available and length of workunits.

In the past the worst load spikes seemed to appear when we were releasing lots of work. I thought at first it was just the increased load of releasing the work, but I now think it's also the increased work required her hit to search the larger volume of data in the database.

So I think the M/T load is small in comparison, but it's good to try to think of everything that might be relevant.
____________
-- Eric Myers

"Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats

Profile JKeck {pirate}
Volunteer tester
Avatar
Send message
Joined: 19 Jul 04
United States
Team Starfire World BOINC
Credit: 3,011.5
RAC: 0.00
Joined: Jul 19, 2004
Verified: Jan 17, 2009
Dubloons: 3
Pieces of Eight: 4
Punishment: Mess Duty
Message 6835 - Posted: 16 Nov 2007 | 8:16:43 UTC - in response to Message 6828.

Wormholio wrote:

In the past the worst load spikes seemed to appear when we were releasing lots of work. I thought at first it was just the increased load of releasing the work, but I now think it's also the increased work required her hit to search the larger volume of data in the database.

It may also be because the clients try again sooner. When there is no work the average connection frequency is going to be close to 4 hours. When there is plenty of work that rate will drop dramatically, possibly nearly as short as the average task.
____________
BOINC WIKI

Dirty John Rackham

Profile Wormholio
Captain
Avatar
Send message
Joined: 6 Jun 04
United States
Away
Credit: 4,009.8
RAC: 0.00
Joined: Jun 6, 2004
Verified: Mar 13, 2008
Dubloons: 3
Pieces of Eight: 10
Punishment: Aztec curse
Message 6837 - Posted: 17 Nov 2007 | 15:26:28 UTC - in response to Message 6835.

Dirty John wrote:
It may also be because the clients try again sooner. When there is no work the average connection frequency is going to be close to 4 hours. When there is plenty of work that rate will drop dramatically, possibly nearly as short as the average task.

That seems to be the case. The DB query rate was around 14/sec with no work, and now it's closer to 20/sec. I'm stopping work generation for a while and we'll see how that changes, if it does.

____________
-- Eric Myers

"Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats

Post to thread

Questions and Answers : Pirates@Home Problems : Database was unavailable

Home Help Status Forums Glossary Account


Return to Pirates@Home main page


Copyright © 2013 Capt. Jack Sparrow