@ |
Home | Help | Status | Forums | Glossary | Account
|
log in |
Message boards : Announcements : Einstein@Home is down
Author | Message |
---|---|
Einstein@Home is down due to failure of the air conditioning in the server room and subsequent crash of the database server. The project server was shut down cleanly. No word on the extent of damage to the server and no estimate yet of how long repairs should take (it's a holiday weekend in the US.) | |
ID: 3346 | Rating: 0 | rate: / Reply Quote | |
Many thanks. | |
ID: 3347 | Rating: 0 | rate: / Reply Quote | |
Many thanks. We hope, but won't know until they get the power restored and try to bring the servers up. Bruce Allen has promised an update in 24 hrs. They will be down for at least that long. ____________ -- Eric Myers "Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats | |
ID: 3349 | Rating: 0 | rate: / Reply Quote | |
Those 24 hours all allmost done. I hope everything will be well, because some of my processors are getting cold (no more work_units left...). | |
ID: 3363 | Rating: 0 | rate: / Reply Quote | |
are there nice news somewhere about einstein. | |
ID: 3364 | Rating: 0 | rate: / Reply Quote | |
This is getting a bit frustrating! Still no info on the server-status of einstein@home. Most of my machines have run-out of work_units. | |
ID: 3365 | Rating: 0 | rate: / Reply Quote | |
Google is a multi-million dollar company. | |
ID: 3366 | Rating: -1 | rate: / Reply Quote | |
I don't know how it is in the US, but from a university you may expect the server-room(s) to be well equipped and secured, with adequate backups, etc. For every university those computers are vital, without them a university cannot exist nowadays. So i.m.h.o. a server-crash should be handled within hours. | |
ID: 3367 | Rating: 0 | rate: / Reply Quote | |
Sorry no news here, just comments on comments. | |
ID: 3368 | Rating: 0 | rate: / Reply Quote | |
As of Wednesday morning (EDT) there is still no news on the status of Einstein@Home. | |
ID: 3369 | Rating: 0 | rate: / Reply Quote | |
Good luck folks, anyone who's ever done any kind of support is empathizing & pulling for you. Here here! As my old team leader used to say to the customer when stuff went wrong- "Do you want me to fix it, or do you want me to talk about fixing it!" ____________ Join BOINC Synergy Team | |
ID: 3370 | Rating: 0 | rate: / Reply Quote | |
| |
ID: 3371 | Rating: 0 | rate: / Reply Quote | |
I don't know how it is in the US, but from a university you may expect the server-room(s) to be well equipped and secured, with adequate backups, etc. For every university those computers are vital, without them a university cannot exist nowadays. So i.m.h.o. a server-crash should be handled within hours. Had you read the actual news given by Eric, you'd have seen it was the air-conditioning that broke down, which caused the database server to crash, probably due to excessive heat. So wouldn't you want to fix the AC first? Or do you think they have backup ACs in place? Plus one never knows what damage the database server got from overheating, prior to its crash. It's possible that it has lost all the tables on the present data that's out there. Which in essence means a restart of the S4 program. Secondly: even with the server down it should still be possible to get some information out about how the repairs are going on. It's this lack of information what's the most frustrating. The normal server that houses the forums and all sits in the same room, which is probably the size of a cupboard. If the AC unit is in this same room, I can imagine them taking all the computers out of that room to have the room to be able to reach the AC. And don't think these AC units are the size of your AC unit, or your table fan. They are big refrigeration units. Eric, while you were around there, did you ever see this? And, as a last point: I don't want to run other projects. I have chosen to support EAH and I stick to that (for the moment). Then that's your own choice. Yet everyone complaining that EAH doesn't come back quickly enough for their liking, that they aren't handling things professionally, that they should do this and that, should just have patience. If your main complaint is that your computers are now sitting idle, then just shut them down. Essent won't like you, but your wallet may. (Gewoon geduld hebben) ____________ Jord. The BOINC FAQ Service. | |
ID: 3372 | Rating: -1 | rate: / Reply Quote | |
And don't think these AC units are the size of your AC unit, or your table fan. They are big refrigeration units. Yes. I have no real new information, but maybe I can provide some context. Hopefully without causing undue speculation. I expect the folks at UWM are working very hard on the problem, and Bruce will provide an update when he knows something definite. I will pass on whatever I learn when I hear it. First of all, as I understand it the AC failure was in the *new* cluster room. So it is possible that there was a problem with the installation or with new AC hardware. But I do not know this. I do know that the Einstein@Home servers are only a small part of a much larger computer installation at UWM, which includes the 300 node "Medusa" beowulf cluster. When I toured the machine room (what I expect is now the *old* cluster room) I found it very impressive (pictures here and here). The power consumption was also very impressive, and the AC system has to remove the excess heat from all of those nodes. I know that there were plans to upgrade the cluster to new hardware, but I do not know if the "new" cluster room has new computing hardware or the original Medusa nodes. In any case, it is possible that there is damage to many of the nodes in the cluster, not just the Einstein@Home servers. But I do not know this. I have not attempted to contact Bruce because I am sure he is extremely busy right now, and I also know that he likes to wait to release information until he knows the full extent of the situation. I will pass on whatever I hear as soon as I can. Meanwhile, BOINC was designed so that participants can crank on other projects when one project is off-line for whatever reason. Or those who only want to do work for Einstein@Home can wait until it comes back up, which it will. I'm sorry I don't have any further news or information. We will all just have to wait patiently and hope for the best. ____________ -- Eric Myers "Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats | |
ID: 3373 | Rating: 0 | rate: / Reply Quote | |
Thank you, Eric. So my estimate of a small room was a bit off as well. :) | |
ID: 3374 | Rating: -1 | rate: / Reply Quote | |
Einstein is back up. You can upload your work. | |
ID: 3377 | Rating: -1 | rate: / Reply Quote | |
Thank you, Eric. So my estimate of a small room was a bit off as well. :) For Einstein@Home, yes, your estimate was off. But it's not far off for Pirates@Home. ____________ -- Eric Myers "Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats | |
ID: 3378 | Rating: 0 | rate: / Reply Quote | |
[color=navy][size=12][b]Those who can, do.
Those who can't, bully.[/b][/size][/color] From here | |
ID: 3379 | Rating: 0 | rate: / Reply Quote | |
Re: The pics... | |
ID: 3384 | Rating: 0 | rate: / Reply Quote | |
Do you have a Beowulf? How many nodes? No, I have 5 machines in a small (cozy?) office, but they are not configured as a beowulf cluster. ____________ -- Eric Myers "Education is not the filling of a pail, but the lighting of a fire." -- William Butler Yeats | |
ID: 3385 | Rating: 0 | rate: / Reply Quote | |
So you are saying, that you have a small and nice warm, and cosy to be in in the wintercold, office?! ;-D ____________ [color=navy][size=12][b]Those who can, do.
Those who can't, bully.[/b][/size][/color] From here | |
ID: 3409 | Rating: 0 | rate: / Reply Quote | |
Message boards : Announcements : Einstein@Home is down
Home | Help | Status | Forums | Glossary | Account
|