Posts by skgiven
1)
Message boards : Docking@Home Science : Going out with a whimper, not a BANG! ( Message 7321 )Posted 1145 days ago by skgiven GPUGrid have been running CPU based Docking research for a few months now, if anyone is interested. Hi |
2)
Message boards : Number crunching : server status? ( Message 7287 )Posted 1195 days ago by skgiven 02/05/2014 10:16:50 | Docking | Sending scheduler request: Requested by user. 02/05/2014 10:16:50 | Docking | Reporting 35 completed tasks 02/05/2014 10:16:50 | Docking | Not requesting tasks: some task is suspended via Manager 02/05/2014 10:16:52 | Docking | Scheduler request completed 02/05/2014 10:16:52 | Docking | Server error: can't attach shared memory I've suspended most of my remaining tasks, for now. If and when this gets fixed (memory gets allocated to the server), and my completed work gets reported, I will allow the remaining WU's to run. I've only got about 60 non-started tasks on my main system and about 100 in total, so no biggie for me . We know that since the project went down for maintenance yesterday it hasn't worked properly. Good luck getting it sorted. |
3)
Message boards : Number crunching : server status? ( Message 7280 )Posted 1196 days ago by skgiven A few of us are having trouble reporting results. Getting - Server error: can't attach shared memory Same problem here; tasks complete but don't upload and BM Event Log says "Server error: can't attach shared memory": 01/05/2014 19:28:51 | Docking | Sending scheduler request: To report completed tasks. 01/05/2014 19:28:51 | Docking | Reporting 12 completed tasks 01/05/2014 19:28:51 | Docking | Requesting new tasks for NVIDIA 01/05/2014 19:28:52 | Docking | Scheduler request completed: got 0 new tasks 01/05/2014 19:28:52 | Docking | Server error: can't attach shared memory I'm not keen on running tasks that can't be uploaded (been there before, several times), so I'll see how this pans out over the next few hours or a day and if it's not fixed I'll suspend the project until it is. |
4)
Message boards : Cafe Docking : Docking@Home is Retiring - Share Your Stories ( Message 7234 )Posted 1218 days ago by skgiven For the most part this has been a set and forget project (a good thing). Unfortunately there have been times when tasks didn't progress (which caused problems). Hopefully the results will be of great use to other researchers. Sorry to hear this project is winding up, but nothing lasts for ever. |
5)
Message boards : Number crunching : cpu/gpu question ( Message 7174 )Posted 1353 days ago by skgiven When crunching, there is nothing worse than seeing a £200+ GPU sit idle to facilitate a CPU task that needs but one CPU thread. It's been the bane of my crunching experience for months. I expect that Mikey is experiencing the 'new and improved' Boinc scheduler! There have been major changes to it over the past few months, and frankly it doesn't work well when crunching for GPU and CPU projects. It's already resulted in some crunchers quitting altogether! It pushed me to build a system exclusively for GPU crunching (two GPU's, dual core processor, no CPU work). Unfortunately many projects don't play by the rules - they overestimate the task size (GFlops), force the task to run at priority, and set very low deadlines... The scheduler can't handle these issues, and no doubt numerous other settings too. To use the latest Boinc versions you really need to keep a very low cache, make sure the CPU is not saturated (better for GPU projects anyway, especially GPUGRID and POEM), stick to one CPU project, and don't make any big changes to the Boinc setup, like increase cache by 0.5days, suspend, resume or add projects. It likes stability (as in no changes ever). It also helps to have massive GPU project weightings (10000) and very low cpu project weightings (1). Even with all these boxes ticked, it doesn't always prevent GPU work suspension to facilitate CPU work, but it goes a long way. A more foolproof solution is to manually download VirtualBox, install Boinc and run the CPU tasks in the VM, keeping the GPU tasks (and no CPU tasks) in the host system. It's fairly easy and isolates CPU tasks completely. As you can set the CPU core count for the VM, you completely prevent GPU stoppages. |
6)
Message boards : Number crunching : HELP - Consistant 0% Progress - Client Problem? ( Message 6888 )Posted 1765 days ago by skgiven I have set Docking@Home to no new tasks and have aborted all current tasks. I had two more tasks run for hours this morning with 0 progress. Please let us know when you have fixed this problem. Yes, please let us know when the problem has been fixed. Presently, I think the server isn't sending new tasks, which is good saying as they don't work. Can you send server aborts, to expedite the resolution? GL |
7)
Message boards : Number crunching : HELP - Consistant 0% Progress - Client Problem? ( Message 6871 )Posted 1766 days ago by skgiven I had this issue on one 2008x64 server. 3 tasks running on a quad core opteron. No progress on any task after 18.5h, 14.4h and 14h.3h. CPU usage at 75% (the tasks), and memory being used as expected. I aborted the said tasks. The next tasks started running but didn't progress either so I restarted the system. After the reboot one task had reached 1% progress by the time I had logged on (running as a daemon). The time was about 3min. to reach this 1% and the checkpoint was at 23sec. About 8min into the run and the same task went to 3.475%. Neither of the other tasks had progressed (0%), so I suspended them. When I suspended the tasks, two new tasks immediately failed, but another 2 started, reached 1% and then 3.475%. A while later Boinc decided to run new docking tasks, these started but didn't progress after 10min, so I aborted them. On a W7x64 system (i7-2600K) the tasks are running normally so far. I prefer the tasks that fail immediately than the tasks that don't progress for hours on end. Anyway, try a restart and if tasks don't progress after say 10 or 15 min. just abort them - others should run, but babysitting seems to be the order of the day. Of note is that the tasks that don't progress don't checkpoint, so we might be able to abort them earlier? My uninformed guess is that these perpetual tasks were built incorrectly; from a dataset that contains a non-standard a-a or Charmm can't handle an atom type/range/angle... Perhaps their names would be useful in tracking the issue down? |
8)
Message boards : Cafe Docking : Personal Milestones ( Message 6589 )Posted 1976 days ago by skgiven 500,000 |
9)
Message boards : Cafe Docking : Personal Milestones ( Message 6463 )Posted 2158 days ago by skgiven 300K |
10)
Message boards : Number crunching : Work Units That Never Want To End ( Message 5582 )Posted 2791 days ago by skgiven I had 3 tasks that ran for 21h, 19h and 17h on a Q9400 @ 3.46gHz. All tasks showed 0.000% Progress. Charmm 34a2 6.23 Applications, 1k1i_89_mod0014trypsin_18046_294560_0 1k1i_89_mod0014trypsin_18910_123518_0 1k1i_89_mod0014trypsin_19550_291072_0 I exited Boinc and started Boinc again. All task time went back to zero, and again no progress was made. Bye-bye! |
Next 10 posts