HELP - Consistant 0% Progress - Client Problem?

Message boards : Number crunching : HELP - Consistant 0% Progress - Client Problem?

Author	Message
Gandelf Joined: Apr 11 09 Posts: 1 ID: 9674 Credit: 3,684,116 RAC: 0	Message 5333 - Posted 19 Aug 2009 8:01:44 UTC
	I have a laptop which is refusing to budge past zero percent unit progress despite eating cpu. Symptoms are:- 2 processes running at approx 50% cpu, each using 653kb ram, exe's are charmm34_6.23_windows_x86_64, processes run continuously >6 hours with zero progress, time remaining does not change from approx 4 hours. Setup:- CPU is T9800 dual core 2.93Ghz, Laptop is genuine licensed 64bit Windows 7 ultimate (from MSDN). Tried:- I have reset the project, detached and reattached, uninstalled and reinstalled,turned off DEP and still no luck. If I close the client and stop the processes the units restart from 0 time again. Please help...
	ID: 5333 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5351 - Posted 25 Aug 2009 16:37:20 UTC Last modified: 25 Aug 2009 16:37:29 UTC
	Hi Gandelf, Is this the machine with the problem ?? I don't know exactly what can be causing the problem, we kow for sure that there are eventually workunits that take much longer than the average, but it would be really improbable that more than one were assigned to the same machine in such a short interval of time. We will look at your tasks to see if we can find a possible cause. If you find something, please let us know Thank you
	ID: 5351 \| Rating: 0 \| rate: /

David Ball Forum moderator Volunteer tester Joined: Sep 18 06 Posts: 274 ID: 115 Credit: 1,634,401 RAC: 0	Message 5375 - Posted 6 Sep 2009 19:12:35 UTC
	Does that machine really only have 512MB ram? Maybe it's swapping to disk. ____________ The views expressed are my own. Facts are subject to memory error :-) Have you read a good science fiction novel lately?
	ID: 5375 \| Rating: 0 \| rate: /

SazanEyes Joined: Feb 17 09 Posts: 2 ID: 7266 Credit: 830,596 RAC: 0	Message 5389 - Posted 17 Sep 2009 4:24:41 UTC
	I'm seeing the same issue. Three of four tasks show at least 71 hours elapsed, with about 45 minutes to completion, but progress is still 0%. The fourth task hasn't started yet. The tasks are here: http://docking.cis.udel.edu/community/results.php?userid=7266 One specific task that has been running: http://docking.cis.udel.edu/community/result.php?resultid=7802724 My PC: http://docking.cis.udel.edu/community/show_host_detail.php?hostid=44672 My situation is not a RAM issue. The Docking processes use very little RAM. This box is also running yoyo@home and The Lattice Project with no issues.
	ID: 5389 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5390 - Posted 17 Sep 2009 16:41:45 UTC - in response to Message ID 5389 .
	Hi Suzan We have not been able to reproduce such an odd behavior, and we don't have an answer about what could be wrong. Please abort those workunits or detach/attach the project. Let us know if by doing this the situation changes or not. Thanks a lot
	ID: 5390 \| Rating: 0 \| rate: /

SazanEyes Joined: Feb 17 09 Posts: 2 ID: 7266 Credit: 830,596 RAC: 0	Message 5399 - Posted 17 Sep 2009 23:43:18 UTC
	I aborted the workunits. I'll let you know if I see the same problem again.
	ID: 5399 \| Rating: 0 \| rate: /

j2satx Volunteer tester Joined: Dec 22 06 Posts: 183 ID: 339 Credit: 16,191,581 RAC: 0	Message 5401 - Posted 18 Sep 2009 14:23:49 UTC - in response to Message ID 5333 .
	I have a laptop which is refusing to budge past zero percent unit progress despite eating cpu. Symptoms are:- 2 processes running at approx 50% cpu, each using 653kb ram, exe's are charmm34_6.23_windows_x86_64, processes run continuously >6 hours with zero progress, time remaining does not change from approx 4 hours. Setup:- CPU is T9800 dual core 2.93Ghz, Laptop is genuine licensed 64bit Windows 7 ultimate (from MSDN). Tried:- I have reset the project, detached and reattached, uninstalled and reinstalled,turned off DEP and still no luck. If I close the client and stop the processes the units restart from 0 time again. Please help... I have the same issue on one Intel CPU Q9550. My other Intel CPUs (Q6600, Q6700, Q9300, Q9450) with same mobo run Docking fine. All W7RC 64-bit. I did not find a solution to the issue, I just stopped running Docking on the Q9550.
	ID: 5401 \| Rating: 0 \| rate: /

Twodee Joined: Jul 2 09 Posts: 2 ID: 14800 Credit: 4,938,070 RAC: 0	Message 5444 - Posted 13 Oct 2009 20:41:34 UTC
	Same problem here. Q9650 - 4 gigs of ram - all at stock running on windows7-64bit. several tests and always the same result, => 0% after hours but, similar systems [q6600/q9300 etc..], with the same operation system and software base works fine.
	ID: 5444 \| Rating: 0 \| rate: /

Erkan Yilmaz Joined: Mar 29 09 Posts: 3 ID: 9000 Credit: 13,217 RAC: 0	Message 5478 - Posted 25 Oct 2009 4:20:44 UTC Last modified: 25 Oct 2009 5:04:35 UTC
	also no progress here about the 2 processes: 1. one belongs to the current calculation 2. one did not stop from a previous BOINC session the day before so, crashed 2., but did not help when pausing 1. in BOINC, I see still cpu is consumed by the Docking app So, I will also pause Docking until the problems solved no other info gathered about this anomaly so far on further request I could provide more info (e.g. remote connection) system: Q9400 with win7 64 RC, activated, admin user BOINC 6.10.11 happened with these WUs: http://docking.cis.udel.edu/community/result.php?resultid=8635180 http://docking.cis.udel.edu/community/result.php?resultid=8621734 http://docking.cis.udel.edu/community/result.php?resultid=8246167 Erkan YILMAZ iaskquestions.com
	ID: 5478 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5487 - Posted 28 Oct 2009 0:07:41 UTC - in response to Message ID 5478 .
	Yes, sadly is still a mystery. It could be a particular initial random configuration of the ligand or something about BOINC, but we have not been able to have this behavior in our machines so far
	ID: 5487 \| Rating: 0 \| rate: /

yuuwaku Joined: Sep 19 09 Posts: 3 ID: 18776 Credit: 4,810 RAC: 0	Message 5499 - Posted 1 Nov 2009 19:12:39 UTC
	Just thought I'd add that I've been having the same problem. I currently have a work unit beginning 1c5q that has been running for 12 hours without any progress being made on it. This has happened with several work units but I think I remember them all beginning with 1c5q. Interestingly, when I suspend the project BOINC assigns two cores to other projects and charmm keeps eating up an entire core by itself even though I have BOINC set to use a maximum of two cores right now. It also uses a constant 704K of memory, which seems low but I haven't been paying much attention to what it usually uses. System specs: Windows 7 Ultimate 64 bit Intel Core 2 Quad Q9550 8 Gigs of ram
	ID: 5499 \| Rating: 0 \| rate: /

yuuwaku Joined: Sep 19 09 Posts: 3 ID: 18776 Credit: 4,810 RAC: 0	Message 5500 - Posted 2 Nov 2009 1:02:02 UTC
	Well it isn't a particular type of work unit as I thought it might be. The same thing happened on a new one I tried and that began 1hvi. I tried aborting the task and the cpu remained in use again. I guess docking@home will have to be suspended for a little while.
	ID: 5500 \| Rating: 0 \| rate: /

Jakester Joined: Nov 7 09 Posts: 1 ID: 21258 Credit: 0 RAC: 0	Message 5509 - Posted 8 Nov 2009 14:38:53 UTC
	This is happening to me to. No matter what I try, it won't budge past zero. I detach and reattach, I suspend and resume, I abort and retry. Nothing, and it still eats my processor. Docking@home is suspended for now, let me know when this problem is fixed. Oh, my setup, if that matters: Pentium Dual-Core E6300 @ 2.83 GHz 4 GB Dual-Channel 800 MHz RAM (brand is OCZ) Radeon HD 4650 512MB (brand is HIS) Never heard of these brands? That's because they're generic: I built this machine myself. Runs great, and it should; it's a borderline entry-level gamer machine. Anyway, try to sort this out so I can help. Thanks!
	ID: 5509 \| Rating: 0 \| rate: /

ampakal Joined: Nov 2 09 Posts: 1 ID: 21040 Credit: 92,531 RAC: 0	Message 5512 - Posted 10 Nov 2009 2:42:02 UTC Last modified: 10 Nov 2009 2:43:27 UTC
	Same thing here. Here are the details on my laptop Intel core 2 duo 2.4 ghz 6 gb ram nvidia 9600M gt Windows 7 64 Machine id is 49006 Every single wu has had the same problem. It just runs and uses process time but never progresses beyond 0.00%. They even appear to start over when the laptop is restarted. Would love to contribute. I hope this can be fixed soon.
	ID: 5512 \| Rating: 0 \| rate: /

yuuwaku Joined: Sep 19 09 Posts: 3 ID: 18776 Credit: 4,810 RAC: 0	Message 5526 - Posted 21 Nov 2009 9:02:22 UTC
	So far everyone who has posted their OS has reported using windows 7 64 bit. Perhaps that is where the problem lies.
	ID: 5526 \| Rating: 0 \| rate: /

The Dirts Joined: Apr 13 09 Posts: 1 ID: 9813 Credit: 692,497 RAC: 0	Message 5529 - Posted 21 Nov 2009 20:20:34 UTC
	Just thought I'd add I also have this problem. Yes, I am using Win 7 64 bit, but it's only a recent problem. Started about 3 to 4 day ago. Stopped the tasks at 64hrs running with 0% complete. Also getting this message 21/11/2009 19:29:11 Docking@Home Sending scheduler request: Requested by user. 21/11/2009 19:29:11 Docking@Home Requesting new tasks for CPU 21/11/2009 19:29:16 Docking@Home Scheduler request completed: got 0 new tasks 21/11/2009 19:29:16 Docking@Home Message from server: No work sent 21/11/2009 19:29:16 Docking@Home Message from server: (reached daily quota of 2 results) Cheers TheDirts - TPR Win 7 - 64bit 8Gb Ram E8400
	ID: 5529 \| Rating: 0 \| rate: /

Marco Vuano Joined: Mar 4 09 Posts: 2 ID: 7913 Credit: 16,969 RAC: 0	Message 5532 - Posted 21 Nov 2009 23:57:30 UTC
	I have a desktop PC with a P5Q motherboard, an Intel Core2Duo E8400, 4 GB DDR2 RAM, a Sapphire Radeon HD 4670. When I was using Windows Vista Ultimate Service Pack 1 32 bit, Docking@Home was working perfectly, but when I switched to Windows 7 Ultimate 64 bit the Work Units stopped working correctly: the cpu time (in the "Properties" section of the WU) was always "---" and the graphics window showed "No Model Formed Yet.", even after many hours of continued processing. The work Units used 100% of the CPU core they were assigned to and used only some MBs of RAM (more than 2 GB of RAM were free). I tried to reset the project many times, with no success. The other BOINC projects were running fine. Then I tried running Docking@Home on Ubuntu 9.10 64 bit: this time the Work Units were correctly processed (the complexes of the work Units processed with Linux are so far 1pph, 1qb6 and 1ce5) while on Windows 7 they still aren't working (even the complexes 1qb6 and 1ce5). I think this problem is related to Windows 7 64 bit.
	ID: 5532 \| Rating: 0 \| rate: /

[AF>France>Aquitaine>Cote-Adour-et-Gaves]Bernard 64250 Joined: Nov 22 09 Posts: 3 ID: 21818 Credit: 500,076 RAC: 0	Message 5545 - Posted 23 Nov 2009 15:06:28 UTC Last modified: 23 Nov 2009 15:31:19 UTC
	I just joined docking@home. I have 2 active WUs. Both are blocked with respectively 43,07% and 1% progress whilst elaspsed time counters go on running for a few hours without any special activity on my PC. Is this normal ? Is there any special HW requirement ? May I abort ?
	ID: 5545 \| Rating: 0 \| rate: /

Marco Vuano Joined: Mar 4 09 Posts: 2 ID: 7913 Credit: 16,969 RAC: 0	Message 5550 - Posted 23 Nov 2009 22:16:19 UTC
	@[AF>France>Aquitaine>Cote-Adour-et-Gaves] Bernard du 40 I don't think it would be wise to abort. Your progress is far different from 0%. Try to see if the CPU time counter (in the properties section of the "Work Units" tab in the Advanced View of Boinc Manager) is blocked and try to see if there are other processes with higher priority that are "stealing" CPU time to Docking@Home
	ID: 5550 \| Rating: 0 \| rate: /

arcturus Joined: Sep 22 08 Posts: 4 ID: 1145 Credit: 767,313 RAC: 0	Message 5563 - Posted 1 Dec 2009 2:00:06 UTC
	Confirming the same problem with a Q9550, 4 gigs of RAM on Win 7 64 bit. 0% after a number of hours. However - no problem on a Phenom II 940 on Win 7 64 bit. Looks to be a lot of processing power going to waste without a solution.
	ID: 5563 \| Rating: 0 \| rate: /

Jaxis Joined: Oct 20 09 Posts: 1 ID: 20049 Credit: 559,626 RAC: 0	Message 5564 - Posted 1 Dec 2009 20:23:17 UTC
	Another confirmation of wu's with 0% progress. Windows7 64-bit Intel Core 2 Quad CPU Q8400 ATI Radeon HD 4650 8 GB RAM Boinc Ver. 6.10.18 "Properties" of task 1k1m_52_mod0014trypsin_6028_100105_0 Application: Charmm 34a2 6.23 Workunit Name: 1k1m_52_mod0014trypsin_6028_100105 State: Running Received: 12/1/2009 12:58:37PM Report deadline: 12/15/2009 11:38:56AM CPU time at last checkpoint --- CPU time --- Elapsed time: 00:12:40 Estimated time remaining: 3:59:11 Fraction done: 0.000% Virtual memory size: 153.32 MB Working set size: 2.63 MB Directory: slots/10 When prompted to "Show Graphics" it states "No Model Formed Yet." Continuously restarts approx. every 3 minutes: 12/1/2009 1:00:24 PM Docking@Home Starting 1k1m_52_mod0014trypsin_6028_100105_0 12/1/2009 1:00:25 PM Docking@Home Starting task 1k1m_52_mod0014trypsin_6028_100105_0 using charmm34 version 623 12/1/2009 1:06:33 PM Docking@Home Restarting task 1k1m_52_mod0014trypsin_6028_100105_0 using charmm34 version 623 12/1/2009 1:09:37 PM Docking@Home Restarting task 1k1m_52_mod0014trypsin_6028_100105_0 using charmm34 version 623 12/1/2009 1:12:41 PM Docking@Home Restarting task 1k1m_52_mod0014trypsin_6028_100105_0 using charmm34 version 623 12/1/2009 1:15:45 PM Docking@Home Restarting task 1k1m_52_mod0014trypsin_6028_100105_0 using charmm34 version 623
	ID: 5564 \| Rating: 0 \| rate: /

vaughan Volunteer tester Joined: Oct 3 06 Posts: 9 ID: 177 Credit: 3,108,281 RAC: 0	Message 5569 - Posted 6 Dec 2009 11:13:18 UTC Last modified: 6 Dec 2009 11:14:47 UTC
	31 hours and still at 0 percent. Estimated run-time is 3 hours WTF. I stopped BOINC and killed BOINCtray from Windows task manager (why doesn't this process close when you shutdown BOINC?) Win7 64 Ultimate, Intel C2D Wolfdale E8600 @ 4GHz Restart BOINC, tasks resume at 0 percent done and time is 0 again. Did I just waste 31 hours of crunching? Developers please address this issue ASAP. ____________
	ID: 5569 \| Rating: 0 \| rate: /

Yoran Joined: Dec 4 09 Posts: 3 ID: 22405 Credit: 1,117 RAC: 0	Message 5573 - Posted 10 Dec 2009 10:33:19 UTC
	just to let you know, I'm having the same problem: specs: Intel E8600 Nvidia gtx295 4gb of ram Windows 7 ultimate x86 info:charmm is using 50%cpu time and only 500kb ram... I will check this thread once in a while, untill then i'll disable docking@home :( please solve this problem quickly :(
	ID: 5573 \| Rating: 0 \| rate: /

steve Joined: Jun 22 09 Posts: 4 ID: 13910 Credit: 62,355 RAC: 0	Message 5587 - Posted 19 Dec 2009 19:53:12 UTC
	I see this is an ongoing problem. I just aborted all Docking WUs because of no progress shown on 13+ hours of cpu time with no progress shown on a 3hr WU. I'll check back after the first of the year to see if any solutions are offered. Specs: Win 7 64bit Intel Q9550 cpu 8 Gig of ram Ati Radeon HD 5850 grahics card BOINC 6.10.19 Bill
	ID: 5587 \| Rating: 0 \| rate: /

Luciano Joined: Dec 29 09 Posts: 2 ID: 23483 Credit: 224,791 RAC: 0	Message 5604 - Posted 31 Dec 2009 10:31:17 UTC - in response to Message ID 5573 .
	I'm having the same problem. Spec: Processor: Intel Core2 Duo CPU T9550 Cache: 6.00mb OS: Windows 7 Ultimate x64 Edition Memory: 4gb Kaspersky 2010 (no scan on folder boinc/*) process docking restart continuously every about 3 min. Charmm is using only 500kb ram. I disable docking@home
	ID: 5604 \| Rating: 0 \| rate: /

steve Joined: Jun 22 09 Posts: 4 ID: 13910 Credit: 62,355 RAC: 0	Message 5622 - Posted 8 Jan 2010 3:37:31 UTC
	Down loaded and started new work. Three units ran for 13+ hours while showing no work. After aborting the Docking work units charrm processes continued to run for several hours untill I killed them in task manager. They had all four cores pegged at 100%. Since there is no interest in fixing this bug from Docking at home I'll be detaching all computers. I'm sure my cpu cycles can be put to use on another project. Bill
	ID: 5622 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 5645 - Posted 15 Jan 2010 15:18:49 UTC
	From what I've seen, the problem may be specific to the combination of Windows 7 and a sufficiently recent Intel CPU. Anyone ready to agree or disagree? Also, anyone with this problem may want to search their log files for anything mentioning the boinc_lockfile; this probably indicates a problem I've seen on some other BOINC projects, where the problem can cascade from any workunit that originates the problem to any other workunit using the same slot before the next boinc.exe restart.
	ID: 5645 \| Rating: 0 \| rate: /

Yoran Joined: Dec 4 09 Posts: 3 ID: 22405 Credit: 1,117 RAC: 0	Message 5647 - Posted 15 Jan 2010 18:56:28 UTC - in response to Message ID 5645 .
	From what I've seen, the problem may be specific to the combination of Windows 7 and a sufficiently recent Intel CPU. Anyone ready to agree or disagree? Also, anyone with this problem may want to search their log files for anything mentioning the boinc_lockfile; this probably indicates a problem I've seen on some other BOINC projects, where the problem can cascade from any workunit that originates the problem to any other workunit using the same slot before the next boinc.exe restart. I've searched the log, and all I could find was: 07:05:05 [Docking@Home] Restarting task 1k1l_93_mod0014trypsin_18710_128757_0 using charmm34 version 623 07:08:11 [Docking@Home] Restarting task 1k1l_93_mod0014trypsin_18710_128757_0 using charmm34 version 623 07:11:16 [Docking@Home] Restarting task 1k1l_93_mod0014trypsin_18710_128757_0 using charmm34 version 623 07:14:21 [Docking@Home] Restarting task 1k1l_93_mod0014trypsin_18710_128757_0 using charmm34 version 623 07:17:26 [Docking@Home] Restarting task 1k1l_93_mod0014trypsin_18710_128757_0 using charmm34 version 623 07:20:31 [Docking@Home] Restarting task 1k1l_93_mod0014trypsin_18710_128757_0 using charmm34 version 623 07:23:36 [Docking@Home] Restarting task 1k1l_93_mod0014trypsin_18710_128757_0 using charmm34 version 623 07:26:41 [Docking@Home] Restarting task 1k1l_93_mod0014trypsin_18710_128757_0 using charmm34 and it just goes on and on and on...
	ID: 5647 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 5648 - Posted 16 Jan 2010 12:10:05 UTC Last modified: 16 Jan 2010 12:10:55 UTC
	Looks like the cause of the problem is different than what I've seen before.
	ID: 5648 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 5654 - Posted 17 Jan 2010 10:31:49 UTC
	I've suggested that the users of this thread post here to keep it all together. More people with the same problem raises the profile somewhat, but doesn't seem to offer much help yet. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 5654 \| Rating: 0 \| rate: /

Inya Joined: Jan 16 10 Posts: 4 ID: 24388 Credit: 62,023 RAC: 0	Message 5655 - Posted 17 Jan 2010 13:48:33 UTC Last modified: 17 Jan 2010 13:50:25 UTC
	Same here, no Docking WUs show any progress. Tested 10 to 15 different ones with different number/letters combination at start of their name. My info: Processor: Intel Core2 Quad CPU Q8300 @ 2.50GHz OS: Windows 7 Home Premium 64Bit process docking restarts continuously every 3 to 4 min. Charmm is using only 500KB to 600KB RAM BOINC Version 6.10.18 Other projects are running smoothly without problems (like ABC, Rosetta, RCN, NFS, SETI, PG, WCG). Docking runs smoothly at my older PCs/laptops with Intel processors (no quads) and all with WIN XP.
	ID: 5655 \| Rating: 0 \| rate: /

Yeti Joined: Sep 3 08 Posts: 2 ID: 606 Credit: 243,169 RAC: 0	Message 5656 - Posted 17 Jan 2010 15:49:28 UTC
	HM, added 6 machines for the charity-race; 5 running okay, 1 is having the described problem with no progress 1x Win7 32Bit on Intel QuadCore having problems: http://docking.cis.udel.edu/community/show_host_detail.php?hostid=56940 2x Win7 32Bit on Intel QuadCore without problems: http://docking.cis.udel.edu/community/show_host_detail.php?hostid=45880 and http://docking.cis.udel.edu/community/show_host_detail.php?hostid=56941 3x Server2K3 x64 without problems: http://docking.cis.udel.edu/community/show_host_detail.php?hostid=56942 http://docking.cis.udel.edu/community/show_host_detail.php?hostid=56943 http://docking.cis.udel.edu/community/show_host_detail.php?hostid=56944 ____________
	ID: 5656 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 5659 - Posted 18 Jan 2010 10:47:36 UTC Last modified: 18 Jan 2010 10:48:42 UTC
	Those that have the continual "non-run", was it running okay then start the problem? What I'm wondering is that something happens/ is set/ written to a file/ etc. by a wu and thereafter, the later wu's are seeing some flag/ setting/ something or other and "failing to run" as a result of that? ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 5659 \| Rating: 0 \| rate: /

Yeti Joined: Sep 3 08 Posts: 2 ID: 606 Credit: 243,169 RAC: 0	Message 5660 - Posted 18 Jan 2010 11:08:35 UTC - in response to Message ID 5659 .
	Those that have the continual "non-run", was it running okay then start the problem? What I'm wondering is that something happens/ is set/ written to a file/ etc. by a wu and thereafter, the later wu's are seeing some flag/ setting/ something or other and "failing to run" as a result of that? No, for me, the problem started direct with the first Docking-WU on the machine
	ID: 5660 \| Rating: 0 \| rate: /

Inya Joined: Jan 16 10 Posts: 4 ID: 24388 Credit: 62,023 RAC: 0	Message 5661 - Posted 18 Jan 2010 14:18:09 UTC
	Same here. Never run Docking before on that machine.
	ID: 5661 \| Rating: 0 \| rate: /

Inya Joined: Jan 16 10 Posts: 4 ID: 24388 Credit: 62,023 RAC: 0	Message 5665 - Posted 19 Jan 2010 22:12:29 UTC
	Any chances to get solved the problem with not-running/permanently restarting WUs at some machines?
	ID: 5665 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 5669 - Posted 20 Jan 2010 16:35:52 UTC
	Anyone want to mention if they've seen this problem recently on any machine NOT running Windows 7 on an Intel CPU? Anyone else want to mention what Windows 7 version on which Intel CPU type, if you haven't already, in order to help pin down what machines to use in pinning down this problem?
	ID: 5669 \| Rating: 0 \| rate: /

Inya Joined: Jan 16 10 Posts: 4 ID: 24388 Credit: 62,023 RAC: 0	Message 5675 - Posted 22 Jan 2010 20:21:17 UTC
	That is, what was said in Planet 3DNow! forum yesterday: http://www.planet3dnow.de/vbulletin/showpost.php?p=4126023&postcount=215 In English: The condition for the problem is Win7 + Intel-Yorkfield/Wolfdale (whether dual or quad, cache size does not matter). The problem may or may not occur.
	ID: 5675 \| Rating: 0 \| rate: /

Yoran Joined: Dec 4 09 Posts: 3 ID: 22405 Credit: 1,117 RAC: 0	Message 5690 - Posted 25 Jan 2010 16:28:36 UTC
	"may or may not occur" sounds to me as if they don't have a clue...
	ID: 5690 \| Rating: 0 \| rate: /

Matthias Lehmkuhl Joined: Sep 9 08 Posts: 9 ID: 801 Credit: 151,820 RAC: 0	Message 5719 - Posted 9 Feb 2010 13:07:44 UTC
	Have the same problem, Docking works fine with XP SP3, since change to Win 7 the program runs with no progress bar and no check pointing or other changes in the slot dir. Also no resultname..._0 to resultname..._3 Files where created in the project dir. resultid=10723677 resultid=10724590 reset of the project and deleting all remaining files brings no help first result was resultid=10599879 witch was started under XP SP3 and should finished under Win 7 Both OS where 32bit, same hardware. The other project results had finished without problems. Set Docking on this machine to NNW, and wait with aborting of the result till 22.02. So if you need some information feel free to contact me. ____________ Matthias
	ID: 5719 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 5720 - Posted 9 Feb 2010 17:29:06 UTC Last modified: 9 Feb 2010 17:58:35 UTC
	Hi, must be gettin bored, have, anyway looks like it, the same problems. From the messages tab : 9-2-2010 5:59:47 Docking [error] 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0: negative FLOPs left -32006514441217.789000 9-2-2010 6:00:49 Docking [error] 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0: negative FLOPs left -32112238476814.980000 9-2-2010 6:01:50 Docking [error] 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0: negative FLOPs left -32215566952693.199000 9-2-2010 6:02:50 Docking [error] 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0: negative FLOPs left -32319001897892.266000 9-2-2010 6:03:51 Docking [error] 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0: negative FLOPs left -32422436843091.328000 9-2-2010 6:04:52 Docking [error] 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0: negative FLOPs left -32525845170960.180000 9-2-2010 6:05:52 Docking [error] 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0: negative FLOPs left -32629280116159.242000 9-2-2010 6:06:53 Docking [error] 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0: negative FLOPs left -32732661826697.883000 9-2-2010 6:08:07 Docking Computation for task 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0 finished 9-2-2010 6:08:07 Docking Starting 1dif1ajv_mod0014crossdockinghiv1_1318_257996_0 9-2-2010 6:08:07 Docking Starting task 1dif1ajv_mod0014crossdockinghiv1_1318_257996_0 using charmm34 version 623 9-2-2010 6:08:07 Docking Starting 1dif1ajv_mod0014crossdockinghiv1_1317_300510_0 9-2-2010 6:08:07 Docking Starting task 1dif1ajv_mod0014crossdockinghiv1_1317_300510_0 using charmm34 version 623 9-2-2010 6:08:09 Docking Started upload of 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0_0 9-2-2010 6:08:09 Docking Started upload of 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0_1 9-2-2010 6:08:16 Docking Finished upload of 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0_0 9-2-2010 6:08:16 Docking Finished upload of 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0_1 9-2-2010 6:08:16 Docking Started upload of 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0_2 9-2-2010 6:08:16 Docking Started upload of 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0_3 9-2-2010 6:08:20 Docking Finished upload of 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0_2 9-2-2010 6:08:20 Docking Finished upload of 1ebw1ajv_mod0014crossdockinghiv1_166_409078_0_3 Can't find one WU, that was not validated, though. Anyone has a clue, as to why this is happening, some WU's just stop on my Laptop, almost all, at about 90% (HP Pavillion,T2400CPU; WIN XP x86)? The other host's (Q6600's) don't have this problem! And use a minimum of 2 GiG (DDR2) except for the XP64 host, which has 4 GiG DDR2. We have just reached and passed the 10 Million task's and WU ID . ____________ Knight who says N! Ni Ni
	ID: 5720 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 5721 - Posted 10 Feb 2010 15:31:10 UTC Last modified: 10 Feb 2010 15:32:09 UTC
	Looked like a similar issue at Rosetta here but it has appreciable differences. I don't think there is commonality. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 5721 \| Rating: 0 \| rate: /

mickey Joined: Jan 11 10 Posts: 1 ID: 24216 Credit: 0 RAC: 0	Message 5728 - Posted 26 Feb 2010 18:23:54 UTC
	Same problem for me: many hours of computation, progress still 0.000% and in the screensaver a message says something like "no protein created yet" if can help this is my pc: http://docking.cis.udel.edu/community/show_host_detail.php?hostid=56027 ____________
	ID: 5728 \| Rating: 0 \| rate: /

Hacker Joined: Mar 20 09 Posts: 2 ID: 8510 Credit: 297,087 RAC: 0	Message 5736 - Posted 2 Mar 2010 14:12:05 UTC
	Same problem. My PC: http://docking.cis.udel.edu/community/show_host_detail.php?hostid=60297
	ID: 5736 \| Rating: 0 \| rate: /

AU518987077 Joined: May 21 09 Posts: 2 ID: 11728 Credit: 594,363 RAC: 0	Message 5737 - Posted 3 Mar 2010 15:59:41 UTC
	as the people have said before. Same Here. http://docking.cis.udel.edu/community/show_host_detail.php?hostid=60472 i will watch this thread as it develops but for now I'm suspending that task as it has already done 50% of the work time (6 hours 23 minutes 13 seconds) according to the completion time of 11 hours 55 minutes and 59 seconds and shows 0% completion yet all my other tasks happily chug along.
	ID: 5737 \| Rating: 0 \| rate: /

MechWarrior Joined: Jan 25 10 Posts: 2 ID: 24925 Credit: 134,391 RAC: 0	Message 5740 - Posted 5 Mar 2010 18:32:54 UTC
	Yep same thing here... Intel Mobile Core 2 Duo P8700 Penryn 6 Gb RAM http://docking.cis.udel.edu/community/results.php?hostid=59103 Running fine on an Intel and AMD system with Win 7 RC2 64bit. Tried several reinstalls and no luck. have not had an issue withe any other OS or computer ( also running on intel p4 3Ghz and AMD 3500+ both with XP. All other progects run without issues including collatz, Seti, GPU grid, Rosetta, and Yoyo. ( first 3 running the GPU app)
	ID: 5740 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 5742 - Posted 6 Mar 2010 12:53:56 UTC
	I currently have two Charmm34a2 6.23 workunits showing around 2 hours elapsed time, 0.000% progress, 00:27:37 to completion, no checkpoints written yet, and a CPU core in use. Is this combination normal for this version? 64-bit Vista SP2 BOINC 6.10.18 1hvi_37_mod0013b type workunits
	ID: 5742 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5744 - Posted 6 Mar 2010 13:13:56 UTC Last modified: 6 Mar 2010 13:20:36 UTC
	I have some of those too now, it seems that this is caused by empty input files (file size = 0 bytes), so if you have any of those, abort them .
	ID: 5744 \| Rating: 0 \| rate: /

Jim Joined: Jan 3 10 Posts: 2 ID: 23788 Credit: 342,233 RAC: 0	Message 5745 - Posted 6 Mar 2010 15:53:57 UTC Last modified: 6 Mar 2010 15:59:35 UTC
	I have several that are of the Charmm 34a2 6.23 type. They were still at 0% after nearly 4 hours. They show a total completion run of 1hr 9 minutes. Aborted the things. Running these on a 8 core Intel machine. Also the graphics display shows that there is no model. On the Intel 4 core machine some had run for nearly 8 hours. Bad batch. ____________
	ID: 5745 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 5747 - Posted 6 Mar 2010 16:41:25 UTC
	Closer to 6 hours elapsed time before I saw the last two messages in this thread. Will abort them now.
	ID: 5747 \| Rating: 0 \| rate: /

MechWarrior Joined: Jan 25 10 Posts: 2 ID: 24925 Credit: 134,391 RAC: 0	Message 5748 - Posted 6 Mar 2010 17:38:04 UTC - in response to Message ID 5744 .
	I have some of those too now, it seems that this is caused by empty input files (file size = 0 bytes), so if you have any of those, abort them . Yep I found a few on a couple of my systems that were hanging. All had the IMP files at 0 bytes and seem to have downloaded in the past 18 hours. May try once again to run this project on my laptop. Maybe I was just getting a few bad WU's to start with and had other that would have ran.
	ID: 5748 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5749 - Posted 6 Mar 2010 17:57:34 UTC Last modified: 6 Mar 2010 17:59:07 UTC
	You can "abort" the damaged WUs before they start to crunch by deleting all 0-bytes .inp files in advance - but give them time to download ;-) p.s.: from what I can see, the bad batch is through now, the input files I received lately all had contents.
	ID: 5749 \| Rating: 0 \| rate: /

outlnder Joined: Sep 18 08 Posts: 1 ID: 1026 Credit: 4,215,011 RAC: 0	Message 5752 - Posted 7 Mar 2010 0:32:07 UTC
	UNBELIEVABLE!!! I lost more than 864 hours of computing time because of this. 18 boxen, 4 cores per box, 12 hours per core. A half day of electricity is $14. I will complete my 5 mil cobbles as I promised my teammates, then permanently "Detach" from this project.
	ID: 5752 \| Rating: 0 \| rate: /

Neil Polson Joined: Jan 18 10 Posts: 2 ID: 24578 Credit: 347,368 RAC: 0	Message 5753 - Posted 7 Mar 2010 7:15:47 UTC Last modified: 7 Mar 2010 7:19:16 UTC
	Upon waking this morning I discovered I had 3 of these too. Mine were issued around 10UTC yesterday. As Ananas stated all had 0byte .inp files. Teach me to think a thread didn't apply to me and not read it again! Could had saved myself 6 hours.
	ID: 5753 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5754 - Posted 7 Mar 2010 12:59:52 UTC
	Dang, there are still those damaged things under ways :-(
	ID: 5754 \| Rating: 0 \| rate: /

Minardi Joined: Oct 21 09 Posts: 4 ID: 20057 Credit: 3,888,211 RAC: 0	Message 5755 - Posted 7 Mar 2010 13:16:37 UTC - in response to Message ID 5744 .
	I have some of those too now, it seems that this is caused by empty input files (file size = 0 bytes), so if you have any of those, abort them . Thanks. I had this problem start about 24 hours ago. I aborted tasks until I came to some where the progress bar came off zero after a minute or so. Thanks for the heads up. I am a relatively new BOINC user - how do you see the specific files that are downloaded? I am running the BOINC client, 6.10.18, and not using a client manager, but attaching to projects on each PC through the BOINC client. Thanks
	ID: 5755 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5756 - Posted 7 Mar 2010 13:31:16 UTC Last modified: 7 Mar 2010 13:41:55 UTC
	In your BOINC data directory, there should be "projects/docking.cis.udel.edu" Look for files there that have the file extension ".inp" (Windoze default is to hide the extensions, you might have to enable that in your windows explorer) and have a file size of 0 bytes. Be careful, there are probably result files with 0 bytes file size, make sure to delete only those with .inp at the end. ______________________________________________ It seems that there are more incomplete WUs around, I found some that end in the middle of the code like : 625 SEG1 41 ARG HN 1 0.250000 1.00800 0 0.00000 -0.301140E-02 626 SEG1 41 ARG CA 10 0.500000E-01 12.0110 0 0.00000 -0.301 or 1866 SEG2 18 GLN HE21 1 0.300000 1.00800 0 0.00000 -0.301140E-02 1867 SEG2 18 GLN HE22 1 0.300000 1.00800 0 0.00000 -0.301140E-02 1868 SEG2 or even set params lpdb set paramfile @params set rtffile @params_amino.rtf set prmfile @p Not sure what to do with those, I doubt that they produce valid results :-/
	ID: 5756 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5757 - Posted 7 Mar 2010 14:00:52 UTC Last modified: 7 Mar 2010 14:24:39 UTC
	I contacted the project leader by mail now. If it had been only the 0 bytes files, it wouldn't have been that bad - aborted WUs don't go into the science database. The incomplete ones might produce invalid results without beeing caught by the validator, so those might mess up the scientific contents of the project. (I hope that not everyone had the same idea, she might be mad then) edit : Got a response already ... Quote : " We will look at this immediately. " ... so it will sure be fixed soon :-)
	ID: 5757 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 5758 - Posted 7 Mar 2010 14:26:57 UTC - in response to Message ID 5757 .
	We are looking at the problem. We may need to stop the distribution of work temporarily. We will keep you posted. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 5758 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 5759 - Posted 7 Mar 2010 15:21:46 UTC - in response to Message ID 5758 .
	We temporarily suspended the generation of new jobs while investigating some issues with the charmm script. Stay tuned .. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 5759 \| Rating: 0 \| rate: /

Aegis Maelstrom Joined: Feb 19 09 Posts: 2 ID: 7346 Credit: 90,121 RAC: 0	Message 5760 - Posted 7 Mar 2010 19:27:12 UTC
	Workunit 1iiq_43_mod0013b_1581_18995 31+ hrs of continuous work and 0.00% progress. No problem with the machine. My fault I haven't realized this error before. Halting Docking@Home until the problem is resolved. Additionally I would suggest adding some process termination (abort WU) to the watchdog: abort the WU when it is being crunched without any progress for some given period of time (2 hrs?). Similar solutions have been tried in Rosetta@Home as urgent workarounds and I think it would help to see the problem far earlier. So far we had lost a lot of computing power. Best regards, a.m., BOINC@Poland
	ID: 5760 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 5761 - Posted 7 Mar 2010 20:24:49 UTC Last modified: 7 Mar 2010 20:37:11 UTC
	Hi, except problems with BOINC version 6.10.18. not only* on my VISTA host, noticed the same problems as stated above , that is 0% progress after 6 hours and a re start of BOINC, just started the same WU's again, with zero-time and progress . When suspending a task, it just starts another one, with the same result . * BOINC, also runs 5 task's?!? After a re start: 7-3-2010 21:10:12 Starting BOINC client version 6.10.18 for windows_intelx86 7-3-2010 21:10:12 log flags: file_xfer, sched_ops, task 7-3-2010 21:10:12 Libraries: libcurl/7.19.4 OpenSSL/0.9.8l zlib/1.2.3 7-3-2010 21:10:12 Data directory: C:ProgramDataBOINC 7-3-2010 21:10:12 Running under account Fred 7-3-2010 21:10:13 Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU @ 2.40GHz [x86 Family 6 Model 15 Stepping 7] 7-3-2010 21:10:13 Processor: 4.00 MB cache 7-3-2010 21:10:13 Processor features: fpu tsc pae nx sse sse2 pni mmx 7-3-2010 21:10:13 OS: Microsoft Windows Vista: Home Premium x86 Edition, Service Pack 2, (06.00.6002.00) 7-3-2010 21:10:13 Memory: 2.00 GB physical, 4.28 GB virtual 7-3-2010 21:10:13 Disk: 290.58 GB total, 225.62 GB free 7-3-2010 21:10:13 Local time is UTC +1 hours 7-3-2010 21:10:13 NVIDIA GPU 0: GeForce 8500 GT (driver version 19038, CUDA version 2030, compute capability 1.1, 512MB, 30 GFLOPS peak) 7-3-2010 21:10:13 Not using a proxy 7-3-2010 21:10:13 Docking URL http://docking.cis.udel.edu/; Computer ID 51168; resource share 300 7-3-2010 21:10:13 Docking Restarting task 1t7k_50_mod0013b_9228_321667_0 using charmm34 version 623 7-3-2010 21:10:13 Docking Restarting task 1t7k_50_mod0013b_9227_285203_0 using charmm34 version 623 7-3-2010 21:10:38 Docking task 1t7k_50_mod0013b_9228_321667_0 suspended by user 7-3-2010 21:10:39 Docking Starting 1t7k_50_mod0013b_9226_208436_0 7-3-2010 21:10:39 Docking Starting task 1t7k_50_mod0013b_9226_208436_0 using charmm34 version 623 7-3-2010 21:10:42 Docking task 1t7k_50_mod0013b_9227_285203_0 suspended by user 7-3-2010 21:10:43 Docking Starting 1t7k_50_mod0013b_9144_184886_0 7-3-2010 21:10:43 Docking Starting task 1t7k_50_mod0013b_9144_184886_0 using charmm34 version 623 On my VISTA QUAD, it started just after 15:15, previous (same)WU's, did fine ?! No changes, during this time, wasn't even at home . . . ____________ Knight who says N! Ni Ni
	ID: 5761 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5763 - Posted 7 Mar 2010 21:26:23 UTC
	A sanity check that scans the file for a logical EOF mark would sure help - something like // in Stockholm or just a comment line with !EOF, which would not disturb the current syntax.
	ID: 5763 \| Rating: 0 \| rate: /

MacDitch Volunteer tester Joined: Sep 13 06 Posts: 27 ID: 24 Credit: 377,838 RAC: 0	Message 5765 - Posted 7 Mar 2010 22:16:40 UTC
	Workunit 1ohr_47_mod0013b_238_18734 Currently at 03:19:57 and 0.000%. Now suspended pending further instructions form the project.
	ID: 5765 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 5766 - Posted 8 Mar 2010 12:39:59 UTC Last modified: 8 Mar 2010 12:46:10 UTC
	Hi, appears the Never Ending WU's , are not over yet! [size=9]7-3-2010 21:46:07 Docking URL http://docking.cis.udel.edu/; Computer ID 51168; resource share 300 7-3-2010 21:46:07 Reading preferences override file 7-3-2010 21:46:07 Preferences limit memory usage when active to 1023.29MB 7-3-2010 21:46:07 Preferences limit memory usage when idle to 1534.93MB 7-3-2010 21:46:10 Preferences limit disk usage to 10.00GB 7-3-2010 21:46:12 Docking Restarting task 1t7k_50_mod0013b_9226_208436_0 using charmm34 version 623 7-3-2010 21:46:13 Docking Restarting task 1t7k_50_mod0013b_9144_184886_0 using charmm34 version 623 7-3-2010 21:48:02 Docking suspended by user 7-3-2010 21:48:40 Docking resumed by user 7-3-2010 21:48:51 Docking task 1t7k_50_mod0013b_9228_321667_0 resumed by user 7-3-2010 21:48:54 Docking task 1t7k_50_mod0013b_9227_285203_0 resumed by user 7-3-2010 22:48:47 Docking Resuming task 1t7k_50_mod0013b_9226_208436_0 using charmm34 version 623 7-3-2010 23:04:08 Docking Sending scheduler request: To fetch work. 7-3-2010 23:04:08 Docking Reporting 6 completed tasks, requesting new tasks for GPU 7-3-2010 23:04:13 Docking Scheduler request completed: got 0 new tasks 7-3-2010 23:16:11 Docking Resuming task 1t7k_50_mod0013b_9144_184886_0 using charmm34 version 623 7-3-2010 23:18:04 Docking Restarting task 1t7k_50_mod0013b_9228_321667_0 using charmm34 version 623 7-3-2010 23:18:04 Docking Restarting task 1t7k_50_mod0013b_9227_285203_0 using charmm34 version 623 Four task's are running for 16 hours and 0% progress , nothing seems more logical, skipping these , at least, but since I don't know where or what is going wrong, I'll have to Baby-Sit the WU's. Which is a bit absurd and has nothing to do, with donating spare-CPU & GPU, cycles . . . . Has this event already reached the Project Dev.'s and Staff? [ADDED] Did stop the 4 task's running 16 hours with no progress, no new tasks are started?! ____________ Knight who says N! Ni Ni
	ID: 5766 \| Rating: 0 \| rate: /

7ri9991 [MM] Joined: Apr 20 09 Posts: 14 ID: 10169 Credit: 304,285 RAC: 0	Message 5767 - Posted 8 Mar 2010 13:54:10 UTC
	Verster, Everything you're bringing up has already been addressed in this thread. Abort the tasks that are running forever with no progress because they have incomplete input files that will not and cannot finish. Also stated above, the project admins are looking into the problem.
	ID: 5767 \| Rating: 0 \| rate: /

MacDitch Volunteer tester Joined: Sep 13 06 Posts: 27 ID: 24 Credit: 377,838 RAC: 0	Message 5768 - Posted 8 Mar 2010 17:11:01 UTC
	Not sure if we're meant to report or not, but just in case... I've now aborted the following due to no progress: Workunit 1ohr_47_mod0013b_238_18734 at 03:19:57 Workunit 1ohr_47_mod0013b_3921_337894 at 07:28:59 Workunit 1ohr_48_mod0013b_5768_90544 at 07:28:41
	ID: 5768 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 5769 - Posted 8 Mar 2010 20:19:32 UTC
	--[snip]-- Also stated above, the project admins are looking into the problem. Thanks Trigggl, yesterday I just noticed Docking WU's being retrieved/deleted/??? But this morning, I had a new load, with the same problems . Hope they get it fixed, whithout too much hassle :) ____________ Knight who says N! Ni Ni
	ID: 5769 \| Rating: 0 \| rate: /

7ri9991 [MM] Joined: Apr 20 09 Posts: 14 ID: 10169 Credit: 304,285 RAC: 0	Message 5770 - Posted 8 Mar 2010 21:45:20 UTC - in response to Message ID 5769 .
	--[snip]-- Also stated above, the project admins are looking into the problem. Thanks Trigggl, yesterday I just noticed Docking WU's being retrieved/deleted/??? But this morning, I had a new load, with the same problems . Hope they get it fixed, whithout too much hassle :) I'm doing some RNA work until these problems are cleaned up.
	ID: 5770 \| Rating: 0 \| rate: /

[B^S] Acmefrog Volunteer tester Joined: Nov 14 06 Posts: 45 ID: 252 Credit: 1,604,407 RAC: 0	Message 5771 - Posted 9 Mar 2010 3:10:26 UTC Last modified: 9 Mar 2010 3:10:58 UTC
	I picked up a few of these never ending WUs. I have aborted them. My cache is drying up so I will see what these remaining few are doing. Has anyone seen anykind of pattern to which ones hang? All the ones I aborted seem to be different types. ____________
	ID: 5771 \| Rating: 0 \| rate: /

7ri9991 [MM] Joined: Apr 20 09 Posts: 14 ID: 10169 Credit: 304,285 RAC: 0	Message 5772 - Posted 9 Mar 2010 4:14:50 UTC - in response to Message ID 5749 .
	You can "abort" the damaged WUs before they start to crunch by deleting all 0-bytes .inp files in advance - but give them time to download ;-) p.s.: from what I can see, the bad batch is through now, the input files I received lately all had contents. Everyone keeps asking how to figure out which ones hang. Check the .inp files. Most of the ones that are going to hang are empty files. The ones that will succeed should be roughly 1.2M and end with something like this: END goto donereadpdbfile5 In Linux you can check it with tail <crossdocking-file>.inp
	ID: 5772 \| Rating: 0 \| rate: /

Calphor Joined: Sep 19 08 Posts: 1 ID: 1042 Credit: 2,653,930 RAC: 0	Message 5773 - Posted 9 Mar 2010 4:50:59 UTC
	I've noticed that one of my machines has not been affected by the bug. It is running Win7 64bit with BOINC 6.10.29. My other machines that have been bogged down with errors are all running XP 32bit and BOINC 6.10.18. Is there a relationship? ____________
	ID: 5773 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5778 - Posted 9 Mar 2010 19:42:26 UTC - in response to Message ID 5773 .
	I've noticed that one of my machines has not been affected by the bug. It is running Win7 64bit with BOINC 6.10.29. My other machines that have been bogged down with errors are all running XP 32bit and BOINC 6.10.18. Is there a relationship? Empty is empty, an x64 binary cannot change that fact :-) Possible that it handles the empty input different and aborts them immediately, possible that the box has just been lucky.
	ID: 5778 \| Rating: 0 \| rate: /

7ri9991 [MM] Joined: Apr 20 09 Posts: 14 ID: 10169 Credit: 304,285 RAC: 0	Message 5779 - Posted 9 Mar 2010 20:31:18 UTC
	I'm finally getting work again and the input files are all complete. It may be safe to download work again.
	ID: 5779 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5780 - Posted 9 Mar 2010 21:50:25 UTC - in response to Message ID 5779 .
	I'm finally getting work again and the input files are all complete. ... Same here :-) The last failures reported have probably been cached files from before the bugfix.
	ID: 5780 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 5784 - Posted 10 Mar 2010 18:21:56 UTC
	Hi still got some tasks. Which never end, presumably, 0% progress. Best to abort them? I think. ____________ Knight who says N! Ni Ni
	ID: 5784 \| Rating: 0 \| rate: /

TheFiend Joined: Apr 7 09 Posts: 70 ID: 9482 Credit: 20,705,527 RAC: 0	Message 5787 - Posted 11 Mar 2010 7:38:37 UTC
	I must have been lucky........ have only come across 2 of these units on my 4 cruchers, which have been aborted before reaching the top of the cache.
	ID: 5787 \| Rating: 0 \| rate: /

Jim Joined: Jan 3 10 Posts: 2 ID: 23788 Credit: 342,233 RAC: 0	Message 5800 - Posted 14 Mar 2010 17:04:38 UTC
	Most of the units that start with the name: "1iiq_43_" have failed to start on my Vista and Windows 7 machines. Some ran as long as 85 hours before I noticed them not completing. I aborted all the 1iiq_43_ units. NICE PROJECT
	ID: 5800 \| Rating: 0 \| rate: /

Mark Brown Joined: Dec 31 09 Posts: 6 ID: 23636 Credit: 4,678,904 RAC: 0	Message 5802 - Posted 15 Mar 2010 2:27:18 UTC
	I have several systems (19) with a variety of OS. that were stuck at 0%. I had a 2.5 day cache on all of them. I finally decided to abort all wu's that eithor havn't started yet or were stuck at 0%. I wont get new tasks for a few days. My average credit has dropped like a rock this week. Hope everything gets fixed by next week. I think the project is well worth my cpu(s) time. ____________
	ID: 5802 \| Rating: 0 \| rate: /

AU518987077 Joined: May 21 09 Posts: 2 ID: 11728 Credit: 594,363 RAC: 0	Message 5805 - Posted 15 Mar 2010 21:49:07 UTC - in response to Message ID 5737 .
	as the people have said before. Same Here. http://docking.cis.udel.edu/community/show_host_detail.php?hostid=60472 i will watch this thread as it develops but for now I'm suspending that task as it has already done 50% of the work time (6 hours 23 minutes 13 seconds) according to the completion time of 11 hours 55 minutes and 59 seconds and shows 0% completion yet all my other tasks happily chug along. heres what ive pulled form my message logs 100315 044535 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 146 times every 3 minutes give or take 20 seconds with things in between like 100315 052003 Project communication failed: attempting access to reference site 100315 052005 Internet access OK - project servers may be temporarily down. 100315 134934 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 135546 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 135851 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 140157 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 140809 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 141114 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 141420 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 141726 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 142031 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 142337 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 142643 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 143254 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 143600 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 143629 Suspending computation - user is active 100315 143931 Resuming computation 100315 144234 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 144539 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 144845 Docking Restarting task 1yqj_117_mod0013bp38alpha_2963_55009_0 using charmm34 version 623 100315 144912 Suspending computation - user is active has spent about 7 hours at this point calculating (task switch every 10 minutes so its really been running more like 3 days)and has 0% to show for it,
	ID: 5805 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 5806 - Posted 16 Mar 2010 11:38:54 UTC Last modified: 16 Mar 2010 12:01:14 UTC
	Hi, for the 3th time I'm seeing tasks with >100 K seconds runtime and no progress?!? It is not only counter-productive but very annoying too. Have a look ??? This WU !? A waste of resources, IMO. What can be done, except setting some debug flags, if it's a client error . I still have some on 1 host and it seems best to me to abort all of them. It's only a waste of time letting them run. BOINC 6.10.18 started the trouble on one of my (3) host's (QUAD's) Has anyone looked into this, or has some sort of explanation? Btw, I've changed BOINC versions 3 times, cause BOINC 6.10.xx doesn't handle large amounts of WU's, (ofcoarse from different projects), well, if you have to many WU's in 'cache', boinc.exe can rise to >20% if there are > 4000 task's in cache. Or it looses contact with local host. I only have a 3 day's cache, which appears to work better. This problem , exists about half a year! I've read 'somewhere', that it should be set and forget , maybe set and forgive, but baby-sitting BOINC can be pretty time consuming and I have something, you can call a life, too :) ____________ Knight who says N! Ni Ni
	ID: 5806 \| Rating: 0 \| rate: /

Jim Strait Joined: Jul 27 09 Posts: 1 ID: 16229 Credit: 816,722 RAC: 0	Message 5818 - Posted 20 Mar 2010 17:17:47 UTC
	I had run in to a few of the 0% completion sessions a few days ago and just encountered 4 just now. They were all just past their (same) deadline. One had been running for 12 hours and the other 3 for 5 hours. I usually can knock out one in a half hour. I aborted the 4 sessions and 2 other Docking sessions started and ran to completion normally. All were running high priority. I am running Windows XP service pack 3 with an Intel quad core i7 CPU 920 @ 2.67GHz with 2.66, 3.25 GB RAM. -Jim
	ID: 5818 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5819 - Posted 20 Mar 2010 18:21:47 UTC
	They do still deliver a lot of those damaged results but those are still old ones whith status "no reply" on the previous host. From what I can see, no new damaged results have been created lately. It would have been smart do delete all inp files with 0 bytes from the download directory, so a redelivery would have given uns a download error instead of getting stuck.
	ID: 5819 \| Rating: 0 \| rate: /

Beyond Joined: Feb 9 09 Posts: 8 ID: 6984 Credit: 3,132,056 RAC: 0	Message 5820 - Posted 21 Mar 2010 4:57:47 UTC
	I just had to delete a boatload of WUs, some running for over 16 hours and at 0%. This has gotten ridiculous.
	ID: 5820 \| Rating: 0 \| rate: /

FalconFly Joined: Jan 17 10 Posts: 1 ID: 24493 Credit: 946,295 RAC: 0	Message 5821 - Posted 21 Mar 2010 12:22:08 UTC - in response to Message ID 5820 . Last modified: 21 Mar 2010 12:22:29 UTC
	Darn, after pausing a while due to the problem and getting back into after the supposed fix, I just found I lost some ~500hours of CPU time - again. That's really annoying but at least some Workunits seem to run normal. ____________ Scientific Network : 44800 MHz - 77824 MB - 1970 GB
	ID: 5821 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 5822 - Posted 21 Mar 2010 12:49:47 UTC Last modified: 21 Mar 2010 12:56:31 UTC
	Hi, well the empty; no-progrss WU's , are gone, atleast looks that way. Now, 1 host, running VISTA x86, has almost only Docking WU's, but they all have been started (~100), some are in High Priority , not consistent with their deadline, though. When switching projects , again BOINC starts a new WU, instead of finishing the one it was working on, IMO : Maybe it has something to do with BOINC 6.10.37.? Has anyone seen this odd behavior, before or now? Atleast, work is done, but 'this' seems to be blocking BOINC, getting other task's! ____________ Knight who says N! Ni Ni
	ID: 5822 \| Rating: 0 \| rate: /

Mark Brown Joined: Dec 31 09 Posts: 6 ID: 23636 Credit: 4,678,904 RAC: 0	Message 5823 - Posted 21 Mar 2010 14:05:18 UTC
	I'm back to processing WU here, but am still getting some of those nasty 0% complete jobs. I'll just abort them when needed. ____________
	ID: 5823 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 5825 - Posted 21 Mar 2010 18:54:23 UTC - in response to Message ID 5822 . Last modified: 21 Mar 2010 18:56:47 UTC
	Hi, well the empty; no-progrss WU's , are gone, atleast looks that way. ... not the redelivered ones. I have killed about 10 just now, several of which already had eaten quite some CPU time. I run only a tiny cache, so those have been quite fresh. Example sent 21 Mar 2010 14:27:43 UTC
	ID: 5825 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 5826 - Posted 22 Mar 2010 1:31:18 UTC Last modified: 22 Mar 2010 1:37:43 UTC
	Ahh, I have 32 Docking WU's pauzed looks like a task which was switched after 60min. pauzed and another is started, but it looks they are finishing normally. Atleast, I hope so. Now 3 are running [i]High Priority ,(1 0.04CPU+GPU SETI) How could this happen? I did uncheck keep in memory, when pauzed! I think, this shouldn't happen, in the first place. Network trouble @ SETI, empty WU's, WU's randomly starting . . . Knock on wood :) ____________ Knight who says N! Ni Ni
	ID: 5826 \| Rating: 0 \| rate: /

crystalsys Joined: May 28 09 Posts: 4 ID: 12210 Credit: 738,141 RAC: 0	Message 5827 - Posted 22 Mar 2010 16:31:49 UTC Last modified: 22 Mar 2010 16:33:10 UTC
	I'm seeing similar. Get the WU, shows some reasonable estimate - I had some that said 48 minutes last week. I just killed one that showed 66 hours elapsed, 0% progress, nothing under time-to-complete. No other projects are behaving this way. The only other bad actor is PrimeGrid which keeps shoving WUs that immediately run high priority. I've got that one on no-new-tasks. ____________
	ID: 5827 \| Rating: 0 \| rate: /

crystalsys Joined: May 28 09 Posts: 4 ID: 12210 Credit: 738,141 RAC: 0	Message 5828 - Posted 23 Mar 2010 11:01:30 UTC - in response to Message ID 5827 .
	I'm seeing similar. Get the WU, shows some reasonable estimate - I had some that said 48 minutes last week. I just killed one that showed 66 hours elapsed, 0% progress, nothing under time-to-complete. No other projects are behaving this way. The only other bad actor is PrimeGrid which keeps shoving WUs that immediately run high priority. I've got that one on no-new-tasks. Last night I had one ready to start with an estimated 4:15 run time. This morning it has run for an hour, still shows 0% complete. ____________
	ID: 5828 \| Rating: 0 \| rate: /

Conan Volunteer tester Joined: Sep 13 06 Posts: 219 ID: 100 Credit: 4,256,493 RAC: 0	Message 5829 - Posted 23 Mar 2010 13:00:16 UTC
	Yes I just aborted 3 of these tasks, one at 16 hours 0.00% another at 14 hours 0.00% and one at nearly 4 hours 0.00%. So a couple still floating around. ____________
	ID: 5829 \| Rating: 0 \| rate: /

TPR_Mojo Joined: Mar 26 09 Posts: 6 ID: 8777 Credit: 7,205,188 RAC: 0	Message 5830 - Posted 23 Mar 2010 16:02:36 UTC
	Please can we have some sort of response to this problem, even if it is just an acknowledgement and "we are looking into it"? Its ongoing, we are dealing with it as best we can but the silence from the project team is deafening.
	ID: 5830 \| Rating: 0 \| rate: /

rebel9 Joined: Sep 3 08 Posts: 2 ID: 421 Credit: 67,272 RAC: 0	Message 5831 - Posted 23 Mar 2010 17:18:29 UTC
	Hear, hear. I haven't had a good WU for at least a month and probably longer. I'm sick of aborting WUs with dozens and dozens of wasted hours invested in them. I am understanding of problems but this is getting ridiculous and I'm on the cusp of disabling this project. If you don't sort it you're going to find the level of interest tumbling like a house of cards out here. Thanks.
	ID: 5831 \| Rating: 0 \| rate: /

King Leo Joined: Jun 16 09 Posts: 10 ID: 13433 Credit: 4,464,450 RAC: 0	Message 5833 - Posted 23 Mar 2010 18:30:42 UTC
	Why is this problem not being addressed? I have three computers all showing 100% CPU usage but 0% PROGRESS. This has been happening on my computers for some time now so I will have to attach to another project until someone on your end finds a remedy.
	ID: 5833 \| Rating: 0 \| rate: /

cenit Joined: Sep 25 09 Posts: 1 ID: 18997 Credit: 22,829 RAC: 0	Message 5841 - Posted 24 Mar 2010 18:20:55 UTC - in response to Message ID 5833 .
	Why is this problem not being addressed? I have three computers all showing 100% CPU usage but 0% PROGRESS. This has been happening on my computers for some time now so I will have to attach to another project until someone on your end finds a remedy. I just aborted this wu because it was at 0% after 9 hours of computation. I put Docking on NNW
	ID: 5841 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 5842 - Posted 24 Mar 2010 20:17:40 UTC
	This issue of zero progress is bad enough, yet the lack of respect to acknowledge us is truly far greater to me at this point. Almost seems you don't understand the value of the participants in this project. Simple put, without us... there is no Docking at home. Sincerely, Heidi-Ann Kennedy ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 5842 \| Rating: 0 \| rate: /

Cluster Physik Joined: Jul 2 09 Posts: 35 ID: 14795 Credit: 16,067,012 RAC: 0	Message 5843 - Posted 24 Mar 2010 23:24:09 UTC - in response to Message ID 5842 .
	This issue of zero progress is bad enough, yet the lack of respect to acknowledge us is truly far greater to me at this point. Almost seems you don't understand the value of the participants in this project. Simple put, without us... there is no Docking at home. Sincerely, Heidi-Ann Kennedy I second that. There needs to be at least a response if not a solution to this severe problem! I'm really thinking about setting Docking to "now new work" if nothing happens. I have better things to do than to constantly check all systems for those broken WUs. And I'm sure I'm not the only one considering the simplest "solution". It's just a click in my account manager to get rid of this annoyance.
	ID: 5843 \| Rating: 0 \| rate: /

DoubleTop Joined: Mar 29 09 Posts: 11 ID: 9044 Credit: 26,873,788 RAC: 0	Message 5844 - Posted 25 Mar 2010 8:37:14 UTC Last modified: 25 Mar 2010 8:37:45 UTC
	I totally agree with ScientificFrontline on this one, I'm having to put in masses of effort to keep my machines computing work units. Effort that I should not have to, imo. I've lost 100's of hours worth of computing time on these poor units, and there are some machines that I won't be onsite for another month and have the chance to check. Some form of technical response, or even an apology of sorts from the project team wouldn't go amiss. DT.
	ID: 5844 \| Rating: 0 \| rate: /

crystalsys Joined: May 28 09 Posts: 4 ID: 12210 Credit: 738,141 RAC: 0	Message 5851 - Posted 26 Mar 2010 13:56:20 UTC
	There's another thread calling for a boycott, which I'm not yet inclined to do. But where is "the man behind the curtain"? There are other, worthwhile projects out there, and we get to choose which ones to run, so some acknowledgment or comment would seem to be appropriate here. ____________
	ID: 5851 \| Rating: 0 \| rate: /

TPR_Mojo Joined: Mar 26 09 Posts: 6 ID: 8777 Credit: 7,205,188 RAC: 0	Message 5853 - Posted 26 Mar 2010 16:09:31 UTC
	22k RAC = ninth overall = bye bye Docking, at least until they have learned some basic communication skills. I'm not burning electricity and putting my time and effort in for a group who don't even think they need to talk to their volunteers. Plenty of units coming soon, I'm about to ditch about 1300
	ID: 5853 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5854 - Posted 26 Mar 2010 18:58:45 UTC Last modified: 26 Mar 2010 19:06:44 UTC
	During March 7-11 we had a big problem with the server ( http://docking.cis.udel.edu/about/project/news.php ), which was running out of space and therefore several workunits were sent empty. On March 11 we stopped the production, increase the server partition and resume distribution. I want to think that the problems of workunits with 0% progress are still some of those created during March 7-11, what worries me the most is that now it's been 15+ days and you are still having the problems. According to the example posted by Ananas, the problem was indeed an old workinit (created on March 7th), but if any of you could post a link to one of those eternal workunits it would be great, specially if it was created after March 11/12 so that we can take a look into the input files, in the meantime we are trying to identify what can be wrong, and if the problem is contained to those old wu's or if it is spread
	ID: 5854 \| Rating: 0 \| rate: /

DoubleTop Joined: Mar 29 09 Posts: 11 ID: 9044 Credit: 26,873,788 RAC: 0	Message 5856 - Posted 26 Mar 2010 19:20:02 UTC - in response to Message ID 5854 .
	unfortunately, I think a large proportion of users have aborted all or detached now Trilce, and you'll have a hard time getting those logs. I for one can vouch that I simply went through and found all units with a 0KB size and simply aborted them, so perhaps there is a way to analyse the "User aborted" units in the system? If you need help, then please do ask. I understand you have PhD deadlines to hit and that is fine, but I'm raising my hand for looking through an export of "Client Aborted", the difficulty will be putting them next to logs of files sent, as I think the results of such work may be now skewed from the number of people who have simply aborted all units. I wonder if the actual .exe should perform a check on the file sizes? hth, DT.
	ID: 5856 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 5858 - Posted 26 Mar 2010 19:59:39 UTC - in response to Message ID 5854 .
	During March 7-11 we had a big problem with the server ( http://docking.cis.udel.edu/about/project/news.php ), which was running out of space and therefore several workunits were sent empty. On March 11 we stopped the production, increase the server partition and resume distribution. I want to think that the problems of workunits with 0% progress are still some of those created during March 7-11, what worries me the most is that now it's been 15+ days and you are still having the problems. According to the example posted by Ananas, the problem was indeed an old workinit (created on March 7th), but if any of you could post a link to one of those eternal workunits it would be great, specially if it was created after March 11/12 so that we can take a look into the input files, in the meantime we are trying to identify what can be wrong, and if the problem is contained to those old wu's or if it is spread Trilce, This is the only project I have ever truly been passionate about. I understand issues do arise, yet a response of any kind is always imperative. One of science as yourself knows the importance of communication, without such... there is failure. I'm willing also to help with communicating with members, your team has to establish the connection though. ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 5858 \| Rating: 0 \| rate: /

DoubleTop Joined: Mar 29 09 Posts: 11 ID: 9044 Credit: 26,873,788 RAC: 0	Message 5859 - Posted 26 Mar 2010 20:09:08 UTC
	http://docking.cis.udel.edu/community/workunit.php?wuid=11126773 I've just been sent this unit by the server, which falls into the problem timeframe of unit generation from the first send. The problem may be the aborted work units being re-sent, or even non-returned being resent. The user Miguel in this instance could be stuck on 0% with that unit and hence why I've received it now. Pure fluke that I spotted this one come into my queue, knowing that I had only just set the project to allow new work. hth, DT.
	ID: 5859 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5864 - Posted 26 Mar 2010 20:43:09 UTC - in response to Message ID 5859 .
	I think you are right DoubleTop, we are/will face a different problem now, there is a bug in the transitioner that keeps generating workunits even if we said we want just one. We had changed the workuint generator and other daemons around and haven't been able to keep it straight, I need to modify the validator to accept these kind of workunits, let's see if I can fix it soon @Scientific Frontline, I'm sorry about this, and you are absolutely right, I'm seriously thinking about making a facebook group, or twitter account or something more effective than the forums, because this format is not helping much
	ID: 5864 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 5869 - Posted 27 Mar 2010 0:25:56 UTC - in response to Message ID 5864 .
	I think you are right DoubleTop, we are/will face a different problem now, there is a bug in the transitioner that keeps generating workunits even if we said we want just one. We had changed the workuint generator and other daemons around and haven't been able to keep it straight, I need to modify the validator to accept these kind of workunits, let's see if I can fix it soon @Scientific Frontline, I'm sorry about this, and you are absolutely right, I'm seriously thinking about making a facebook group, or twitter account or something more effective than the forums, because this format is not helping much As I am also, now lets all move forward with better understanding of both sides. Heidi-Ann Kennedy ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 5869 \| Rating: 0 \| rate: /

Mark Brown Joined: Dec 31 09 Posts: 6 ID: 23636 Credit: 4,678,904 RAC: 0	Message 5870 - Posted 27 Mar 2010 1:09:43 UTC
	When the problem started, I stopped requesting new WUs. When I started getting WUs again, I started monitoring for 0% and killed only them. I figured there were enough good WUs to make it valid to keep pressing on. I havn't received any bad WUs for awhile but if I do, I will document them before killing them. I hope others proceed in this manner instead of moving to other projects. I do run other projects concurrently with D@H set to a much higher priority. Good luck. BTW, facebook would be a good forum for status problems/updates. ____________
	ID: 5870 \| Rating: 0 \| rate: /

Bryan Price Joined: Jun 22 09 Posts: 2 ID: 13990 Credit: 526,416 RAC: 0	Message 5874 - Posted 28 Mar 2010 15:58:34 UTC - in response to Message ID 5870 .
	Good luck. BTW, facebook would be a good forum for status problems/updates. With the screensaver, a quick update "If you see this and your project is at 0%, abort it!" would have sufficed! :)
	ID: 5874 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 5879 - Posted 29 Mar 2010 0:40:39 UTC
	Personally I would just like to see the notices on the main page where they belong in my opinion. Don't use screen-savers and just about as negative towards social sites as one can get. Project news/updates belong on the project site, nowhere else. ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 5879 \| Rating: 0 \| rate: /

7ri9991 [MM] Joined: Apr 20 09 Posts: 14 ID: 10169 Credit: 304,285 RAC: 0	Message 5880 - Posted 29 Mar 2010 10:50:46 UTC - in response to Message ID 5874 .
	Good luck. BTW, facebook would be a good forum for status problems/updates. With the screensaver, a quick update "If you see this and your project is at 0%, abort it!" would have sufficed! :) Except the problem is a 0 data input file. If they can code it to recognize that and switch the screensaver, it would be easier to code it to recognize and abort, or better yet, recognize and re-download. It would be easy to scan an input file for the word "END".
	ID: 5880 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5883 - Posted 29 Mar 2010 17:13:55 UTC
	The abortion was supposed to be codded on the charmm warper, but it is obvious that is not working. We will have to revisit the code to add a reliable way to detect empty or truncated input files
	ID: 5883 \| Rating: 0 \| rate: /

7ri9991 [MM] Joined: Apr 20 09 Posts: 14 ID: 10169 Credit: 304,285 RAC: 0	Message 5886 - Posted 29 Mar 2010 18:06:23 UTC - in response to Message ID 5883 .
	The abortion was supposed to be codded on the charmm warper, but it is obvious that is not working. We will have to revisit the code to add a reliable way to detect empty or truncated input files I'm not a programmer, but I play one on web forums. :-D
	ID: 5886 \| Rating: 0 \| rate: /

rebel9 Joined: Sep 3 08 Posts: 2 ID: 421 Credit: 67,272 RAC: 0	Message 5888 - Posted 30 Mar 2010 14:22:47 UTC - in response to Message ID 5870 .
	When the problem started, I stopped requesting new WUs. When I started getting WUs again, I started monitoring for 0% and killed only them. I figured there were enough good WUs to make it valid to keep pressing on. I havn't received any bad WUs for awhile but if I do, I will document them before killing them. I hope others proceed in this manner instead of moving to other projects. Yeees, unfortunately, there aren't any "good" WUs, or at least I haven't seen one for months, so the cusp is now behind me and I've stopped receiving work until such time as their is a clear indication that this has stopped. Shame.
	ID: 5888 \| Rating: 0 \| rate: /

Mark Brown Joined: Dec 31 09 Posts: 6 ID: 23636 Credit: 4,678,904 RAC: 0	Message 5889 - Posted 30 Mar 2010 23:48:55 UTC
	Another one bites the dust.. 1t7k_50_mod0013b_9971_238982_1 Received: 3/21/2010 Deadline: 4/4/2010 CPU Time: 83:16:00 Elapsed Time: 95:00:30 Estimated Time Remaining: 04:08:41 Fraction Done: 0.000% Task ID 12021095 Name 1t7k_50_mod0013b_9971_238982_1 Workunit 11113581 Created 21 Mar 2010 7:03:10 UTC Sent 21 Mar 2010 7:03:35 UTC Received 30 Mar 2010 23:41:53 UTC Server state Over Outcome Client error Client state Aborted by user Exit status -197 (0xffffffffffffff3b) Computer ID 61649 Report deadline 4 Apr 2010 5:43:35 UTC CPU time 299902.6 stderr out <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> aborted by user </message> <stderr_txt> Calling BOINC init. Starting charmm run (initial or from checkpoint)... Calling BOINC init. Starting charmm run (initial or from checkpoint)... Calling BOINC init. Starting charmm run (initial or from checkpoint)... Calling BOINC init. Starting charmm run (initial or from checkpoint)... Calling BOINC init. Starting charmm run (initial or from checkpoint)... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x7C81A3E1 Engaging BOINC Windows Runtime Debugger... ******************** ____________
	ID: 5889 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5892 - Posted 1 Apr 2010 22:08:20 UTC - in response to Message ID 5889 .
	Another one bites the dust.. 1t7k_50_mod0013b_9971_238982_1 Yes, that was one of them. The good news is that this crisis helped us to take care of several pending issues, one was the generation and validation of retrials, like this one ending in _1
	ID: 5892 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5893 - Posted 1 Apr 2010 22:10:21 UTC - in response to Message ID 5889 .
	Another one bites the dust.. 1t7k_50_mod0013b_9971_238982_1 Yes, that was one of them. The good news is that this crisis helped us to take care of several pending issues, one was the generation and validation of retrials, like this one ending in _1
	ID: 5893 \| Rating: 0 \| rate: /

TPR_Mojo Joined: Mar 26 09 Posts: 6 ID: 8777 Credit: 7,205,188 RAC: 0	Message 5894 - Posted 2 Apr 2010 19:26:55 UTC - in response to Message ID 5893 . Last modified: 2 Apr 2010 19:28:59 UTC
	Yes, that was one of them. The good news is that this crisis helped us to take care of several pending issues, one was the generation and validation of retrials, like this one ending in _1 Yes, that was one of them. The good news is that this crisis helped us to take care of several pending issues, one was the generation and validation of retrials, like this one ending in _1 Unfortunately we have found a new bug in the message board software where under certain circumstances replies are posted twice.......... ;)
	ID: 5894 \| Rating: 0 \| rate: /

Trilce Estrada Forum moderator Project administrator Project developer Project tester Joined: Sep 19 06 Posts: 189 ID: 119 Credit: 1,217,236 RAC: 0	Message 5897 - Posted 5 Apr 2010 16:43:02 UTC - in response to Message ID 5894 . Last modified: 5 Apr 2010 16:43:54 UTC
	True =)
	ID: 5897 \| Rating: 0 \| rate: /

arcturus Joined: Sep 22 08 Posts: 4 ID: 1145 Credit: 767,313 RAC: 0	Message 5900 - Posted 5 Apr 2010 20:54:43 UTC
	<sigh> 4 months later, hoping things have improved? Nope. Downloaded 4 of the Charmm 34a2 6.23's which have (you guessed it!) the all too familiar 0% issue. Aborted them. Q9550 Yorkfield on Win 7 64 bit. Who has time to babysit? Hopeless.
	ID: 5900 \| Rating: 0 \| rate: /

Hacker Joined: Mar 20 09 Posts: 2 ID: 8510 Credit: 297,087 RAC: 0	Message 5916 - Posted 29 Apr 2010 14:00:05 UTC
	Still no solution for me, either. 0% on all workunits I get after aborting.
	ID: 5916 \| Rating: 0 \| rate: /

sam_spade Joined: May 30 09 Posts: 1 ID: 12330 Credit: 701,781 RAC: 0	Message 5987 - Posted 20 Aug 2010 11:05:46 UTC
	Is there any solution by now? I'ver got 2 computers with the above-mentioned problem: * GenuineIntel Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz [Family 0 Model 0 Stepping 0] Microsoft Windows 7 Enterprise x64 Edition, (06.01.7600.00) Boinc 6.11.4 * GenuineIntel Pentium(R) Dual-Core CPU T4300 @ 2.10GHz [Family 6 Model 23 Stepping 10] Microsoft Windows 7 Enterprise x86 Edition, (06.01.7600.00) Boinc 6.10.58 The other computers works fine with Windows7: * AuthenticAMD AMD Phenom(tm) II X6 1055T Processor [Family 0 Model 0 Stepping 0] * GenuineIntel Intel(R) Atom(TM) CPU N270 @ 1.60GHz [Family 6 Model 28 Stepping 2]
	ID: 5987 \| Rating: 0 \| rate: /

BF Volunteer tester Joined: Nov 14 06 Posts: 3 ID: 299 Credit: 147,913 RAC: 0	Message 6037 - Posted 1 Oct 2010 5:39:21 UTC
	bump
	ID: 6037 \| Rating: 0 \| rate: /

sharky Joined: Sep 16 10 Posts: 2 ID: 33046 Credit: 0 RAC: 0	Message 6040 - Posted 7 Oct 2010 5:36:31 UTC
	I just noticed the issue too, but unfortunately I aborted the tasks prior to looking for a thread on it. I had 4 tasks hung this morning, aborted them and 4 more started, got home from work and they were still at 0% Intel Q9550, windows 7 64bit
	ID: 6040 \| Rating: 0 \| rate: /

Felix Joined: Jan 19 11 Posts: 4 ID: 36973 Credit: 320 RAC: 0	Message 6148 - Posted 21 Jan 2011 18:24:48 UTC - in response to Message ID 6040 .
	Same problem here. Windows Seven 64bit, Intel SU9400 CPU. Docking is always 0,00%, Rosetta@Home goes flawless. I stopped Docking, tried to restart and to ask a new job. Fail. Any idea? I'm gonna try with an AMD machine with the same profile... I'll let u know.
	ID: 6148 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6151 - Posted 23 Jan 2011 3:08:52 UTC - in response to Message ID 6148 . Last modified: 23 Jan 2011 3:16:56 UTC
	Same problem here. Windows Seven 64bit, Intel SU9400 CPU. Docking is always 0,00%, Rosetta@Home goes flawless. I stopped Docking, tried to restart and to ask a new job. Fail. Any idea? I'm gonna try with an AMD machine with the same profile... I'll let u know. On other BOINC projects, that often happens if you don't let it run long enough to reach the first checkpoint, for any workunits that only update their progress at checkpoints. Therefore, it would be useful to know how much CPU time and how much elapsed time it used while still showing 0.00% progress, to see if it should have reached a checkpoint by then. Also, some BOINC projects will wait about 24 hours after you report a failed or aborted workunit before sending you any more workunits at all.
	ID: 6151 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 6158 - Posted 29 Jan 2011 0:08:43 UTC Last modified: 29 Jan 2011 0:10:11 UTC
	No such problem for monts (actually I had not even one with no progress until now) but now 5 in a row, e.g. : http://docking.cis.udel.edu/community/result.php?resultid=19014897 2 ran for several hours (this one for 8h) and as the usual behavior is 1% (or more) after 1 or 2 minutes, I aborted 3 more after ~5 minutes at 0%. The ones that are running now are just normal, ~20% after an hours, progress constantly increasing. edit : If it was a checkpoint problem, it would not record the elapsed CPU time, it would reset to 0 after each restart - but it did record the time so it is not a checkpoint related error.
	ID: 6158 \| Rating: 0 \| rate: /

Tom Joined: Oct 31 10 Posts: 1 ID: 34448 Credit: 816,266 RAC: 0	Message 6161 - Posted 29 Jan 2011 16:59:25 UTC - in response to Message ID 6158 .
	No such problem for monts (actually I had not even one with no progress until now) but now 5 in a row, e.g. : http://docking.cis.udel.edu/community/result.php?resultid=19014897 2 ran for several hours (this one for 8h) and as the usual behavior is 1% (or more) after 1 or 2 minutes, I aborted 3 more after ~5 minutes at 0%. The ones that are running now are just normal, ~20% after an hours, progress constantly increasing. edit : If it was a checkpoint problem, it would not record the elapsed CPU time, it would reset to 0 after each restart - but it did record the time so it is not a checkpoint related error. I've been having the same problem for the last few days and have aborted a number of workunits at various times (up to an hour) in their progress. The progress always remained at 0%. Finally decided to let one run for the entire estimated completion time of about 3 hour. The time to completion went down to zero, the progress stayed at 0%, and the elapsed time continued to count up. Are these workunits defective? I'm also running Seti and not having any problems. I would like to continue with this project, but don't want to waste my CPU time if the workunits are defective. Anybody got any answers?
	ID: 6161 \| Rating: 0 \| rate: /

P . P . L . Joined: Oct 20 08 Posts: 69 ID: 2725 Credit: 1,000,979 RAC: 0	Message 6162 - Posted 30 Jan 2011 4:33:30 UTC Last modified: 30 Jan 2011 5:04:00 UTC
	Hi. Looks like the problem has struck linux now too, noticed one stuck on my quad had been running for 42min and was showing 0%. They are usually moving after 5min, so i aborted it i'll have to keep an eye on them from now on! 1hvk1hbv_mod0014crossdockinghiv1_1648_108194 http://docking.cis.udel.edu/community/workunit.php?wuid=18475870 edit / two more, one from hex core ran 2hrs 23min 0% other 33min 0% not good! 1hvk1hbv_mod0014crossdockinghiv1_1649_151280 http://docking.cis.udel.edu/community/workunit.php?wuid=18475871 1hvj1hbv_mod0014crossdockinghiv1_20102_333689 http://docking.cis.udel.edu/community/workunit.php?wuid=18472967 edit / I think i know at least with my two bad tasks they are missing an input file, (I have another one running now i'll let it go for 30min if it hasn't moved by then i'll abort it) back to the story, these two d/l to fast / faster then they normally do, not the full file size.? ____________
	ID: 6162 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 6163 - Posted 30 Jan 2011 13:17:45 UTC
	1hvi1hbv_mod0014crossdockinghiv1_13789_408524 showing 09:13:21 CPU time and 48:27:37 elapsed. Fraction done 0.000%. Aborted. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 6163 \| Rating: 0 \| rate: /

Saenger Volunteer tester Joined: Sep 13 06 Posts: 125 ID: 79 Credit: 411,959 RAC: 0	Message 6164 - Posted 30 Jan 2011 13:22:26 UTC
	Got one as well, 1hvj1hbv_mod0014crossdockinghiv1_18736_423697 . CPU-time was 15:45h, process bar showed 0%, I just aborted it. It says CPU-time 0 sec, so nothing was recorded. My system was always at 100%, so CPU-time was definitely used, just not for something useful. ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 6164 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 6165 - Posted 30 Jan 2011 14:52:42 UTC
	1hvj1hbv_mod0014crossdocking_15775_294374 01:34:53 CPU time, 06:18:04 elapsed time, 0.000% done. Aborted. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 6165 \| Rating: 0 \| rate: /

MAPSIT Joined: Sep 2 10 Posts: 1 ID: 32608 Credit: 999,488 RAC: 0	Message 6166 - Posted 30 Jan 2011 16:05:48 UTC
	2011/01/30: I've gotten hit with a bunch of the "run forever - no progress" WUs on my Windows 7 Pro-64 bit, 8 cores machine. Being firmly committed to the "set it and forget it" philosophy for volunteering my cycles, I've simply aborted them after the remaining time went to zero. I'll leave the debugging to those with greater time and expertise. Having read the thread about the problem, I'll now delete any future apparent problems after one hour with no progress rather than the 20 hours I've been giving WUs with an estimated 15 hour completion. Good luck to those attempting to track down and resolve this issue. As others have said, it's annoying.
	ID: 6166 \| Rating: 0 \| rate: /

Trotador Joined: Sep 5 09 Posts: 5 ID: 18182 Credit: 6,445,766 RAC: 0	Message 6167 - Posted 30 Jan 2011 17:45:26 UTC Last modified: 30 Jan 2011 17:46:06 UTC
	Another one, aborted after 12 hours, 0% performed and 0% remaining 1hvk1hbv_mod0014crossdockinghiv1_1992_194045_0 ubuntu 9.1 64 bits
	ID: 6167 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 6168 - Posted 30 Jan 2011 19:25:02 UTC Last modified: 30 Jan 2011 19:28:01 UTC
	50 more hours wasted, setting to no new work :-( edit : 4 out of 5 WUs had this problem this time.
	ID: 6168 \| Rating: 0 \| rate: /

Paratima Joined: May 31 10 Posts: 4 ID: 29644 Credit: 1,021,214 RAC: 0	Message 6169 - Posted 31 Jan 2011 3:14:39 UTC
	Having the same problem, only on my Win7 machine. However, NOT Intel CPU. Details: AuthenticAMD AMD Phenom(tm) II X4 945 Processor [AMD64 Family 16 Model 4 Stepping 2] Microsoft Windows 7 Home Premium x64 Edition, (06.01.7600.00) Am aborting the bad units & hoping for the best.
	ID: 6169 \| Rating: 0 \| rate: /

P . P . L . Joined: Oct 20 08 Posts: 69 ID: 2725 Credit: 1,000,979 RAC: 0	Message 6170 - Posted 31 Jan 2011 4:59:16 UTC
	Hi. I've had a few good ones and this one not so good, ran for 18min 0% aborted it. Some bad tasks are still around. http://docking.cis.udel.edu/community/workunit.php?wuid=18487328 1hvk1hbv_mod0014crossdockinghiv1_12743_296212 ____________
	ID: 6170 \| Rating: 0 \| rate: /

ZoSo Joined: Oct 14 10 Posts: 16 ID: 33872 Credit: 3,742,738 RAC: 0	Message 6172 - Posted 31 Jan 2011 18:00:43 UTC Last modified: 31 Jan 2011 18:01:55 UTC
	OK... only my linux boxes were showing this, not any of my windows machines, so I reported it on the linux board. I'll just keep an eye on it and abort them when it happens, since it appears to have been going on for over 17 months without a fix.
	ID: 6172 \| Rating: 0 \| rate: /

Strom Joined: Dec 29 09 Posts: 2 ID: 23444 Credit: 1,316,847 RAC: 0	Message 6174 - Posted 31 Jan 2011 18:32:46 UTC
	I just aborted seven of these with 0% progress after various run lengths. The four processing now are showing progress after only a few minutes of run time. Seems to be an issue with some WUs, but not all.
	ID: 6174 \| Rating: 0 \| rate: /

Trotador Joined: Sep 5 09 Posts: 5 ID: 18182 Credit: 6,445,766 RAC: 0	Message 6175 - Posted 31 Jan 2011 19:58:31 UTC
	new one 1hvk1hbv_mod0014crossdockinghiv1_17825_206714_0 aborted after 14 hours 0% progress, 3 hours remaining
	ID: 6175 \| Rating: 0 \| rate: /

mctonale Joined: Jan 21 10 Posts: 1 ID: 24730 Credit: 98,138 RAC: 0	Message 6176 - Posted 31 Jan 2011 21:20:20 UTC
	Also having this (recuring) problem on two machines. Core i7, 3x2gb ddr3, win7, boinc 6.12.12 (64). Phenom 9650, ubuntu 10.10, boinc 6.10.56. Over an hour and still @ 0%. Appears too be docking only collatz, seti, rosetta and aqua are all running fine. I dont think it is a memory issue as have swapped memory on phenom.. 2x1GB ddr2 1066 (at 800 due to motherboard restriction) and 2x2GB ddr2 800. with the same results. cpu is still 100%?? Project suspended.. for now.
	ID: 6176 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 6178 - Posted 1 Feb 2011 2:58:14 UTC
	Yet another, 1hvk1hbv_mod0014crossdockinghiv1_24501_456879 . There is a problem here. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 6178 \| Rating: 0 \| rate: /

P . P . L . Joined: Oct 20 08 Posts: 69 ID: 2725 Credit: 1,000,979 RAC: 0	Message 6179 - Posted 1 Feb 2011 4:45:52 UTC
	Hi. I'm now getting nothing but 0 file size every time my rigs try get new tasks. Suspending DOCKING until someone from the project says it's fixed. ____________
	ID: 6179 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 6180 - Posted 1 Feb 2011 8:30:12 UTC - in response to Message ID 6179 . Last modified: 1 Feb 2011 8:33:44 UTC
	1hvk1hbv_mod0014crossdockinghiv1_33167_465761 and... 1hvk1hbv_mod0014crossdockinghiv1_32186_207448 and... 1hvk1hbv_mod0014crossdockinghiv1_27165_411078 All 0% and aborted. No response from the project, sadly, as usual. Michela, you are turning people off the project with this lack of attention to this and other issues on the boards here. No New Tasks set. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 6180 \| Rating: 0 \| rate: /

sandro Joined: Sep 3 08 Posts: 4 ID: 512 Credit: 4,076,636 RAC: 0	Message 6181 - Posted 1 Feb 2011 8:37:53 UTC
	Same here, i aborded a bunch of WUs this day. all stucked at 0.0%
	ID: 6181 \| Rating: 0 \| rate: /

TheFiend Joined: Apr 7 09 Posts: 70 ID: 9482 Credit: 20,705,527 RAC: 0	Message 6183 - Posted 1 Feb 2011 9:57:15 UTC
	These dodgy WU's that are coming through have .INP files with a size of 0KB. Have a look through your cache in your project directory and you will find them. I have just aborted about 70 of them so far, but it seems as though they are still being sent out. I run a large cache so I have crunched any of them.. luckily!!!
	ID: 6183 \| Rating: 0 \| rate: /

etrecords Joined: Nov 18 09 Posts: 2 ID: 21628 Credit: 1,943,170 RAC: 0	Message 6186 - Posted 1 Feb 2011 18:27:11 UTC
	I also found a number of these wu on different systems. All with the imp file of zero bytes. Due the fact that I don't have the time to babysit my systems this can cause that I have to decide to stop temperary with docking
	ID: 6186 \| Rating: 0 \| rate: /

MaW Joined: Jan 26 11 Posts: 17 ID: 37208 Credit: 114,943 RAC: 0	Message 6187 - Posted 1 Feb 2011 19:01:03 UTC Last modified: 1 Feb 2011 19:05:17 UTC
	Is this project still maintained or someone locked the server room in November and forgot about it? And it's miraculously running by itself? Need to abort about half of work units because of this problem... Edit: just now got another series of dead WUs.. Joined this project recently as it seems more focused on certain task than Rosetta, but this is dissapointing..
	ID: 6187 \| Rating: 0 \| rate: /

ZoSo Joined: Oct 14 10 Posts: 16 ID: 33872 Credit: 3,742,738 RAC: 0	Message 6188 - Posted 1 Feb 2011 19:12:52 UTC - in response to Message ID 6186 .
	... All with the imp file of zero bytes. Good catch... I just aborted a half dozen with 0-byte .inp files here, before they started running. I've wasted over 200 hours of crunching because of those over the last few days. Now how do we abort work units with a script? Then we can just have it check /var/lib/boinc/projects/docking.cis.udel.edu/ say, every 15 minutes, and abort work units with the 0-byte *.inp file names. Or would it be enough to just delete the 0-byte .inp files so BOINC doesn't even try to run those WU's and instead Aborts them itself?
	ID: 6188 \| Rating: 0 \| rate: /

MaW Joined: Jan 26 11 Posts: 17 ID: 37208 Credit: 114,943 RAC: 0	Message 6189 - Posted 1 Feb 2011 19:16:37 UTC
	Well, for me >all< the good ones after 2 minutes are already 1%, so if after 4-5 minutes it's still 0 i abort them.
	ID: 6189 \| Rating: 0 \| rate: /

TheFiend Joined: Apr 7 09 Posts: 70 ID: 9482 Credit: 20,705,527 RAC: 0	Message 6190 - Posted 1 Feb 2011 19:52:13 UTC - in response to Message ID 5854 .
	During March 7-11 we had a big problem with the server ( http://docking.cis.udel.edu/about/project/news.php ), which was running out of space and therefore several workunits were sent empty. On March 11 we stopped the production, increase the server partition and resume distribution. I want to think that the problems of workunits with 0% progress are still some of those created during March 7-11, what worries me the most is that now it's been 15+ days and you are still having the problems. According to the example posted by Ananas, the problem was indeed an old workinit (created on March 7th), but if any of you could post a link to one of those eternal workunits it would be great, specially if it was created after March 11/12 so that we can take a look into the input files, in the meantime we are trying to identify what can be wrong, and if the problem is contained to those old wu's or if it is spread Looks like it could be the same problem as happened March 2010!
	ID: 6190 \| Rating: 0 \| rate: /

Chris Granger Joined: Sep 17 10 Posts: 2 ID: 33087 Credit: 294,493 RAC: 0	Message 6191 - Posted 1 Feb 2011 20:26:55 UTC
	Both of my machines are experiencing this problem. One is Linux 64-bit with 8GB of RAM and the other is Windows Vista 32-bit with 2GB of RAM. Work units ran for over 12 hours with 0% progress before I aborted them.
	ID: 6191 \| Rating: 0 \| rate: /

MaW Joined: Jan 26 11 Posts: 17 ID: 37208 Credit: 114,943 RAC: 0	Message 6192 - Posted 1 Feb 2011 20:56:41 UTC - in response to Message ID 6190 .
	Looks like it could be the same problem as happened March 2010! Makes sense. But I'm a bit worried if there's anyone out there to fix the problem this time ^.=
	ID: 6192 \| Rating: 0 \| rate: /

TheFiend Joined: Apr 7 09 Posts: 70 ID: 9482 Credit: 20,705,527 RAC: 0	Message 6193 - Posted 1 Feb 2011 21:56:43 UTC
	Just fired a PM off to Trilce Estrada (see above) to see if she is aware of the current problems. Hopefully she will have a look into it.
	ID: 6193 \| Rating: 0 \| rate: /

_heinz Joined: Jun 16 09 Posts: 12 ID: 13437 Credit: 1,471,103 RAC: 0	Message 6194 - Posted 2 Feb 2011 1:36:06 UTC Last modified: 2 Feb 2011 1:50:23 UTC
	Hi, I have the same issue, since 2 hours no progres, wu canceled now http://docking.cis.udel.edu/community/workunit.php?wuid=18510268 pitty, lost time I'm stopping work till the problems of the project are solved Number 140 in the world statistic of Docking heinz ____________ V8-Xeon-Docking
	ID: 6194 \| Rating: 0 \| rate: /

etrecords Joined: Nov 18 09 Posts: 2 ID: 21628 Credit: 1,943,170 RAC: 0	Message 6195 - Posted 2 Feb 2011 8:08:36 UTC
	I did just a check and all the wu with this problem are created recently. Also this morning a found some new ones. The only reason why I have wu aborted by client is this reason, so you could see the workunts in my tasks list
	ID: 6195 \| Rating: 0 \| rate: /

MaW Joined: Jan 26 11 Posts: 17 ID: 37208 Credit: 114,943 RAC: 0	Message 6196 - Posted 2 Feb 2011 10:22:56 UTC
	Arghh it's getting worse. Got almost only bad tasks today. Could delete them straight away because they indeed have 0-size INP file. http://docking.cis.udel.edu/community/results.php?hostid=85823&offset=0 ...
	ID: 6196 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 6198 - Posted 2 Feb 2011 13:51:11 UTC
	Simplest solution / work-around. Increase cache, get new tasks, then set no new tasks, delete 0 inf files, and keep on running until you need more and repeat the process until they get this fixed. ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 6198 \| Rating: 0 \| rate: /

TheFiend Joined: Apr 7 09 Posts: 70 ID: 9482 Credit: 20,705,527 RAC: 0	Message 6199 - Posted 2 Feb 2011 15:44:54 UTC
	No response yet from the PM sent to Trilce Estrada :(
	ID: 6199 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 6200 - Posted 2 Feb 2011 18:05:39 UTC - in response to Message ID 6199 .
	No response yet from the PM sent to Trilce Estrada :( Does not surprise me any. Been one of the biggest flaws of this project is the lack of communication. ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 6200 \| Rating: 0 \| rate: /

keyboards Joined: Jan 2 09 Posts: 3 ID: 5426 Credit: 1,061,087 RAC: 0	Message 6201 - Posted 2 Feb 2011 23:31:11 UTC Last modified: 2 Feb 2011 23:33:04 UTC
	Have a WU that has been running for 17+ hours showing 0% progress and time to completion as --- with a report deadline of 2/11/11. http://docking.cis.udel.edu/community/workunit.php?wuid=18446304 Running on this computer: http://docking.cis.udel.edu/community/show_host_detail.php?hostid=16437 Seriously considering aborting all WUs and suspending indefinitely! ____________ *!!REMEMBER - Stupidity should be PAINFUL!!*
	ID: 6201 \| Rating: 0 \| rate: /

MaW Joined: Jan 26 11 Posts: 17 ID: 37208 Credit: 114,943 RAC: 0	Message 6205 - Posted 3 Feb 2011 9:17:44 UTC Last modified: 3 Feb 2011 9:22:59 UTC
	The problem is definitely caused by empty(0-size INP file) or corrupt (less than 1,16MB INP file - it even crashed the app twice lol) work units. Unfortunately it's a server-side problem and it would be nice to see that anyone in there cares about it... Now I'm doing 'hunting' for good work units :D Edit: ahh guess there's nothing to hunt.. seems that the generator is down and it's re-sending bad WUs?
	ID: 6205 \| Rating: 0 \| rate: /

Jesse Viviano Joined: Jan 14 10 Posts: 7 ID: 24317 Credit: 349,250 RAC: 0	Message 6208 - Posted 3 Feb 2011 14:08:58 UTC Last modified: 3 Feb 2011 14:09:35 UTC
	I have just started and got two more of the zero-byte *.inp file bad work units. I aborted them following the advice on this thread. They are work units 18530883 and 18531863 .
	ID: 6208 \| Rating: 0 \| rate: /

Jesse Viviano Joined: Jan 14 10 Posts: 7 ID: 24317 Credit: 349,250 RAC: 0	Message 6209 - Posted 3 Feb 2011 17:26:33 UTC
	I have another bad one that I aborted because of the empty input file problem. It is work unit 18537922 .
	ID: 6209 \| Rating: 0 \| rate: /

Paratima Joined: May 31 10 Posts: 4 ID: 29644 Credit: 1,021,214 RAC: 0	Message 6212 - Posted 4 Feb 2011 3:00:03 UTC
	Drop me a line when y'all get it fixed. I'll be crunching elsewhere, maybe POEM. Got no time for handholding.
	ID: 6212 \| Rating: 0 \| rate: /

googloo Joined: Nov 30 09 Posts: 6 ID: 22204 Credit: 1,182,026 RAC: 0	Message 6214 - Posted 4 Feb 2011 15:50:06 UTC
	I have Docking@home set to no new tasks until this is fixed.
	ID: 6214 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 6215 - Posted 4 Feb 2011 16:33:59 UTC Last modified: 4 Feb 2011 16:34:59 UTC
	I've e-mailed Michela concerning the issue here. Maybe she is around. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 6215 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6216 - Posted 4 Feb 2011 16:34:23 UTC - in response to Message ID 6214 .
	Hi All, we have a disk issue. We stopped the generation of new jobs and are looking at the issue. Sorry for the problem and thank you for the notes! Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6216 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 6220 - Posted 4 Feb 2011 16:47:35 UTC - in response to Message ID 6216 .
	Hi All, we have a disk issue. We stopped the generation of new jobs and are looking at the issue. Sorry for the problem and thank you for the notes! Michela You have more then a disk issue, the project admin has a total lack of respect for its members. ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 6220 \| Rating: 0 \| rate: /

MaW Joined: Jan 26 11 Posts: 17 ID: 37208 Credit: 114,943 RAC: 0	Message 6222 - Posted 4 Feb 2011 18:16:17 UTC - in response to Message ID 6220 .
	Hi All, we have a disk issue. We stopped the generation of new jobs and are looking at the issue. Sorry for the problem and thank you for the notes! Michela You have more then a disk issue, the project admin has a total lack of respect for its members. Nah, at least we know they're alive. I was getting worried.
	ID: 6222 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 6223 - Posted 4 Feb 2011 21:23:30 UTC - in response to Message ID 6222 .
	Hi All, we have a disk issue. We stopped the generation of new jobs and are looking at the issue. Sorry for the problem and thank you for the notes! Michela You have more then a disk issue, the project admin has a total lack of respect for its members. Nah, at least we know they're alive. I was getting worried. Been there and done that with them too many times. Never was worried, just annoyed. ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 6223 \| Rating: 0 \| rate: /

Ronald Tilby Joined: Oct 22 10 Posts: 1 ID: 34143 Credit: 158,428 RAC: 0	Message 6224 - Posted 5 Feb 2011 0:14:32 UTC - in response to Message ID 6223 .
	I discovered that I have the "Long Run times with no progress" tasks after two of them had run for over 96 hours. I have three suggestions: 1. Fix the process that creates the task files to not create zero sized task files. 2. Fix the process that serves the task files to the client so that it won't send zero sized task files. 3. Fix the client docking program to detect empty/invalid task files and appropriately set their status.
	ID: 6224 \| Rating: 0 \| rate: /

ZoSo Joined: Oct 14 10 Posts: 16 ID: 33872 Credit: 3,742,738 RAC: 0	Message 6225 - Posted 5 Feb 2011 5:51:35 UTC
	Since this problem has been occurring intermittently for over a year and a half, Ronald Tilby's suggestions all sound reasonable to me. Another issue with the work units that have been aborted is: 2011-02-04 23:58:25\|Docking\|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 3 completed tasks 2011-02-04 23:58:30\|Docking\|Scheduler request succeeded: got 0 new tasks 2011-02-04 23:58:30\|Docking\|Message from server: Server error: can't attach shared memory Fri 04 Feb 2011 11:56:57 PM EST Docking Reporting 2 completed tasks, not requesting new tasks Fri 04 Feb 2011 11:57:08 PM EST Docking Scheduler request completed Fri 04 Feb 2011 11:57:08 PM EST Docking Message from server: Server error: can't attach shared memory Both of those clips were grabbed on the same machine. The first group is from BOINCTasks (from the BOINC manager on another machine on my LAN), and the second group is from BOINC itself... I just synced its clock, which was about 1:06 slow... so if you're synced to a source like otc2.psu.edu:123, too, your log clips for those 2 should show up between 23:58:03 on the 4th and a little after midnight on the 5th. I found a couple links to http://www.spy-hill.net/~myers/help/boinc/Create_Project.html#feeder on berkeley.edu about that message. None of them have to do with the client/manager, though.
	ID: 6225 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6228 - Posted 5 Feb 2011 14:01:26 UTC - in response to Message ID 6225 .
	Dear All, a new update from D@H: 1) We are in a recovery mode. In other words, we are collecting and validating results but we are not generating and distributing new jobs for the moment, while we are investigating what caused the problem yesterday. 2) Please bear with us. We do not have a full time system administrator taking care of D@H but the work is done by students. They are doing their very best but they have also classes and homework. We are dedicating the weekend on understanding the problem and fixing it. Thanks for your several notes and support. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6228 \| Rating: 0 \| rate: /

Scientific Frontline Joined: Mar 25 09 Posts: 42 ID: 8725 Credit: 788,015 RAC: 0	Message 6233 - Posted 5 Feb 2011 15:09:27 UTC - in response to Message ID 6228 .
	Dear All, a new update from D@H: 1) We are in a recovery mode. In other words, we are collecting and validating results but we are not generating and distributing new jobs for the moment, while we are investigating what caused the problem yesterday. 2) Please bear with us. We do not have a full time system administrator taking care of D@H but the work is done by students. They are doing their very best but they have also classes and homework. We are dedicating the weekend on understanding the problem and fixing it. Thanks for your several notes and support. Michela I'll accept that as a reasonable answer. Academics first by all means, ____________ Recognized by the Carnegie Institute of Science . Washington D.C.
	ID: 6233 \| Rating: 0 \| rate: /

Jesse Viviano Joined: Jan 14 10 Posts: 7 ID: 24317 Credit: 349,250 RAC: 0	Message 6234 - Posted 5 Feb 2011 19:32:24 UTC - in response to Message ID 6228 .
	Dear All, a new update from D@H: 1) We are in a recovery mode. In other words, we are collecting and validating results but we are not generating and distributing new jobs for the moment, while we are investigating what caused the problem yesterday. 2) Please bear with us. We do not have a full time system administrator taking care of D@H but the work is done by students. They are doing their very best but they have also classes and homework. We are dedicating the weekend on understanding the problem and fixing it. Thanks for your several notes and support. Michela I doubt that your systems are capable of collecting results. Result uploads work fine, but they do nothing but waste space on your disk until they have been reported. That is when your server becomes aware of the results and prepares them for the postprocessing they need (checking to see if they can be validated, getting them validated, assimilated into the science database, and then deleted along with their associated work unit). However, when I try to do an update to report them or BOINC tries to do so automatically, I get these messages: 2/5/2011 2:18:10 PM Docking update requested by user 2/5/2011 2:18:14 PM Docking Sending scheduler request: Requested by user. 2/5/2011 2:18:14 PM Docking Reporting 4 completed tasks, requesting new tasks for CPU and GPU 2/5/2011 2:18:15 PM Docking Scheduler request completed: got 0 new tasks 2/5/2011 2:18:15 PM Docking Message from server: Server error: can't attach shared memory The results then stay in my lists of unfinished tasks. Something is keeping your server from being able to accept the reporting of these tasks. When I searched for the matter, one possible scenario involves the feeder not running. Could you please fix the issue preventing us from reporting our results so that they can finally be processesd and accepted? I think that the space freed up by the deletion of the work units could help you with your disk issues if the problem turns out to be a full disk, which has caused other projects to generate empty work unit files that must be aborted or caused other errors.
	ID: 6234 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6235 - Posted 6 Feb 2011 15:54:01 UTC - in response to Message ID 6234 .
	Dear All, a new update from D@H: 1) We are in a recovery mode. In other words, we are collecting and validating results but we are not generating and distributing new jobs for the moment, while we are investigating what caused the problem yesterday. 2) Please bear with us. We do not have a full time system administrator taking care of D@H but the work is done by students. They are doing their very best but they have also classes and homework. We are dedicating the weekend on understanding the problem and fixing it. Thanks for your several notes and support. Michela I doubt that your systems are capable of collecting results. Result uploads work fine, but they do nothing but waste space on your disk until they have been reported. That is when your server becomes aware of the results and prepares them for the postprocessing they need (checking to see if they can be validated, getting them validated, assimilated into the science database, and then deleted along with their associated work unit). However, when I try to do an update to report them or BOINC tries to do so automatically, I get these messages: 2/5/2011 2:18:10 PM Docking update requested by user 2/5/2011 2:18:14 PM Docking Sending scheduler request: Requested by user. 2/5/2011 2:18:14 PM Docking Reporting 4 completed tasks, requesting new tasks for CPU and GPU 2/5/2011 2:18:15 PM Docking Scheduler request completed: got 0 new tasks 2/5/2011 2:18:15 PM Docking Message from server: Server error: can't attach shared memory The results then stay in my lists of unfinished tasks. Something is keeping your server from being able to accept the reporting of these tasks. When I searched for the matter, one possible scenario involves the feeder not running. Could you please fix the issue preventing us from reporting our results so that they can finally be processesd and accepted? I think that the space freed up by the deletion of the work units could help you with your disk issues if the problem turns out to be a full disk, which has caused other projects to generate empty work unit files that must be aborted or caused other errors. Hi, all the daemons are up and running. I am monitoring a couple of clients to see if I can reproduce you error message. ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6235 \| Rating: 0 \| rate: /

Hephaiston Joined: Feb 3 11 Posts: 3 ID: 37733 Credit: 159,759 RAC: 0	Message 6236 - Posted 7 Feb 2011 15:55:46 UTC
	Having the same problem as metioned above several times. 0% progress after 76 hours. Still 5 more jobs to do. ____________ meine Kiste
	ID: 6236 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6237 - Posted 7 Feb 2011 17:33:50 UTC - in response to Message ID 6236 .
	Having the same problem as metioned above several times. 0% progress after 76 hours. Still 5 more jobs to do. We removed all the jobs with potential 0% progress that were in our database. Unfortunately some jobs were distributed by the time we worked on the database. Can you abort the jobs with 0% progress and get new jobs? Thanks, Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6237 \| Rating: 0 \| rate: /

Hephaiston Joined: Feb 3 11 Posts: 3 ID: 37733 Credit: 159,759 RAC: 0	Message 6243 - Posted 7 Feb 2011 21:18:55 UTC
	All jobs aborted ____________ meine Kiste
	ID: 6243 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6244 - Posted 8 Feb 2011 3:40:51 UTC - in response to Message ID 6243 .
	All jobs aborted Thanks! Let us know if the new jobs have any similar issues. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6244 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 6245 - Posted 8 Feb 2011 16:35:02 UTC Last modified: 8 Feb 2011 16:36:13 UTC
	I'd ramped up the quota for Docking on a couple of machines to check for problems, and none were found. Looks okay. "This problem", assuming it is the same that has been reported in the thread for a while, could be easily caught in the jobs. When the client starts, look for the file, (or files if necessary), if found, look at it's size, if its "big enough", maybe open it and read it through for some end of input marker to see if it is complete/okay. Might not catch all things but is a trivial change to make. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 6245 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6246 - Posted 9 Feb 2011 17:01:57 UTC - in response to Message ID 6245 .
	I'd ramped up the quota for Docking on a couple of machines to check for problems, and none were found. Looks okay. "This problem", assuming it is the same that has been reported in the thread for a while, could be easily caught in the jobs. When the client starts, look for the file, (or files if necessary), if found, look at it's size, if its "big enough", maybe open it and read it through for some end of input marker to see if it is complete/okay. Might not catch all things but is a trivial change to make. The solution we were considering would require us to recompile charmm with BOINC and this task can be very challenging considering the complexity of charmm. At this point we have a nagios system in place alerting us on the quote of the disks and the status of the daemons. This should help a lot. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6246 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 6247 - Posted 10 Feb 2011 15:46:21 UTC Last modified: 10 Feb 2011 15:47:14 UTC
	If you've just installed nagios, then perhaps it will help. The problems we see here have been going on, (on and off), for a LONG time though. I can see that there is only one other member of our team that is still with the project now. Good luck. ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 6247 \| Rating: 0 \| rate: /

vaughan Volunteer tester Joined: Oct 3 06 Posts: 9 ID: 177 Credit: 3,108,281 RAC: 0	Message 6248 - Posted 11 Feb 2011 10:16:32 UTC
	Continue to get problems with 0% progress for some tasks. What file do we need to check for zero length so we can abort the dud tasks early?
	ID: 6248 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6249 - Posted 11 Feb 2011 13:51:44 UTC - in response to Message ID 6248 .
	Continue to get problems with 0% progress for some tasks. What file do we need to check for zero length so we can abort the dud tasks early? Can you please tell us the name of the jobs with 0% progress? We deleted the old jobs still on the server and the space on disk is now plenty. We were not able to delete the jobs already distributed. I want to check if your jobs with 0% progress are old jobs. Thanks, Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6249 \| Rating: 0 \| rate: /

vaughan Volunteer tester Joined: Oct 3 06 Posts: 9 ID: 177 Credit: 3,108,281 RAC: 0	Message 6250 - Posted 11 Feb 2011 14:11:30 UTC
	I have aborted all of them.
	ID: 6250 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6251 - Posted 11 Feb 2011 15:39:25 UTC - in response to Message ID 6250 .
	I have aborted all of them. OK, this is a good decision. Please send me an e-mail or submit an entry to this forum if there are another jobs with 0% progress together with the name of the jobs. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6251 \| Rating: 0 \| rate: /

JonJen Joined: Aug 27 10 Posts: 2 ID: 32294 Credit: 209,242 RAC: 0	Message 6252 - Posted 11 Feb 2011 18:28:43 UTC
	Thank you so much Docking ppl 4 posting a message to me via the screen saver graphics. Not sure how many months I have been getting the 0% progress error WU, but I aborted it as ordered. (^:= Here are the messages I got today regarding it. 2/11/2011 10:14:27 AM Docking task 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0 aborted by user 2/11/2011 10:14:37 AM Docking update requested by user 2/11/2011 10:14:39 AM Docking Sending scheduler request: Requested by user. 2/11/2011 10:14:39 AM Docking Reporting 1 completed tasks, requesting new tasks for GPU 2/11/2011 10:14:40 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:14:40 AM Docking [error] garbage_collect(); still have active task for acked result 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0; state 5 2/11/2011 10:14:41 AM Docking Computation for task 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0 finished 2/11/2011 10:14:41 AM Docking Output file 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0_0 for task 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0 absent 2/11/2011 10:14:41 AM Docking Output file 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0_1 for task 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0 absent 2/11/2011 10:14:41 AM Docking Output file 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0_2 for task 1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0 absent 2/11/2011 10:14:48 AM Docking [error] Couldn't delete file projects/docking.cis.udel.edu/1hvk1hbv_mod0014crossdockinghiv1_28102_18972.inp 2/11/2011 10:14:54 AM Docking [error] Couldn't delete file projects/docking.cis.udel.edu/1hvk1hbv_mod0014crossdockinghiv1_28102_18972_0_3 2/11/2011 10:14:54 AM Resuming computation 2/11/2011 10:15:54 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:15:54 AM Docking Requesting new tasks for GPU 2/11/2011 10:15:55 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:17:00 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:17:00 AM Docking Requesting new tasks for GPU 2/11/2011 10:17:02 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:18:07 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:18:07 AM Docking Requesting new tasks for GPU 2/11/2011 10:18:08 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:23:13 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:23:13 AM Docking Requesting new tasks for GPU 2/11/2011 10:23:14 AM Docking Scheduler request completed: got 0 new tasks ____________
	ID: 6252 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6253 - Posted 11 Feb 2011 21:20:46 UTC - in response to Message ID 6252 .
	2/11/2011 10:14:54 AM Resuming computation 2/11/2011 10:15:54 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:15:54 AM Docking Requesting new tasks for GPU 2/11/2011 10:15:55 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:17:00 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:17:00 AM Docking Requesting new tasks for GPU 2/11/2011 10:17:02 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:18:07 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:18:07 AM Docking Requesting new tasks for GPU 2/11/2011 10:18:08 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:23:13 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:23:13 AM Docking Requesting new tasks for GPU 2/11/2011 10:23:14 AM Docking Scheduler request completed: got 0 new tasks This is strange, your client is continuously asking for GPU jobs and we do not support GPUs yet. I would expect that eventually the client starts asking for CPU jobs. What version of the client do you have? ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6253 \| Rating: 0 \| rate: /

zioriga Joined: Sep 3 08 Posts: 1 ID: 409 Credit: 225,585 RAC: 0	Message 6256 - Posted 14 Feb 2011 6:46:17 UTC
	This is another task with 0% progress 2/14/2011 7:32:56 AM \| Docking \| Restarting task 1hvk1hbv_mod0014crossdockinghiv1_4627_410868_0 using charmm34 version 623 Few days ago I had some other WU with 0% progress in a neverending crunching time. I aborted them, as I did with the above. I use Boinc Manager 6.12.14 (x64), with XP 64b
	ID: 6256 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 6257 - Posted 14 Feb 2011 22:15:44 UTC Last modified: 14 Feb 2011 22:17:04 UTC
	The daily project throughput went down below 1/3rd of what it used to be, I doubt that one specific client version or only a few specific boxes have a problem. For the tests, try one of those : 1hvi1hbv_mod0014crossdockinghiv1_11505_235903_0 1hvi1hbv_mod0014crossdockinghiv1_11107_146009_0 1hvi1hbv_mod0014crossdockinghiv1_11617_141160_0 (this is one with errorcode 1, no infinite runtime)
	ID: 6257 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6260 - Posted 15 Feb 2011 1:20:06 UTC - in response to Message ID 6257 .
	Most of the workunits that were affected during the disk problem were 1hvl1hbv and 1hvk1hbv. We are not longer distributing empty jobs since last week. We are still in a recovery mode and thus we reduced by half the generation of jobs while we are making sure everything is fine. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6260 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6261 - Posted 15 Feb 2011 4:44:37 UTC - in response to Message ID 6253 . Last modified: 15 Feb 2011 4:52:11 UTC
	2/11/2011 10:14:54 AM Resuming computation 2/11/2011 10:15:54 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:15:54 AM Docking Requesting new tasks for GPU 2/11/2011 10:15:55 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:17:00 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:17:00 AM Docking Requesting new tasks for GPU 2/11/2011 10:17:02 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:18:07 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:18:07 AM Docking Requesting new tasks for GPU 2/11/2011 10:18:08 AM Docking Scheduler request completed: got 0 new tasks 2/11/2011 10:23:13 AM Docking Sending scheduler request: To fetch work. 2/11/2011 10:23:13 AM Docking Requesting new tasks for GPU 2/11/2011 10:23:14 AM Docking Scheduler request completed: got 0 new tasks This is strange, your client is continuously asking for GPU jobs and we do not support GPUs yet. I would expect that eventually the client starts asking for CPU jobs. What version of the client do you have? The current versions of the BOINC client software will, if your computer has a BOINC-usable GPU, send ALL the connected projects requests for both CPU workunits and GPU workunits. However, the current versions of the BOINC server software allow you to reduce, but not totally eliminate, this - it allows the server to send a response telling the client not to ask for any more workunits of the type requested for up to about a week. This should allow you, for as long as you're not even planning any GPU workunits, to tell any client that sends a request for GPU workunits only that it should not send another such request for about a week. The 6.10.* series of BOINC client programs has this feature. It will eventually start asking for CPU workunits, but usually only after it gets at least one GPU workunit from SOME project. If you're looking for a project that sends only GPU workunits, I've found two: GPUGRID sends protein-folding workunits, but only if you have a sufficiently high-end Nvidia GPU (a GT 220 is currently about the lowest it will use). Collatz Conjecture sends workunits related to some math problem, but to almost any GPU that BOINC can use. http://www.gpugrid.net/ http://boinc.thesonntags.com/collatz/ An idea on how to handle the input file size checking: Add a wrapper program that checks the size of the input file, then passes control to the main application program ONLY if the input file passes this test.
	ID: 6261 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 6263 - Posted 16 Feb 2011 13:10:45 UTC Last modified: 16 Feb 2011 13:49:22 UTC
	16-2-2011 11:59:41 Docking Restarting task 1hvk1hbv_mod0014crossdockinghiv1_19536_75277_0 using charmm34 version 623 16-2-2011 11:59:42 Docking Restarting task 1hvk1hbv_mod0014crossdockinghiv1_19530_195401_0 using charmm34 version 623 16-2-2011 11:59:42 Docking Restarting task 1hvk1hbv_mod0014crossdockinghiv1_19843_57465_0 using charmm34 version 623 16-2-2011 11:59:42 Docking Restarting task 1hvk1hbv_mod0014crossdockinghiv1_23011_479024_0 using charmm34 version 623 Hi, long time since I posted here, but now I again noticed some tasks, see above, showing no progress at all. Are they empty, CPU, Q6600 is showing 100% usage,(4 x 25%), so that isn't likely! Everyone else experiencing this abnormal behavior? Some other WU's are pauzed , for whatever reason and they are all 4 exactly at 2 hours and 40 minutes and xx seconds. Should I delete these WU's??? I answer this myself: YES, delete them all! OK ......Done ......... They were due 14-15 and 16 feb.2011, so a little late and probably also empty?! ____________ Knight who says N! Ni Ni
	ID: 6263 \| Rating: 0 \| rate: /

adrianxw Volunteer tester Joined: Dec 30 06 Posts: 164 ID: 343 Credit: 1,669,741 RAC: 0	Message 6265 - Posted 17 Feb 2011 14:43:50 UTC Last modified: 17 Feb 2011 14:48:40 UTC
	Good number of "Client Error" wu's today. Most seem to fail after some multiple of ~300 seconds, (300, 600, 900, 1200 you get the picture). Example ____________ Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.
	ID: 6265 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6266 - Posted 17 Feb 2011 16:14:54 UTC - in response to Message ID 6265 .
	Good number of "Client Error" wu's today. Most seem to fail after some multiple of ~300 seconds, (300, 600, 900, 1200 you get the picture). Example We are working on this problem right now! We will keep you posted. Thanks! ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6266 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6267 - Posted 17 Feb 2011 18:09:43 UTC - in response to Message ID 6266 .
	One of the ligands, ligand 1hih, really did not want to dock into the other protein conformations than the one in which it was observed experimentally. So in the cross-docking simulation, no matter what protein conformation we were using, the simulation was very short and inconclusive, besides crating D@H problems. We removed the whole batch of simulations with this ligand and will work with our scientists to understand the scientific reason for this problem. We are distributing a new batch of jobs with another ligand and this time it seem to work OK. Protein-ligand docking is definitely not a deterministic thing! Thanks for the alert! Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6267 \| Rating: 0 \| rate: /

Hephaiston Joined: Feb 3 11 Posts: 3 ID: 37733 Credit: 159,759 RAC: 0	Message 6268 - Posted 18 Feb 2011 1:35:21 UTC - in response to Message ID 6244 .
	All jobs aborted Thanks! Let us know if the new jobs have any similar issues. Michela Above your message from the 8th Feb. Recieved a new job that day, after aborting old jobs. The jobs started today (charmm 34a2 6.23) and quit after one second wokring time with "error while computing" and progress of 100%. No problem with jobs of any other project today or the last few days. ____________ meine Kiste
	ID: 6268 \| Rating: 0 \| rate: /

johnsone79 Joined: Apr 1 09 Posts: 1 ID: 9235 Credit: 995,312 RAC: 0	Message 6271 - Posted 23 Feb 2011 1:36:52 UTC
	For some reason all my workunits on my new computer i7 2600k are stalling at 0.000% progress while those on my old computer Intel core duo are still running fine. Any idea on how to fix this? My new computer is burning through workunits on other projects, and I want them to do the same here.
	ID: 6271 \| Rating: 0 \| rate: /

Ananas Joined: Aug 29 09 Posts: 56 ID: 17736 Credit: 2,500,425 RAC: 0	Message 6277 - Posted 27 Feb 2011 21:30:27 UTC
	"does not want to dock" should not be treated as an error status, it is a result just like "docks easily" or "might dock if there is no better interface available". If the program can handle this situation, there would probably be less results that error out or run forever.
	ID: 6277 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6279 - Posted 28 Feb 2011 14:18:41 UTC - in response to Message ID 6277 .
	"does not want to dock" should not be treated as an error status, it is a result just like "docks easily" or "might dock if there is no better interface available". If the program can handle this situation, there would probably be less results that error out or run forever. The docking simulation can evolve toward a state in which the energy of the complex does not make any sense and the traditional charmm executable aborts. We wrapped charmm to catch these errors, terminate gently, and send us proper information. We also changed the application to give partial credits for partial simulations, once the initial phase of the simulation (when the ligand is located into the docking pocket) is successful. Right now we run a set of short simulations on a testing server for each complex to make sure that the simulations can complete. Unfortunately, this does not necessity mean that we are always able to capture all the possible problems of a complex simulation. Error and energy violations are hard to predict a priori, especially with the type of simulations we are doing right now in which we cross-dock proteins and ligands that were not observed experimentally. Our next step toward preventing this problem is as follows: we will extend the testing phase. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6279 \| Rating: 0 \| rate: /

BeemerBiker Joined: Aug 8 09 Posts: 3 ID: 16805 Credit: 692,240 RAC: 0	Message 6350 - Posted 28 May 2011 19:16:42 UTC
	I have seen the same problem on 2 differen system (linux and windows). While monitoring using BOINCTASKS, I notice the % complete is way past 100%. Stopping and starting boinc causes the task to start over at 0.0 % even though it might have run for 8-12 hours at over %200 or more. Viewing results using BOINCMANAGER one never sees past 0 percent as the percent complete is not calculated the same way in BM as in BT. Is not the project suppose to terminate the app if it goes beyond some magic number such as 10x the expected computation time? Neither should it start over at 0 % when the system reboots or boinc restarts.
	ID: 6350 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6351 - Posted 29 May 2011 5:04:48 UTC - in response to Message ID 6350 .
	I have seen the same problem on 2 differen system (linux and windows). While monitoring using BOINCTASKS, I notice the % complete is way past 100%. Stopping and starting boinc causes the task to start over at 0.0 % even though it might have run for 8-12 hours at over %200 or more. Viewing results using BOINCMANAGER one never sees past 0 percent as the percent complete is not calculated the same way in BM as in BT. Is not the project suppose to terminate the app if it goes beyond some magic number such as 10x the expected computation time? Neither should it start over at 0 % when the system reboots or boinc restarts. You may want to try observing the timing values for the last checkpoint before shutting down BOINC, since that's the critical factor in when workunits can be restarted. In my version of BOINC Manager, advanced view, Tasks, just click on the workunit, then Properties. Do not expect it to be able to resume any more recently than the last checkpoint after a system restart or a BOINC restart - BOINC simply does not have that capability. However, if your operating system supports sleep mode, and you suspend all workunits within BOINC but do not shut BOINC down entirely, the operating system should be able to go into sleep mode while still preserving the memory contents needed to resume the workunits where they were suspended, IF you have enabled the option to keep workunits in memory while they are suspended. Also, you may want to check if the operating system agrees that the workunit is still using any CPU time. If not, do not expect any time limits built into the application program to work - the code checking for exceeding that time limit cannot run with no CPU time at all.
	ID: 6351 \| Rating: 0 \| rate: /

Palamedes Joined: Dec 7 09 Posts: 2 ID: 22549 Credit: 16,855 RAC: 0	Message 6358 - Posted 11 Jun 2011 21:35:17 UTC - in response to Message ID 6271 .
	For some reason all my workunits on my new computer i7 2600k are stalling at 0.000% progress while those on my old computer Intel core duo are still running fine. Any idea on how to fix this? My new computer is burning through workunits on other projects, and I want them to do the same here. I'm having the same issue. I have 8 docking work units running at once and they all stay at zero percent. More over the elapsed time seems to count up to about a minute forty five or so then resets to zero. ------------------ System Information ------------------ Time of this report: 6/11/2011, 16:32:00 Machine name: VESPID Operating System: Windows 7 Professional 64-bit (6.1, Build 7600) (7600.win7_rtm.090713-1255) Language: English (Regional Setting: English) System Manufacturer: MSI System Model: MS-7681 BIOS: BIOS Date: 03/02/11 10:58:35 Ver: 04.06.04 Processor: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (8 CPUs), ~3.4GHz Memory: 16384MB RAM Available OS Memory: 16364MB RAM Page File: 5003MB used, 27723MB available
	ID: 6358 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6360 - Posted 12 Jun 2011 22:19:01 UTC - in response to Message ID 6358 .
	For some reason all my workunits on my new computer i7 2600k are stalling at 0.000% progress while those on my old computer Intel core duo are still running fine. Any idea on how to fix this? My new computer is burning through workunits on other projects, and I want them to do the same here. I'm having the same issue. I have 8 docking work units running at once and they all stay at zero percent. More over the elapsed time seems to count up to about a minute forty five or so then resets to zero. ------------------ System Information ------------------ Time of this report: 6/11/2011, 16:32:00 Machine name: VESPID Operating System: Windows 7 Professional 64-bit (6.1, Build 7600) (7600.win7_rtm.090713-1255) Language: English (Regional Setting: English) System Manufacturer: MSI System Model: MS-7681 BIOS: BIOS Date: 03/02/11 10:58:35 Ver: 04.06.04 Processor: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (8 CPUs), ~3.4GHz Memory: 16384MB RAM Available OS Memory: 16364MB RAM Page File: 5003MB used, 27723MB available We are looking at this. Thanks for the note! Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6360 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6363 - Posted 12 Jun 2011 23:48:39 UTC - in response to Message ID 6360 .
	This is a report of the jobs associated with 1hvi1hpv. The testing process that is performed before to distribute a new complex to volunteers did not capture any major problem. There are 11,684 jobs with server state = over, 692 of them with outcome = client error. It is around 6% of them (including aborted and failing) and the failings come from the same hosts (around 30-40 of them). The job distributed but not returned are 17,958; the jobs generated but not distributed are 252; the job to be generated are 0 since we moved on to the next complex. Please abort jobs with 0% progress. Docking is not a deterministic simulation and some docking attempts can fail. Our testing can successfully capture most of the cases but not all. Thanks! MT ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6363 \| Rating: 0 \| rate: /

Vepide Joined: Jun 13 11 Posts: 4 ID: 41381 Credit: 0 RAC: 0	Message 6365 - Posted 13 Jun 2011 19:43:35 UTC
	I just attached to D@H yesterday and I'm getting this problem also, WU's stay at 0% with no progress, the screen saver say's to abort any WU's displaying 0%. I just suspended D@H until this problem is resolved. Running Window 7 Ultimate 64bit on a Q9650 OC'd to 3.6Ghz, ASUS P5E3 Premium 8GB RAM and a 4870 X2. ____________
	ID: 6365 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6366 - Posted 14 Jun 2011 2:43:54 UTC Last modified: 14 Jun 2011 2:45:58 UTC
	I have a 1hvi1hpv workunit on one of my computers, but since it's already at 18% progress I plan to let it run for now. Some ideas for Docking@Home to consider: Add another thread to their application program, and move most of the checking of whether the rest of the application program is still doing anything useful there. Depending on what the cause of the 0% progress is, this may allow checking for it to continue. Add a section to their application program which, if the workunit asks for it, will gather more information on the details of what kind of computer it is running on, and write this information to a separate output file BEFORE going on to the rest of the application program. If you later no longer need it, it should be easy to turn off by changing the workunits instead of the application program. If there is any need to write more to this file later, it should be reopened first so that it will be preserved past most workunit failures.
	ID: 6366 \| Rating: 0 \| rate: /

vaughan Volunteer tester Joined: Oct 3 06 Posts: 9 ID: 177 Credit: 3,108,281 RAC: 0	Message 6369 - Posted 25 Jun 2011 13:52:26 UTC
	Does Docking still have the annoying 0.000% progress bug?
	ID: 6369 \| Rating: 0 \| rate: /

Conan Volunteer tester Joined: Sep 13 06 Posts: 219 ID: 100 Credit: 4,256,493 RAC: 0	Message 6370 - Posted 27 Jun 2011 11:46:47 UTC - in response to Message ID 6369 .
	Does Docking still have the annoying 0.000% progress bug? G'Day Vaughan, I have not noticed it in the past two weeks of processing work units. I am running both Windows and Linux, on 5 AMD Phenom processors and so far there has been no problems at all. Conan ____________
	ID: 6370 \| Rating: 0 \| rate: /

vaughan Volunteer tester Joined: Oct 3 06 Posts: 9 ID: 177 Credit: 3,108,281 RAC: 0	Message 6371 - Posted 2 Jul 2011 7:05:36 UTC - in response to Message ID 6370 . Last modified: 2 Jul 2011 7:06:21 UTC
	Does Docking still have the annoying 0.000% progress bug? G'Day Vaughan, I have not noticed it in the past two weeks of processing work units. I am running both Windows and Linux, on 5 AMD Phenom processors and so far there has been no problems at all. Conan Thanks Conan. Yes it seems to be behaving now.
	ID: 6371 \| Rating: 0 \| rate: /

Ed Joined: Jul 30 11 Posts: 11 ID: 42642 Credit: 0 RAC: 0	Message 6426 - Posted 31 Jul 2011 15:51:10 UTC
	I just joined and I seem to have this 0% issue.
	ID: 6426 \| Rating: 0 \| rate: /

Michela Forum moderator Project administrator Project developer Project tester Project scientist Joined: Sep 13 06 Posts: 163 ID: 10 Credit: 97,083 RAC: 0	Message 6433 - Posted 3 Aug 2011 2:52:20 UTC - in response to Message ID 6426 .
	I just joined and I seem to have this 0% issue. Hi, can I please have the name of the jobs with 0 % progress? I just checked the server and we have space on the disk (in the past it was one of the reasons for the problem). The testing machines in the lab seems to crunch well. We will look at this in detail tomorrow morning. Michela ____________ If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
	ID: 6433 \| Rating: 0 \| rate: /

Vepide Joined: Jun 13 11 Posts: 4 ID: 41381 Credit: 0 RAC: 0	Message 6543 - Posted 2 Jan 2012 17:30:36 UTC
	I just rejoined the project and found it was the reason why I previously terminated this project. Zero progress bar, WU's not terminating the process, etc. When I get around to it I will try to install Windows 8 onto a new hard drive and see if the problem extends into Windows 8 64 bit. I have another post here going back to June of 2011 and the problem clearly has not been solved. So suspending the project for now. Unknown wether it is a processor specific problem, or Win764bit problem, in which case its just otherwise bad project coding on the 64 bit Win7 platform. The only few INP files I found contained this html code. <soft_link>../../projects/docking.cis.udel.edu/1t7k1htf_mod0014crossdockinghiv1_66702_386423.inp</soft_link>
	ID: 6543 \| Rating: 0 \| rate: /

GlowClam Joined: Feb 3 12 Posts: 1 ID: 49581 Credit: 0 RAC: 0	Message 6560 - Posted 3 Feb 2012 20:18:02 UTC Last modified: 3 Feb 2012 20:39:33 UTC
	Today I joined to this project and ran into the same 0 % bug. I disrupt the given two WU because they deliver 0 % progress and eating CPU-time. This problem seems to be in existence since a while... I think that computing for docking@home is a good thing to do. But now I am not happy, because I have to command my PC not to get WU from this project anymore until this error is fixed from the team. Or is anything with my setup? The BOINC-Manager log is: 03.02.2012 15:38:43 \| \| Starting BOINC client version 6.12.34 for windows_x86_64 03.02.2012 15:38:43 \| \| log flags: file_xfer, sched_ops, task 03.02.2012 15:38:43 \| \| Libraries: libcurl/7.21.6 OpenSSL/1.0.0d zlib/1.2.5 03.02.2012 15:38:43 \| \| Data directory: C:ProgramDataBOINC 03.02.2012 15:38:43 \| \| Running under account ... 03.02.2012 15:38:43 \| \| Processor: 2 GenuineIntel Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz [Family 6 Model 23 Stepping 10] 03.02.2012 15:38:43 \| \| Processor: 6.00 MB cache 03.02.2012 15:38:43 \| \| Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 syscall nx lm vmx smx tm2 pbe 03.02.2012 15:38:43 \| \| OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00) 03.02.2012 15:38:43 \| \| Memory: 4.00 GB physical, 8.00 GB virtual 03.02.2012 15:38:43 \| \| Disk: 97.56 GB total, 60.01 GB free 03.02.2012 15:38:43 \| \| Local time is UTC +1 hours 03.02.2012 15:38:43 \| \| NVIDIA GPU 0: GeForce 9500 GT (driver version 28562, CUDA version 4010, compute capability 1.1, 1024MB, 72 GFLOPS peak) ... 03.02.2012 20:07:15 \| \| Attaching to http://docking.cis.udel.edu/ 03.02.2012 20:07:19 \| http://docking.cis.udel.edu/ \| Master file download succeeded 03.02.2012 20:07:24 \| http://docking.cis.udel.edu/ \| Sending scheduler request: Project initialization. 03.02.2012 20:07:24 \| http://docking.cis.udel.edu/ \| Requesting new tasks for CPU and NVIDIA GPU 03.02.2012 20:07:27 \| Docking \| Scheduler request completed: got 1 new tasks 03.02.2012 20:07:27 \| \| Couldn't parse preferences file - using BOINC defaults 03.02.2012 20:07:27 \| \| Reading preferences override file 03.02.2012 20:07:27 \| \| Preferences: 03.02.2012 20:07:27 \| \| max memory usage when active: 2047.56MB 03.02.2012 20:07:27 \| \| max memory usage when idle: 3685.61MB 03.02.2012 20:07:27 \| \| max disk usage: 10.00GB 03.02.2012 20:07:27 \| \| don't use GPU while active 03.02.2012 20:07:27 \| \| suspend work if non-BOINC CPU load exceeds 25 % 03.02.2012 20:07:27 \| \| (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) 03.02.2012 20:07:29 \| Docking \| Started download of charmm34_6.23_windows_x86_64 03.02.2012 20:07:29 \| Docking \| Started download of charmm34_6.23_graphics_windows_x86_64 03.02.2012 20:07:32 \| Docking \| Sending scheduler request: To fetch work. 03.02.2012 20:07:32 \| Docking \| Requesting new tasks for NVIDIA GPU 03.02.2012 20:07:33 \| Docking \| Finished download of charmm34_6.23_graphics_windows_x86_64 03.02.2012 20:07:33 \| Docking \| Started download of 1iiq1hpv_mod0014crossdockinghiv1_9720_16010.inp 03.02.2012 20:07:34 \| Docking \| Scheduler request completed: got 0 new tasks 03.02.2012 20:07:34 \| \| Couldn't parse preferences file - using BOINC defaults 03.02.2012 20:07:34 \| \| Reading preferences override file 03.02.2012 20:07:34 \| \| Preferences: 03.02.2012 20:07:34 \| \| max memory usage when active: 2047.56MB 03.02.2012 20:07:34 \| \| max memory usage when idle: 3685.61MB 03.02.2012 20:07:34 \| \| max disk usage: 10.00GB 03.02.2012 20:07:34 \| \| don't use GPU while active 03.02.2012 20:07:34 \| \| suspend work if non-BOINC CPU load exceeds 25 % 03.02.2012 20:07:34 \| \| (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) 03.02.2012 20:07:34 \| Docking \| Giving up on download of 1iiq1hpv_mod0014crossdockinghiv1_9720_16010.inp: file not found 03.02.2012 20:07:34 \| Docking \| Started download of grid_probes.rtf 03.02.2012 20:07:35 \| Docking \| Finished download of grid_probes.rtf 03.02.2012 20:07:35 \| Docking \| Started download of lpdb_amino.rtf 03.02.2012 20:07:38 \| Docking \| Finished download of lpdb_amino.rtf 03.02.2012 20:07:38 \| Docking \| Started download of lpdb.prm 03.02.2012 20:07:40 \| Docking \| Finished download of charmm34_6.23_windows_x86_64 03.02.2012 20:07:40 \| Docking \| Finished download of lpdb.prm 03.02.2012 20:07:40 \| Docking \| Started download of lpdb_probes.prm 03.02.2012 20:07:40 \| Docking \| Started download of logo.jpg 03.02.2012 20:07:41 \| Docking \| Finished download of lpdb_probes.prm 03.02.2012 20:07:41 \| Docking \| Started download of minus.jpg 03.02.2012 20:07:42 \| Docking \| Finished download of logo.jpg 03.02.2012 20:07:42 \| Docking \| Finished download of minus.jpg 03.02.2012 20:07:42 \| Docking \| Started download of plus.jpg 03.02.2012 20:07:42 \| Docking \| Started download of rotate_left.jpg 03.02.2012 20:07:44 \| Docking \| Finished download of plus.jpg 03.02.2012 20:07:44 \| Docking \| Finished download of rotate_left.jpg 03.02.2012 20:07:44 \| Docking \| Started download of rotate_right.jpg 03.02.2012 20:07:44 \| Docking \| Started download of helvetica.txf 03.02.2012 20:07:45 \| Docking \| Finished download of rotate_right.jpg 03.02.2012 20:07:46 \| Docking \| Finished download of helvetica.txf 03.02.2012 20:08:55 \| Docking \| Sending scheduler request: To fetch work. 03.02.2012 20:08:55 \| Docking \| Reporting 1 completed tasks, requesting new tasks for CPU 03.02.2012 20:08:59 \| Docking \| Scheduler request completed: got 2 new tasks 03.02.2012 20:08:59 \| \| Couldn't parse preferences file - using BOINC defaults 03.02.2012 20:08:59 \| \| Reading preferences override file 03.02.2012 20:08:59 \| \| Preferences: 03.02.2012 20:08:59 \| \| max memory usage when active: 2047.56MB 03.02.2012 20:08:59 \| \| max memory usage when idle: 3685.61MB 03.02.2012 20:08:59 \| \| max disk usage: 10.00GB 03.02.2012 20:08:59 \| \| don't use GPU while active 03.02.2012 20:08:59 \| \| suspend work if non-BOINC CPU load exceeds 25 % 03.02.2012 20:08:59 \| \| (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) 03.02.2012 20:09:01 \| Docking \| Started download of 1t7k1m0b_mod0014crossdockinghiv1_541890_198100.inp 03.02.2012 20:09:01 \| Docking \| Started download of 1t7k1m0b_mod0014crossdockinghiv1_541891_65051.inp 03.02.2012 20:09:06 \| Docking \| Finished download of 1t7k1m0b_mod0014crossdockinghiv1_541890_198100.inp 03.02.2012 20:09:06 \| Docking \| Starting task 1t7k1m0b_mod0014crossdockinghiv1_541890_198100_0 using charmm34 version 623 03.02.2012 20:09:11 \| Docking \| Finished download of 1t7k1m0b_mod0014crossdockinghiv1_541891_65051.inp 03.02.2012 20:09:11 \| Docking \| Starting task 1t7k1m0b_mod0014crossdockinghiv1_541891_65051_0 using charmm34 version 623 03.02.2012 20:09:21 \| \| Suspending computation - CPU is busy 03.02.2012 20:09:31 \| \| Resuming computation 03.02.2012 20:12:31 \| Docking \| Restarting task 1t7k1m0b_mod0014crossdockinghiv1_541890_198100_0 using charmm34 version 623 03.02.2012 20:12:31 \| Docking \| Restarting task 1t7k1m0b_mod0014crossdockinghiv1_541891_65051_0 using charmm34 version 623 03.02.2012 20:16:06 \| Docking \| Sending scheduler request: To fetch work. 03.02.2012 20:16:06 \| Docking \| Requesting new tasks for NVIDIA GPU 03.02.2012 20:16:10 \| Docking \| Scheduler request completed: got 0 new tasks 03.02.2012 20:16:10 \| \| Couldn't parse preferences file - using BOINC defaults 03.02.2012 20:16:10 \| \| Reading preferences override file 03.02.2012 20:16:10 \| \| Preferences: 03.02.2012 20:16:10 \| \| max memory usage when active: 2047.56MB 03.02.2012 20:16:10 \| \| max memory usage when idle: 3685.61MB 03.02.2012 20:16:10 \| \| max disk usage: 10.00GB 03.02.2012 20:16:10 \| \| don't use GPU while active 03.02.2012 20:16:10 \| \| suspend work if non-BOINC CPU load exceeds 25 % 03.02.2012 20:16:10 \| \| (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) 03.02.2012 20:25:31 \| \| Suspending computation - CPU is busy 03.02.2012 20:25:41 \| \| Resuming computation 03.02.2012 20:28:41 \| Docking \| Restarting task 1t7k1m0b_mod0014crossdockinghiv1_541890_198100_0 using charmm34 version 623 03.02.2012 20:28:41 \| Docking \| Restarting task 1t7k1m0b_mod0014crossdockinghiv1_541891_65051_0 using charmm34 version 623 03.02.2012 20:32:15 \| Docking \| Sending scheduler request: To fetch work. 03.02.2012 20:32:15 \| Docking \| Requesting new tasks for NVIDIA GPU 03.02.2012 20:32:19 \| Docking \| Scheduler request completed: got 0 new tasks 03.02.2012 20:32:19 \| \| Couldn't parse preferences file - using BOINC defaults 03.02.2012 20:32:19 \| \| Reading preferences override file 03.02.2012 20:32:19 \| \| Preferences: 03.02.2012 20:32:19 \| \| max memory usage when active: 2047.56MB 03.02.2012 20:32:19 \| \| max memory usage when idle: 3685.61MB 03.02.2012 20:32:19 \| \| max disk usage: 10.00GB 03.02.2012 20:32:19 \| \| don't use GPU while active 03.02.2012 20:32:19 \| \| suspend work if non-BOINC CPU load exceeds 25 % 03.02.2012 20:32:19 \| \| (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) 03.02.2012 20:35:37 \| \| Fetching configuration file from http://bam.boincstats.com/get_project_config.php 03.02.2012 20:35:41 \| \| Contacting account manager at http://bam.boincstats.com/ ... 03.02.2012 20:35:43 \| \| Account manager contact succeeded 03.02.2012 20:36:50 \| Docking \| task 1t7k1m0b_mod0014crossdockinghiv1_541890_198100_0 aborted by user 03.02.2012 20:36:58 \| Docking \| task 1t7k1m0b_mod0014crossdockinghiv1_541891_65051_0 aborted by user 03.02.2012 20:37:50 \| Docking \| Computation for task 1t7k1m0b_mod0014crossdockinghiv1_541890_198100_0 finished 03.02.2012 20:37:58 \| Docking \| Computation for task 1t7k1m0b_mod0014crossdockinghiv1_541891_65051_0 finished 03.02.2012 20:40:50 \| Docking \| Sending scheduler request: To report completed tasks. 03.02.2012 20:40:50 \| Docking \| Reporting 2 completed tasks, not requesting new tasks 03.02.2012 20:40:53 \| Docking \| Scheduler request completed
	ID: 6560 \| Rating: 0 \| rate: /

D337z Joined: Mar 8 12 Posts: 1 ID: 51519 Credit: 0 RAC: 0	Message 6587 - Posted 9 Mar 2012 19:09:33 UTC - in response to Message ID 6560 .
	The problem appears to be with the program's ability to output its results. It appears to be with CPU only, but the graphics version is not being used. I have an Intel CPU as well. Perhaps the output code is having difficulty working properly?
	ID: 6587 \| Rating: 0 \| rate: /

UBT - Rick Horn Joined: Jan 10 11 Posts: 4 ID: 36735 Credit: 230,465 RAC: 0	Message 6614 - Posted 23 Mar 2012 9:49:13 UTC
	This thread has been running since August 2009, and the problem has still not been solved, despite promises that the admins are working on it. I can only say that they are working very slowly. My Win7 64 bit quad is available for Docking, and would more that double my output if only it could be used. Come on guys, pull your fingers out! ____________
	ID: 6614 \| Rating: 0 \| rate: /

NATE1 Joined: May 17 11 Posts: 4 ID: 40573 Credit: 109,598 RAC: 0	Message 6620 - Posted 25 Mar 2012 16:58:43 UTC
	ok, I have a number of intel computer the ones that will run docking 64win7 have vt-x, the ones that will not run docking do not vt-x. go figure.
	ID: 6620 \| Rating: 0 \| rate: /

Michael Tillman Joined: Jan 30 10 Posts: 3 ID: 25162 Credit: 6,818 RAC: 0	Message 6837 - Posted 24 Sep 2012 16:50:55 UTC
	same problem here. no changes using the newest version for gpu usage. docking only 0.00 had to abort all dockings.
	ID: 6837 \| Rating: 0 \| rate: /

spuddly buddly Joined: Aug 16 12 Posts: 24 ID: 66176 Credit: 14,124 RAC: 0	Message 6838 - Posted 25 Sep 2012 9:24:07 UTC
	This messageboard has died as is now a terraforming/action art project involving large amounts of organic waste (therefore the smell)run by the Knights who say Ni! If you want help running docking@home, it's pretty much go figure it out for yourselves, as no one at the project bothers to answer e-mails or messages posted here. Sorry! Now enjoy the terraforming/action art ... :) ____________ The Knights who say Ni!
	ID: 6838 \| Rating: 0 \| rate: /

mfarley Joined: Sep 25 12 Posts: 2 ID: 67712 Credit: 0 RAC: 0	Message 6843 - Posted 25 Sep 2012 22:01:20 UTC - in response to Message ID 6838 .
	This messageboard has died as is now a terraforming/action art project involving large amounts of organic waste (therefore the smell)run by the Knights who say Ni! If you want help running docking@home, it's pretty much go figure it out for yourselves, as no one at the project bothers to answer e-mails or messages posted here. Sorry! Now enjoy the terraforming/action art ... :) free courses online
	ID: 6843 \| Rating: 0 \| rate: /

mfarley Joined: Sep 25 12 Posts: 2 ID: 67712 Credit: 0 RAC: 0	Message 6844 - Posted 25 Sep 2012 22:03:12 UTC - in response to Message ID 6837 .
	same problem here. no changes using the newest version for gpu usage. docking only 0.00 had to abort all dockings. myfreecoursesonline
	ID: 6844 \| Rating: 0 \| rate: /

spuddly buddly Joined: Aug 16 12 Posts: 24 ID: 66176 Credit: 14,124 RAC: 0	Message 6845 - Posted 26 Sep 2012 5:31:03 UTC
	Damn! Instead of terraforming we've created a spam magent! ____________ The Knights who say Ni!
	ID: 6845 \| Rating: 0 \| rate: /

spuddly buddly Joined: Aug 16 12 Posts: 24 ID: 66176 Credit: 14,124 RAC: 0	Message 6846 - Posted 26 Sep 2012 19:12:46 UTC - in response to Message ID 6845 .
	Damn! Instead of terraforming we've created a spam magent! That should be magnet of course ... (slaps forhead) ____________ The Knights who say Ni!
	ID: 6846 \| Rating: 0 \| rate: /

King Leo Joined: Apr 26 12 Posts: 3 ID: 54980 Credit: 2,337,218 RAC: 0	Message 6863 - Posted 7 Oct 2012 16:43:53 UTC
	After over an hour of crunching, Progress remains at 0.000%. Can anyone help or explain to me what is happening? Thanks. It is happening on one of three of my computers. First began getting computational errors and then it changes to zero progress. There must be a problem on the far end not the user side as my other 2 machines appear to be working okay for now.
	ID: 6863 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6865 - Posted 8 Oct 2012 5:26:33 UTC - in response to Message ID 6863 .
	After over an hour of crunching, Progress remains at 0.000%. Can anyone help or explain to me what is happening? Thanks. It is happening on one of three of my computers. First began getting computational errors and then it changes to zero progress. There must be a problem on the far end not the user side as my other 2 machines appear to be working okay for now. My guess is that the progress is only updated when a checkpoint is made, and many of the current workunits have some problem that makes them run at least 3 times the initially estimated time and perhaps 10 times the initially estimated time before writing any checkpoints at all (if they ever get around to doing anything useful). I've had to abort at least my last 5 workunits on this computer for this reason. I haven't seen any such workunits in the last few days on my other two computers, perhaps because all three computer participate in many BOINC projects and the others just haven't reached a good time for their next batch of Docking@Home workunits.
	ID: 6865 \| Rating: 0 \| rate: /

der_Day Joined: Jan 16 10 Posts: 10 ID: 24434 Credit: 1,922,000 RAC: 0	Message 6867 - Posted 8 Oct 2012 11:53:23 UTC
	same problem with this one in another thread
	ID: 6867 \| Rating: 0 \| rate: /

Andreas38871 Joined: Jan 8 09 Posts: 2 ID: 5693 Credit: 8,459 RAC: 0	Message 6868 - Posted 8 Oct 2012 14:10:28 UTC
	Same problem! Andreas
	ID: 6868 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6869 - Posted 8 Oct 2012 16:16:56 UTC
	Those mentioning this problem might mention whether they see it only on computers running Windows 7, and whether that happens to be the 64-bit version of Windows 7. For me, only one of my computers shows the problem, and that one is running 64-bit Windows 7. The other two, running 64-bit Windows Vista, do not show the problem but have had only one Docking@Home workunit each lately. I now have my Windows 7 computer on No New Tasks for Docking@Home while I check if its last batch of Docking@Home workunits takes much more than the initial estimated time in addition to having no checkpoints and no visible progress.
	ID: 6869 \| Rating: 0 \| rate: /

Toppie* Joined: Mar 21 12 Posts: 1 ID: 52537 Credit: 187,392 RAC: 0	Message 6870 - Posted 8 Oct 2012 16:56:34 UTC - in response to Message ID 6869 .
	Those mentioning this problem might mention whether they see it only on computers running Windows 7, and whether that happens to be the 64-bit version of Windows 7. For me, only one of my computers shows the problem, and that one is running 64-bit Windows 7. The other two, running 64-bit Windows Vista, do not show the problem but have had only one Docking@Home workunit each lately. I now have my Windows 7 computer on No New Tasks for Docking@Home while I check if its last batch of Docking@Home workunits takes much more than the initial estimated time in addition to having no checkpoints and no visible progress. Win Vista 64/ Win 7 64. Been downloading files with zero content. Spread over 4 machines. Been downloading files with incorrect crc checksums.All 4 machines. The workunits that do start, on my Vista machine: Run up to 100% complete and after 16 hours still the same. Aborted. On same machine, same batch, zero% after two hours.On the other 3 machines I cannot even start to crunch. I'll wait for better days. Toppie.
	ID: 6870 \| Rating: 0 \| rate: /

skgiven Joined: Oct 10 08 Posts: 10 ID: 2331 Credit: 3,721,673 RAC: 0	Message 6871 - Posted 8 Oct 2012 19:00:24 UTC - in response to Message ID 6870 .
	I had this issue on one 2008x64 server. 3 tasks running on a quad core opteron. No progress on any task after 18.5h, 14.4h and 14h.3h. CPU usage at 75% (the tasks), and memory being used as expected. I aborted the said tasks. The next tasks started running but didn't progress either so I restarted the system. After the reboot one task had reached 1% progress by the time I had logged on (running as a daemon). The time was about 3min. to reach this 1% and the checkpoint was at 23sec. About 8min into the run and the same task went to 3.475%. Neither of the other tasks had progressed (0%), so I suspended them. When I suspended the tasks, two new tasks immediately failed, but another 2 started, reached 1% and then 3.475%. A while later Boinc decided to run new docking tasks, these started but didn't progress after 10min, so I aborted them. On a W7x64 system (i7-2600K) the tasks are running normally so far. I prefer the tasks that fail immediately than the tasks that don't progress for hours on end. Anyway, try a restart and if tasks don't progress after say 10 or 15 min. just abort them - others should run, but babysitting seems to be the order of the day. Of note is that the tasks that don't progress don't checkpoint, so we might be able to abort them earlier? My uninformed guess is that these perpetual tasks were built incorrectly; from a dataset that contains a non-standard a-a or Charmm can't handle an atom type/range/angle... Perhaps their names would be useful in tracking the issue down? ____________
	ID: 6871 \| Rating: 0 \| rate: /

der_Day Joined: Jan 16 10 Posts: 10 ID: 24434 Credit: 1,922,000 RAC: 0	Message 6872 - Posted 8 Oct 2012 19:10:48 UTC - in response to Message ID 6871 . Last modified: 8 Oct 2012 19:11:52 UTC
	I've also a Win7 x64 machine after the reboot one task had reached 1% progress by the time I had logged on (running as a daemon). The time was about 3min. to reach this 1% and the checkpoint was at 23sec. About 8min into the run and the same task went to 3.475%. Neither of the other tasks had progressed (0%), so I suspended them. When I suspended the tasks, two new tasks immediately failed, but another 2 started, reached 1% and then 3.475%. A while later Boinc decided to run new docking tasks, these started but didn't progress after 10min, so I aborted them. On a W7x64 system (i7-2600K) the tasks are running normally so far. I prefer the tasks that fail immediately than the tasks that don't progress for hours on end. Anyway, try a restart and if tasks don't progress after say 10 or 15 min. just abort them - others should run, but babysitting seems to be the order of the day. Of note is that the tasks that don't progress don't checkpoint, so we might be able to abort them earlier? I don't wait so long. As you said, the first progress is visible after almost 45sec. I checked the slot-folders (for example d:\Boinc\Project_Data\slots) of the broken WUs and saw, that several files are missing.
	ID: 6872 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6873 - Posted 8 Oct 2012 19:54:37 UTC
	A little more to report: One more workunit on my 64-bit Windows 7 computer failed the same way last night. One workunit finished on each of my 64-bit Windows Vista computers last night. One failed, but in a different way. The other was validated.
	ID: 6873 \| Rating: 0 \| rate: /

Boyu Zhang Forum moderator Project administrator Project developer Project tester Joined: May 5 10 Posts: 88 ID: 28821 Credit: 2,013,795 RAC: 0	Message 6876 - Posted 9 Oct 2012 2:24:32 UTC - in response to Message ID 6873 .
	During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again. Thanks for letting us know and bear with us during the difficulty! Boyu
	ID: 6876 \| Rating: 0 \| rate: /

Andrea [E.R.] Joined: Jul 4 11 Posts: 1 ID: 41944 Credit: 148,083 RAC: 0	Message 6877 - Posted 9 Oct 2012 10:06:15 UTC - in response to Message ID 6876 .
	During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again. Thanks for letting us know and bear with us during the difficulty! Boyu Thanks!!! :) I think that i have the same problem with a "1ohr1htf". Should I abort this one too?
	ID: 6877 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6878 - Posted 9 Oct 2012 13:12:07 UTC - in response to Message ID 6876 .
	During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again. Thanks for letting us know and bear with us during the difficulty! Boyu Looks like the workunits need some test at the beginning that will quickly shut down any incomplete workunits. My current group of troublesome workunits all have names beginning with 1m0b1htf; should I abort all of them too?
	ID: 6878 \| Rating: 0 \| rate: /

Boyu Zhang Forum moderator Project administrator Project developer Project tester Joined: May 5 10 Posts: 88 ID: 28821 Credit: 2,013,795 RAC: 0	Message 6882 - Posted 9 Oct 2012 15:13:01 UTC - in response to Message ID 6878 .
	Yes, please abort them too. Thanks! During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again. Thanks for letting us know and bear with us during the difficulty! Boyu Looks like the workunits need some test at the beginning that will quickly shut down any incomplete workunits. My current group of troublesome workunits all have names beginning with 1m0b1htf; should I abort all of them too?
	ID: 6882 \| Rating: 0 \| rate: /

Boyu Zhang Forum moderator Project administrator Project developer Project tester Joined: May 5 10 Posts: 88 ID: 28821 Credit: 2,013,795 RAC: 0	Message 6883 - Posted 9 Oct 2012 15:13:27 UTC - in response to Message ID 6877 .
	Yes, please abort them too, thanks! During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again. Thanks for letting us know and bear with us during the difficulty! Boyu Thanks!!! :) I think that i have the same problem with a "1ohr1htf". Should I abort this one too?
	ID: 6883 \| Rating: 0 \| rate: /

lohphat Joined: Jan 1 10 Posts: 3 ID: 23732 Credit: 3,321,943 RAC: 0	Message 6885 - Posted 9 Oct 2012 18:19:09 UTC
	Why isn't this problem posted as a news item in the server status section yet?
	ID: 6885 \| Rating: 0 \| rate: /

googloo Joined: Nov 30 09 Posts: 6 ID: 22204 Credit: 1,182,026 RAC: 0	Message 6886 - Posted 9 Oct 2012 19:06:52 UTC
	I have set Docking@Home to no new tasks and have aborted all current tasks. I had two more tasks run for hours this morning with 0 progress. Please let us know when you have fixed this problem.
	ID: 6886 \| Rating: 0 \| rate: /

skgiven Joined: Oct 10 08 Posts: 10 ID: 2331 Credit: 3,721,673 RAC: 0	Message 6888 - Posted 9 Oct 2012 19:20:48 UTC - in response to Message ID 6886 .
	I have set Docking@Home to no new tasks and have aborted all current tasks. I had two more tasks run for hours this morning with 0 progress. Please let us know when you have fixed this problem. Yes, please let us know when the problem has been fixed. Presently, I think the server isn't sending new tasks, which is good saying as they don't work. Can you send server aborts, to expedite the resolution? GL ____________
	ID: 6888 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6890 - Posted 10 Oct 2012 0:40:13 UTC - in response to Message ID 6882 . Last modified: 10 Oct 2012 0:40:58 UTC
	Yes, please abort them too. Thanks! My current group of troublesome workunits all have names beginning with 1m0b1htf; should I abort all of them too? Aborted. Could you let us know when you have a new batch of workunits that have been adequately tested under 64-bit Windows 7, and the other versions of Windows mentioned recently in this thread?
	ID: 6890 \| Rating: 0 \| rate: /

Aaron Finney Volunteer tester Joined: Mar 23 07 Posts: 74 ID: 367 Credit: 2,409,831 RAC: 0	Message 6894 - Posted 10 Oct 2012 14:13:59 UTC - in response to Message ID 6890 .
	Yes, please abort them too. Thanks! My current group of troublesome workunits all have names beginning with 1m0b1htf; should I abort all of them too? Aborted. Could you let us know when you have a new batch of workunits that have been adequately tested under 64-bit Windows 7, and the other versions of Windows mentioned recently in this thread? I have new workunits today with 1hbv1hih string at the beginning. All 8 of them 2 hours in and 0% complete.
	ID: 6894 \| Rating: 0 \| rate: /

TheFiend Joined: Apr 7 09 Posts: 70 ID: 9482 Credit: 20,705,527 RAC: 0	Message 6895 - Posted 10 Oct 2012 14:37:21 UTC
	The current 0% problem is not just restricted to Win 7 x64, all my Docking is done on XP x86 crunchers.
	ID: 6895 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6897 - Posted 10 Oct 2012 21:01:06 UTC - in response to Message ID 6895 .
	The current 0% problem is not just restricted to Win 7 x64, all my Docking is done on XP x86 crunchers. Not restricted for me either. One of my 64-bit Windows Vista computers has now had two such failures, and is now on No New Tasks for Docking@Home. All three of my computers run BOINC 7.0.28.
	ID: 6897 \| Rating: 0 \| rate: /

rixx Joined: Mar 29 10 Posts: 4 ID: 27550 Credit: 1,112,714 RAC: 0	Message 6898 - Posted 10 Oct 2012 21:07:01 UTC
	This problem is in Linux too (Arch Linux x86_64).
	ID: 6898 \| Rating: 0 \| rate: /

hugos Joined: Jul 23 12 Posts: 1 ID: 64191 Credit: 1,716,384 RAC: 0	Message 6899 - Posted 10 Oct 2012 23:25:27 UTC
	I'm in it for the science (and subsequent speedup of medical research, ie, my life expectancy) and will keep testing WUs with new tasks even if my RAC takes a dive. Had loads of 1m0b1htf ones that are now aborted.
	ID: 6899 \| Rating: 0 \| rate: /

UBT - Timbo Volunteer tester Joined: Sep 13 06 Posts: 9 ID: 46 Credit: 159,440 RAC: 0	Message 6904 - Posted 11 Oct 2012 11:37:09 UTC
	Hi all, I just posted in another thread on this forum (url="http://docking.cis.udel.edu/community/forum/thread.php?id=499") that I've got the same issue: Docking WU's are just spinning their wheels and Progress stays at 0.000%. I've aborted the WU's and hope that someone, somewhere on this project can fix this issue, as it seems to have been problematic for about 3 years now (earliest post was in 2009 !!). I can't see that it's a client issue, as there's seems to be no "constant" throughout the reports made on here.....Win and Linux are affected, various versions of BOINC Manager are noted, and different types of PC's, with different CPU's. It seems to me to be a WU related issue ? regards Tim Founder, UK BOINC Team
	ID: 6904 \| Rating: 0 \| rate: /

Boyu Zhang Forum moderator Project administrator Project developer Project tester Joined: May 5 10 Posts: 88 ID: 28821 Credit: 2,013,795 RAC: 0	Message 6905 - Posted 11 Oct 2012 17:39:24 UTC
	Hi all, Please abort all the 0% progress workunits, I posted an entry regarding this on the News: http://docking.cis.udel.edu/ Sorry for the inconvenience and thanks for baring with us!! Boyu
	ID: 6905 \| Rating: 0 \| rate: /

Cluster Physik Joined: Jul 2 09 Posts: 35 ID: 14795 Credit: 16,067,012 RAC: 0	Message 6907 - Posted 11 Oct 2012 18:30:26 UTC - in response to Message ID 6905 .
	Hi all, Please abort all the 0% progress workunits, I posted an entry regarding this on the News: http://docking.cis.udel.edu/ Sorry for the inconvenience and thanks for baring with us!! Boyu Can't you abort them remotely from the project's side (other projects like RNA do this regularly)? Would be much more convenient for people who don't have the time to check all machines for WUs blocking the computation.
	ID: 6907 \| Rating: 0 \| rate: /

Boyu Zhang Forum moderator Project administrator Project developer Project tester Joined: May 5 10 Posts: 88 ID: 28821 Credit: 2,013,795 RAC: 0	Message 6908 - Posted 11 Oct 2012 19:53:23 UTC - in response to Message ID 6907 .
	I aborted the ones that are "unsent" from server side, but for the workunits that are already sent to the volunteers, we do not have control from the server side. Sorry for the inconvenience! Boyu Hi all, Please abort all the 0% progress workunits, I posted an entry regarding this on the News: http://docking.cis.udel.edu/ Sorry for the inconvenience and thanks for baring with us!! Boyu Can't you abort them remotely from the project's side (other projects like RNA do this regularly)? Would be much more convenient for people who don't have the time to check all machines for WUs blocking the computation.
	ID: 6908 \| Rating: 0 \| rate: /

Mark Rush Joined: Feb 15 09 Posts: 4 ID: 7162 Credit: 5,779,850 RAC: 0	Message 6909 - Posted 12 Oct 2012 1:44:11 UTC - in response to Message ID 6908 .
	I aborted the ones that are "unsent" from server side, but for the workunits that are already sent to the volunteers, we do not have control from the server side. Sorry for the inconvenience! Boyu As I am sure your realize, this situation makes for a rather large pain in the tush. I will have to check several machines to make certain that the defective Docking WUs are not blocking Docking and the other projects I run as well. I pay attention to BOINC, so while it's an issue, for me, it's not insurmountable. I expect that many other crunchers do not pay attention and for them the defective Docking WUs might be a major slowdown, not only for Docking but for other projects. As it happens, other projects (Malariacontrol for instance) have the ability to delete WUs after they are downloaded. I urge in the strongest possible terms for Docking to spend some resources developing this capability. Mark
	ID: 6909 \| Rating: 0 \| rate: /

Cluster Physik Joined: Jul 2 09 Posts: 35 ID: 14795 Credit: 16,067,012 RAC: 0	Message 6912 - Posted 12 Oct 2012 18:15:53 UTC - in response to Message ID 6909 .
	I aborted the ones that are "unsent" from server side, but for the workunits that are already sent to the volunteers, we do not have control from the server side. Sorry for the inconvenience! Boyu [..] As it happens, other projects (Malariacontrol for instance) have the ability to delete WUs after they are downloaded. I urge in the strongest possible terms for Docking to spend some resources developing this capability. I second that. And I can only reiterate, that other projects can do it. Mark mentioned MalariaControl and I mentioned RNA World before. Both projects have the ability to cancel tasks remotely (for instance when the results are not needed anymore). The BOINC platforms offers this somehwhere for sure.
	ID: 6912 \| Rating: 0 \| rate: /

robertmiles Joined: Apr 16 09 Posts: 96 ID: 9967 Credit: 1,290,747 RAC: 0	Message 6918 - Posted 15 Oct 2012 2:14:12 UTC Last modified: 15 Oct 2012 2:15:10 UTC
	SOMETHING has allowed my Windows 7 computer to resume Docking@Home workunits. I can't tell if it was an improvement in the workunits, or the fact that I drained that computer of Docking@Home workunits and then told BOINC Manager to reset that project.
	ID: 6918 \| Rating: 0 \| rate: /

dgnuff Joined: Jan 7 11 Posts: 2 ID: 36644 Credit: 8,253,291 RAC: 0	Message 6925 - Posted 17 Oct 2012 9:27:58 UTC - in response to Message ID 6918 . Last modified: 17 Oct 2012 9:45:07 UTC
	-- Deleted --
	ID: 6925 \| Rating: 0 \| rate: /

Fred Verster Joined: May 8 09 Posts: 26 ID: 11034 Credit: 2,647,353 RAC: 0	Message 6934 - Posted 24 Oct 2012 11:21:30 UTC
	Hi, this morning I noticed several tasks running High Priority , but don't make any progress after 105 hours! Still at 0%. Seems useless to let it run, so deleting these is the only(?) option? Atleast for the 4 tasks that are running now, all with 0% progress after >50 hours. ____________ Knight who says N! Ni Ni
	ID: 6934 \| Rating: 0 \| rate: /

Boyu Zhang Forum moderator Project administrator Project developer Project tester Joined: May 5 10 Posts: 88 ID: 28821 Credit: 2,013,795 RAC: 0	Message 6936 - Posted 25 Oct 2012 13:46:40 UTC - in response to Message ID 6934 .
	Dear Fred, Please abort all the 0% progress workunits, they are part of the incomplete workunits from the previous batch. Sorry for the inconvenience! Thanks! Boyu Hi, this morning I noticed several tasks running High Priority , but don't make any progress after 105 hours! Still at 0%. Seems useless to let it run, so deleting these is the only(?) option? Atleast for the 4 tasks that are running now, all with 0% progress after >50 hours.
	ID: 6936 \| Rating: 0 \| rate: /

Aaron Finney Volunteer tester Joined: Mar 23 07 Posts: 74 ID: 367 Credit: 2,409,831 RAC: 0	Message 6989 - Posted 16 Nov 2012 16:21:55 UTC - in response to Message ID 6936 . Last modified: 16 Nov 2012 16:28:57 UTC
	Still getting these workunits. Had 6 today with 42 hours elapsed time.. They shouldn't be sent out if they are going to do this. 1d4h1hih_ <--- Workunits start with this prefix.
	ID: 6989 \| Rating: 0 \| rate: /

Message boards : Number crunching : HELP - Consistant 0% Progress - Client Problem?

Database Error
: The MySQL server is running with the --read-only option so it cannot execute this statement

array(3) {
  [0]=>
  array(7) {
    ["file"]=>
    string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc"
    ["line"]=>
    int(97)
    ["function"]=>
    string(8) "do_query"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#264 (2) {
      ["db_conn"]=>
      resource(684) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(51) "update DBNAME.thread set views=views+1 where id=460"
    }
  }
  [1]=>
  array(7) {
    ["file"]=>
    string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc"
    ["line"]=>
    int(60)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#264 (2) {
      ["db_conn"]=>
      resource(684) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(3) {
      [0]=>
      object(BoincThread)#3 (16) {
        ["id"]=>
        string(3) "460"
        ["forum"]=>
        string(1) "2"
        ["owner"]=>
        string(4) "9674"
        ["status"]=>
        string(1) "0"
        ["title"]=>
        string(47) "HELP - Consistant 0% Progress - Client Problem?"
        ["timestamp"]=>
        string(10) "1353082915"
        ["views"]=>
        string(4) "5327"
        ["replies"]=>
        string(3) "258"
        ["activity"]=>
        string(22) "2.6183915503533997e-34"
        ["sufferers"]=>
        string(1) "0"
        ["score"]=>
        string(1) "0"
        ["votes"]=>
        string(1) "0"
        ["create_time"]=>
        string(10) "1250668904"
        ["hidden"]=>
        string(1) "0"
        ["sticky"]=>
        string(1) "0"
        ["locked"]=>
        string(1) "0"
      }
      [1]=>
      &string(6) "thread"
      [2]=>
      &string(13) "views=views+1"
    }
  }
  [2]=>
  array(7) {
    ["file"]=>
    string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php"
    ["line"]=>
    int(184)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(11) "BoincThread"
    ["object"]=>
    object(BoincThread)#3 (16) {
      ["id"]=>
      string(3) "460"
      ["forum"]=>
      string(1) "2"
      ["owner"]=>
      string(4) "9674"
      ["status"]=>
      string(1) "0"
      ["title"]=>
      string(47) "HELP - Consistant 0% Progress - Client Problem?"
      ["timestamp"]=>
      string(10) "1353082915"
      ["views"]=>
      string(4) "5327"
      ["replies"]=>
      string(3) "258"
      ["activity"]=>
      string(22) "2.6183915503533997e-34"
      ["sufferers"]=>
      string(1) "0"
      ["score"]=>
      string(1) "0"
      ["votes"]=>
      string(1) "0"
      ["create_time"]=>
      string(10) "1250668904"
      ["hidden"]=>
      string(1) "0"
      ["sticky"]=>
      string(1) "0"
      ["locked"]=>
      string(1) "0"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(13) "views=views+1"
    }
  }
}

query: update docking.thread set views=views+1 where id=460