Bug reports for charmm 5.01
Message boards : Number crunching : Bug reports for charmm 5.01
Author | Message | |
---|---|---|
When a WU is aborted, this message appears on my client:
2006-09-13 14:55:16 [Docking@Home] Unrecoverable error for result 1tng_mod0001_149_70389_0 (aborted by user) After I started client again, this message was shown: 2006-09-13 14:57:09 [Docking@Home] State file error: result 1tng_mod0001_149_70389_0 is in wrong state ____________ I'm a volunteer participant; my views are not necessarily those of Docking@Home or its participating institutions. |
||
ID: 12 | Rating: 0 | rate: / | ||
It looks like same problems as simap with hmmer app. Excessive writing to disk caused some problems with boinc (boinc.exe with 50% cpu power!) |
||
ID: 27 | Rating: 0 | rate: / | ||
My first
result
on my Intel P4 3.06GHz completed in 4:28 hours (running BOINC only while I slept). A little longer than the 1.5 hours (1:30) noted on the front page.
|
||
ID: 31 | Rating: 0 | rate: / | ||
13.09.2006 15:14:04|Docking@Home|Computation for task 1tng_mod0001_902_375820_2 finished
|
||
ID: 36 | Rating: 0 | rate: / | ||
09/13/06 05:33:36|Docking@Home|Computation for task 1tng_mod0001_357_447685_2 finished
|
||
ID: 40 | Rating: 0 | rate: / | ||
I've changed max size upload size manually and will report later if it helps... |
||
ID: 47 | Rating: 0 | rate: / | ||
I've changed max size upload size manually and will report later if it helps... If all WUs error out with -131 then the complete serie is faulty. I have suspend all WUs at this time and waiting, 6h is too long for testing :/ |
||
ID: 48 | Rating: 0 | rate: / | ||
I have successfully finished my 2 first WU on Ubuntu 5.10, using Boinc 5.2.13.
|
||
ID: 50 | Rating: 0 | rate: / | ||
I have successfully finished my 2 first WU on Ubuntu 5.10, using Boinc 5.2.13. Ubuntu 5.10 - is this the linux application? Is it not at 5.01 as well? |
||
ID: 51 | Rating: 0 | rate: / | ||
We'll change this asap. Seems there is still a lot of debug info written to the logs that generate upload errors when they finish.
My first result on my Intel P4 3.06GHz completed in 4:28 hours (running BOINC only while I slept). A little longer than the 1.5 hours (1:30) noted on the front page. |
||
ID: 52 | Rating: 0 | rate: / | ||
I have successfully finished my 2 first WU on Ubuntu 5.10, using Boinc 5.2.13. 5.10 is the version of my Ubuntu operating system ;-) |
||
ID: 54 | Rating: 0 | rate: / | ||
I have successfully finished my 2 first WU on Ubuntu 5.10, using Boinc 5.2.13. Never heard of it. I guess it's true; you do learn something new every day. :-) |
||
ID: 55 | Rating: 0 | rate: / | ||
I've changed max size upload size manually and will report later if it helps... Doesn´t work for me, i´ve no limit for upload size in both, project- and general-preferences, and the same -131 error appears. |
||
ID: 77 | Rating: 0 | rate: / | ||
It's our problem. We are writing too much debug info in the windoze app. We'll create a new app later today when the developer gets out of his class.
I've changed max size upload size manually and will report later if it helps... |
||
ID: 79 | Rating: 0 | rate: / | ||
Thanks for the fast response Andre, think i´ll wait patiently for the new app.
|
||
ID: 81 | Rating: 0 | rate: / | ||
It's our problem. We are writing too much debug info in the windoze app. We'll create a new app later today when the developer gets out of his class. You really meant the upload size preferences? Ok, I found it, but what I can edit in the client_state.xml file if the WU is running? Only if boinc doesnt run then its possible. |
||
ID: 82 | Rating: 0 | rate: / | ||
You really meant the upload size preferences? Ok, I found it, but what I can edit in the client_state.xml file if the WU is running? Only if boinc doesnt run then its possible. what to edit - can you copy and paste the line? PS: anyone got more than 5% on an linux-machine? ____________ Lebe Dein Leben so wie Du wenn Du stirbst wünschen wirst gelebt zu haben |
||
ID: 87 | Rating: 0 | rate: / | ||
|
||
ID: 89 | Rating: 0 | rate: / | ||
Is there any valid result for windows up to now?
|
||
ID: 92 | Rating: 0 | rate: / | ||
<max_nbytes>1000000.000000</max_nbytes>
|
||
ID: 94 | Rating: 0 | rate: / | ||
Works flawlessly on all my linux boxes. Takes about 4 hours per wu.
|
||
ID: 95 | Rating: 0 | rate: / | ||
Charmm needs approx 50 MB for a run. If you set it to 100 MB you should be okay for now.
<max_nbytes>1000000.000000</max_nbytes> |
||
ID: 97 | Rating: 0 | rate: / | ||
Charmm needs approx 50 MB for a run. If you set it to 100 MB you should be okay for now. Ähm, Angus had this error: 09/13/06 05:33:36|Docking@Home|File size: 1714333.000000 bytes. Limit: 1000000.000000 bytes So the limit is smaller than the file size! |
||
ID: 98 | Rating: 0 | rate: / | ||
I've changed max size upload size manually and will report later if it helps... I'm getting errors on upload in log, but server says OK with this result . I took about 3 hours to complete. 2006-09-13 20:09:19 [Docking@Home] Computation for task 1tng_mod0001_1238_240225_2 finished 2006-09-13 20:09:19 [Einstein@Home] Resuming task h1_0877.0_S5R1__184_S5R1a_1 using einstein_S5R1 version 424 2006-09-13 20:09:21 [Docking@Home] Started upload of file 1tng_mod0001_1238_240225_2_0 2006-09-13 20:09:21 [Docking@Home] Started upload of file 1tng_mod0001_1238_240225_2_1 2006-09-13 20:09:25 [Docking@Home] Error on file upload: invalid signature 2006-09-13 20:09:25 [Docking@Home] Error on file upload: invalid signature 2006-09-13 20:09:25 [Docking@Home] Permanently failed upload of 1tng_mod0001_1238_240225_2_0 2006-09-13 20:09:25 [Docking@Home] Giving up on upload of 1tng_mod0001_1238_240225_2_0: server rejected file 2006-09-13 20:09:25 [Docking@Home] Permanently failed upload of 1tng_mod0001_1238_240225_2_1 2006-09-13 20:09:25 [Docking@Home] Giving up on upload of 1tng_mod0001_1238_240225_2_1: server rejected file 2006-09-13 20:09:25 [Docking@Home] Started upload of file 1tng_mod0001_1238_240225_2_2 2006-09-13 20:09:25 [Docking@Home] Started upload of file 1tng_mod0001_1238_240225_2_3 2006-09-13 20:09:29 [Docking@Home] Error on file upload: invalid signature 2006-09-13 20:09:29 [Docking@Home] Permanently failed upload of 1tng_mod0001_1238_240225_2_3 2006-09-13 20:09:29 [Docking@Home] Giving up on upload of 1tng_mod0001_1238_240225_2_3: server rejected file 2006-09-13 20:11:11 [Docking@Home] Error on file upload: invalid signature 2006-09-13 20:11:11 [Docking@Home] Permanently failed upload of 1tng_mod0001_1238_240225_2_2 2006-09-13 20:11:11 [Docking@Home] Giving up on upload of 1tng_mod0001_1238_240225_2_2: server rejected file 2006-09-13 20:11:16 [Docking@Home] Sending scheduler request to http://docking.utep.edu/docking_cgi/cgi 2006-09-13 20:11:16 [Docking@Home] Reason: To report completed tasks 2006-09-13 20:11:16 [Docking@Home] Reporting 1 tasks |
||
ID: 99 | Rating: 0 | rate: / | ||
Is there any valid result for windows up to now? I'm not sure if it is valid (slower computers haven't returned other result in WU yet) but yes - I have completed 2 results succesfully out of 2. But had to deal with upload size manually... |
||
ID: 100 | Rating: 0 | rate: / | ||
On MacOS 10.4.7 the WU computed to 100 % and aborted :
|
||
ID: 111 | Rating: 0 | rate: / | ||
Honza, "Charmm exited with code 0" means only success, you can ignore it. But on what value do you have changed max_nbytes? |
||
ID: 112 | Rating: 0 | rate: / | ||
and do we have to change it for each WU??? (42 on my account!!!) cheers DonaldXP |
||
ID: 113 | Rating: 0 | rate: / | ||
It's our problem. We are writing too much debug info in the windoze app. We'll create a new app later today when the developer gets out of his class. Could you explain how to do that, please ? Which "prefs" ? |
||
ID: 114 | Rating: 0 | rate: / | ||
It's our problem. We are writing too much debug info in the windoze app. We'll create a new app later today when the developer gets out of his class. You must looking for "<max_nbytes>1000000.000000</max_nbytes>" in client_state.xml file. Before you change anything you must close boinc! @Donald, yes you must change this for each WU! |
||
ID: 115 | Rating: 0 | rate: / | ||
ok,thanx,i'll try that... in ca.3 hours i will see if it works for me. cheers DonaldXP |
||
ID: 116 | Rating: 0 | rate: / | ||
My 1st WU crashed:
13/09/2006 10:35:59 AM|Docking@Home|Computation for task 1tng_mod0001_903_168857_1 finished This WU crashed for me: http://docking.utep.edu/workunit.php?wuid=902 |
||
ID: 119 | Rating: 0 | rate: / | ||
It's our problem. We are writing too much debug info in the windoze app. We'll create a new app later today when the developer gets out of his class. Actually you can change this globally in your "general preferences" -- settings/changes affect all BOINC projects. My upload/download sets are: 9999999999999999999999999999 bytes. Setting this high will allow each project to use whatever your pipe supports (or what you've your ISP to give you):D |
||
ID: 120 | Rating: 0 | rate: / | ||
We removed this debug information from the Windows executable. You should update your project and that shold fix the problem.
My first result on my Intel P4 3.06GHz completed in 4:28 hours (running BOINC only while I slept). A little longer than the 1.5 hours (1:30) noted on the front page. |
||
ID: 121 | Rating: 0 | rate: / | ||
now i am a little bit confused. the only settings i see in general prefs are the max.up/downloadspeeds.our problem here is the size of the uploading-results,isn't it!? cheers DonaldXP ____________ |
||
ID: 122 | Rating: 0 | rate: / | ||
Correction:
We removed this debug information from the Windows executable. You should update your project and that shold fix the problem. |
||
ID: 123 | Rating: 0 | rate: / | ||
Another crash! Looks like it crashes as soon as it finishes or something.
2006-09-13 18:10:15 [Docking@Home] Computation for task 1tng_mod0001_709_386735_2 finished http://docking.utep.edu/result.php?resultid=2125 Before I cruch anymore WUs here (if this is happening to just me, I wasted about 16-18hrs of time), I want to know if anyone else is having this problem. EDIT: Just noticed post above me. What about the current WUs? I'm assuming they are still affected? |
||
ID: 125 | Rating: 0 | rate: / | ||
EDIT: Just noticed post above me. What about the current WUs? I'm assuming they are still affected? If they finished fixing, they would notice us on the thread. why don't you wait for more hours/days? :) For those who want to know BOINC error codes, this page of BOINC wiki will help. ERR_FILE_TOO_BIG | -131 | file size too big | an output file was bigger than max_nbytes ____________ I'm a volunteer participant; my views are not necessarily those of Docking@Home or its participating institutions. |
||
ID: 126 | Rating: 0 | rate: / | ||
As Richard mentioned below we've found a bug in the input file for charmm: a random seed is not properly doing what it is supposed to be doing and writes to one of the output files over and over; this output file is normally about 2kb in size, but because of this problem grows to way over 1 MB which the boinc client doesn't like. We'll fix this as soon as possible and yes all current wu's are affected :-(
EDIT: Just noticed post above me. What about the current WUs? I'm assuming they are still affected? |
||
ID: 127 | Rating: 0 | rate: / | ||
Just finished first w/u after 6:11:40 hours of crunchin. Running Windows XP, Amd 64, Boinc 5.4.11 , And also ended with the following error :
|
||
ID: 128 | Rating: 0 | rate: / | ||
they are talking about editing the Boinc files open the boinc folder on your harddrive and there will be an XML file called client_state this the file they are changing to change it you have exit Boinc first, you need to use a text editor, I use Notepad to do this,save and restart Boinc they need to be more specific of which section they are editing, there are several different parts to that file speaking of client_state the DCF(duration correction factor) isn't adjusting itself on my manager still thinks the units are 6 mins instead of 6 hrs love this testing stuff Big Whiskey [edit] these boards are too weird[/edit] |
||
ID: 129 | Rating: 0 | rate: / | ||
Getting errors here as well:
|
||
ID: 132 | Rating: 0 | rate: / | ||
3h crunching and error -131;
<core_client_version>5.4.9</core_client_version>
<stderr_txt> Starting charmm run... SUCCESS - Charmm exited with code 0. Resolving file charmm.out... Calling BOINC finish. </stderr_txt> <message> <file_xfer_error> <file_name>1tng_mod0001_856_46936_2_2</file_name> <error_code>-131</error_code> </file_xfer_error> </message> ( http://docking.utep.edu/result.php?resultid=2566 ) And the messages:
Don 14 Sep 2006 04:11:53 CEST|Docking@Home|Resuming task 1tng_mod0001_856_46936_2 using charmm version 501
Don 14 Sep 2006 06:30:24 CEST|Docking@Home|Computation for task 1tng_mod0001_856_46936_2 finished Don 14 Sep 2006 06:30:24 CEST|Docking@Home|Output file 1tng_mod0001_856_46936_2_2 for task 1tng_mod0001_856_46936_2 exceeds size limit. Don 14 Sep 2006 06:30:24 CEST|Docking@Home|File size: 1631304.000000 bytes. Limit: 1000000.000000 bytes Don 14 Sep 2006 06:30:24 CEST||Resuming round-robin CPU scheduling. Don 14 Sep 2006 06:30:26 CEST|Docking@Home|Unrecoverable error for result 1tng_mod0001_856_46936_2 (<file_xfer_error> <file_name>1tng_mod0001_856_46936_2_2</file_name> <error_code>-131</error_code></file_xfer_error>) Don 14 Sep 2006 06:30:26 CEST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds Don 14 Sep 2006 06:30:27 CEST|Docking@Home|Started upload of file 1tng_mod0001_856_46936_2_0 Don 14 Sep 2006 06:30:27 CEST|Docking@Home|Started upload of file 1tng_mod0001_856_46936_2_1 Don 14 Sep 2006 06:30:29 CEST|Docking@Home|Finished upload of file 1tng_mod0001_856_46936_2_0 Don 14 Sep 2006 06:30:29 CEST|Docking@Home|Throughput 5142 bytes/sec Don 14 Sep 2006 06:30:29 CEST|Docking@Home|Finished upload of file 1tng_mod0001_856_46936_2_1 Don 14 Sep 2006 06:30:29 CEST|Docking@Home|Throughput 5575 bytes/sec Don 14 Sep 2006 06:30:29 CEST|Docking@Home|Started upload of file 1tng_mod0001_856_46936_2_3 Don 14 Sep 2006 06:30:32 CEST|Docking@Home|Finished upload of file 1tng_mod0001_856_46936_2_3 Don 14 Sep 2006 06:30:32 CEST|Docking@Home|Throughput 1197 bytes/sec What's that limit about, and where can I change it??? I've got DSL, so there's no problem even with several MB, 1.7 is nothing I would complain about. Edit: Bloody mess with this right aligned formatting. I'd like to add: BOINC 5.4.9, Suse Linux 9.2, AthlonXP2200+, DSL-flatrate ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki |
||
ID: 133 | Rating: 0 | rate: / | ||
Preferences setting in Account are far from solving any trouble wit upload file size; they only limit throughtput not to flood your connection.
|
||
ID: 137 | Rating: 0 | rate: / | ||
I had the same error (exceeds size limit) but updated the project BEFORE changing the upload size. Now the WU is marked with "Client error" and "Compute error". Did you get the result of the result anyway? Or are the 5:20 hours wasted time? |
||
ID: 138 | Rating: 0 | rate: / | ||
I finally had time to try this on the one 5.01 wu I didn't abort, and I got the same results as Honza, except that it took close to 5 hours on my system. Got the invalid sig & giving up on upload errors for all the files, but the result returned ok and claimed credit. Just d/l a 5.02 work unit, my DCF is now 41.something (wow) and the time estimate is now about 4 1/2 hours. |
||
ID: 139 | Rating: 0 | rate: / | ||
Forgive my ignorance, but do I need to abort the unstarted 5.01 wu's and get 5.02's instead?
|
||
ID: 141 | Rating: 0 | rate: / | ||
Forgive my ignorance, but do I need to abort the unstarted 5.01 wu's and get 5.02's instead? Not abort! Before you do this, make a copy of your boinc folder to another place! 1. You must exit boinc 2. http://docking.utep.edu/download/charmm_5.2_windows_intelx86 for download 3. Replace the old charm_5.1_windows_intelx86 with the newer one 4. Start Boinc, thats all I have not been tested this yet, but it must be work. |
||
ID: 143 | Rating: 0 | rate: / | ||
Ok, thanks. I'll try this shortly and let you know if it works. |
||
ID: 145 | Rating: 0 | rate: / | ||
Thu Sep 14 04:44:06 2006||Starting BOINC client version 5.4.9 for i686-pc-linux-gnu
|
||
ID: 147 | Rating: 0 | rate: / | ||
Forgive my ignorance, but do I need to abort the unstarted 5.01 wu's and get 5.02's instead? Step 3. Do you mean re-name the downloaded file and replace? This seems to be the only way to get things to run but it does mean that it will report as having used the wrong version... |
||
ID: 148 | Rating: 0 | rate: / | ||
Getting different problem.
|
||
ID: 149 | Rating: 0 | rate: / | ||
Forgive my ignorance, but do I need to abort the unstarted 5.01 wu's and get 5.02's instead? If you try to download both files 5.01/5.02 it has the same name. I have overwritten the file and run, the worst thing I have seen was it fall back from 29% to 5%. No error so far but its not a good solution. I will try to change the max_nbytes value. If the time also fall back then I will abort all and waiting for new ones. The problem with high boinc.exe load is still present :/ (v5.02) Edit: Same happened with v5.01 and some changes on max_nbytes. postet here Update: I have posted an update of the percent status on the upper link ;) |
||
ID: 150 | Rating: 0 | rate: / | ||
I have forgot to say: Dont allow boinc to report WUs! |
||
ID: 155 | Rating: 0 | rate: / | ||
1. You must exit boinc Nope, this is not correct manual application upgrade. (you need to edit client_state on more placed, edit slot files etc.) It is a bit tricky; still much easier with simple aplication like Docking where only one application files (vs. CPDN). I think I'll stop writting about tweaking BOINC and .XML files as is cause more troubles here that I expected (:- It needs experienced BOINC users or testers having some knowledge of how BOINC works... |
||
ID: 157 | Rating: 0 | rate: / | ||
1. You must exit boinc I know that but v5.02 should be only a fix without any success, all problems are still present! |
||
ID: 159 | Rating: 0 | rate: / | ||
Correction: Oh, that sounds good. I edited my file and am now having succesful uploads. When these run out I'll begin the next version. I wanted to see how this was done instead of aborting some work. based on what I deciphered from this post, this is what I did, for those that want to know. exit boinc_manager make a copy of client_state.xml (if you are not comfortable editing this file you may want to copy/backup the entire boinc directory, just in case you goof.) edit client_state.xml find <master_url>http://docking.utep.edu/<master_url> below this find <max_nbytes>1000000.000000</max_nbytes> make sure line 2 above what you found says <name>1tng...... change line to <max_nbytes>100000000.000000</max_nbytes> (or jsut add 2 zeros after the 1 do this for each 1tng... entry, there are 4 for each workunit save file exit notepad restart boinc_manager wait for work to finish ... on host 5 it had 3 errors before, after it had 3 successful uploads showing no error . on host 6 it had 1 error before, after it had 1 sucessful upload but still showed an error , then it had 2 sucessful uploads without error . I'll let the rest I have, about 8 each host run, then it should automatically download the new version 5.02 If you do not want to edit, either abort all the workunits one at a time or reset project (Note that reseting project aborts any unsent results and waiting to reports and you will not get credit for them even if they are good, they are aborted). I reported the wrong time before My Intel P4 HT 3.06GHz is at avg 5:30 for each of 4 workunits My Intel P4 HT 3.80GHz is at avg 4:28 for each of 6 workunits Off to do some boring for pay work now... I'd much rather be BOINCin... |
||
ID: 161 | Rating: 0 | rate: / | ||
I detached and re-attached to the project, and since then all my downloads fail :(
|
||
ID: 162 | Rating: 0 | rate: / | ||
I detached and re-attached to the project, and since then all my downloads fail :( I don't know if it'll necessarily fix anything but you are using a very old version of boinc. Current is 5.4.11 with 5.6.3 (I think) in development. |
||
ID: 163 | Rating: 0 | rate: / | ||
I detached and re-attached to the project, and since then all my downloads fail :( I know ;-) But my first 6 work units all went fine. Then I detached and re-attached and the problem started. No idea if the detaching and re-attaching is related to the problem. |
||
ID: 167 | Rating: 0 | rate: / | ||
(sorry, duplicate post) |
||
ID: 168 | Rating: 0 | rate: / | ||
I upgraded to 5.4.9 (Linux), but the problem remains. Here is a complete message log from the start of a work request to the end.
|
||
ID: 169 | Rating: 0 | rate: / | ||
I detached and re-attached to the project, and since then all my downloads fail :( It's not a detach/reattach issue. I have the same probleme after attaching Docking on a new machine. A linux box too (old boinc 5.2.13) |
||
ID: 191 | Rating: 0 | rate: / | ||
This should not happen anymore. Let us know if it does.
I detached and re-attached to the project, and since then all my downloads fail :( |
||
ID: 201 | Rating: 0 | rate: / | ||
I still have problems with the 5.01 app on the Linux box! See
this
post in the Q&A...
|
||
ID: 233 | Rating: 0 | rate: / | ||
Stefan,
I still have problems with the 5.01 app on the Linux box! See this post in the Q&A... |
||
ID: 234 | Rating: 0 | rate: / | ||
For shure Andre! ;)
|
||
ID: 236 | Rating: 0 | rate: / | ||
Any news on that? I understand that the windoze users are more important, but it would be nice to become an answer... |
||
ID: 255 | Rating: 0 | rate: / | ||
Not a real app problem (at least I think so) but one of my
results
is invalid
|
||
ID: 262 | Rating: 0 | rate: / | ||
Hello,
|
||
ID: 287 | Rating: 0 | rate: / | ||
|
||
ID: 296 | Rating: 0 | rate: / | ||
|
||
ID: 297 | Rating: 0 | rate: / | ||
I would have had the first credits yet, if the others had not errored out and errors were decided to be the valid result:
CPU time 7359.167235
stderr out <core_client_version>5.4.9</core_client_version> <stderr_txt> Starting charmm run... SUCCESS - Charmm exited with code 0. Resolving file charmm.out... Calling BOINC finish. </stderr_txt> Validate state Invalid Claimed credit 10.403250187519 Granted credit 0 application version 5.01 Other:
CPU time 244.50328
stderr out <core_client_version>5.5.0</core_client_version> <stderr_txt> Calling BOINC init. Starting charmm run... ERROR - Charmm exited with code 1. Calling BOINC finish. </stderr_txt> Validate state Valid Claimed credit 1.91242036206108 Granted credit 1.14108966428172 application version 5.01 Third:
CPU time 225.641697
stderr out <core_client_version>5.2.14</core_client_version> <stderr_txt> Starting charmm run... ERROR - Charmm exited with code 1. Calling BOINC finish. </stderr_txt> Validate state Valid Claimed credit 1.14108966428172 Granted credit 1.14108966428172 application version 5.01 What's that supposed to mean? Errors are good and success is wrong??? What does this ERROR stand for anyway? ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki |
||
ID: 300 | Rating: 0 | rate: / | ||
|
||
ID: 308 | Rating: 0 | rate: / | ||
This is really good feedback that points us to a big problem we didn't even realize was there. Will look into this immediately.
I would have had the first credits yet, if the others had not errored out and errors were decided to be the valid result: |
||
ID: 317 | Rating: 0 | rate: / | ||
Not sure if this is a bug, but I sure think it's weird. If you look at
http://docking.utep.edu/workunit.php?wuid=3395
, you notice that my machine worked 145 seconds on the WU, and someone else's did it in 5710 seconds. My machine is not
that
fast :P I have seen several work units with similar results.
|
||
ID: 330 | Rating: 0 | rate: / | ||
Not sure if this is a bug, but I sure think it's weird. If you look at http://docking.utep.edu/workunit.php?wuid=3395 , you notice that my machine worked 145 seconds on the WU, and someone else's did it in 5710 seconds. My machine is not that fast :P I have seen several work units with similar results. Tutta, I am getting the same w/u as you and am taking twice as long as you, saw the other one you speak-of and donnot understand how fast a machine can be?But yours is actually "Real Fast" Coodos tutta Sincerely Doug Worrall Teammate |
||
ID: 331 | Rating: 0 | rate: / | ||
Not sure if this is a bug, but I sure think it's weird. If you look at http://docking.utep.edu/workunit.php?wuid=3395 , you notice that my machine worked 145 seconds on the WU, and someone else's did it in 5710 seconds. My machine is not that fast :P I have seen several work units with similar results. If you look at your results, you will see "charm exited with error code 1". The other WU has error code 0. I think error code 0 means that this WU is crunched without an error. But I don´t know what this error 1 means... I have crunched 125 WUs with my Linux box, and all show this error 1... ;( |
||
ID: 336 | Rating: 0 | rate: / | ||
Make the application output lots of debugging stuff. Bloat your code with printfs! It will help a lot tracking down errors.
Here's a result from my project
, see how much cr4p is printed, apart from the normal POV-Ray messages (which start at "Persistence of Vision(tm) Ray Tracer Version 3.6.0.mingw-3.10(gcc-3.4.5)").
|
||
ID: 337 | Rating: 0 | rate: / | ||
It happened a second time.
stderr out
___________________
<core_client_version>5.2.14</core_client_version> <stderr_txt> Starting charmm run... ERROR - Charmm exited with code 1. Calling BOINC finish. </stderr_txt>
stderr out
__________________
<core_client_version>5.5.0</core_client_version> <stderr_txt> Calling BOINC init. Starting charmm run... ERROR - Charmm exited with code 1. Calling BOINC finish. </stderr_txt> Mine:
stderr out
<core_client_version>5.4.9</core_client_version> <stderr_txt> Starting charmm run... SUCCESS - Charmm exited with code 0. Resolving file charmm.out... Calling BOINC finish. </stderr_txt> A bit more about my puter :
CPU type AuthenticAMD
AMD Athlon(tm) XP 2200+ Number of CPUs 1 Operating System Linux 2.6.8-24.20-default Memory 503.55 MB Cache 256 KB Swap space 1004.05 MB Total disk space 100 GB Free Disk Space 94.72 GB Measured floating point speed 914.22 million ops/sec Measured integer speed 1583.97 million ops/sec Average upload rate 4.09 KB/sec Average download rate 97.07 KB/sec Average turnaround time 5 days Maximum daily WU quota per CPU 17/day Results 43 Number of times client has contacted server 11 Last time contacted server 18 Sep 2006 15:44:03 UTC % of time BOINC client is running 84.0286 % While BOINC running, % of time work is allowed 99.979 % Average CPU efficiency 0.935248 Result duration correction factor 1.345338 Linux flavour is Suse 9.2 Boinc 5.4.9 Docking 5.01 And the BOINC messages:
Son 17 Sep 2006 22:33:56 CEST|Docking@Home|Resuming task 1tng_mod0001_349_110746_0 using charmm version 501
Son 17 Sep 2006 22:42:40 CEST||Rescheduling CPU: application exited Son 17 Sep 2006 22:42:40 CEST|Docking@Home|Computation for task 1tng_mod0001_349_110746_0 finished Son 17 Sep 2006 22:42:43 CEST|Docking@Home|Started upload of file 1tng_mod0001_349_110746_0_0 Son 17 Sep 2006 22:42:43 CEST|Docking@Home|Started upload of file 1tng_mod0001_349_110746_0_1 Son 17 Sep 2006 22:42:47 CEST|Docking@Home|Finished upload of file 1tng_mod0001_349_110746_0_0 Son 17 Sep 2006 22:42:47 CEST|Docking@Home|Throughput 5674 bytes/sec Son 17 Sep 2006 22:42:47 CEST|Docking@Home|Finished upload of file 1tng_mod0001_349_110746_0_1 Son 17 Sep 2006 22:42:47 CEST|Docking@Home|Throughput 6221 bytes/sec Son 17 Sep 2006 22:42:47 CEST|Docking@Home|Started upload of file 1tng_mod0001_349_110746_0_2 Son 17 Sep 2006 22:42:47 CEST|Docking@Home|Started upload of file 1tng_mod0001_349_110746_0_3 Son 17 Sep 2006 22:42:49 CEST|Docking@Home|Finished upload of file 1tng_mod0001_349_110746_0_3 Son 17 Sep 2006 22:42:49 CEST|Docking@Home|Throughput 1347 bytes/sec Son 17 Sep 2006 22:42:54 CEST|Docking@Home|Finished upload of file 1tng_mod0001_349_110746_0_2 Son 17 Sep 2006 22:42:54 CEST|Docking@Home|Throughput 23274 bytes/sec |
||
ID: 355 | Rating: 0 | rate: / | ||
I still don't know what "ERROR - Charmm exited with code 1." and "SUCCESS - Charmm exited with code 0." stand for. What else is in the result? Is any trace left on my puter? I wasn't there both times it sent the result up, so I don't know. |
||
ID: 356 | Rating: 0 | rate: / | ||
And a third time:(
|
||
ID: 368 | Rating: 0 | rate: / | ||
Hello,
|
||
ID: 369 | Rating: 0 | rate: / | ||
So far, all 4 linux boxes have reported very quickly with the following error message. I see that I am not the only one experiencing this problem. Should I continue crunching, or wait for a fix? Afaik there is already a fix. Look here , and as I see it, you should perhaps abort the remaining WUs and start a bunch of new ones, so that you get the new app. As I miracously had success 'til now, I'll stick with my remaining WU. |
||
ID: 370 | Rating: 0 | rate: / | ||
So far, all 4 linux boxes have reported very quickly with the following error message. I see that I am not the only one experiencing this problem. Should I continue crunching, or wait for a fix? I *just now* downloaded the WUs, and they were 5.01. In any case, I aborted the remaining job on my PII (it has a queue of only 1 job), and the replacement that was downloaded was also 5.01. |
||
ID: 371 | Rating: 0 | rate: / | ||
I *just now* downloaded the WUs, and they were 5.01. So my questions over there weren't as dumb at all ;) And Nicolas was probably wrong. Let's ask again: How do we get the new application on our puters? |
||
ID: 374 | Rating: 0 | rate: / | ||
I *just now* downloaded the WUs, and they were 5.01. Try resetting the project, if just waiting for new WUs didn't work. |
||
ID: 377 | Rating: 0 | rate: / | ||
I just joined this project today. When I attached I got the 5.01 linux apps, not the 5.02
|
||
ID: 378 | Rating: 0 | rate: / | ||
Try resetting the project, if just waiting for new WUs didn't work. As you can see in the other posts here, it doesn't work. |
||
ID: 380 | Rating: 0 | rate: / | ||
Try resetting the project, if just waiting for new WUs didn't work. I gave it a shot, and nope, still downloads 5.01. Are we sure 5.02 really exists? And is it 5.02? Or 5.2? Over in the applications forum, it is listed as "charmm_5.2_i686-pc-linux-gnu". |
||
ID: 381 | Rating: 0 | rate: / | ||
It seems to exist . But I just remembered something: Message for admins: update_versions (which you use when you add a new app version) is supposed to touch a trigger file to make the feeder re-read the database, but for me it was creating it in ~/projects/reread_db instead of ~/projects/project_name/reread_db. If you see file there, move it, or try restarting project (./bin/stop;./bin/start). |
||
ID: 382 | Rating: 0 | rate: / | ||
(FWIW, the mac seems to be crunching successfully and should return results shortly.) Yep, they completed just fine, if anyone is interested. Both WUs took about 4800 seconds on my 2.17ghz macbook pro. |
||
ID: 384 | Rating: 0 | rate: / | ||
But I just remembered something: I tried to give this a shot, but there is no file anywhere on any of my linux boxes named reread_db, or that have "reread" anywhere in the name. ____________ Dublin, CA Team SETI.USA |
||
ID: 387 | Rating: 0 | rate: / | ||
But I just remembered something: I was talking about server :) When was last time you used update_versions? Edited original post. |
||
ID: 388 | Rating: 0 | rate: / | ||
I was talking about server :) When was last time you used update_versions? Okay, now you've lost me. You want me to change something on the Docking@Home server? FWIW, I am using boincmgr, not CLI. I have tried o purging the queue to force new downloads o resetting the project o detaching/reattaching All attempts result in 5.01, not 5.02 or 5.2. ____________ Dublin, CA Team SETI.USA |
||
ID: 390 | Rating: 0 | rate: / | ||
Okay, now you've lost me. You want me to change something on the Docking@Home server? I never addressed the message to you :) Admins read the forums as well. |
||
ID: 391 | Rating: 0 | rate: / | ||
Okay, now you've lost me. You want me to change something on the Docking@Home server? Ah. Got it. Back in message 382, you included a quote from a post of mine, so I thought you were addressing it to me. ____________ Dublin, CA Team SETI.USA |
||
ID: 393 | Rating: 0 | rate: / | ||
Please wait for a fix. We're working on that, but might take a while since it is a pretty weird problem: not all linux boxes experience it, only some.
Hello, |
||
ID: 395 | Rating: 0 | rate: / | ||
I fixed this problem today. Linux 5.2 and windows 5.3 will not cause this weird behavior anymore. The same fix for mac will soon be available too, but I have problems with the compiler right now.
I would have had the first credits yet, if the others had not errored out and errors were decided to be the valid result: |
||
ID: 396 | Rating: 0 | rate: / | ||
I've just check the D@H cgi logfile and see linux 502 apps being send out... so were good from our side I think. Try resetting or detaching/attaching the project as a last resort.
So far, all 4 linux boxes have reported very quickly with the following error message. I see that I am not the only one experiencing this problem. Should I continue crunching, or wait for a fix? |
||
ID: 397 | Rating: 0 | rate: / | ||
Saenger,
So far, all 4 linux boxes have reported very quickly with the following error message. I see that I am not the only one experiencing this problem. Should I continue crunching, or wait for a fix? |
||
ID: 398 | Rating: 0 | rate: / | ||
It's actually 5.2 (or 5.x for that matter), but the boinc client shows them as 502 (or 50x)
Try resetting the project, if just waiting for new WUs didn't work. |
||
ID: 399 | Rating: 0 | rate: / | ||
I've just check the D@H cgi logfile and see linux 502 apps being send out... so were good from our side I think. Try resetting or detaching/attaching the project as a last resort. Did so and succeded. charmm 5.02 is now waiting for crunching on 2 WUs. Yes, I only run Suse 9.2 I have already 10.1 on my desk as a DVD, but I have to wait for a free weekend to install it, as I don't want to loose any of my data. |
||
ID: 400 | Rating: 0 | rate: / | ||
I've just check the D@H cgi logfile and see linux 502 apps being send out... so were good from our side I think. Try resetting or detaching/attaching the project as a last resort. Interesting. @20:41 UTC, detaching/attaching resulted in 5.01. @ 21:05, detaching/attaching restulted in 5.02. Whatever the case, I have 5.02 running now. Thanks! ____________ Dublin, CA Team SETI.USA |
||
ID: 401 | Rating: 0 | rate: / | ||
Just a reminder: this is NOT a fix for the 'charmm exit 1' problem you see in your stderr.txt. This only fixed the problem were error results were validated successfully.
I've just check the D@H cgi logfile and see linux 502 apps being send out... so were good from our side I think. Try resetting or detaching/attaching the project as a last resort. |
||
ID: 402 | Rating: 0 | rate: / | ||
Now we're maybe getting somewhere... People who have 'charmm exit 1' in their stderr.txt files, please let me know what linux distro you are running; SuSE seems to be fine.
I've just check the D@H cgi logfile and see linux 502 apps being send out... so were good from our side I think. Try resetting or detaching/attaching the project as a last resort. |
||
ID: 403 | Rating: 0 | rate: / | ||
Just a reminder: this is NOT a fix for the 'charmm exit 1' problem you see in your stderr.txt. This only fixed the problem were error results were validated successfully. Ubuntu 6.06 (current version) with all updates installed. |
||
ID: 407 | Rating: 0 | rate: / | ||
Please also let us know if all wu's are erroring, or if you have some good ones on that box.
Just a reminder: this is NOT a fix for the 'charmm exit 1' problem you see in your stderr.txt. This only fixed the problem were error results were validated successfully. |
||
ID: 408 | Rating: 0 | rate: / | ||
I suggest moving this distro reporting to a separate thread; this one is long enough :) |
||
ID: 409 | Rating: 0 | rate: / | ||
Now we're maybe getting somewhere... People who have 'charmm exit 1' in their stderr.txt files, please let me know what linux distro you are running; Ubuntu 5.10 here. All giving the exit code 1. |
||
ID: 410 | Rating: 0 | rate: / | ||
Please also let us know if all wu's are erroring, or if you have some good ones on that box. So far, all are WUs are erroring, both 5.01 and 5.02. Should I suspend all linux work until this is resolved? Or am I helping anything by returning more error results? Thanks. |
||
ID: 411 | Rating: 0 | rate: / | ||
Good suggestion. Please report your distro here: http://docking.utep.edu/forum_thread.php?id=44
I suggest moving this distro reporting to a separate thread; this one is long enough :) ____________ D@H the greatest project in the world... a while from now! |
||
ID: 413 | Rating: 0 | rate: / | ||
Good suggestion. Please report your distro here: http://docking.utep.edu/forum_thread.php?id=44 For the lazy guys'n'gals: http://docking.utep.edu/forum_thread.php?id=44 |
||
ID: 415 | Rating: 0 | rate: / | ||
You can see I'm not a very experienced BBCode chatter (yet) :-)
Good suggestion. Please report your distro here: http://docking.utep.edu/forum_thread.php?id=44 ____________ D@H the greatest project in the world... a while from now! |
||
ID: 419 | Rating: 0 | rate: / | ||
Message boards : Number crunching : Bug reports for charmm 5.01
Database Error: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) { [0]=> array(7) { ["file"]=> string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc" ["line"]=> int(97) ["function"]=> string(8) "do_query" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#119 (2) { ["db_conn"]=> resource(240) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(49) "update DBNAME.thread set views=views+1 where id=7" } } [1]=> array(7) { ["file"]=> string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc" ["line"]=> int(60) ["function"]=> string(6) "update" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#119 (2) { ["db_conn"]=> resource(240) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(3) { [0]=> object(BoincThread)#3 (16) { ["id"]=> string(1) "7" ["forum"]=> string(1) "2" ["owner"]=> string(2) "15" ["status"]=> string(1) "0" ["title"]=> string(27) "Bug reports for charmm 5.01" ["timestamp"]=> string(10) "1158617046" ["views"]=> string(4) "2879" ["replies"]=> string(3) "113" ["activity"]=> string(20) "1.8081521500009e-129" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1158127419" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } [1]=> &string(6) "thread" [2]=> &string(13) "views=views+1" } } [2]=> array(7) { ["file"]=> string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php" ["line"]=> int(184) ["function"]=> string(6) "update" ["class"]=> string(11) "BoincThread" ["object"]=> object(BoincThread)#3 (16) { ["id"]=> string(1) "7" ["forum"]=> string(1) "2" ["owner"]=> string(2) "15" ["status"]=> string(1) "0" ["title"]=> string(27) "Bug reports for charmm 5.01" ["timestamp"]=> string(10) "1158617046" ["views"]=> string(4) "2879" ["replies"]=> string(3) "113" ["activity"]=> string(20) "1.8081521500009e-129" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1158127419" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(13) "views=views+1" } } }query: update docking.thread set views=views+1 where id=7