Yes, we know. Richard posted yesterday that 5.02 will only fix the excessive debugging info in stderr.txt. The fix for the -131 will be deployed today and is only a change in the input file of the app. All existing wu's will have to be aborted for this though. Keep an eye an the news for the next hours.
Andre
No luck with 5.02 and size of upload size (:-
Result is
here
14/09/2006 14:18:44|Docking@Home|Successfully attached to Docking@Home
14/09/2006 14:18:46|Docking@Home|Started download of file charmm_5.2_windows_intelx86
14/09/2006 14:18:46|Docking@Home|Started download of file 1tng_mod0001_1576_93911.inp
14/09/2006 14:18:57||Rescheduling CPU: result suspended, resumed or aborted by user
14/09/2006 14:18:58|Docking@Home|Finished download of file 1tng_mod0001_1576_93911.inp
14/09/2006 14:18:58|Docking@Home|Throughput 106435 bytes/sec
14/09/2006 14:18:58|Docking@Home|Started download of file grid_probes.rtf
14/09/2006 14:18:59|Docking@Home|Incomplete read of less than 5KB for grid_probes.rtf - truncating
14/09/2006 14:18:59|Docking@Home|Temporarily failed download of grid_probes.rtf: HTTP file not found
14/09/2006 14:18:59|Docking@Home|Giving up on download of grid_probes.rtf: file was not found on server
14/09/2006 14:18:59|Docking@Home|Started download of file lpdb_amino.rtf
14/09/2006 14:18:59|Docking@Home|Checksum or signature error for grid_probes.rtf
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_1576_93911_3 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_435_384075_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_436_462952_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_372_420443_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_373_78184_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 2 minutes and 7 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_374_272917_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 3 minutes and 33 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_2138_231828_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_1596_29389_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 12 minutes and 26 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_1597_308895_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 20 minutes and 21 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_1598_472366_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 41 minutes and 31 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_2139_344323_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
14/09/2006 14:19:00|Docking@Home|Incomplete read of less than 5KB for lpdb_amino.rtf - truncating
14/09/2006 14:19:00|Docking@Home|Temporarily failed download of lpdb_amino.rtf: HTTP file not found
14/09/2006 14:19:00|Docking@Home|Giving up on download of lpdb_amino.rtf: file was not found on server
14/09/2006 14:19:00|Docking@Home|Started download of file lpdb.prm
14/09/2006 14:19:00|Docking@Home|Checksum or signature error for lpdb_amino.rtf
14/09/2006 14:19:01|Docking@Home|Incomplete read of less than 5KB for lpdb.prm - truncating
14/09/2006 14:19:01|Docking@Home|Temporarily failed download of lpdb.prm: HTTP file not found
14/09/2006 14:19:01|Docking@Home|Giving up on download of lpdb.prm: file was not found on server
14/09/2006 14:19:01|Docking@Home|Started download of file lpdb_probes.prm
14/09/2006 14:19:01|Docking@Home|Checksum or signature error for lpdb.prm
14/09/2006 14:19:02|Docking@Home|Incomplete read of less than 5KB for lpdb_probes.prm - truncating
14/09/2006 14:19:02|Docking@Home|Temporarily failed download of lpdb_probes.prm: HTTP file not found
14/09/2006 14:19:02|Docking@Home|Giving up on download of lpdb_probes.prm: file was not found on server
14/09/2006 14:19:02|Docking@Home|Started download of file 1tng_mod0001_435_384075.inp
14/09/2006 14:19:02|Docking@Home|Checksum or signature error for lpdb_probes.prm
14/09/2006 14:19:14|Docking@Home|Finished download of file 1tng_mod0001_435_384075.inp
Don't know where that comes from yet. See it on our test system as well. I'm looking into it. Thanks.
Andre
Tried downloading wu's but got the following:
14/09/2006 14:18:44|Docking@Home|Successfully attached to Docking@Home
14/09/2006 14:18:46|Docking@Home|Started download of file charmm_5.2_windows_intelx86
14/09/2006 14:18:46|Docking@Home|Started download of file 1tng_mod0001_1576_93911.inp
14/09/2006 14:18:57||Rescheduling CPU: result suspended, resumed or aborted by user
14/09/2006 14:18:58|Docking@Home|Finished download of file 1tng_mod0001_1576_93911.inp
14/09/2006 14:18:58|Docking@Home|Throughput 106435 bytes/sec
14/09/2006 14:18:58|Docking@Home|Started download of file grid_probes.rtf
14/09/2006 14:18:59|Docking@Home|Incomplete read of less than 5KB for grid_probes.rtf - truncating
14/09/2006 14:18:59|Docking@Home|Temporarily failed download of grid_probes.rtf: HTTP file not found
14/09/2006 14:18:59|Docking@Home|Giving up on download of grid_probes.rtf: file was not found on server
14/09/2006 14:18:59|Docking@Home|Started download of file lpdb_amino.rtf
14/09/2006 14:18:59|Docking@Home|Checksum or signature error for grid_probes.rtf
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_1576_93911_3 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_435_384075_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_436_462952_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_372_420443_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_373_78184_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 2 minutes and 7 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_374_272917_4 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 3 minutes and 33 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_2138_231828_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_1596_29389_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 12 minutes and 26 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_1597_308895_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 20 minutes and 21 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_1598_472366_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 41 minutes and 31 seconds
14/09/2006 14:19:00|Docking@Home|Unrecoverable error for result 1tng_mod0001_2139_344323_2 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
14/09/2006 14:19:00|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
14/09/2006 14:19:00|Docking@Home|Incomplete read of less than 5KB for lpdb_amino.rtf - truncating
14/09/2006 14:19:00|Docking@Home|Temporarily failed download of lpdb_amino.rtf: HTTP file not found
14/09/2006 14:19:00|Docking@Home|Giving up on download of lpdb_amino.rtf: file was not found on server
14/09/2006 14:19:00|Docking@Home|Started download of file lpdb.prm
14/09/2006 14:19:00|Docking@Home|Checksum or signature error for lpdb_amino.rtf
14/09/2006 14:19:01|Docking@Home|Incomplete read of less than 5KB for lpdb.prm - truncating
14/09/2006 14:19:01|Docking@Home|Temporarily failed download of lpdb.prm: HTTP file not found
14/09/2006 14:19:01|Docking@Home|Giving up on download of lpdb.prm: file was not found on server
14/09/2006 14:19:01|Docking@Home|Started download of file lpdb_probes.prm
14/09/2006 14:19:01|Docking@Home|Checksum or signature error for lpdb.prm
14/09/2006 14:19:02|Docking@Home|Incomplete read of less than 5KB for lpdb_probes.prm - truncating
14/09/2006 14:19:02|Docking@Home|Temporarily failed download of lpdb_probes.prm: HTTP file not found
14/09/2006 14:19:02|Docking@Home|Giving up on download of lpdb_probes.prm: file was not found on server
14/09/2006 14:19:02|Docking@Home|Started download of file 1tng_mod0001_435_384075.inp
14/09/2006 14:19:02|Docking@Home|Checksum or signature error for lpdb_probes.prm
14/09/2006 14:19:14|Docking@Home|Finished download of file 1tng_mod0001_435_384075.inp
All existing workunits have been cancelled and about 500 new workunits have been created. These take about 1.5 hours on a P4 3.2 GHz and 2.5 hours on a Celeron 2 GHz. Please reset your project or detach and re-attach to start crunching the new wu's. Thanks for all the help!
Have tried downloading some of the new wu's and got the following again:
15/09/2006 05:17:51|Docking@Home|Started download of file grid_probes.rtf
15/09/2006 05:17:52|Docking@Home|Incomplete read of less than 5KB for grid_probes.rtf - truncating
15/09/2006 05:17:52|Docking@Home|Temporarily failed download of grid_probes.rtf: HTTP file not found
15/09/2006 05:17:52|Docking@Home|Giving up on download of grid_probes.rtf: file was not found on server
15/09/2006 05:17:52|Docking@Home|Started download of file lpdb_amino.rtf
15/09/2006 05:17:52|Docking@Home|Checksum or signature error for grid_probes.rtf
15/09/2006 05:17:53||Rescheduling CPU: project suspended by user
15/09/2006 05:17:54|Docking@Home|Unrecoverable error for result 1tng_mod0001_63_378078_0 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:17:54|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
15/09/2006 05:17:54|Docking@Home|Unrecoverable error for result 1tng_mod0001_64_448585_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:17:54|Docking@Home|Unrecoverable error for result 1tng_mod0001_65_284584_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:17:54|Docking@Home|Unrecoverable error for result 1tng_mod0001_66_241073_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:17:54|Docking@Home|Unrecoverable error for result 1tng_mod0001_67_149037_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:17:54|Docking@Home|Unrecoverable error for result 1tng_mod0001_68_373871_0 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:17:54|Docking@Home|Unrecoverable error for result 1tng_mod0001_69_430468_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:17:54|Docking@Home|Incomplete read of less than 5KB for lpdb_amino.rtf - truncating
15/09/2006 05:17:54|Docking@Home|Temporarily failed download of lpdb_amino.rtf: HTTP file not found
15/09/2006 05:17:54|Docking@Home|Giving up on download of lpdb_amino.rtf: file was not found on server
15/09/2006 05:17:54|Docking@Home|Started download of file lpdb.prm
15/09/2006 05:17:54|Docking@Home|Checksum or signature error for lpdb_amino.rtf
15/09/2006 05:17:55|Docking@Home|Incomplete read of less than 5KB for lpdb.prm - truncating
15/09/2006 05:17:55|Docking@Home|Temporarily failed download of lpdb.prm: HTTP file not found
15/09/2006 05:17:55|Docking@Home|Giving up on download of lpdb.prm: file was not found on server
15/09/2006 05:17:55|Docking@Home|Started download of file lpdb_probes.prm
15/09/2006 05:17:55|Docking@Home|Checksum or signature error for lpdb.prm
15/09/2006 05:17:56|Docking@Home|Incomplete read of less than 5KB for lpdb_probes.prm - truncating
15/09/2006 05:17:56|Docking@Home|Temporarily failed download of lpdb_probes.prm: HTTP file not found
15/09/2006 05:17:56|Docking@Home|Giving up on download of lpdb_probes.prm: file was not found on server
15/09/2006 05:17:56|Docking@Home|Started download of file 1tng_mod0001_64_448585.inp
15/09/2006 05:17:56|Docking@Home|Checksum or signature error for lpdb_probes.prm
15/09/2006 05:18:42|Docking@Home|Started download of file grid_probes.rtf
15/09/2006 05:18:42|Docking@Home|Started download of file lpdb_amino.rtf
15/09/2006 05:18:43|Docking@Home|Incomplete read of less than 5KB for grid_probes.rtf - truncating
15/09/2006 05:18:43|Docking@Home|Incomplete read of less than 5KB for lpdb_amino.rtf - truncating
15/09/2006 05:18:43|Docking@Home|Temporarily failed download of grid_probes.rtf: HTTP file not found
15/09/2006 05:18:43|Docking@Home|Giving up on download of grid_probes.rtf: file was not found on server
15/09/2006 05:18:43|Docking@Home|Temporarily failed download of lpdb_amino.rtf: HTTP file not found
15/09/2006 05:18:43|Docking@Home|Giving up on download of lpdb_amino.rtf: file was not found on server
15/09/2006 05:18:43|Docking@Home|Started download of file lpdb.prm
15/09/2006 05:18:43|Docking@Home|Started download of file lpdb_probes.prm
15/09/2006 05:18:43|Docking@Home|Checksum or signature error for grid_probes.rtf
15/09/2006 05:18:43|Docking@Home|Checksum or signature error for lpdb_amino.rtf
15/09/2006 05:18:44|Docking@Home|Unrecoverable error for result 1tng_mod0001_70_156337_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error><file_xfer_error> <file_name>lpdb_amino.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:18:44|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
15/09/2006 05:18:44|Docking@Home|Unrecoverable error for result 1tng_mod0001_71_34287_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error><file_xfer_error> <file_name>lpdb_amino.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:18:44|Docking@Home|Unrecoverable error for result 1tng_mod0001_72_402547_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error><file_xfer_error> <file_name>lpdb_amino.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:18:44|Docking@Home|Unrecoverable error for result 1tng_mod0001_73_337348_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error><file_xfer_error> <file_name>lpdb_amino.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:18:44|Docking@Home|Unrecoverable error for result 1tng_mod0001_74_407127_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error><file_xfer_error> <file_name>lpdb_amino.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:18:44|Docking@Home|Deferring scheduler requests for 1 minutes and 41 seconds
15/09/2006 05:18:44|Docking@Home|Unrecoverable error for result 1tng_mod0001_75_247213_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error><file_xfer_error> <file_name>lpdb_amino.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:18:44|Docking@Home|Deferring scheduler requests for 2 minutes and 7 seconds
15/09/2006 05:18:44|Docking@Home|Unrecoverable error for result 1tng_mod0001_76_306711_1 (WU download error: couldn't get input files:<file_xfer_error> <file_name>grid_probes.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error><file_xfer_error> <file_name>lpdb_amino.rtf</file_name> <error_code>-163</error_code> <error_message>file was not found on server</error_message></file_xfer_error>)
15/09/2006 05:18:44|Docking@Home|Deferring scheduler requests for 15 minutes and 27 seconds
15/09/2006 05:18:44|Docking@Home|Incomplete read of less than 5KB for lpdb.prm - truncating
15/09/2006 05:18:44|Docking@Home|Incomplete read of less than 5KB for lpdb_probes.prm - truncating
15/09/2006 05:18:44|Docking@Home|Temporarily failed download of lpdb.prm: HTTP file not found
15/09/2006 05:18:44|Docking@Home|Giving up on download of lpdb.prm: file was not found on server
15/09/2006 05:18:44|Docking@Home|Temporarily failed download of lpdb_probes.prm: HTTP file not found
15/09/2006 05:18:44|Docking@Home|Giving up on download of lpdb_probes.prm: file was not found on server
15/09/2006 05:18:44|Docking@Home|Checksum or signature error for lpdb.prm
15/09/2006 05:18:44|Docking@Home|Checksum or signature error for lpdb_probes.prm
<core_client_version>5.4.9</core_client_version>
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>grid_probes.rtf</file_name>
<error_code>-163</error_code>
<error_message>file was not found on server</error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>lpdb_amino.rtf</file_name>
<error_code>-163</error_code>
<error_message>file was not found on server</error_message>
</file_xfer_error>
</message>
I have reset the project, but Boinc still seems to d/l the 5.01 app and work.
EDIT: Out of curiosity I attached a Windows machines as well. There it downloads app 5.02, but I get the same type of download errors.
Same here, I have a P4 3.2 (on Windows) and it's showing 56% with 1:45 cpu time. Had to switch back to another project so this won't complete any time soon.
Iam watching many write to disk activities with the app 5.02. Can you check this? I think it will also ignore the preferences. The worst things are gone :)
Yes, I'm getting ~2GB of disk reads every hour on each charmm_5.2
That sayd and with 2 results sucessfully upload, excessive debug info in stderr.txt have been eliminated, but excessive disk reads remains.
Also, memory usage went up from 13 to ~35 MB. It is low memory usage but I thought that extra 20 mega would help to eliminate excessive disk reads...
Just returned 2 successful WUs. No crunching errors, no file errors.
However, I did notice that the reported time (12,550 sec approx.) does not agree with the message log run duration of 4:10 approx. for both WUs There's about 40 minutes missing. The tasks ran without switching, and nothing else running on the PC.
The 4:10 times I'm getting are on an XP2600 running W2K, quite a bit more that the estimate of 2 hours on a Celeron 2GHz.
The estimated run time that is embedded in the WU when downloaded is still way out of sync with real run time, but the DCF seems to be working and adjusting the times of the remaining WUs in my queue. The low initial runtime estimate (28 minutes, if I recall?) still causes queues to be overfilled.
I have found the problem that is causing this error. It seems that our resultCollector (a fancy name for the file_deleter that does a little bit more), is removing a couple of files consistently from the download directory, so that you see the error below in your logs. I have fixed this for any new wu's that will be created (I've put the no_delete flag in the workunit template for these files) but for the current ones I am still looking for a good solution, because I think that boinc doesn't allow what we are currently doing with out files.
<core_client_version>5.4.9</core_client_version>
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>grid_probes.rtf</file_name>
<error_code>-163</error_code>
<error_message>file was not found on server</error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>lpdb_amino.rtf</file_name>
<error_code>-163</error_code>
<error_message>file was not found on server</error_message>
</file_xfer_error>
</message>
I have reset the project, but Boinc still seems to d/l the 5.01 app and work.
EDIT: Out of curiosity I attached a Windows machines as well. There it downloads app 5.02, but I get the same type of download errors.
That's interesting. I checked my machines again and every result that I crunch takes 2.5 hours on a 2 GHz celeron and 1.5 hours on a 3.2 GHz P4. The difference is I am running Linux and we all now that Linux is a bit more performant than Windows... but 2 hours is a quite a difference.... We should try to get more data on this. Can any of the other Linux guys/girls comment on this?
Andre
Just returned 2 successful WUs. No crunching errors, no file errors.
However, I did notice that the reported time (12,550 sec approx.) does not agree with the message log run duration of 4:10 approx. for both WUs There's about 40 minutes missing. The tasks ran without switching, and nothing else running on the PC.
The 4:10 times I'm getting are on an XP2600 running W2K, quite a bit more that the estimate of 2 hours on a Celeron 2GHz.
The estimated run time that is embedded in the WU when downloaded is still way out of sync with real run time, but the DCF seems to be working and adjusting the times of the remaining WUs in my queue. The low initial runtime estimate (28 minutes, if I recall?) still causes queues to be overfilled.
Iam watching many write to disk activities with the app 5.02. Can you check this? I think it will also ignore the preferences. The worst things are gone :)
Since I used up my daily quota on both machines, I d/l some new work in a Vista virtual machine. The d/l now works fine, and the WU start off full of enthousiasm :) It looks like the problem is solved. Unfortunately my Vista installation is so sluggish and eats so much of my machine's resources, even when idle, that I won't let the WU run till the end. I hope you don't mind <blush>
Should be better now after my temporary fix. Next problem we'll look at is the excessive disk writing. Richard has started on this already this afternoon.
Thanks for all of your patience :-)
Andre
One successful returned finally WOOT!
no transfer problem
your on the right track now
how's the app working for ya,running O.K.?
I quess I'm the first to try this, I checked boincstats and saw no Windows 98 hosts.
I attached my Windows 98 host, went off for breakfast and came back it was still downloading I thought, No it was downloading another workunit after chewing thru 16 others at 5 seconds each.
The error in the result is:
<core_client_version>5.4.11</core_client_version>
<stderr_txt>
Starting charmm run...
CHARMM.OUT OPEN ERROR - Charmm exited with code 2.
Calling BOINC finish.
Is the 5.02 version of Docking@home just for Windows? My Linux machine still only downloads 5.01 workunits and they all take about 3 1/2 minutes. All have the same error code even though they say they are successful.
Error message is
Starting charmm run...
ERROR - Charmm exited with code 1.
That's about 320 total units with about 112 aborted and the rest 5.01 WU's that have all taken 3 1/2 minutes. I can't get 5.02 WU's and no 5.01 WU will run for 5 minutes let alone 2 to 4 hours.
The part about the Version numbers I have got the answer from another thead from Andre, so don't worry about the 5.02 and 5.01 thing, 5.02 is for Windows and 5.01 is for Linux and Macs.
The rest of my last post should be in the "problems with 5.01" thread, so sorry about that.
@ JShadic - No heartbeat means that BOINC core is having trouble finiding science application alive. Application should send "alive" message periodically.
It can happen when another task takes too much CPU cycles hence BOINC project application doesn't get any since it runs on low priority.
Or, it can happen on windows machines when the clock on XP is updated to the correct time (done automatically in Windows), and BOINC core gets out of sync with application.
More on wiki
http://boinc-wiki.ath.cx/index.php?title=No_heartbeat_from_core_client_-_exiting
@ JShadic - No heartbeat means that BOINC core is having trouble finiding science application alive. Application should send "alive" message periodically.
Note that this is done automatically by BOINC library, it's not something the application should so 'by hand'.
Or, it can happen on windows machines when the clock on XP is updated to the correct time (done automatically in Windows), and BOINC core gets out of sync with application.
Wonder what will happen if clock goes *backwards* when it updates! "No heartbeat for -10 seconds"? :D
@ JShadic - No heartbeat means that BOINC core is having trouble finiding science application alive. Application should send "alive" message periodically.
Note that this is done automatically by BOINC library, it's not something the application should so 'by hand'.
Or, it can happen on windows machines when the clock on XP is updated to the correct time (done automatically in Windows), and BOINC core gets out of sync with application.
Wonder what will happen if clock goes *backwards* when it updates! "No heartbeat for -10 seconds"? :D
EDIT: any mod around to delete my doublepost? :(
Yes, 5.2 is only for windows. I am currently working on a 5.2 for linux, because some of you report app crashes on linux (not everybody and I cannot reproduce any of these crashes on the system in my test lab). We also will release a fix for the validation problem soon. So many new versions to come.
Thanks
Andre
Is the 5.02 version of Docking@home just for Windows? My Linux machine still only downloads 5.01 workunits and they all take about 3 1/2 minutes. All have the same error code even though they say they are successful.
Error message is
Starting charmm run...
ERROR - Charmm exited with code 1.
That's about 320 total units with about 112 aborted and the rest 5.01 WU's that have all taken 3 1/2 minutes. I can't get 5.02 WU's and no 5.01 WU will run for 5 minutes let alone 2 to 4 hours.
We haven't tested on win98 for the simple reason we don't have a system like that and have a hard time finding cd's to set one up. We could definitely use some help in that corner. Seems that our app cannot open its logfile charmm.out on your box. That will be a hard problem to solve since we don't even have a logfile.. Are the permissions set right on the boinc directory? (projects and or slots)
Thanks
Andre
I quess I'm the first to try this, I checked boincstats and saw no Windows 98 hosts.
I attached my Windows 98 host, went off for breakfast and came back it was still downloading I thought, No it was downloading another workunit after chewing thru 16 others at 5 seconds each.
The error in the result is:
<core_client_version>5.4.11</core_client_version>
<stderr_txt>
Starting charmm run...
CHARMM.OUT OPEN ERROR - Charmm exited with code 2.
Calling BOINC finish.
We haven't tested on win98 for the simple reason we don't have a system like that and have a hard time finding cd's to set one up. We could definitely use some help in that corner. Seems that our app cannot open its logfile charmm.out on your box. That will be a hard problem to solve since we don't even have a logfile.. Are the permissions set right on the boinc directory? (projects and or slots)
Thanks
Andre
I can give you full access via VNC to a Win98 host (virtual machine). Although it's Spanish version of Windows...
We haven't tested on win98 for the simple reason we don't have a system like that and have a hard time finding cd's to set one up. We could definitely use some help in that corner. Seems that our app cannot open its logfile charmm.out on your box. That will be a hard problem to solve since we don't even have a logfile.. Are the permissions set right on the boinc directory? (projects and or slots)
Thanks
Andre
I've never had a problem like this on any of my 6 or 7 windows 98 hosts running any other BOINC projects/applications. They all run other BOINC projects without ever having to set any permissions so I don't know about that.
We haven't tested on win98 for the simple reason we don't have a system like that...(snip) Are the permissions set right on the boinc directory? (projects and or slots)
Thanks
Andre
I've never had a problem like this on any of my 6 or 7 windows 98 hosts running any other BOINC projects/applications. They all run other BOINC projects without ever having to set any permissions so I don't know about that.
Windows 9x doesn't even have a permissions/ownership system. That's only on NT-based Windows versions.
I suspect that your other WUs weren't successful, but only validated successful because of a bug in 5.1. Do you have any result numbers for us to check?
Thanks
Andre
My linux box just updated the application from 5.01 to 5.02. 5.01 was running just fine. all WU's that it crunched completed successfully.
5.02, on the other hand, is giving me the following error.
9/19/2006 6:31:22 AM Unrecoverable error for result 1tng_mod0001_1123_110866_3 (process exited with code 1 (0x1))
Any other info needed?
____________
D@H the greatest project in the world... a while from now!
I had picked out several wu's and listed them individually, but I inadvertently hit the "back" button, which erased my post before it got sent. :( oh well. Here is some of the info.
Anything else needed?
I suspect that your other WUs weren't successful, but only validated successful because of a bug in 5.1. Do you have any result numbers for us to check?
Thanks
Andre
My linux box just updated the application from 5.01 to 5.02. 5.01 was running just fine. all WU's that it crunched completed successfully.
5.02, on the other hand, is giving me the following error.
9/19/2006 6:31:22 AM Unrecoverable error for result 1tng_mod0001_1123_110866_3 (process exited with code 1 (0x1))
We have finally found the cause of the problem that some users were experiencing on their Linux systems. It has to do with the stacksize setting on your machine which is for some distros (SuSE 9.3 and 10 for example) set to unlimited and for others (FCx, Ubuntu, etc) set to a limited value like 10240. Your setting can be seen by typing 'ulimit -s' in a terminal. To make the Charmm 'exit 1' errors go away, please set the stacksize to unlimited using the command 'ulimit -s unlimited'. This is not saying that Charmm will use all of your memory (it won't), but it gives us a little bit more space to do our simulations correctly and without errors. Please let us know if this does not work for you. If it does work, please add this command to your shell initialization file (.bashrc, .tcshrc, .kshrc, etc) in your home directory. Of course don't forget to resume the D@H project on your boincmgr in case you suspended it before.
I had picked out several wu's and listed them individually, but I inadvertently hit the "back" button, which erased my post before it got sent. :( oh well. Here is some of the info.
Anything else needed?
I suspect that your other WUs weren't successful, but only validated successful because of a bug in 5.1. Do you have any result numbers for us to check?
Thanks
Andre
My linux box just updated the application from 5.01 to 5.02. 5.01 was running just fine. all WU's that it crunched completed successfully.
5.02, on the other hand, is giving me the following error.
9/19/2006 6:31:22 AM Unrecoverable error for result 1tng_mod0001_1123_110866_3 (process exited with code 1 (0x1))
Any other info needed?
____________
D@H the greatest project in the world... a while from now!
I am running a test now. The default ulimit was set to 8192 for Ubuntu 5.10. The test is now beyond the point where it used to exit (now close to 10%, where it used to exit at 4%), so it's looking good. Unfortunately I have to leave now, I will only see in the morning if it really ran to the end.
Remains the question why the program needs that much stack space. 8192 KB is a lot! How does the program come to that high usage? Is there very deep recursion in the coding? Or large chunks of memory that are put on the stack instead of allocating them from the heap?
____________
I suspect that charmm is using stack space to allocate memory instead of the heap. Or maybe it uses both I'm not sure. Also it's a piece of fortran code that is under development for more than 30 years now which doesn't make it easier to analyze ;-) We will get back to the charmm developers (a whole different community) to ask why the stack. For now there's not too much we can do except asking people to increase their stacksize.
Thanks
Andre
Remains the question why the program needs that much stack space. 8192 KB is a lot! How does the program come to that high usage? Is there very deep recursion in the coding? Or large chunks of memory that are put on the stack instead of allocating them from the heap?
____________
D@H the greatest project in the world... a while from now!
We have finally found the cause of the problem that some users were experiencing on their Linux systems. It has to do with the stacksize setting on your machine which is for some distros (SuSE 9.3 and 10 for example) set to unlimited and for others (FCx, Ubuntu, etc) set to a limited value like 10240. Your setting can be seen by typing 'ulimit -s' in a terminal. To make the Charmm 'exit 1' errors go away, please set the stacksize to unlimited using the command 'ulimit -s unlimited'. This is not saying that Charmm will use all of your memory (it won't), but it gives us a little bit more space to do our simulations correctly and without errors. Please let us know if this does not work for you. If it does work, please add this command to your shell initialization file (.bashrc, .tcshrc, .kshrc, etc) in your home directory. Of course don't forget to resume the D@H project on your boincmgr in case you suspended it before.
Where is ulimit located? I am getting "command not found", and I cannot locate it anywhere.
Just a thought. You said to put that command in the shell initialization file. But, that won't work for me. I have BOINC runninng as a daemon. I am actually rarely logged into either one of my linux boxes.
It will work even as a daemon: in the boinc start script or init script (or whatever means you use to start), put this command before you start the actual boinc process. Make sure to set the stack limit for the user that boinc runs under.
Andre
PS We are looking for a better solution (one where this hack on the user side is not necessary), but for now this is the workaround.
Just a thought. You said to put that command in the shell initialization file. But, that won't work for me. I have BOINC runninng as a daemon. I am actually rarely logged into either one of my linux boxes.
____________
D@H the greatest project in the world... a while from now!
Sorry for all the NOOB questions. The reason i am running these linux machines is to get a better handle on LInux, and it is working, slowly but surely.
Ok, now on to my question..
In my init script, how do i set the stack limit for a particular user?
It will work even as a daemon: in the boinc start script or init script (or whatever means you use to start), put this command before you start the actual boinc process. Make sure to set the stack limit for the user that boinc runs under.
Andre
PS We are looking for a better solution (one where this hack on the user side is not necessary), but for now this is the workaround.
Just a thought. You said to put that command in the shell initialization file. But, that won't work for me. I have BOINC runninng as a daemon. I am actually rarely logged into either one of my linux boxes.
What distro are you running and which shell do you use?
Ubuntu 6.06, tcsh
Thanks.
Okay, you asking about the shell got me thinking that this is a bash-only command. So I changed my shell to bash, added the line to my .bashrc, and then rebooted for good measure.
I have boincmgr set to run at login via the sessions manager, so it started automatically. After 3 minutes or so, the WUs failed in the usual way.
I fired up a terminal and checked, yep, "unlimited". Everything looks right there.
So I quit boincmgr, went to the GUI filemanager, and double clicked boincmgr to start it again. After 3 minutes or so, the WUs still failed in the usual way.
So I quit boinc manager again, went back to the terminal, and launched boincmgr from the command line. This time, it appears to have worked. It's up to 9 minutes now.
Issues with this solution:
1) I don't like bash
2) This won't work when there is a power failure, as I have my machines set to automatically boot, log in, and run boincmgr. And if I have to launch it manually from the command line, it won't get fixed until whenever I notice and get back to the machine.
____________
Dublin, CA
Team
SETI.USA
1) On tcsh the command is called 'limit' and you set stacksize to unlimited with 'limit stacksize unlimited'. For ksh it is 'ulimit'.
Edit - I've updated the front page news as well with this info.
2) I never automatically boot, log in as a certain user, and run an app, so I don't know how this works. But somehow it must be possible to set your stack to unlimited. Could you run the ./run_manager script that comes standard with boinc? You could add the 'limit' command to that script and use that to fire up boincmgr either from the commandline or by clicking on it.
Let me know if that works.
Andre
What distro are you running and which shell do you use?
Ubuntu 6.06, tcsh
Thanks.
Okay, you asking about the shell got me thinking that this is a bash-only command. So I changed my shell to bash, added the line to my .bashrc, and then rebooted for good measure.
I have boincmgr set to run at login via the sessions manager, so it started automatically. After 3 minutes or so, the WUs failed in the usual way.
I fired up a terminal and checked, yep, "unlimited". Everything looks right there.
So I quit boincmgr, went to the GUI filemanager, and double clicked boincmgr to start it again. After 3 minutes or so, the WUs still failed in the usual way.
So I quit boinc manager again, went back to the terminal, and launched boincmgr from the command line. This time, it appears to have worked. It's up to 9 minutes now.
Issues with this solution:
1) I don't like bash
2) This won't work when there is a power failure, as I have my machines set to automatically boot, log in, and run boincmgr. And if I have to launch it manually from the command line, it won't get fixed until whenever I notice and get back to the machine.
____________
D@H the greatest project in the world... a while from now!
1) On tcsh the command is called 'limit' and you set stacksize to unlimited with 'limit stacksize unlimited'. For ksh it is 'ulimit'.
Thanks!
2) I never automatically boot, log in as a certain user, and run an app, so I don't know how this works. But somehow it must be possible to set your stack to unlimited. Could you run the ./run_manager script that comes standard with boinc? You could add the 'limit' command to that script and use that to fire up boincmgr either from the commandline or by clicking on it.
yikes. I'm afraid that is beyond my skills. But a thought occured to me. perhaps I can start boincmgr from one of the .cshrc/.bashrc files. Let me try that.
____________
Dublin, CA
Team
SETI.USA
Sorry, it's beyond my linux skills too :(
I opened a linux "session" (terminal) under root.
Typed : ulimit - s, result is 8192
I changed ulimit to unlimited than restarted boinc.
I Typed again to make sure : ulimit - s, result is "unlimited"
I closed (exit) the session
Same problem...
So, I opened a session
I typed : ulimit -s
Bloody hell ! the value is 8192 again
After many years, Windows made me stupid, I'm affraid :(
My knowledge of linux is very poor, and I'm not sure I want to learn those crazy commands. It's chinese for me...
____________
In the terminal where you type 'ulimit -s unlimited' also start the boincmgr process. Every terminal that you open will have the setting 8192 again unless you put that command in a file called .bashrc in your home directory. That file can be edited with any GUI editor (doesn't have to be vi ;-)
The other method is editing the file run_manager in your BOINC directory and add the line there:
ulimit -s unlimited
cd "/data/BOINC" && exec ./boincmgr $@
Than use the command run_manager to start boinc.
Hope that makes it a little clearer...
Andre
Sorry, it's beyond my linux skills too :(
I opened a linux "session" (terminal) under root.
Typed : ulimit - s, result is 8192
I changed ulimit to unlimited than restarted boinc.
I Typed again to make sure : ulimit - s, result is "unlimited"
I closed (exit) the session
Same problem...
So, I opened a session
I typed : ulimit -s
Bloody hell ! the value is 8192 again
After many years, Windows made me stupid, I'm affraid :(
My knowledge of linux is very poor, and I'm not sure I want to learn those crazy commands. It's chinese for me...
____________
D@H the greatest project in the world... a while from now!
2) I never automatically boot, log in as a certain user, and run an app, so I don't know how this works. But somehow it must be possible to set your stack to unlimited. Could you run the ./run_manager script that comes standard with boinc? You could add the 'limit' command to that script and use that to fire up boincmgr either from the commandline or by clicking on it.
yikes. I'm afraid that is beyond my skills. But a thought occured to me. perhaps I can start boincmgr from one of the .cshrc/.bashrc files. Let me try that.
I figured how to do it afterall. I added the "limit stacksize unlimited" as the first line in run_manager (my shell is tcsh). Then I went into the sessions -> startup items.
Great! Happy to hear that :-)
Linux is not so hard after all, but watch out you might get addicted to it ;-)
Andre
2) I never automatically boot, log in as a certain user, and run an app, so I don't know how this works. But somehow it must be possible to set your stack to unlimited. Could you run the ./run_manager script that comes standard with boinc? You could add the 'limit' command to that script and use that to fire up boincmgr either from the commandline or by clicking on it.
yikes. I'm afraid that is beyond my skills. But a thought occured to me. perhaps I can start boincmgr from one of the .cshrc/.bashrc files. Let me try that.
I figured how to do it afterall. I added the "limit stacksize unlimited" as the first line in run_manager (my shell is tcsh). Then I went into the sessions -> startup items.
Deleted boincmgr
added run_manager
Works like a charm!
____________
D@H the greatest project in the world... a while from now!
In the terminal where you type 'ulimit -s unlimited' also start the boincmgr process. Every terminal that you open will have the setting 8192 again unless you put that command in a file called .bashrc in your home directory. That file can be edited with any GUI editor (doesn't have to be vi ;-)
The other method is editing the file run_manager in your BOINC directory and add the line there:
ulimit -s unlimited
cd "/data/BOINC" && exec ./boincmgr $@
I am running Gentoo Linux and have BOINC starting as a daemon at boot under a user called boinc. I installed BOINC from portage (Gentoo ebuild) and am running version 5.5.6.
To implement this work around, I su'd to boinc and edited /home/boinc/.bashrc
I added the line "ulimit -s unlimited" to the start of this script before it checks whether it is running an interactive shell (so that it takes this setting either way). I confirmed that the setting was holding by closing all terminal windows, opening a new window, su'ing to boinc and running ulimit
Maybe you could put the ulimit command in the startup script before the daemon is started?
Any other gentoo boxes out there who still have the problem? If not, how did you solve it?
Thanks
Andre
Hi Andre,
I am running Gentoo Linux and have BOINC starting as a daemon at boot under a user called boinc. I installed BOINC from portage (Gentoo ebuild) and am running version 5.5.6.
To implement this work around, I su'd to boinc and edited /home/boinc/.bashrc
I added the line "ulimit -s unlimited" to the start of this script before it checks whether it is running an interactive shell (so that it takes this setting either way). I confirmed that the setting was holding by closing all terminal windows, opening a new window, su'ing to boinc and running ulimit
Where is ulimit located? I am getting "command not found", and I cannot locate it anywhere.
I'm having the same problem, running Xubuntu (XFCE uses Terminal). Ulimit, limit, none of them work! :-(
Additionally, this is what happened when I've crunched my first WU:
Wed 20 Sep 2006 05:36:36 PM AST|Docking@Home|Starting task 1tng_mod0001_1530_1466_4 using charmm version 502
Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Unrecoverable error for result 1tng_mod0001_1530_1466_4 (process exited with code 1 (0x1))
Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
Wed 20 Sep 2006 05:40:44 PM AST||Rescheduling CPU: application exited
Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Computation for task 1tng_mod0001_1530_1466_4 finished
ulimit is a shell parameter for bash (and other shells). I am not familiar with Xubuntu/XFCE, but its shell should have a similar parameter. You can try looking the man pages for references to the stack size.
I'm not sure if the error you are getting is related, but I will look into this too. Thanks
Lost power overnight (damn storms!) and my UPS shut down my computers... hopefully will work to report soon enough.
Paul.
[off-topic]At least you have an UPS! Yesterday, AND the day before, I had 30-second long power outages at night. Both times VMware was running, so that's two operating systems being shut down uncleanly. And only one of the times I could be bothered to turn computer back on and strat up everything again, so lost CPU time (BOINC) and download time.[/off-topic]
back on topic... Looks like adding the ulimit command to the startup script did the job. [url=http://docking.utep.edu/result.php?resultid=20675]This Results[/url looks to have completed successfully!
Where is ulimit located? I am getting "command not found", and I cannot locate it anywhere.
I'm having the same problem, running Xubuntu (XFCE uses Terminal). Ulimit, limit, none of them work! :-(
Additionally, this is what happened when I've crunched my first WU:
Wed 20 Sep 2006 05:36:36 PM AST|Docking@Home|Starting task 1tng_mod0001_1530_1466_4 using charmm version 502
Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Unrecoverable error for result 1tng_mod0001_1530_1466_4 (process exited with code 1 (0x1))
Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
Wed 20 Sep 2006 05:40:44 PM AST||Rescheduling CPU: application exited
Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Computation for task 1tng_mod0001_1530_1466_4 finished
ulimit is a shell parameter for bash (and other shells). I am not familiar with Xubuntu/XFCE, but its shell should have a similar parameter. You can try looking the man pages for references to the stack size.
I'm not sure if the error you are getting is related, but I will look into this too. Thanks
Sorry to bother again, but I can't for the life of me find anything even remotely related to that command. I've looked *everywhere*... :-(
Oh, and the same thing happened with a second WU now:
Fri 22 Sep 2006 12:21:01 PM AST|Docking@Home|Starting task 1tng_mod0001_1127_24451_6 using charmm version 502
Fri 22 Sep 2006 12:25:26 PM AST|Docking@Home|Unrecoverable error for result 1tng_mod0001_1127_24451_6 (process exited with code 1 (0x1))
Fri 22 Sep 2006 12:25:26 PM AST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
Fri 22 Sep 2006 12:25:26 PM AST||Rescheduling CPU: application exited
Fri 22 Sep 2006 12:25:26 PM AST|Docking@Home|Computation for task 1tng_mod0001_1127_24451_6 finished
It sounds like you need advice from someone who has more intimate knowledge of the distro you run. You might wait a long time before someone with that knowledge shows up in this small forum and stumbles upon your post. Have you tried explaining your problem in a forum dedicated to the distro you're running? I think you would get fairly quick results that way.
I've got this messagesw in my BOINC when I looked there:
Sam 23 Sep 2006 11:07:43 CEST|Docking@Home|Resuming task 1tng_mod0001_1254_335479_3 using charmm version 502
Sam 23 Sep 2006 11:47:17 CEST|Docking@Home|Computation for task 1tng_mod0001_1254_335479_3 finished
Sam 23 Sep 2006 11:47:20 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_0
Sam 23 Sep 2006 11:47:20 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_1
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Error on file upload: invalid signature
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Error on file upload: invalid signature
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_0
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_0: server rejected file
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_1
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_1: server rejected file
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_2
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_3
Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Error on file upload: invalid signature
Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_3
Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_3: server rejected file
Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Error on file upload: invalid signature
Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_2
Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_2: server rejected file
I don't know what happened, as I see the
corresponding result
in my account as "Checked, but no consensus yet", but at least succesful uploaded.
What went wrong where? And did anything go wrong at all besides the worrysome messages popping up?
It sounds like you need advice from someone who has more intimate knowledge of the distro you run. You might wait a long time before someone with that knowledge shows up in this small forum and stumbles upon your post. Have you tried explaining your problem in a forum dedicated to the distro you're running? I think you would get fairly quick results that way.
Nevermind, I've found the command with the help of someone, and my stacksize is already set to unlimited. So, the errors I get must be coming from something else...
Saenger,
I wouldn't worry too much about this one. The validator is confused about this wu, because there is
one valid result
which actually ended with an error. This one was crunched with the 5.1 app which had an error and was replaced with 5.2 because of that reason. So basically the validator gets 4 valid results of which 1 is different from the other 3 and this makes it set the validate_state of this wu to 4 (no consensus yet). The one that is still pending will determine the final result I suspect (and hope).
Hope that helps explain it...
Thanks, Andre
PS I'm on vacation for 4 days starting tomorrow. Going to check out the Grand Canyon :-)
I've got this messagesw in my BOINC when I looked there:
Sam 23 Sep 2006 11:07:43 CEST|Docking@Home|Resuming task 1tng_mod0001_1254_335479_3 using charmm version 502
Sam 23 Sep 2006 11:47:17 CEST|Docking@Home|Computation for task 1tng_mod0001_1254_335479_3 finished
Sam 23 Sep 2006 11:47:20 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_0
Sam 23 Sep 2006 11:47:20 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_1
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Error on file upload: invalid signature
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Error on file upload: invalid signature
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_0
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_0: server rejected file
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_1
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_1: server rejected file
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_2
Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_3
Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Error on file upload: invalid signature
Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_3
Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_3: server rejected file
Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Error on file upload: invalid signature
Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_2
Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_2: server rejected file
I don't know what happened, as I see the
corresponding result
in my account as "Checked, but no consensus yet", but at least succesful uploaded.
What went wrong where? And did anything go wrong at all besides the worrysome messages popping up?
____________
D@H the greatest project in the world... a while from now!
I don't worry about the credits, I worry about the messages.
My BOINC says it's not uploaded, it went wrong, sorry, but it failed.
My account here says everythings fine, no probs at all.
Both can't be right, my question is why they don't agree.
If I go to PCLINUX Forums and talk about Boinc, you get NO-Reply for my Distro.
Maybe next weeks Distro Du Jur will take the ulimit -s unlimited
Untill that time, have tried 14 times with No Luck, will try each day
to get some results that help "Docking".
Thanks Andre
It sounds like you need advice from someone who has more intimate knowledge of the distro you run. You might wait a long time before someone with that knowledge shows up in this small forum and stumbles upon your post. Have you tried explaining your problem in a forum dedicated to the distro you're running? I think you would get fairly quick results that way.
Nevermind, I've found the command with the help of someone, and my stacksize is already set to unlimited. So, the errors I get must be coming from something else...
:-(
Wed 27 Sep 2006 10:02:25 AM AST|Docking@Home|Starting task 1tng_mod0001_4039_71682_2 using charmm version 502
Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Unrecoverable error for result 1tng_mod0001_4039_71682_2 (process exited with code 1 (0x1))
Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
Wed 27 Sep 2006 10:07:23 AM AST||Rescheduling CPU: application exited
Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Computation for task 1tng_mod0001_4039_71682_2 finished
Can you show your ulimit output using the 'ulimit -a' command on the same terminal you started your boinc client on? (start a terminal, cd into your BOINC directory, enter 'ulimit -s unlimited', enter 'run_manager.sh &', enter 'ulimit -a')
Thanks
Andre
:-(
Wed 27 Sep 2006 10:02:25 AM AST|Docking@Home|Starting task 1tng_mod0001_4039_71682_2 using charmm version 502
Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Unrecoverable error for result 1tng_mod0001_4039_71682_2 (process exited with code 1 (0x1))
Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds
Wed 27 Sep 2006 10:07:23 AM AST||Rescheduling CPU: application exited
Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Computation for task 1tng_mod0001_4039_71682_2 finished
Anyone? My stacksize is already set to unlimited.
____________
D@H the greatest project in the world... a while from now!
The problem is, I can't find my BOINC directory. That may sound sutpid, but all I can see is the executables in /usr/bin (boinc_client, boinc_cmd, boincmgr), I have no idea where all the data is stored. There's nothing in /~ either except a small text file with basic settings. Anyhow, executing those commands in /usr/bin gave me the following:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
max nice (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
max rt priority (-r) unlimited
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
[1]+ Exit 127 run_manager.sh
Ah, seems you haven't downloaded a boinc client from boinc.berkeley.edu but installed one from an rpm or deb package. That may mean that you don't have a run_manager.sh script... Can you do the same, but instead of run_manager.sh run boincmgr? Let me know what that does.
Andre
The problem is, I can't find my BOINC directory. That may sound sutpid, but all I can see is the executables in /usr/bin (boinc_client, boinc_cmd, boincmgr), I have no idea where all the data is stored. There's nothing in /~ either except a small text file with basic settings. Anyhow, executing those commands in /usr/bin gave me the following:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
max nice (-e) 20
file size (blocks, -f) unlimited
pending signals (-i) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) unlimited
max rt priority (-r) unlimited
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) unlimited
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
[1]+ Exit 127 run_manager.sh
I don't know what to make of it. :-/
____________
D@H the greatest project in the world... a while from now!
Indeed, this is the boinc client that comes packaged with Xubuntu, which I've downloaded through Synaptic. But when I try that command, it gives me a "command not found" error. I've tried:
run_boincmgr
run_boincmgr &
run boincmgr
run boincmgr &
None of them work. As you can see I'm still a n00b with GNU/Linux, so go easy on me. ;-)
Indeed, this is the boinc client that comes packaged with Xubuntu, which I've downloaded through Synaptic. But when I try that command, it gives me a "command not found" error. I've tried:
run_boincmgr
run_boincmgr &
run boincmgr
run boincmgr &
None of them work. As you can see I'm still a n00b with GNU/Linux, so go easy on me. ;-)
Ubuntu is based on Debian. I am running Debian and I am using the version of BOINC released through them. My solution to the ulimit problem was to go to /etc/init.d and find the startup script for BOINC. I edited the startup script and put the ulimit command close to the beginning of the script. before any other commands where executed.
Where did you add the command ulimit -s unilimited to a file or just typed on your shell?
Good morning Memo,
After the first 12 were crunched (and reported) I opened up a Terminal-window and entered after the prompt 'ulimit -s'. This was replied by 8192.
At that moment BOINC was still running doing Seti and Rosetta.
I then entered 'ulimit -s unlimited' and left the Terminal-window open.
New downloaded Wu seemed to go to 4.something% this time but then stopped anaway. (so did the remaining next 11)
Then I redid the 'ulimit -s' command to check the stacks, but the reply was still 8192 and did not seem changed.
no joy here, only hard resets for computation errors to endure, tried the ulimiting after a couple, still hanging. Ah well this this old athlon with its new red hat has failed every wu of nearly every kind for a few weeks, after running successfully all kinds (even sap)for mounths .I'v memtest & all that other sorta stuff, perhaps time to re emerge, or recycle that p.c., lol .
Thanks for letting me play eh.
Indeed, this is the boinc client that comes packaged with Xubuntu, which I've downloaded through Synaptic. But when I try that command, it gives me a "command not found" error. I've tried:
run_boincmgr
run_boincmgr &
run boincmgr
run boincmgr &
None of them work. As you can see I'm still a n00b with GNU/Linux, so go easy on me. ;-)
Ubuntu is based on Debian. I am running Debian and I am using the version of BOINC released through them. My solution to the ulimit problem was to go to /etc/init.d and find the startup script for BOINC. I edited the startup script and put the ulimit command close to the beginning of the script. before any other commands where executed.
Jim
Thank you, Jim. I've done that to /etc/init.d/boinc-client. Is there anything else I should do, or will the problem be fixed now?
ID:
769 | Rating: 0
| rate:
/
Memo
Forum moderator
Project developer
Project tester
Joined: Sep 13 06
Posts: 88
ID: 14
Credit: 1,666,392
RAC: 0
Couple of questions:
Did you stop boinc after you changed the setting.
Where did you add the command ulimit -s unilimited to a file or just typed on your shell?
Good morning Memo,
After the first 12 were crunched (and reported) I opened up a Terminal-window and entered after the prompt 'ulimit -s'. This was replied by 8192.
At that moment BOINC was still running doing Seti and Rosetta.
I then entered 'ulimit -s unlimited' and left the Terminal-window open.
New downloaded Wu seemed to go to 4.something% this time but then stopped anaway. (so did the remaining next 11)
Then I redid the 'ulimit -s' command to check the stacks, but the reply was still 8192 and did not seem changed.
Did I do something wrong?
Greetings
Rene
Rene
The thing is that the command must be runed by a script befor boinc starts.
If you run on text mode adding the script to .bashrc will do. If its running in graphical mode I belive (I run boinc in text mode) it has to be in the run_client script in boinc directory.
Dont forget to restart boinc so this setting is catched by the client.
Where did you add the command ulimit -s unilimited to a file or just typed on your shell?
Good morning Memo,
After the first 12 were crunched (and reported) I opened up a Terminal-window and entered after the prompt 'ulimit -s'. This was replied by 8192.
At that moment BOINC was still running doing Seti and Rosetta.
I then entered 'ulimit -s unlimited' and left the Terminal-window open.
New downloaded Wu seemed to go to 4.something% this time but then stopped anaway. (so did the remaining next 11)
Then I redid the 'ulimit -s' command to check the stacks, but the reply was still 8192 and did not seem changed.
Did I do something wrong?
Greetings
Rene
Rene
The thing is that the command must be runed by a script befor boinc starts.
If you run on text mode adding the script to .bashrc will do. If its running in graphical mode I belive (I run boinc in text mode) it has to be in the run_client script in boinc directory.
Dont forget to restart boinc so this setting is catched by the client.
Let me know if you have more problems.
That's were i did put it the first time (run_client) thinking the run_manager script would trigger the run-client.
Now it seems that i've fixed it.
I've edited the run_manager script (the one that I use to start up the manager) and added the "ulimit -s unlimited" at the beginning.
Wu is still running now for over a hour and has reached approx 70%.
Thanks and will report back if the first ones have been crunched.
Indeed, this is the boinc client that comes packaged with Xubuntu, which I've downloaded through Synaptic. But when I try that command, it gives me a "command not found" error. I've tried:
run_boincmgr
run_boincmgr &
run boincmgr
run boincmgr &
None of them work. As you can see I'm still a n00b with GNU/Linux, so go easy on me. ;-)
Ubuntu is based on Debian. I am running Debian and I am using the version of BOINC released through them. My solution to the ulimit problem was to go to /etc/init.d and find the startup script for BOINC. I edited the startup script and put the ulimit command close to the beginning of the script. before any other commands where executed.
Jim
Thank you, Jim. I've done that to /etc/init.d/boinc-client. Is there anything else I should do, or will the problem be fixed now?
Just stop / start or restart your boinc client. Other than that, it should work.
I've got some finished wu's now (pending) and all seems well.
So here's what dit the trick for me:
Close (if needed) the BOINC manager.
Open Gedit ---> Open run_manager in the BOINC directory --->
Add "ulimit -s unlimited" at the beginning of the file --->
Save the file and use this to start up the BOINC manager.
You can also save the file as another (run_manager_docking or something) to use that script as long as it is needed here at Docking.
Greetings
Rene
ID:
801 | Rating: 0
| rate:
/
Memo
Forum moderator
Project developer
Project tester
Joined: Sep 13 06
Posts: 88
ID: 14
Credit: 1,666,392
RAC: 0
Just as a side note this setting will not affect any other project. It just give charmm a little more space to work thats all.
Have added a new computer using Linux and added the "ulimit -s unlimited" command to run_manager.
My first lot of work units (40 of them,10 per cpu) have all errored out with either 'error code 2' or an error message about can't get input files?
Computer is this one:- http://docking.utep.edu/show_host_detail.php?hostid=410
Am I missing something?
____________
Error 2 means that the app cannot open its own logfile called charmm.out. Could you check permissions on the slots and projects directories, etc?
Andre
Have added a new computer using Linux and added the "ulimit -s unlimited" command to run_manager.
My first lot of work units (40 of them,10 per cpu) have all errored out with either 'error code 2' or an error message about can't get input files?
Computer is this one:- http://docking.utep.edu/show_host_detail.php?hostid=410
Am I missing something?
____________
D@H the greatest project in the world... a while from now!
Hello Andre,
No restrictions that I can see. Set up the same as my other Linux machine.
Both are AMD Opteron computers, the working one is an 848 (2 cpus) and the one having trouble is an 275 (2 dual cpus). The 275 machine has no trouble running CP, Einstein, Rosetta, Ralph, QMC and Predictor.
The 848 computer also runs QMC, Rosetta, Einstein, Ralph, LHC.
Where is the 'charmm.out' file kept? I am unable to find it on either machine when I look in the project folder or the Boinc folder, I have but 5 files in the project folder (both machines).
Hello Andre,
No restrictions that I can see. Set up the same as my other Linux machine.
Both are AMD Opteron computers, the working one is an 848 (2 cpus) and the one having trouble is an 275 (2 dual cpus). The 275 machine has no trouble running CP, Einstein, Rosetta, Ralph, QMC and Predictor.
The 848 computer also runs QMC, Rosetta, Einstein, Ralph, LHC.
Where is the 'charmm.out' file kept? I am unable to find it on either machine when I look in the project folder or the Boinc folder, I have but 5 files in the project folder (both machines).
You can find them in one of the folders in ../BOINC/SLOTS
The folders are called "0", "1", "2", etc... depending on how much projects are running.
Hello Andre,
No restrictions that I can see. Set up the same as my other Linux machine.
Both are AMD Opteron computers, the working one is an 848 (2 cpus) and the one having trouble is an 275 (2 dual cpus). The 275 machine has no trouble running CP, Einstein, Rosetta, Ralph, QMC and Predictor.
The 848 computer also runs QMC, Rosetta, Einstein, Ralph, LHC.
Where is the 'charmm.out' file kept? I am unable to find it on either machine when I look in the project folder or the Boinc folder, I have but 5 files in the project folder (both machines).
You can find them in one of the folders in ../BOINC/SLOTS
The folders are called "0", "1", "2", etc... depending on how much projects are running.
The files we send back to the server are actually 'symlinks' in the slots directory. These files point to files called 1tng_xxxx_xxxxxx_x_x in the projects directory that actually contain the real content. The file name resolving is being done by the boinc client.
Andre
____________
D@H the greatest project in the world... a while from now!
One thing to try is suspending the job that is going to crash right after you downloaded it and check which files are present in the project and slots directories. The charmm.out will be called like 1tng_xxxx_xxxxxx_x_3 in the projects directory (the charmm.out file in the slots directory will contain the real file name). Let me know what you find.
Andre
Hello Andre,
No restrictions that I can see. Set up the same as my other Linux machine.
Both are AMD Opteron computers, the working one is an 848 (2 cpus) and the one having trouble is an 275 (2 dual cpus). The 275 machine has no trouble running CP, Einstein, Rosetta, Ralph, QMC and Predictor.
The 848 computer also runs QMC, Rosetta, Einstein, Ralph, LHC.
Where is the 'charmm.out' file kept? I am unable to find it on either machine when I look in the project folder or the Boinc folder, I have but 5 files in the project folder (both machines).
____________
D@H the greatest project in the world... a while from now!
I will try Andre, but with that last lot all terminating in seconds due to the errors, even if I had been home (I was at work), I would of had trouble trapping a WU to see what was in the SLOT directory. I notice that the SLOT folders only hold information while a project is being processed.
I am also trying to work out why the 848 machine only has 7 SLOT folders (with 6 projects) but the 275 machine has 17 SLOT folders for 7 projects, seems a bit weird but as the extras hold no data no real problem.
As I type this the 275 machine has Docking work downloading so we shall see what I find.
I will try Andre, but with that last lot all terminating in seconds due to the errors, even if I had been home (I was at work), I would of had trouble trapping a WU to see what was in the SLOT directory. I notice that the SLOT folders only hold information while a project is being processed.
I am also trying to work out why the 848 machine only has 7 SLOT folders (with 6 projects) but the 275 machine has 17 SLOT folders for 7 projects, seems a bit weird but as the extras hold no data no real problem.
As I type this the 275 machine has Docking work downloading so we shall see what I find.
Well It did not get better. After 3 minutes 20 seconds all the WU's started to error out so I suspended the project.
The "charmm.out" SLOT file held this information :-
<soft_link>../../projects/
docking.utep.edu/1tng_mod0001_5104_384550_0_3</soft_link>
The Project folder held these files (plus all WU files):-
grid_probes.rtf 1pdb_amino.rtf 1pdb.prm 1pdb_probes.prm charmm_5.2_i686-pc-linux-gnu
I am now getting "Unrecoverable error for result xxxxx (process exited with code 1 (0x1)).
This is the same as original error with Linux machines, but I have added the 'ulimit -s unlimited' command in the 'run_manager' boinc file.
____________
Yay! Thank you all who helped me. Now Docking@Home is working properly for me. Anyone running Xubuntu who doesn't know what to do, here are the summarized steps:
sudo nano -w /etc/init.d/boinc-client
- Add the text "ulimit -s unlimited" to the beginning of the file and save that file (Ctrl+O). Then:
>>> While I did not have to add the 'ulimit' command to 'run_client' on my other working machine (AMD Opteron 848 (2 cpus) same OS Fedora Core 3), but I will try.
Ok stopped Boinc and added the 'ulimit -s unlimited' command as the first line of 'run_client'. Started Boinc but still the same, error code 1 after 3 minutes 21 seconds.
Right, nothing for it, I will reboot the computer.
You little ripper, she's a goer now and has gone past 9 minutes for the first time and still crunching, says WU will take 35 minutes 13 seconds, I will wait.
It would appear that I may not of needed the 'ulimit -s unlimited' command in the 'run_client' if I had rebooted the machine in the first place, I made the assumption that as I did not have to reboot the first computer I did not need to reboot this one, how wrong I was. Crunching now for 19 minutes.
As it looks like it is going to take longer than 35 minutes I will give an update later as i have to go to work.
All now working ok, have now processed a successful WU after 1 hour 25 minutes. Also have 6 pending.
I should of done the reboot at the start.
Thanks for everyones help, all systems go.
____________
All now working ok, have now processed a successful WU after 1 hour 25 minutes. Also have 6 pending.
I should of done the reboot at the start.
Thanks for everyones help, all systems go.
Well done... ;-)
Let's hope that an app update will fix the needed "hack".