Charmm 5.02
Message boards : Number crunching : Charmm 5.02
Author | Message | |
---|---|---|
Can you pls post changes with new versions? Thx |
||
ID:
140 | Rating: 0
| rate:
![]() ![]() ![]() |
||
No luck with 5.02 and size of upload size (:-
|
||
ID:
158 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Yes, we know. Richard posted yesterday that 5.02 will only fix the excessive debugging info in stderr.txt. The fix for the -131 will be deployed today and is only a change in the input file of the app. All existing wu's will have to be aborted for this though. Keep an eye an the news for the next hours.
No luck with 5.02 and size of upload size (:- |
||
ID:
165 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Tried downloading wu's but got the following:
|
||
ID:
173 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Don't know where that comes from yet. See it on our test system as well. I'm looking into it. Thanks.
Tried downloading wu's but got the following: |
||
ID:
174 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Tried on another box and got the same |
||
ID:
175 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Just attached one new computer to the project and downloaded some WUs without any problem.
|
||
ID:
179 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Just started downloading, problem seems to have been solved.
|
||
ID:
180 | Rating: 0
| rate:
![]() ![]() ![]() |
||
First wu still ok after 2hrs |
||
ID:
187 | Rating: 0
| rate:
![]() ![]() ![]() |
||
All existing workunits have been cancelled and about 500 new workunits have been created. These take about 1.5 hours on a P4 3.2 GHz and 2.5 hours on a Celeron 2 GHz. Please reset your project or detach and re-attach to start crunching the new wu's. Thanks for all the help!
First wu still ok after 2hrs |
||
ID:
203 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Have tried downloading some of the new wu's and got the following again:
|
||
ID:
205 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Strange. I got my several WUs about 30mins ago fine. |
||
ID:
206 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Had the same problem yesterday, then it seemed to be ok later on
|
||
ID:
207 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I still have the same download problem on my Linux box. See
http://docking.utep.edu/result.php?resultid=9227
|
||
ID:
210 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Just resetted project and downloaded 12 brand-new results with 5.02 to crunch; no download errors.
|
||
ID:
213 | Rating: 0
| rate:
![]() ![]() ![]() |
||
1,5h on a P4 3,2 HT? Oh a joke, Iam now at ~20% in 1h=~5h to complete :/ |
||
ID:
215 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Same here, I have a P4 3.2 (on Windows) and it's showing 56% with 1:45 cpu time. Had to switch back to another project so this won't complete any time soon. |
||
ID:
216 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I've got a P4 1.6GHz currently at 48% after 3h50 so I'm guessing about 8.5-9 hours for completion. |
||
ID:
218 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I've sucessfully completed and uploaded 2 result in 2 hours.
|
||
ID:
219 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Iam watching many write to disk activities with the app 5.02. Can you check this? I think it will also ignore the preferences. The worst things are gone :) |
||
ID:
221 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Yes, I'm getting ~2GB of disk reads every hour on each charmm_5.2
|
||
ID:
222 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Just returned 2 successful WUs. No crunching errors, no file errors.
|
||
ID:
224 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I detached/reattached my windows box this morning. Downloaded about 15 wu's. All errored out on download. See http://docking.utep.edu/result.php?resultid=9467 |
||
ID:
225 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I have found the problem that is causing this error. It seems that our resultCollector (a fancy name for the file_deleter that does a little bit more), is removing a couple of files consistently from the download directory, so that you see the error below in your logs. I have fixed this for any new wu's that will be created (I've put the no_delete flag in the workunit template for these files) but for the current ones I am still looking for a good solution, because I think that boinc doesn't allow what we are currently doing with out files.
I still have the same download problem on my Linux box. See http://docking.utep.edu/result.php?resultid=9227 |
||
ID:
227 | Rating: 0
| rate:
![]() ![]() ![]() |
||
That's interesting. I checked my machines again and every result that I crunch takes 2.5 hours on a 2 GHz celeron and 1.5 hours on a 3.2 GHz P4. The difference is I am running Linux and we all now that Linux is a bit more performant than Windows... but 2 hours is a quite a difference.... We should try to get more data on this. Can any of the other Linux guys/girls comment on this?
Just returned 2 successful WUs. No crunching errors, no file errors. |
||
ID:
228 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Another
one finished fine
|
||
ID:
229 | Rating: 0
| rate:
![]() ![]() ![]() |
||
We'll check into this.
Iam watching many write to disk activities with the app 5.02. Can you check this? I think it will also ignore the preferences. The worst things are gone :) |
||
ID:
230 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Since I used up my daily quota on both machines, I d/l some new work in a Vista virtual machine. The d/l now works fine, and the WU start off full of enthousiasm :) It looks like the problem is solved. Unfortunately my Vista installation is so sluggish and eats so much of my machine's resources, even when idle, that I won't let the WU run till the end. I hope you don't mind <blush> |
||
ID:
231 | Rating: 0
| rate:
![]() ![]() ![]() |
||
4:49h took my P4 3,2 HT on Win XP Pro with 2GB RAM, I think the disk writing grabbed some time away, oh linux ^^ |
||
ID:
232 | Rating: 0
| rate:
![]() ![]() ![]() |
||
One successful returned finally WOOT!
|
||
ID:
243 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Should be better now after my temporary fix. Next problem we'll look at is the excessive disk writing. Richard has started on this already this afternoon.
One successful returned finally WOOT! |
||
ID:
244 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I quess I'm the first to try this, I checked boincstats and saw no Windows 98 hosts.
|
||
ID:
299 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I am womdering has anyone see a result like this one?
|
||
ID:
327 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Is the 5.02 version of Docking@home just for Windows? My Linux machine still only downloads 5.01 workunits and they all take about 3 1/2 minutes. All have the same error code even though they say they are successful.
|
||
ID:
328 | Rating: 0
| rate:
![]() ![]() ![]() |
||
The part about the Version numbers I have got the answer from another thead from Andre, so don't worry about the 5.02 and 5.01 thing, 5.02 is for Windows and 5.01 is for Linux and Macs.
|
||
ID:
329 | Rating: 0
| rate:
![]() ![]() ![]() |
||
@ JShadic - No heartbeat means that BOINC core is having trouble finiding science application alive. Application should send "alive" message periodically.
|
||
ID:
332 | Rating: 0
| rate:
![]() ![]() ![]() |
||
@ JShadic - No heartbeat means that BOINC core is having trouble finiding science application alive. Application should send "alive" message periodically. Note that this is done automatically by BOINC library, it's not something the application should so 'by hand'. Or, it can happen on windows machines when the clock on XP is updated to the correct time (done automatically in Windows), and BOINC core gets out of sync with application. Wonder what will happen if clock goes *backwards* when it updates! "No heartbeat for -10 seconds"? :D |
||
ID:
347 | Rating: 0
| rate:
![]() ![]() ![]() |
||
@ JShadic - No heartbeat means that BOINC core is having trouble finiding science application alive. Application should send "alive" message periodically. Note that this is done automatically by BOINC library, it's not something the application should so 'by hand'. Or, it can happen on windows machines when the clock on XP is updated to the correct time (done automatically in Windows), and BOINC core gets out of sync with application. Wonder what will happen if clock goes *backwards* when it updates! "No heartbeat for -10 seconds"? :D EDIT: any mod around to delete my doublepost? :( |
||
ID:
348 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Yes, 5.2 is only for windows. I am currently working on a 5.2 for linux, because some of you report app crashes on linux (not everybody and I cannot reproduce any of these crashes on the system in my test lab). We also will release a fix for the validation problem soon. So many new versions to come.
Is the 5.02 version of Docking@home just for Windows? My Linux machine still only downloads 5.01 workunits and they all take about 3 1/2 minutes. All have the same error code even though they say they are successful. |
||
ID:
349 | Rating: 0
| rate:
![]() ![]() ![]() |
||
We haven't tested on win98 for the simple reason we don't have a system like that and have a hard time finding cd's to set one up. We could definitely use some help in that corner. Seems that our app cannot open its logfile charmm.out on your box. That will be a hard problem to solve since we don't even have a logfile.. Are the permissions set right on the boinc directory? (projects and or slots)
I quess I'm the first to try this, I checked boincstats and saw no Windows 98 hosts. |
||
ID:
350 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Correct. This should be taken care of by the boinc client. We don't touch that functionality at all (wouldn't know how to :-)
|
||
ID:
351 | Rating: 0
| rate:
![]() ![]() ![]() |
||
We haven't tested on win98 for the simple reason we don't have a system like that and have a hard time finding cd's to set one up. We could definitely use some help in that corner. Seems that our app cannot open its logfile charmm.out on your box. That will be a hard problem to solve since we don't even have a logfile.. Are the permissions set right on the boinc directory? (projects and or slots) I can give you full access via VNC to a Win98 host (virtual machine). Although it's Spanish version of Windows... |
||
ID:
354 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Thank you Andre, Nicolas, and Honza for the explanation. Happy to let Docking use my spare cycles on this old clunker of mine. |
||
ID:
359 | Rating: 0
| rate:
![]() ![]() ![]() |
||
We haven't tested on win98 for the simple reason we don't have a system like that and have a hard time finding cd's to set one up. We could definitely use some help in that corner. Seems that our app cannot open its logfile charmm.out on your box. That will be a hard problem to solve since we don't even have a logfile.. Are the permissions set right on the boinc directory? (projects and or slots) I've never had a problem like this on any of my 6 or 7 windows 98 hosts running any other BOINC projects/applications. They all run other BOINC projects without ever having to set any permissions so I don't know about that. |
||
ID:
392 | Rating: 0
| rate:
![]() ![]() ![]() |
||
We haven't tested on win98 for the simple reason we don't have a system like that...(snip) Are the permissions set right on the boinc directory? (projects and or slots) Windows 9x doesn't even have a permissions/ownership system. That's only on NT-based Windows versions. |
||
ID:
394 | Rating: 0
| rate:
![]() ![]() ![]() |
||
My linux box just updated the application from 5.01 to 5.02. 5.01 was running just fine. all WU's that it crunched completed successfully.
|
||
ID:
452 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I suspect that your other WUs weren't successful, but only validated successful because of a bug in 5.1. Do you have any result numbers for us to check?
My linux box just updated the application from 5.01 to 5.02. 5.01 was running just fine. all WU's that it crunched completed successfully. ____________ D@H the greatest project in the world... a while from now! |
||
ID:
455 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Here are my computers
I suspect that your other WUs weren't successful, but only validated successful because of a bug in 5.1. Do you have any result numbers for us to check? |
||
ID:
458 | Rating: 0
| rate:
![]() ![]() ![]() |
||
We have finally found the cause of the problem that some users were experiencing on their Linux systems. It has to do with the stacksize setting on your machine which is for some distros (SuSE 9.3 and 10 for example) set to unlimited and for others (FCx, Ubuntu, etc) set to a limited value like 10240. Your setting can be seen by typing 'ulimit -s' in a terminal. To make the Charmm 'exit 1' errors go away, please set the stacksize to unlimited using the command 'ulimit -s unlimited'. This is not saying that Charmm will use all of your memory (it won't), but it gives us a little bit more space to do our simulations correctly and without errors. Please let us know if this does not work for you. If it does work, please add this command to your shell initialization file (.bashrc, .tcshrc, .kshrc, etc) in your home directory. Of course don't forget to resume the D@H project on your boincmgr in case you suspended it before.
Here are my computers ____________ D@H the greatest project in the world... a while from now! |
||
ID:
459 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I am running a test now. The default ulimit was set to 8192 for Ubuntu 5.10. The test is now beyond the point where it used to exit (now close to 10%, where it used to exit at 4%), so it's looking good. Unfortunately I have to leave now, I will only see in the morning if it really ran to the end.
|
||
ID:
464 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I suspect that charmm is using stack space to allocate memory instead of the heap. Or maybe it uses both I'm not sure. Also it's a piece of fortran code that is under development for more than 30 years now which doesn't make it easier to analyze ;-) We will get back to the charmm developers (a whole different community) to ask why the stack. For now there's not too much we can do except asking people to increase their stacksize.
Remains the question why the program needs that much stack space. 8192 KB is a lot! How does the program come to that high usage? Is there very deep recursion in the coding? Or large chunks of memory that are put on the stack instead of allocating them from the heap? ____________ D@H the greatest project in the world... a while from now! |
||
ID:
465 | Rating: 0
| rate:
![]() ![]() ![]() |
||
We have finally found the cause of the problem that some users were experiencing on their Linux systems. It has to do with the stacksize setting on your machine which is for some distros (SuSE 9.3 and 10 for example) set to unlimited and for others (FCx, Ubuntu, etc) set to a limited value like 10240. Your setting can be seen by typing 'ulimit -s' in a terminal. To make the Charmm 'exit 1' errors go away, please set the stacksize to unlimited using the command 'ulimit -s unlimited'. This is not saying that Charmm will use all of your memory (it won't), but it gives us a little bit more space to do our simulations correctly and without errors. Please let us know if this does not work for you. If it does work, please add this command to your shell initialization file (.bashrc, .tcshrc, .kshrc, etc) in your home directory. Of course don't forget to resume the D@H project on your boincmgr in case you suspended it before. Where is ulimit located? I am getting "command not found", and I cannot locate it anywhere. ____________ Dublin, CA Team SETI.USA |
||
ID:
467 | Rating: 0
| rate:
![]() ![]() ![]() |
||
My stack size was also 8192. I have just increased it to unlimited as you asked. I will keep an eye on things from here on out. |
||
ID:
468 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Just a thought. You said to put that command in the shell initialization file. But, that won't work for me. I have BOINC runninng as a daemon. I am actually rarely logged into either one of my linux boxes. |
||
ID:
469 | Rating: 0
| rate:
![]() ![]() ![]() |
||
What distro are you running and which shell do you use?
____________ D@H the greatest project in the world... a while from now! |
||
ID:
470 | Rating: 0
| rate:
![]() ![]() ![]() |
||
What distro are you running and which shell do you use? Ubuntu 6.06, tcsh Thanks. ____________ Dublin, CA Team SETI.USA |
||
ID:
471 | Rating: 0
| rate:
![]() ![]() ![]() |
||
It will work even as a daemon: in the boinc start script or init script (or whatever means you use to start), put this command before you start the actual boinc process. Make sure to set the stack limit for the user that boinc runs under.
Just a thought. You said to put that command in the shell initialization file. But, that won't work for me. I have BOINC runninng as a daemon. I am actually rarely logged into either one of my linux boxes. ____________ D@H the greatest project in the world... a while from now! |
||
ID:
472 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Sorry for all the NOOB questions. The reason i am running these linux machines is to get a better handle on LInux, and it is working, slowly but surely.
It will work even as a daemon: in the boinc start script or init script (or whatever means you use to start), put this command before you start the actual boinc process. Make sure to set the stack limit for the user that boinc runs under. |
||
ID:
476 | Rating: 0
| rate:
![]() ![]() ![]() |
||
What distro are you running and which shell do you use? Okay, you asking about the shell got me thinking that this is a bash-only command. So I changed my shell to bash, added the line to my .bashrc, and then rebooted for good measure. I have boincmgr set to run at login via the sessions manager, so it started automatically. After 3 minutes or so, the WUs failed in the usual way. I fired up a terminal and checked, yep, "unlimited". Everything looks right there. So I quit boincmgr, went to the GUI filemanager, and double clicked boincmgr to start it again. After 3 minutes or so, the WUs still failed in the usual way. So I quit boinc manager again, went back to the terminal, and launched boincmgr from the command line. This time, it appears to have worked. It's up to 9 minutes now. Issues with this solution: 1) I don't like bash 2) This won't work when there is a power failure, as I have my machines set to automatically boot, log in, and run boincmgr. And if I have to launch it manually from the command line, it won't get fixed until whenever I notice and get back to the machine. ____________ Dublin, CA Team SETI.USA |
||
ID:
477 | Rating: 0
| rate:
![]() ![]() ![]() |
||
1) On tcsh the command is called 'limit' and you set stacksize to unlimited with 'limit stacksize unlimited'. For ksh it is 'ulimit'.
What distro are you running and which shell do you use? ____________ D@H the greatest project in the world... a while from now! |
||
ID:
478 | Rating: 0
| rate:
![]() ![]() ![]() |
||
1) On tcsh the command is called 'limit' and you set stacksize to unlimited with 'limit stacksize unlimited'. For ksh it is 'ulimit'. Thanks! 2) I never automatically boot, log in as a certain user, and run an app, so I don't know how this works. But somehow it must be possible to set your stack to unlimited. Could you run the ./run_manager script that comes standard with boinc? You could add the 'limit' command to that script and use that to fire up boincmgr either from the commandline or by clicking on it. yikes. I'm afraid that is beyond my skills. But a thought occured to me. perhaps I can start boincmgr from one of the .cshrc/.bashrc files. Let me try that. ____________ Dublin, CA Team SETI.USA |
||
ID:
479 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Sorry, it's beyond my linux skills too :(
|
||
ID:
480 | Rating: 0
| rate:
![]() ![]() ![]() |
||
In the terminal where you type 'ulimit -s unlimited' also start the boincmgr process. Every terminal that you open will have the setting 8192 again unless you put that command in a file called .bashrc in your home directory. That file can be edited with any GUI editor (doesn't have to be vi ;-)
Sorry, it's beyond my linux skills too :( ____________ D@H the greatest project in the world... a while from now! |
||
ID:
481 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Many, thanks for your time Andre !
|
||
ID:
485 | Rating: 0
| rate:
![]() ![]() ![]() |
||
2) I never automatically boot, log in as a certain user, and run an app, so I don't know how this works. But somehow it must be possible to set your stack to unlimited. Could you run the ./run_manager script that comes standard with boinc? You could add the 'limit' command to that script and use that to fire up boincmgr either from the commandline or by clicking on it. I figured how to do it afterall. I added the "limit stacksize unlimited" as the first line in run_manager (my shell is tcsh). Then I went into the sessions -> startup items. Deleted boincmgr added run_manager Works like a charm! ____________ Dublin, CA Team SETI.USA |
||
ID:
494 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Great! Happy to hear that :-)
2) I never automatically boot, log in as a certain user, and run an app, so I don't know how this works. But somehow it must be possible to set your stack to unlimited. Could you run the ./run_manager script that comes standard with boinc? You could add the 'limit' command to that script and use that to fire up boincmgr either from the commandline or by clicking on it. ____________ D@H the greatest project in the world... a while from now! |
||
ID:
498 | Rating: 0
| rate:
![]() ![]() ![]() |
||
In the terminal where you type 'ulimit -s unlimited' also start the boincmgr process. Every terminal that you open will have the setting 8192 again unless you put that command in a file called .bashrc in your home directory. That file can be edited with any GUI editor (doesn't have to be vi ;-) It works for me ! Thanks again Andre ____________ ![]() |
||
ID:
519 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I got this message as I started BOINC:
|
||
ID:
523 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Doesnt appear to be working for me... see latest results from
this host...
|
||
ID:
524 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Paul,
Doesnt appear to be working for me... see latest results from this host... ____________ D@H the greatest project in the world... a while from now! |
||
ID:
525 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Hi Andre,
paul@GentooPC ~ $ su - boinc Password: boinc@GentooPC ~ $ ulimit unlimited boinc@GentooPC ~ $ I then stop started boinc paul@GentooPC ~ $ su - root Password: GentooPC ~ # cd /etc/init.d/ GentooPC init.d # ./boinc stop * Caching service dependencies ... [ ok ] * Stopping BOINC ... [ ok ] GentooPC init.d # ./boinc start * Starting BOINC ... [ ok ] GentooPC init.d # The actual command to start the client as a daemon is below. The variables are all standard stuff and populated earlier in the start script. setsid start-stop-daemon --quiet --start --chdir ${RUNTIMEDIR} \ --exec ${BOINCBIN} --chuid ${USER}:${GROUP} \ --nicelevel ${NICELEVEL} -- ${ARGS} > ${LOGFILE} 2>&1 & To launch BOINC Manager, i log into KDE and run the command below via a desktop shortcut. /usr/bin/boinc_gui I think it should have implemented the work around properly! cheers, Paul. ____________ |
||
ID:
528 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Maybe you could put the ulimit command in the startup script before the daemon is started?
Hi Andre, ____________ D@H the greatest project in the world... a while from now! |
||
ID:
529 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Ok, the start script no reads (only the interesting bit below!):
ulimit -s unlimited setsid start-stop-daemon --quiet --start --chdir ${RUNTIMEDIR} \ --exec ${BOINCBIN} --chuid ${USER}:${GROUP} \ --nicelevel ${NICELEVEL} -- ${ARGS} > ${LOGFILE} 2>&1 & I have restarted BOINC and will see what happens. Paul. |
||
ID:
537 | Rating: 0
| rate:
![]() ![]() ![]() |
||
[quote]... I'm having the same problem, running Xubuntu (XFCE uses Terminal). Ulimit, limit, none of them work! :-( Additionally, this is what happened when I've crunched my first WU: Wed 20 Sep 2006 05:36:36 PM AST|Docking@Home|Starting task 1tng_mod0001_1530_1466_4 using charmm version 502 Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Unrecoverable error for result 1tng_mod0001_1530_1466_4 (process exited with code 1 (0x1)) Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds Wed 20 Sep 2006 05:40:44 PM AST||Rescheduling CPU: application exited Wed 20 Sep 2006 05:40:44 PM AST|Docking@Home|Computation for task 1tng_mod0001_1530_1466_4 finished ____________ |
||
ID:
540 | Rating: 0
| rate:
![]() ![]() ![]() |
||
|
||
ID:
547 | Rating: 0
| rate:
![]() ![]() ![]() |
||
|
||
ID:
548 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I generated more work units. That should solve the problem. Please let use know if it doesn't. Thanks. |
||
ID:
552 | Rating: 0
| rate:
![]() ![]() ![]() |
||
[quote]... ulimit is a shell parameter for bash (and other shells). I am not familiar with Xubuntu/XFCE, but its shell should have a similar parameter. You can try looking the man pages for references to the stack size. I'm not sure if the error you are getting is related, but I will look into this too. Thanks |
||
ID:
553 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Maybe you could put the ulimit command in the startup script before the daemon is started? I dont want to jinx it, but since making this change I am 4 hors and 80% into a result - the longest one yet by some margin. Here's hoping! Paul. |
||
ID:
554 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Lost power overnight (damn storms!) and my UPS shut down my computers... hopefully will work to report soon enough.
|
||
ID:
557 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Lost power overnight (damn storms!) and my UPS shut down my computers... hopefully will work to report soon enough. [off-topic]At least you have an UPS! Yesterday, AND the day before, I had 30-second long power outages at night. Both times VMware was running, so that's two operating systems being shut down uncleanly. And only one of the times I could be bothered to turn computer back on and strat up everything again, so lost CPU time (BOINC) and download time.[/off-topic] |
||
ID:
559 | Rating: 0
| rate:
![]() ![]() ![]() |
||
[off topic]
|
||
ID:
561 | Rating: 0
| rate:
![]() ![]() ![]() |
||
[quote]... Sorry to bother again, but I can't for the life of me find anything even remotely related to that command. I've looked *everywhere*... :-( Oh, and the same thing happened with a second WU now: Fri 22 Sep 2006 12:21:01 PM AST|Docking@Home|Starting task 1tng_mod0001_1127_24451_6 using charmm version 502 Fri 22 Sep 2006 12:25:26 PM AST|Docking@Home|Unrecoverable error for result 1tng_mod0001_1127_24451_6 (process exited with code 1 (0x1)) Fri 22 Sep 2006 12:25:26 PM AST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds Fri 22 Sep 2006 12:25:26 PM AST||Rescheduling CPU: application exited Fri 22 Sep 2006 12:25:26 PM AST|Docking@Home|Computation for task 1tng_mod0001_1127_24451_6 finished |
||
ID:
568 | Rating: 0
| rate:
![]() ![]() ![]() |
||
|
||
ID:
572 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I've got this messagesw in my BOINC when I looked there:
Sam 23 Sep 2006 11:07:43 CEST|Docking@Home|Resuming task 1tng_mod0001_1254_335479_3 using charmm version 502
Sam 23 Sep 2006 11:47:17 CEST|Docking@Home|Computation for task 1tng_mod0001_1254_335479_3 finished Sam 23 Sep 2006 11:47:20 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_0 Sam 23 Sep 2006 11:47:20 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_1 Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Error on file upload: invalid signature Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Error on file upload: invalid signature Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_0 Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_0: server rejected file Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_1 Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_1: server rejected file Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_2 Sam 23 Sep 2006 11:47:22 CEST|Docking@Home|Started upload of file 1tng_mod0001_1254_335479_3_3 Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Error on file upload: invalid signature Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_3 Sam 23 Sep 2006 11:47:24 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_3: server rejected file Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Error on file upload: invalid signature Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Permanently failed upload of 1tng_mod0001_1254_335479_3_2 Sam 23 Sep 2006 11:47:28 CEST|Docking@Home|Giving up on upload of 1tng_mod0001_1254_335479_3_2: server rejected file I don't know what happened, as I see the corresponding result in my account as "Checked, but no consensus yet", but at least succesful uploaded. What went wrong where? And did anything go wrong at all besides the worrysome messages popping up? |
||
ID:
576 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Nevermind, I've found the command with the help of someone, and my stacksize is already set to unlimited. So, the errors I get must be coming from something else... |
||
ID:
588 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Saenger,
I've got this messagesw in my BOINC when I looked there: ____________ D@H the greatest project in the world... a while from now! |
||
ID:
589 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I don't worry about the credits, I worry about the messages.
|
||
ID:
597 | Rating: 0
| rate:
![]() ![]() ![]() |
||
If I go to PCLINUX Forums and talk about Boinc, you get NO-Reply for my Distro.
|
||
ID:
599 | Rating: 0
| rate:
![]() ![]() ![]() |
||
:-( Wed 27 Sep 2006 10:02:25 AM AST|Docking@Home|Starting task 1tng_mod0001_4039_71682_2 using charmm version 502 Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Unrecoverable error for result 1tng_mod0001_4039_71682_2 (process exited with code 1 (0x1)) Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Deferring scheduler requests for 1 minutes and 0 seconds Wed 27 Sep 2006 10:07:23 AM AST||Rescheduling CPU: application exited Wed 27 Sep 2006 10:07:23 AM AST|Docking@Home|Computation for task 1tng_mod0001_4039_71682_2 finished Anyone? My stacksize is already set to unlimited. |
||
ID:
637 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Can you show your ulimit output using the 'ulimit -a' command on the same terminal you started your boinc client on? (start a terminal, cd into your BOINC directory, enter 'ulimit -s unlimited', enter 'run_manager.sh &', enter 'ulimit -a')
____________ D@H the greatest project in the world... a while from now! |
||
ID:
669 | Rating: 0
| rate:
![]() ![]() ![]() |
||
The problem is, I can't find my BOINC directory. That may sound sutpid, but all I can see is the executables in /usr/bin (boinc_client, boinc_cmd, boincmgr), I have no idea where all the data is stored. There's nothing in /~ either except a small text file with basic settings. Anyhow, executing those commands in /usr/bin gave me the following:
|
||
ID:
677 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Ah, seems you haven't downloaded a boinc client from boinc.berkeley.edu but installed one from an rpm or deb package. That may mean that you don't have a run_manager.sh script... Can you do the same, but instead of run_manager.sh run boincmgr? Let me know what that does.
The problem is, I can't find my BOINC directory. That may sound sutpid, but all I can see is the executables in /usr/bin (boinc_client, boinc_cmd, boincmgr), I have no idea where all the data is stored. There's nothing in /~ either except a small text file with basic settings. Anyhow, executing those commands in /usr/bin gave me the following: ____________ D@H the greatest project in the world... a while from now! |
||
ID:
678 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Indeed, this is the boinc client that comes packaged with Xubuntu, which I've downloaded through Synaptic. But when I try that command, it gives me a "command not found" error. I've tried:
|
||
ID:
696 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Indeed, this is the boinc client that comes packaged with Xubuntu, which I've downloaded through Synaptic. But when I try that command, it gives me a "command not found" error. I've tried: Ubuntu is based on Debian. I am running Debian and I am using the version of BOINC released through them. My solution to the ulimit problem was to go to /etc/init.d and find the startup script for BOINC. I edited the startup script and put the ulimit command close to the beginning of the script. before any other commands where executed. Jim |
||
ID:
741 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Hi there..
|
||
ID:
757 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Couple of questions:
|
||
ID:
761 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Couple of questions: Good morning Memo, After the first 12 were crunched (and reported) I opened up a Terminal-window and entered after the prompt 'ulimit -s'. This was replied by 8192. At that moment BOINC was still running doing Seti and Rosetta. I then entered 'ulimit -s unlimited' and left the Terminal-window open. New downloaded Wu seemed to go to 4.something% this time but then stopped anaway. (so did the remaining next 11) Then I redid the 'ulimit -s' command to check the stacks, but the reply was still 8192 and did not seem changed. Did I do something wrong? Greetings Rene |
||
ID:
766 | Rating: 0
| rate:
![]() ![]() ![]() |
||
no joy here, only hard resets for computation errors to endure, tried the ulimiting after a couple, still hanging. Ah well this this old athlon with its new red hat has failed every wu of nearly every kind for a few weeks, after running successfully all kinds (even sap)for mounths .I'v memtest & all that other sorta stuff, perhaps time to re emerge, or recycle that p.c., lol .
|
||
ID:
767 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Indeed, this is the boinc client that comes packaged with Xubuntu, which I've downloaded through Synaptic. But when I try that command, it gives me a "command not found" error. I've tried: Thank you, Jim. I've done that to /etc/init.d/boinc-client. Is there anything else I should do, or will the problem be fixed now? |
||
ID:
769 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Couple of questions: Rene The thing is that the command must be runed by a script befor boinc starts. If you run on text mode adding the script to .bashrc will do. If its running in graphical mode I belive (I run boinc in text mode) it has to be in the run_client script in boinc directory. Dont forget to restart boinc so this setting is catched by the client. Let me know if you have more problems. |
||
ID:
786 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Couple of questions: That's were i did put it the first time (run_client) thinking the run_manager script would trigger the run-client. Now it seems that i've fixed it. I've edited the run_manager script (the one that I use to start up the manager) and added the "ulimit -s unlimited" at the beginning. Wu is still running now for over a hour and has reached approx 70%. Thanks and will report back if the first ones have been crunched. ;-) |
||
ID:
788 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Indeed, this is the boinc client that comes packaged with Xubuntu, which I've downloaded through Synaptic. But when I try that command, it gives me a "command not found" error. I've tried: Just stop / start or restart your boinc client. Other than that, it should work. |
||
ID:
795 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I've got some finished wu's now (pending) and all seems well.
|
||
ID:
801 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Just as a side note this setting will not affect any other project. It just give charmm a little more space to work thats all. |
||
ID:
802 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Have added a new computer using Linux and added the "ulimit -s unlimited" command to run_manager.
|
||
ID:
803 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Error 2 means that the app cannot open its own logfile called charmm.out. Could you check permissions on the slots and projects directories, etc?
Have added a new computer using Linux and added the "ulimit -s unlimited" command to run_manager. ____________ D@H the greatest project in the world... a while from now! |
||
ID:
806 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Hello Andre,
|
||
ID:
807 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Hello Andre, You can find them in one of the folders in ../BOINC/SLOTS The folders are called "0", "1", "2", etc... depending on how much projects are running. |
||
ID:
808 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Hello Andre, The files we send back to the server are actually 'symlinks' in the slots directory. These files point to files called 1tng_xxxx_xxxxxx_x_x in the projects directory that actually contain the real content. The file name resolving is being done by the boinc client. Andre ____________ D@H the greatest project in the world... a while from now! |
||
ID:
813 | Rating: 0
| rate:
![]() ![]() ![]() |
||
One thing to try is suspending the job that is going to crash right after you downloaded it and check which files are present in the project and slots directories. The charmm.out will be called like 1tng_xxxx_xxxxxx_x_3 in the projects directory (the charmm.out file in the slots directory will contain the real file name). Let me know what you find.
Hello Andre, ____________ D@H the greatest project in the world... a while from now! |
||
ID:
814 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I will try Andre, but with that last lot all terminating in seconds due to the errors, even if I had been home (I was at work), I would of had trouble trapping a WU to see what was in the SLOT directory. I notice that the SLOT folders only hold information while a project is being processed.
|
||
ID:
855 | Rating: 0
| rate:
![]() ![]() ![]() |
||
I will try Andre, but with that last lot all terminating in seconds due to the errors, even if I had been home (I was at work), I would of had trouble trapping a WU to see what was in the SLOT directory. I notice that the SLOT folders only hold information while a project is being processed. Well It did not get better. After 3 minutes 20 seconds all the WU's started to error out so I suspended the project. The "charmm.out" SLOT file held this information :- <soft_link>../../projects/ docking.utep.edu/1tng_mod0001_5104_384550_0_3</soft_link> The SLOT folder held these files :- 1tng_0.bin 1tng.bin 1tng.crt 1tng_grid.bin 1tng_min.pdb 1tng.streamfile boinc_lockfile charmm_5.2_i686-pc-linux-gnu charmm.inp charmm.out grid_probes.rtf init_data.xml ligandmingrid.bin ligand.pdb ligand.psf 1pdb_amino.rtf 1pdb.prm 1pdb_probes.prm minenergy.pdb minrmsd.pdb percentdone.str receptor.pdb receptor.psf stderr.txt summary.txt The Project folder held these files (plus all WU files):- grid_probes.rtf 1pdb_amino.rtf 1pdb.prm 1pdb_probes.prm charmm_5.2_i686-pc-linux-gnu I am now getting "Unrecoverable error for result xxxxx (process exited with code 1 (0x1)). This is the same as original error with Linux machines, but I have added the 'ulimit -s unlimited' command in the 'run_manager' boinc file. ____________ ![]() ![]() |
||
ID:
865 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Please, type it in the " run_client " also!
|
||
ID:
872 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Yay! Thank you all who helped me. Now Docking@Home is working properly for me. Anyone running Xubuntu who doesn't know what to do, here are the summarized steps:
|
||
ID:
878 | Rating: 0
| rate:
![]() ![]() ![]() |
||
>>> While I did not have to add the 'ulimit' command to 'run_client' on my other working machine (AMD Opteron 848 (2 cpus) same OS Fedora Core 3), but I will try.
|
||
ID:
885 | Rating: 0
| rate:
![]() ![]() ![]() |
||
All now working ok, have now processed a successful WU after 1 hour 25 minutes. Also have 6 pending.
|
||
ID:
896 | Rating: 0
| rate:
![]() ![]() ![]() |
||
All now working ok, have now processed a successful WU after 1 hour 25 minutes. Also have 6 pending. Well done... ;-) Let's hope that an app update will fix the needed "hack". |
||
ID:
897 | Rating: 0
| rate:
![]() ![]() ![]() |
||
Message boards : Number crunching : Charmm 5.02
Database Error: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) { [0]=> array(7) { ["file"]=> string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc" ["line"]=> int(97) ["function"]=> string(8) "do_query" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#123 (2) { ["db_conn"]=> resource(228) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(50) "update DBNAME.thread set views=views+1 where id=26" } } [1]=> array(7) { ["file"]=> string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc" ["line"]=> int(60) ["function"]=> string(6) "update" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#123 (2) { ["db_conn"]=> resource(228) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(3) { [0]=> object(BoincThread)#3 (16) { ["id"]=> string(2) "26" ["forum"]=> string(1) "2" ["owner"]=> string(2) "52" ["status"]=> string(1) "0" ["title"]=> string(11) "Charmm 5.02" ["timestamp"]=> string(10) "1160144356" ["views"]=> string(4) "3153" ["replies"]=> string(3) "117" ["activity"]=> string(20) "4.5022778327013e-129" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1158221853" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } [1]=> &string(6) "thread" [2]=> &string(13) "views=views+1" } } [2]=> array(7) { ["file"]=> string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php" ["line"]=> int(184) ["function"]=> string(6) "update" ["class"]=> string(11) "BoincThread" ["object"]=> object(BoincThread)#3 (16) { ["id"]=> string(2) "26" ["forum"]=> string(1) "2" ["owner"]=> string(2) "52" ["status"]=> string(1) "0" ["title"]=> string(11) "Charmm 5.02" ["timestamp"]=> string(10) "1160144356" ["views"]=> string(4) "3153" ["replies"]=> string(3) "117" ["activity"]=> string(20) "4.5022778327013e-129" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1158221853" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(13) "views=views+1" } } }query: update docking.thread set views=views+1 where id=26