Posts by Arun

11)

Message boards : Number crunching : Checkpointing

( Message 4106 )
Posted 3332 days ago by Profile Arun
Hi,

the script for the docking seems to miss boinc_checkpoint_completed and therefore you do not see our checkpointing. We will add it tomorrow.

Regarding the CPU that starts at 0: there must be some issues with the timing. We will look at this as well tomorrow.

We checkpoint the coordinates at the end of each conformation (random structure of the ligand). Each conformation is randomly rotated a certain number of times and each time it is docked in the protein. We will provide a table of the times for different machine between two checkpoints soon.

The percentdone.txt file keeps track of the point when the last checkpoint was done. The fraction done is the variable FDONE.

Thanks for all your help and patience!

Michela



Thanks Dr. Taufer. We have added boinc_checkpoint_completed to the charmm script and now the CPU time should start from the previous time before the task was suspended/stopped. Now in the stderr.txt file you should see the message: "Starting charmm run (initial or from checkpoint)". We will distribute 300 WUs to test this.

Thanks for your feedback and help !

Arun
12)

Message boards : Number crunching : errors on Mac, still

( Message 4092 )
Posted 3335 days ago by Profile Arun
Even with the latest version, charmm (with screensaver) is still crashing instantly on my Mac. Here's a sample task result.

Running charmm manually exits immediately because there's no input data -- if someone would point me at an init file so I can try to see what's going wrong I'd appreciate it.

Or at least respond... am I the only one having problems with a Mac running Charmm??


hi [B^S] sTrey,
Thanks for your feedback. We will be updating the charmm executable for Intel Mac soon and that should fix the problem you are having. Thanks for your patience.

cheers
Arun
13)

Message boards : Number crunching : Homogenous Redundancy?

( Message 4091 )
Posted 3335 days ago by Profile Arun
Are we using this feature on Docking ??
My AMD Linux machines have been paired with Intel Vista machines and my AMD XP machine has been paired with an AMD Linux machine.
In all cases the one machine has not validated against two of the other type machines. In fact the odd one out often just errors out.


Hi Conan,
Thanks for your message. We are using homogeneous redundancy. We will check the results and see where the problem is.

Thanks
Arun
14)

Message boards : Number crunching : Test WUs

( Message 4090 )
Posted 3335 days ago by Profile Arun


OK Arun, the FC6 machine had the '1abe_mod0011sc_' work units and the FC3 machine (the one not giving seg faults) had an '1tng_mod0011sc_' work unit.
So not the same type, will wait for more work units.


Conan, thanks for your response. Actually I want to know if your client downloaded charmm_7.0_i686-pc-linux-gnu or charmm_7.0_x86_64-pc-linux-gnu when you attached to the project before downloading the wu.

Thanks
Arun


Arun, I have no idea. As I currently have no work units I am unable to locate this information. I will wait for more work and if I can catch it in time (they error out in 20 odd seconds) then I may be able to get this information.


>> G'Day Arun,
Today I have been able to get a few work units and again I had four error out on the same machine with the same error.
I checked and found that the file "charmm_7.2_i686-pc-linux-gnu" and the file "charmm_7.2_i686-pc-linux-gnu-main" are currently in my Boinc/projects/docking folder.
Hope this can be of help as I will have to detach this machine if I can't get it to work. It worked before the move to this new university.
Perhaps I need to detach and reattach ???

Awaiting your reply, Conan.

EDIT::: I have found that a work unit has downloaded to the same spec machine running FC3 and in it's Boinc folder it not only has the charmm_7.2_i686-pc-linux-gnu and charmm_7.2_i686-pc-linux-gnu-main files but also the charmm_5.8_i686-pc-linux-gnu , charmm_5.8_i686-pc-linux-gnu-main , charmm_7.0_i686-pc-linux-gnu and charmm_7.0_i686-pc-linux-gnu-main files.
Could this be a reason why my FC6 machine is playing up, it has lost old and possibly needed files ??


Hi Conan,
Thanks for your feedback. We will use the information to figure out the problem. The problem is not because of the absence of the old files in the FC6 machine.

Thanks
Arun
15)

Message boards : Number crunching : error on Windows

( Message 4084 )
Posted 3335 days ago by Profile Arun
I'm getting this Windows XP Error:

6/14/2008 17:06:59|Docking@Home|Sending scheduler request: Requested by user. Requesting 3822 seconds of work, reporting 0 completed tasks
6/14/2008 17:07:04|Docking@Home|Scheduler request succeeded: got 0 new tasks
6/14/2008 17:07:04|Docking@Home|Message from server: No work sent
6/14/2008 17:07:04|Docking@Home|Message from server: Charmm with screensaver is not available for your type of computer.

then it holds for 24 hours before ne conecction to server. Does that mean no windows work?


Thanks for your feedback. We will look into the problem.
16)

Message boards : Getting started : Invitation Code?

( Message 4083 )
Posted 3335 days ago by Profile Arun
May I ask, whether you intend to open up this weekend for n00bs, or may I otherwise ask to get an invitation code for some eager teammates, who would like to join?


New volunteers need the invitation code. Please check your PM.

cheers
Arun
17)

Message boards : Number crunching : Reminder for the project admins

( Message 4082 )
Posted 3335 days ago by Profile Arun
hi Zombie, Thanks for pointing out the thread. We will take care of the request.

cheers,
Arun
18)

Message boards : Number crunching : with few other projects?

( Message 4079 )
Posted 3340 days ago by Profile Arun
On my machines, they run so quickly, they never get suspended. I can force some switching if you need, but there are very few, so experimental opportunities are equally few. I exited BOINC then restarted to see what happened to the Dock wu, it seemed to restart, and the %complete continued to rise, however the CPU time reset to zero and started counting up again. The wu is this one my machine is 2253.

Odd thing I noticed, when looking at the link above, when I clicked on the 2253 to check it was mine, then went "Back", the 2253 number was absent, the table entry was blank. Trivial, but possibly indicative of a lurking problem.

I caught one running, a "Charmm with screensaver 7.00", so out of curiosity, I opened the graphics, something I don't usually do, the picture below is what I saw, probably not what was intended...



Windows XP, BOINC 5.10.45, ATI Radeon HD 2400 PRO graphics adaptor and driver.

Just ask if you need anything.


Have you looked at the graphics window before ? Detaching from the project and attaching to it, fixed the problem with the graphics for many clients.

Because I underestimated the FLOPS count for the tasks, the workunits were generating compute error after the estimated CPU time was exceeded.

There seems to be some issues with the new checkpointing, we are working on resolving them.

Thanks
Arun
19)

Message boards : Number crunching : Test WUs

( Message 4078 )
Posted 3340 days ago by Profile Arun


OK Arun, the FC6 machine had the '1abe_mod0011sc_' work units and the FC3 machine (the one not giving seg faults) had an '1tng_mod0011sc_' work unit.
So not the same type, will wait for more work units.


Conan, thanks for your response. Actually I want to know if your client downloaded charmm_7.0_i686-pc-linux-gnu or charmm_7.0_x86_64-pc-linux-gnu when you attached to the project before downloading the wu.

Thanks
Arun
20)

Message boards : Number crunching : with few other projects?

( Message 4072 )
Posted 3341 days ago by Profile Arun
hi all,
Some workunits sent over during the weekend had the estimated time lower than it should have been. It resulted in many clients ending up with 'compute error'. I underestimated the CPU time needed for the workunit.

Later I sent workunits with increased CPU time estimate and most of them ran successfully. Did anyone who got the 'compute error' get successful completion of these workunits ?

Thanks,
Arun


Next 10 posts