with few other projects?


Advanced search

Message boards : Number crunching : with few other projects?

Sort
Author Message
zombie67 [MM]
Volunteer tester
Avatar

Joined: Sep 18 06
Posts: 207
ID: 114
Credit: 2,817,648
RAC: 0
Message 4053 - Posted 13 Jun 2008 21:44:44 UTC

June 13, 2008 19:37:00
We are testing a new algorithm for checkpointing on docking, please run docking with few other projects and give us your feedback. Thanks for your patience. We will be distributing 300 workunits.


I can read this two opposite ways, so I am asking for clarification.

Do you want us to run with other projects, to try out the checkpointing?

Or do you want us to minimize the number of other projects to run this with, to speed up the crunching and return of results?


____________
Dublin, CA
Team SETI.USA
Profile Arun
Volunteer tester

Joined: Apr 30 08
Posts: 40
ID: 379
Credit: 10,385
RAC: 0
Message 4054 - Posted 13 Jun 2008 21:54:35 UTC - in response to Message ID 4053 .

June 13, 2008 19:37:00
We are testing a new algorithm for checkpointing on docking, please run docking with few other projects and give us your feedback. Thanks for your patience. We will be distributing 300 workunits.


I can read this two opposite ways, so I am asking for clarification.

Do you want us to run with other projects, to try out the checkpointing?

Or do you want us to minimize the number of other projects to run this with, to speed up the crunching and return of results?



We are trying the new checkpointing algorithm, so we want docking to be run with other projects to test if checkpointing is working correctly.

Also the RSC limit specified may be lower than it needs, so some workunits may have problems. We are generating workunits with higher rsc limits. Thanks for your patience and feedback.
Profile adrianxw
Volunteer tester
Avatar

Joined: Dec 30 06
Posts: 164
ID: 343
Credit: 1,669,741
RAC: 0
Message 4062 - Posted 14 Jun 2008 16:33:16 UTC
Last modified: 14 Jun 2008 16:41:36 UTC

On my machines, they run so quickly, they never get suspended. I can force some switching if you need, but there are very few, so experimental opportunities are equally few. I exited BOINC then restarted to see what happened to the Dock wu, it seemed to restart, and the %complete continued to rise, however the CPU time reset to zero and started counting up again. The wu is this one my machine is 2253.

Odd thing I noticed, when looking at the link above, when I clicked on the 2253 to check it was mine, then went "Back", the 2253 number was absent, the table entry was blank. Trivial, but possibly indicative of a lurking problem.

I caught one running, a "Charmm with screensaver 7.00", so out of curiosity, I opened the graphics, something I don't usually do, the picture below is what I saw, probably not what was intended...



Windows XP, BOINC 5.10.45, ATI Radeon HD 2400 PRO graphics adaptor and driver.

Just ask if you need anything.
____________
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.

Profile adrianxw
Volunteer tester
Avatar

Joined: Dec 30 06
Posts: 164
ID: 343
Credit: 1,669,741
RAC: 0
Message 4064 - Posted 14 Jun 2008 19:16:09 UTC

Update, see it errored out, along with several others. Don't know if that was me fiddling, or simply bad stuff.
____________
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.

Profile Arun
Volunteer tester

Joined: Apr 30 08
Posts: 40
ID: 379
Credit: 10,385
RAC: 0
Message 4072 - Posted 16 Jun 2008 13:54:48 UTC

hi all,
Some workunits sent over during the weekend had the estimated time lower than it should have been. It resulted in many clients ending up with 'compute error'. I underestimated the CPU time needed for the workunit.

Later I sent workunits with increased CPU time estimate and most of them ran successfully. Did anyone who got the 'compute error' get successful completion of these workunits ?

Thanks,
Arun

mewbysea

Joined: Apr 14 07
Posts: 6
ID: 369
Credit: 3,078,191
RAC: 0
Message 4075 - Posted 17 Jun 2008 0:47:39 UTC - in response to Message ID 4072 .
Last modified: 17 Jun 2008 1:10:01 UTC

hi all,
Some workunits sent over during the weekend had the estimated time lower than it should have been. It resulted in many clients ending up with 'compute error'. I underestimated the CPU time needed for the workunit.

Later I sent workunits with increased CPU time estimate and most of them ran successfully. Did anyone who got the 'compute error' get successful completion of these workunits ?

Thanks,
Arun


This may not be your revised version, but I got some that finally were able to at least claim their computer time and credit, but they also errored out with the same exit code. They did have the de-bugger run though so that may be of use. See result ID 8044 for an example. The BOINC Manager message for this one, which ran for about 1:20:00 with about 8 projects sharing 3 cpu cores, had the same message as the zero credit claimed wu's (which also ran for about that long):

6/14/2008 5:53:33 PM|Docking@Home|Aborting task 1tng_mod0011sc_50_467992_6: exceeded CPU time limit 4288.403915
____________
Dotsch
Volunteer tester
Avatar

Joined: Sep 13 06
Posts: 49
ID: 75
Credit: 57,728
RAC: 0
Message 4076 - Posted 17 Jun 2008 5:05:18 UTC - in response to Message ID 4054 .


We are trying the new checkpointing algorithm, so we want docking to be run with other projects to test if checkpointing is working correctly.

Looks like some problems at the checkpointing with 5.08. The WU http://docking.cis.udel.edu/workunit.php?wuid=2957 has state Checked, but no consensus yet. The WU was computed from two Intel Macs. One WU of them was restarted four times.
Profile Arun
Volunteer tester

Joined: Apr 30 08
Posts: 40
ID: 379
Credit: 10,385
RAC: 0
Message 4079 - Posted 17 Jun 2008 21:12:02 UTC - in response to Message ID 4062 .

On my machines, they run so quickly, they never get suspended. I can force some switching if you need, but there are very few, so experimental opportunities are equally few. I exited BOINC then restarted to see what happened to the Dock wu, it seemed to restart, and the %complete continued to rise, however the CPU time reset to zero and started counting up again. The wu is this one my machine is 2253.

Odd thing I noticed, when looking at the link above, when I clicked on the 2253 to check it was mine, then went "Back", the 2253 number was absent, the table entry was blank. Trivial, but possibly indicative of a lurking problem.

I caught one running, a "Charmm with screensaver 7.00", so out of curiosity, I opened the graphics, something I don't usually do, the picture below is what I saw, probably not what was intended...



Windows XP, BOINC 5.10.45, ATI Radeon HD 2400 PRO graphics adaptor and driver.

Just ask if you need anything.


Have you looked at the graphics window before ? Detaching from the project and attaching to it, fixed the problem with the graphics for many clients.

Because I underestimated the FLOPS count for the tasks, the workunits were generating compute error after the estimated CPU time was exceeded.

There seems to be some issues with the new checkpointing, we are working on resolving them.

Thanks
Arun

Message boards : Number crunching : with few other projects?

Database Error
: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) {
  [0]=>
  array(7) {
    ["file"]=>
    string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc"
    ["line"]=>
    int(97)
    ["function"]=>
    string(8) "do_query"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#13 (2) {
      ["db_conn"]=>
      resource(78) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(51) "update DBNAME.thread set views=views+1 where id=308"
    }
  }
  [1]=>
  array(7) {
    ["file"]=>
    string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc"
    ["line"]=>
    int(60)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#13 (2) {
      ["db_conn"]=>
      resource(78) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(3) {
      [0]=>
      object(BoincThread)#3 (16) {
        ["id"]=>
        string(3) "308"
        ["forum"]=>
        string(1) "2"
        ["owner"]=>
        string(3) "114"
        ["status"]=>
        string(1) "0"
        ["title"]=>
        string(24) "with few other projects?"
        ["timestamp"]=>
        string(10) "1213737122"
        ["views"]=>
        string(3) "623"
        ["replies"]=>
        string(1) "7"
        ["activity"]=>
        string(20) "3.6998740504721e-103"
        ["sufferers"]=>
        string(1) "0"
        ["score"]=>
        string(1) "0"
        ["votes"]=>
        string(1) "0"
        ["create_time"]=>
        string(10) "1213393484"
        ["hidden"]=>
        string(1) "0"
        ["sticky"]=>
        string(1) "0"
        ["locked"]=>
        string(1) "0"
      }
      [1]=>
      &string(6) "thread"
      [2]=>
      &string(13) "views=views+1"
    }
  }
  [2]=>
  array(7) {
    ["file"]=>
    string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php"
    ["line"]=>
    int(184)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(11) "BoincThread"
    ["object"]=>
    object(BoincThread)#3 (16) {
      ["id"]=>
      string(3) "308"
      ["forum"]=>
      string(1) "2"
      ["owner"]=>
      string(3) "114"
      ["status"]=>
      string(1) "0"
      ["title"]=>
      string(24) "with few other projects?"
      ["timestamp"]=>
      string(10) "1213737122"
      ["views"]=>
      string(3) "623"
      ["replies"]=>
      string(1) "7"
      ["activity"]=>
      string(20) "3.6998740504721e-103"
      ["sufferers"]=>
      string(1) "0"
      ["score"]=>
      string(1) "0"
      ["votes"]=>
      string(1) "0"
      ["create_time"]=>
      string(10) "1213393484"
      ["hidden"]=>
      string(1) "0"
      ["sticky"]=>
      string(1) "0"
      ["locked"]=>
      string(1) "0"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(13) "views=views+1"
    }
  }
}
query: update docking.thread set views=views+1 where id=308