checkpoint problem


Advanced search

Message boards : Number crunching : checkpoint problem

Sort
Author Message
_heinz

Joined: Jun 16 09
Posts: 12
ID: 13437
Credit: 1,471,103
RAC: 0
Message 5483 - Posted 27 Oct 2009 8:40:57 UTC

After progress of 100% a last checkpoint should be written.
As we see on this wu the work started again at last checkpoint although the wu reached 100% and called boinc_finish.(the red line)
--------------------
<core_client_version>6.2.19</core_client_version>
<![CDATA[
<stderr_txt>
Calling BOINC init.
Starting charmm run (initial or from checkpoint)...
Calling BOINC init.
Starting charmm run (initial or from checkpoint)...
Calling BOINC init.
Starting charmm run (initial or from checkpoint)...
Calling BOINC init.
Starting charmm run (initial or from checkpoint)...
SUCCESS - Charmm exited with code 0.
Resolving file charmm.out...
Calling BOINC finish.
called boinc_finish
Calling BOINC init.
Starting charmm run (initial or from checkpoint)...
SUCCESS - Charmm exited with code 0.
Resolving file charmm.out...
Calling BOINC finish.
called boinc_finish

</stderr_txt>
]]>
--------------------------------
caused by round robin scheduler as it interrupted work at 100%
this situation can be avoided by writing additional checkpoint at 100%.



____________
V8-Xeon-Docking

Profile Trilce Estrada
Forum moderator
Project administrator
Project developer
Project tester

Joined: Sep 19 06
Posts: 189
ID: 119
Credit: 1,217,236
RAC: 0
Message 5486 - Posted 28 Oct 2009 0:04:25 UTC - in response to Message ID 5483 .


We will seriously consider your suggestion for the next version of our application. This same behavior has been observed before, its just bad luck having the interruption at the very end.

Thanks a lot

Message boards : Number crunching : checkpoint problem

Database Error
: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) {
  [0]=>
  array(7) {
    ["file"]=>
    string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc"
    ["line"]=>
    int(97)
    ["function"]=>
    string(8) "do_query"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#7 (2) {
      ["db_conn"]=>
      resource(60) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(51) "update DBNAME.thread set views=views+1 where id=479"
    }
  }
  [1]=>
  array(7) {
    ["file"]=>
    string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc"
    ["line"]=>
    int(60)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#7 (2) {
      ["db_conn"]=>
      resource(60) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(3) {
      [0]=>
      object(BoincThread)#3 (16) {
        ["id"]=>
        string(3) "479"
        ["forum"]=>
        string(1) "2"
        ["owner"]=>
        string(5) "13437"
        ["status"]=>
        string(1) "0"
        ["title"]=>
        string(18) "checkpoint problem"
        ["timestamp"]=>
        string(10) "1256688265"
        ["views"]=>
        string(2) "69"
        ["replies"]=>
        string(1) "1"
        ["activity"]=>
        string(17) "2.61658782558e-82"
        ["sufferers"]=>
        string(1) "0"
        ["score"]=>
        string(1) "0"
        ["votes"]=>
        string(1) "0"
        ["create_time"]=>
        string(10) "1256632857"
        ["hidden"]=>
        string(1) "0"
        ["sticky"]=>
        string(1) "0"
        ["locked"]=>
        string(1) "0"
      }
      [1]=>
      &string(6) "thread"
      [2]=>
      &string(13) "views=views+1"
    }
  }
  [2]=>
  array(7) {
    ["file"]=>
    string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php"
    ["line"]=>
    int(184)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(11) "BoincThread"
    ["object"]=>
    object(BoincThread)#3 (16) {
      ["id"]=>
      string(3) "479"
      ["forum"]=>
      string(1) "2"
      ["owner"]=>
      string(5) "13437"
      ["status"]=>
      string(1) "0"
      ["title"]=>
      string(18) "checkpoint problem"
      ["timestamp"]=>
      string(10) "1256688265"
      ["views"]=>
      string(2) "69"
      ["replies"]=>
      string(1) "1"
      ["activity"]=>
      string(17) "2.61658782558e-82"
      ["sufferers"]=>
      string(1) "0"
      ["score"]=>
      string(1) "0"
      ["votes"]=>
      string(1) "0"
      ["create_time"]=>
      string(10) "1256632857"
      ["hidden"]=>
      string(1) "0"
      ["sticky"]=>
      string(1) "0"
      ["locked"]=>
      string(1) "0"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(13) "views=views+1"
    }
  }
}
query: update docking.thread set views=views+1 where id=479