Wrong numbers in "max # of error/total/success tasks"


Advanced search

Message boards : Number crunching : Wrong numbers in "max # of error/total/success tasks"

Sort
Author Message
Profile Saenger
Volunteer tester
Avatar

Joined: Sep 13 06
Posts: 125
ID: 79
Credit: 411,959
RAC: 0
Message 5056 - Posted 2 Jun 2009 8:45:07 UTC

I have no problem in general with the numbers 0/1/1 in those fields of the WUs, only you should never resend a delayed WU to another cruncher if you won't accept 2 valids.

I've got one as the second cruncher at 31 May 2009 20:07:58 UTC, but before I could report it back (or better my BOINC would do it) it was reported by the original cruncher reported it at 31 May 2009 23:22:36 UTC. At 1 Jun 2009 7:55:12 UTC mine was invalidated because of "Too many total results".

If you resend WUs the number if totals must be at least 2, as well as the number if successes, the number of errors can be one less.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile Saenger
Volunteer tester
Avatar

Joined: Sep 13 06
Posts: 125
ID: 79
Credit: 411,959
RAC: 0
Message 5064 - Posted 5 Jun 2009 16:52:58 UTC
Last modified: 5 Jun 2009 16:53:31 UTC

As you haven't changed the numbers, am I correct that you will never resend non-valid completed WUs ever?

In connection with this threat I think that's right to assume. Does that mean you don't need all WUs crunched to achieve your result?

Profile SG-Booster

Joined: Oct 25 08
Posts: 2
ID: 2918
Credit: 2,218,124
RAC: 0
Message 5072 - Posted 8 Jun 2009 9:47:56 UTC

I agree with Saenger...


a minimum value should be 2...


have also some of these Wu's ... lost crunching-time!



best RS

Profile Trilce Estrada
Forum moderator
Project administrator
Project developer
Project tester

Joined: Sep 19 06
Posts: 189
ID: 119
Credit: 1,217,236
RAC: 0
Message 5073 - Posted 9 Jun 2009 1:51:58 UTC

As an initial answer to this problem just let me tell you that it was sending additional jobs of the same workunit because of a bug in the transitioner. We fixed this issue a while ago, but in the last relocation the transitioner was overwritten. Now it was restored and the maximum number of jobs shouldn't be a problem anymore. Regarding to the maximum number of success, it is correct, a better value is 2 and now it is changed.

More about this issue later

Thank you!!

Profile Michela
Forum moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Joined: Sep 13 06
Posts: 163
ID: 10
Credit: 97,083
RAC: 0
Message 5078 - Posted 10 Jun 2009 21:52:03 UTC - in response to Message ID 5064 .

As you haven't changed the numbers, am I correct that you will never resend non-valid completed WUs ever?

In connection with this threat I think that's right to assume. Does that mean you don't need all WUs crunched to achieve your result?


Hi, I want to address the question "Does that mean you don't need all WUs crunched to achieve your result?" We consider any WU as an important WU however the WU is not relevant as a single result but as part of a larger set of results. We post-process the set of results and we identify tendencies. The more results we have, the larger is the probability that they converge toward a correct answer.

In other words, we replaced redundancy (that resulted in wasted cycles) with a powerful post-processing analysis that allows us to identify outliers (possible results affected by errors) and convergent results (tentative answers).

Michela


____________
If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
Steven Meyer
Avatar

Joined: May 26 09
Posts: 23
ID: 12091
Credit: 130,335
RAC: 0
Message 5090 - Posted 25 Jun 2009 16:19:47 UTC - in response to Message ID 5078 .
Last modified: 25 Jun 2009 16:29:04 UTC

As you haven't changed the numbers, am I correct that you will never resend non-valid completed WUs ever?

In connection with this threat I think that's right to assume. Does that mean you don't need all WUs crunched to achieve your result?


Hi, I want to address the question "Does that mean you don't need all WUs crunched to achieve your result?" We consider any WU as an important WU however the WU is not relevant as a single result but as part of a larger set of results. We post-process the set of results and we identify tendencies. The more results we have, the larger is the probability that they converge toward a correct answer.

In other words, we replaced redundancy (that resulted in wasted cycles) with a powerful post-processing analysis that allows us to identify outliers (possible results affected by errors) and convergent results (tentative answers).

Michela


Michela, some time ago, when I was first starting to crunch D@H WU, I was sent a large number of WU with a short deadline. Since D@H is not my only project, the work overload caused D@H to run in "High Priority" thus shutting down the other project. In order to reduce the work overload, I aborted about half of the D@H WU.

Then I checked one of the aborted WU on the web site and saw the line.

max # of error/total/success tasks 0, 1, 1

Since the abort was counted as an error, all of the aborted Work Units will never be sent out again.

This may or may not be an issue, given your post-processing.

Now, however, I see that the number of success tasks has been set to 2, but the error and total numbers are unchanged.

max # of error/total/success tasks 0, 1, 2

It might make sense to change the counts to be 1 for errors, 1 or 2 for total, and 2 for success so that an abort will not prevent the WU from being reissued.

D@H settings of errors 0, total 1, success 2 will cause the WU to be abandoned with one error or any 2 results.

Again, maybe this is OK with your post-processing...
____________

Message boards : Number crunching : Wrong numbers in "max # of error/total/success tasks"

Database Error
: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) {
  [0]=>
  array(7) {
    ["file"]=>
    string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc"
    ["line"]=>
    int(97)
    ["function"]=>
    string(8) "do_query"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#11 (2) {
      ["db_conn"]=>
      resource(78) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(51) "update DBNAME.thread set views=views+1 where id=434"
    }
  }
  [1]=>
  array(7) {
    ["file"]=>
    string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc"
    ["line"]=>
    int(60)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#11 (2) {
      ["db_conn"]=>
      resource(78) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(3) {
      [0]=>
      object(BoincThread)#3 (16) {
        ["id"]=>
        string(3) "434"
        ["forum"]=>
        string(1) "2"
        ["owner"]=>
        string(2) "79"
        ["status"]=>
        string(1) "0"
        ["title"]=>
        string(53) "Wrong numbers in "max # of error/total/success tasks""
        ["timestamp"]=>
        string(10) "1245946787"
        ["views"]=>
        string(3) "154"
        ["replies"]=>
        string(1) "5"
        ["activity"]=>
        string(19) "1.1206908734979e-87"
        ["sufferers"]=>
        string(1) "0"
        ["score"]=>
        string(1) "0"
        ["votes"]=>
        string(1) "0"
        ["create_time"]=>
        string(10) "1243932307"
        ["hidden"]=>
        string(1) "0"
        ["sticky"]=>
        string(1) "0"
        ["locked"]=>
        string(1) "0"
      }
      [1]=>
      &string(6) "thread"
      [2]=>
      &string(13) "views=views+1"
    }
  }
  [2]=>
  array(7) {
    ["file"]=>
    string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php"
    ["line"]=>
    int(184)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(11) "BoincThread"
    ["object"]=>
    object(BoincThread)#3 (16) {
      ["id"]=>
      string(3) "434"
      ["forum"]=>
      string(1) "2"
      ["owner"]=>
      string(2) "79"
      ["status"]=>
      string(1) "0"
      ["title"]=>
      string(53) "Wrong numbers in "max # of error/total/success tasks""
      ["timestamp"]=>
      string(10) "1245946787"
      ["views"]=>
      string(3) "154"
      ["replies"]=>
      string(1) "5"
      ["activity"]=>
      string(19) "1.1206908734979e-87"
      ["sufferers"]=>
      string(1) "0"
      ["score"]=>
      string(1) "0"
      ["votes"]=>
      string(1) "0"
      ["create_time"]=>
      string(10) "1243932307"
      ["hidden"]=>
      string(1) "0"
      ["sticky"]=>
      string(1) "0"
      ["locked"]=>
      string(1) "0"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(13) "views=views+1"
    }
  }
}
query: update docking.thread set views=views+1 where id=434