Computation error - now up to 50% of the WUs fail


Advanced search

Message boards : Number crunching : Computation error - now up to 50% of the WUs fail

Sort
Author Message
Scott A. Howard*

Joined: Nov 10 08
Posts: 6
ID: 3568
Credit: 5,203,255
RAC: 0
Message 5797 - Posted 14 Mar 2010 15:10:19 UTC
Last modified: 14 Mar 2010 16:01:27 UTC

Over the last week, I have been seeing a small number of Docking WUs failing with 'Computation error'. This morning I find that nearly 50% of the WUs on my work machine are failing. At this time, there is no similar problem with WUs from other projects. But then I am not processing many WUs from other projects.

The only relevant thing that I can think of doing recently was moving from 6.10.10 to 6.10.36 BOINC manager. (I think I was at 6.10.10 last...). And, since QMC is not putting as many WUs out, almost all my jobs are Docking now. There are no GPU jobs running.

On this particular machine, I am only attached to boincsimap, The Lattice Project (no jobs run yet), Docking, QMC@Home, and Ralph.

Needless to say, I am burning a lot of energy that is being wasted. I will need to disconnect if this continues. I have a dual quad core machine so I am wasting a significant number of cpu cycles.

Here is the description of my machine from the messages pane:

3/12/2010 8:24:15 AM Processor: 8 GenuineIntel Intel(R) Xeon(R) CPU E5410 @ 2.33GHz [Family 6 Model 23 Stepping 6]
3/12/2010 8:24:15 AM Processor: 6.00 MB cache
3/12/2010 8:24:15 AM Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 nx lm vmx tm2 dca pbe
3/12/2010 8:24:15 AM OS: Microsoft Windows XP: Professional x86 Edition, Service Pack 3, (05.01.2600.00)
3/12/2010 8:24:15 AM Memory: 3.00 GB physical, 10.82 GB virtual
3/12/2010 8:24:15 AM Disk: 368.10 GB total, 296.37 GB free

Any suggestions from anyone on where to go from here?

Edit: Note to the moderator. I am purposely double posting this message here and under the Windows forum. Feel free to delete it from here if it was not the appropriate forum. :-)

Ananas

Joined: Aug 29 09
Posts: 56
ID: 17736
Credit: 2,500,425
RAC: 0
Message 5801 - Posted 14 Mar 2010 19:37:05 UTC
Last modified: 14 Mar 2010 19:41:27 UTC

You have a very large cache, those crashed WUs might be old ones (delivery date 5 days ago) - something has been fixed in the work generator lately.

Error is "Reached the end of the file. (0x26)" and this tells me, that they most likely have the problem described in this thread .

Check the news section on the front page of the project too, it has related informations.

Scott A. Howard*

Joined: Nov 10 08
Posts: 6
ID: 3568
Credit: 5,203,255
RAC: 0
Message 5807 - Posted 16 Mar 2010 12:01:41 UTC

Thanks for the reply and info.

Out of the 40 WUs processed in the last 18 hours or so, only 3 have ended in failure. Hopefully, failures will reach 0 soon.

It looks like the problem was transient.

Scott A. Howard*

Joined: Nov 10 08
Posts: 6
ID: 3568
Credit: 5,203,255
RAC: 0
Message 5812 - Posted 17 Mar 2010 11:25:36 UTC

Not out of the woods yet.

Out of the ~100 WUs run last night, ~60 of them failed.

I'll do the upload this morning and then reset the project.

Message boards : Number crunching : Computation error - now up to 50% of the WUs fail

Database Error
: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) {
  [0]=>
  array(7) {
    ["file"]=>
    string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc"
    ["line"]=>
    int(97)
    ["function"]=>
    string(8) "do_query"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#9 (2) {
      ["db_conn"]=>
      resource(60) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(51) "update DBNAME.thread set views=views+1 where id=521"
    }
  }
  [1]=>
  array(7) {
    ["file"]=>
    string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc"
    ["line"]=>
    int(60)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#9 (2) {
      ["db_conn"]=>
      resource(60) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(3) {
      [0]=>
      object(BoincThread)#3 (16) {
        ["id"]=>
        string(3) "521"
        ["forum"]=>
        string(1) "2"
        ["owner"]=>
        string(4) "3568"
        ["status"]=>
        string(1) "0"
        ["title"]=>
        string(49) "Computation error - now up to 50% of the WUs fail"
        ["timestamp"]=>
        string(10) "1268825136"
        ["views"]=>
        string(3) "126"
        ["replies"]=>
        string(1) "3"
        ["activity"]=>
        string(19) "5.0891958742416e-76"
        ["sufferers"]=>
        string(1) "0"
        ["score"]=>
        string(1) "0"
        ["votes"]=>
        string(1) "0"
        ["create_time"]=>
        string(10) "1268579419"
        ["hidden"]=>
        string(1) "0"
        ["sticky"]=>
        string(1) "0"
        ["locked"]=>
        string(1) "0"
      }
      [1]=>
      &string(6) "thread"
      [2]=>
      &string(13) "views=views+1"
    }
  }
  [2]=>
  array(7) {
    ["file"]=>
    string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php"
    ["line"]=>
    int(184)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(11) "BoincThread"
    ["object"]=>
    object(BoincThread)#3 (16) {
      ["id"]=>
      string(3) "521"
      ["forum"]=>
      string(1) "2"
      ["owner"]=>
      string(4) "3568"
      ["status"]=>
      string(1) "0"
      ["title"]=>
      string(49) "Computation error - now up to 50% of the WUs fail"
      ["timestamp"]=>
      string(10) "1268825136"
      ["views"]=>
      string(3) "126"
      ["replies"]=>
      string(1) "3"
      ["activity"]=>
      string(19) "5.0891958742416e-76"
      ["sufferers"]=>
      string(1) "0"
      ["score"]=>
      string(1) "0"
      ["votes"]=>
      string(1) "0"
      ["create_time"]=>
      string(10) "1268579419"
      ["hidden"]=>
      string(1) "0"
      ["sticky"]=>
      string(1) "0"
      ["locked"]=>
      string(1) "0"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(13) "views=views+1"
    }
  }
}
query: update docking.thread set views=views+1 where id=521