WU abort due to output file size


Advanced search

Message boards : Number crunching : WU abort due to output file size

Sort
Author Message
Profile David Ball
Forum moderator
Volunteer tester
Avatar

Joined: Sep 18 06
Posts: 274
ID: 115
Credit: 1,634,401
RAC: 0
Message 1965 - Posted 6 Jan 2007 14:49:49 UTC

My Fedora Core 3 machine usually runs fine with docking, but I had a WU abort with the following in the log file:

2007-01-06 06:48:11 [Docking@Home] Restarting task 1tng_mod0001_17265_398526_3 using charmm version 502
2007-01-06 06:48:59 [Docking@Home] Task 1tng_mod0001_17265_398526_3 exited with zero status but no 'finished' file
2007-01-06 06:48:59 [Docking@Home] If this happens repeatedly you may need to reset the project.
2007-01-06 06:48:59 [---] Rescheduling CPU: application exited
2007-01-06 06:48:59 [Docking@Home] Temporarily failed upload of 1tng_mod0001_17654_225340_2_2: http error
2007-01-06 06:48:59 [Docking@Home] Backing off 1 minutes and 0 seconds on upload of file 1tng_mod0001_17654_225340_2_2
2007-01-06 06:48:59 [Docking@Home] Temporarily failed upload of 1tng_mod0001_17654_225340_2_3: http error
2007-01-06 06:48:59 [Docking@Home] Backing off 1 minutes and 0 seconds on upload of file 1tng_mod0001_17654_225340_2_3
2007-01-06 06:48:59 [Docking@Home] Restarting task 1tng_mod0001_17265_398526_3 using charmm version 502
2007-01-06 06:49:12 [Docking@Home] Started upload of file 1tng_mod0001_17654_225340_2_0
2007-01-06 06:49:12 [Docking@Home] Started upload of file 1tng_mod0001_17654_225340_2_1
2007-01-06 06:49:20 [Docking@Home] Finished upload of file 1tng_mod0001_17654_225340_2_1
2007-01-06 06:49:20 [Docking@Home] Throughput 965 bytes/sec
2007-01-06 06:49:21 [Docking@Home] Finished upload of file 1tng_mod0001_17654_225340_2_0
2007-01-06 06:49:21 [Docking@Home] Throughput 666 bytes/sec
2007-01-06 06:50:00 [Docking@Home] Started upload of file 1tng_mod0001_17654_225340_2_2
2007-01-06 06:50:00 [Docking@Home] Started upload of file 1tng_mod0001_17654_225340_2_3
2007-01-06 06:50:07 [Docking@Home] Finished upload of file 1tng_mod0001_17654_225340_2_3
2007-01-06 06:50:07 [Docking@Home] Throughput 240 bytes/sec
2007-01-06 06:50:14 [Docking@Home] Aborting task 1tng_mod0001_17265_398526_3: exceeded disk limit: 276973242.000000 > 165000000.000000
2007-01-06 06:50:14 [Docking@Home] Unrecoverable error for result 1tng_mod0001_17265_398526_3 (Maximum disk usage exceeded)
2007-01-06 06:50:14 [Docking@Home] Deferring scheduler requests for 1 minutes and 0 seconds
2007-01-06 06:50:15 [---] Rescheduling CPU: application exited
2007-01-06 06:50:15 [Docking@Home] Computation for task 1tng_mod0001_17265_398526_3 finished
2007-01-06 06:50:15 [Docking@Home] Output file 1tng_mod0001_17265_398526_3_3 for task 1tng_mod0001_17265_398526_3 exceeds size limit.
2007-01-06 06:50:15 [Docking@Home] File size: 275166086.000000 bytes. Limit: 100000000.000000 bytes
2007-01-06 06:50:15 [Docking@Home] Starting task 1tng_mod0001_17422_430545_2 using charmm version 502


Here's the output of "df -h" on that machine:

Filesystem Size Used Avail Use% Mounted on
/dev/hda5 34G 3.8G 29G 12% /
/dev/hda1 99M 14M 80M 15% /boot
none 490M 0 490M 0% /dev/shm
/dev/hda2 2.0G 37M 1.9G 2% /tmp
/tmp 2.0G 37M 1.9G 2% /var/tmp

Here's the output of "df" without the "-h"
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda5 35262476 3890756 29580476 12% /
/dev/hda1 101086 14149 81718 15% /boot
none 501508 0 501508 0% /dev/shm
/dev/hda2 2063536 37108 1921604 2% /tmp
/tmp 2063536 37108 1921604 2% /var/tmp


BOINC is in /home/BOINC

"ulimit -a" says

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 15871
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 15871
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited


My General preferences say

Do work while computer is running on batteries?
(matters only for portable computers) yes
Do work while computer is in use? yes
Do work only between the hours of (no restriction)
Leave applications in memory while suspended?
(suspended applications will consume swap space if 'yes') yes
Switch between applications every
(recommended: 60 minutes) 60 minutes
On multiprocessors, use at most 2 processors
Use at most 100 percent of CPU time
Disk and memory usage
Use no more than 100 GB disk space
Leave at least
(Values smaller than 0.001 are ignored) 2 GB disk space free
Use no more than 75% of total disk space
Write to disk at most every 60 seconds
Use no more than 75% of total virtual memory
Network usage
Connect to network about every
(determines size of work cache; maximum 10 days) 1 days
Confirm before connecting to Internet?
(matters only if you have a modem, ISDN or VPN connection) no
Disconnect when done?
(matters only if you have a modem, ISDN or VPN connection) no
Maximum download rate: 16 KB/s
Maximum upload rate: 16 KB/s
Use network only between the hours of
Enforced by versions 4.46 and greater (no restriction)
Skip image file verification?
Check this ONLY if your Internet provider modifies image files (UMTS does this, for example).
Skipping verification reduces the security of BOINC. no

The only projects on that machine are Docking and LHC, each with a 100 share, but LHC doesn't have any work.

It's running Fedora Core 3 Linux and BOINC version "5.4.11 i686-pc-linux-gnu".

NOTE: This is NOT the RHEL3 machine that won't run Docking. This machine usually runs Docking fine.

-- David

Profile Andre Kerstens
Forum moderator
Project tester
Volunteer tester
Avatar

Joined: Sep 11 06
Posts: 749
ID: 1
Credit: 15,199
RAC: 0
Message 1967 - Posted 6 Jan 2007 20:24:42 UTC - in response to Message ID 1965 .

One of the problems, I've seen before (the one where the output file exceeds the size limit). Basically, charmm.out (the file ending with _3) is 275 MB in size (I didn't even realize that this file could grow this large), but we have set 100 MB as a limit in the result template file on the server. This is probably a good thing, because you probably don't want to upload a 275 MB file to the docking server (and this is the first time I've seen such a big file).

The first problem with the missing finished file must be a boinc client problem. Did you try resetting the project as the CC suggests? Have you seen this before?

The http error usually occurs when the server cannot be reached. Maybe there was a temporary problem on the network somewhere. Anyway it seems the upload succeeds later on.

The CC also seems to be confused about your disk limit; it says you have one of 165000000.000000 bytes, but according to your preferences, this is incorrect (you specify 100GB); this also seems to be a problem of the boinc client.

Thanks
Andre


My Fedora Core 3 machine usually runs fine with docking, but I had a WU abort with the following in the log file:

2007-01-06 06:48:11 [Docking@Home] Restarting task 1tng_mod0001_17265_398526_3 using charmm version 502
2007-01-06 06:48:59 [Docking@Home] Task 1tng_mod0001_17265_398526_3 exited with zero status but no 'finished' file
2007-01-06 06:48:59 [Docking@Home] If this happens repeatedly you may need to reset the project.
2007-01-06 06:48:59 [---] Rescheduling CPU: application exited
2007-01-06 06:48:59 [Docking@Home] Temporarily failed upload of 1tng_mod0001_17654_225340_2_2: http error
2007-01-06 06:48:59 [Docking@Home] Backing off 1 minutes and 0 seconds on upload of file 1tng_mod0001_17654_225340_2_2
2007-01-06 06:48:59 [Docking@Home] Temporarily failed upload of 1tng_mod0001_17654_225340_2_3: http error
2007-01-06 06:48:59 [Docking@Home] Backing off 1 minutes and 0 seconds on upload of file 1tng_mod0001_17654_225340_2_3
2007-01-06 06:48:59 [Docking@Home] Restarting task 1tng_mod0001_17265_398526_3 using charmm version 502
2007-01-06 06:49:12 [Docking@Home] Started upload of file 1tng_mod0001_17654_225340_2_0
2007-01-06 06:49:12 [Docking@Home] Started upload of file 1tng_mod0001_17654_225340_2_1
2007-01-06 06:49:20 [Docking@Home] Finished upload of file 1tng_mod0001_17654_225340_2_1
2007-01-06 06:49:20 [Docking@Home] Throughput 965 bytes/sec
2007-01-06 06:49:21 [Docking@Home] Finished upload of file 1tng_mod0001_17654_225340_2_0
2007-01-06 06:49:21 [Docking@Home] Throughput 666 bytes/sec
2007-01-06 06:50:00 [Docking@Home] Started upload of file 1tng_mod0001_17654_225340_2_2
2007-01-06 06:50:00 [Docking@Home] Started upload of file 1tng_mod0001_17654_225340_2_3
2007-01-06 06:50:07 [Docking@Home] Finished upload of file 1tng_mod0001_17654_225340_2_3
2007-01-06 06:50:07 [Docking@Home] Throughput 240 bytes/sec
2007-01-06 06:50:14 [Docking@Home] Aborting task 1tng_mod0001_17265_398526_3: exceeded disk limit: 276973242.000000 > 165000000.000000
2007-01-06 06:50:14 [Docking@Home] Unrecoverable error for result 1tng_mod0001_17265_398526_3 (Maximum disk usage exceeded)
2007-01-06 06:50:14 [Docking@Home] Deferring scheduler requests for 1 minutes and 0 seconds
2007-01-06 06:50:15 [---] Rescheduling CPU: application exited
2007-01-06 06:50:15 [Docking@Home] Computation for task 1tng_mod0001_17265_398526_3 finished
2007-01-06 06:50:15 [Docking@Home] Output file 1tng_mod0001_17265_398526_3_3 for task 1tng_mod0001_17265_398526_3 exceeds size limit.
2007-01-06 06:50:15 [Docking@Home] File size: 275166086.000000 bytes. Limit: 100000000.000000 bytes
2007-01-06 06:50:15 [Docking@Home] Starting task 1tng_mod0001_17422_430545_2 using charmm version 502


Here's the output of "df -h" on that machine:

Filesystem Size Used Avail Use% Mounted on
/dev/hda5 34G 3.8G 29G 12% /
/dev/hda1 99M 14M 80M 15% /boot
none 490M 0 490M 0% /dev/shm
/dev/hda2 2.0G 37M 1.9G 2% /tmp
/tmp 2.0G 37M 1.9G 2% /var/tmp

Here's the output of "df" without the "-h"
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda5 35262476 3890756 29580476 12% /
/dev/hda1 101086 14149 81718 15% /boot
none 501508 0 501508 0% /dev/shm
/dev/hda2 2063536 37108 1921604 2% /tmp
/tmp 2063536 37108 1921604 2% /var/tmp


BOINC is in /home/BOINC

"ulimit -a" says

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
pending signals (-i) 15871
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 15871
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited


My General preferences say

Do work while computer is running on batteries?
(matters only for portable computers) yes
Do work while computer is in use? yes
Do work only between the hours of (no restriction)
Leave applications in memory while suspended?
(suspended applications will consume swap space if 'yes') yes
Switch between applications every
(recommended: 60 minutes) 60 minutes
On multiprocessors, use at most 2 processors
Use at most 100 percent of CPU time
Disk and memory usage
Use no more than 100 GB disk space
Leave at least
(Values smaller than 0.001 are ignored) 2 GB disk space free
Use no more than 75% of total disk space
Write to disk at most every 60 seconds
Use no more than 75% of total virtual memory
Network usage
Connect to network about every
(determines size of work cache; maximum 10 days) 1 days
Confirm before connecting to Internet?
(matters only if you have a modem, ISDN or VPN connection) no
Disconnect when done?
(matters only if you have a modem, ISDN or VPN connection) no
Maximum download rate: 16 KB/s
Maximum upload rate: 16 KB/s
Use network only between the hours of
Enforced by versions 4.46 and greater (no restriction)
Skip image file verification?
Check this ONLY if your Internet provider modifies image files (UMTS does this, for example).
Skipping verification reduces the security of BOINC. no

The only projects on that machine are Docking and LHC, each with a 100 share, but LHC doesn't have any work.

It's running Fedora Core 3 Linux and BOINC version "5.4.11 i686-pc-linux-gnu".

NOTE: This is NOT the RHEL3 machine that won't run Docking. This machine usually runs Docking fine.

-- David


____________
D@H the greatest project in the world... a while from now!

Message boards : Number crunching : WU abort due to output file size

Database Error
: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) {
  [0]=>
  array(7) {
    ["file"]=>
    string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc"
    ["line"]=>
    int(97)
    ["function"]=>
    string(8) "do_query"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#7 (2) {
      ["db_conn"]=>
      resource(60) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(51) "update DBNAME.thread set views=views+1 where id=146"
    }
  }
  [1]=>
  array(7) {
    ["file"]=>
    string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc"
    ["line"]=>
    int(60)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#7 (2) {
      ["db_conn"]=>
      resource(60) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(3) {
      [0]=>
      object(BoincThread)#3 (16) {
        ["id"]=>
        string(3) "146"
        ["forum"]=>
        string(1) "2"
        ["owner"]=>
        string(3) "115"
        ["status"]=>
        string(1) "0"
        ["title"]=>
        string(32) "WU abort due to output file size"
        ["timestamp"]=>
        string(10) "1168115082"
        ["views"]=>
        string(3) "747"
        ["replies"]=>
        string(1) "1"
        ["activity"]=>
        string(23) "2.1896773297341998e-126"
        ["sufferers"]=>
        string(1) "0"
        ["score"]=>
        string(1) "0"
        ["votes"]=>
        string(1) "0"
        ["create_time"]=>
        string(10) "1168094989"
        ["hidden"]=>
        string(1) "0"
        ["sticky"]=>
        string(1) "0"
        ["locked"]=>
        string(1) "0"
      }
      [1]=>
      &string(6) "thread"
      [2]=>
      &string(13) "views=views+1"
    }
  }
  [2]=>
  array(7) {
    ["file"]=>
    string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php"
    ["line"]=>
    int(184)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(11) "BoincThread"
    ["object"]=>
    object(BoincThread)#3 (16) {
      ["id"]=>
      string(3) "146"
      ["forum"]=>
      string(1) "2"
      ["owner"]=>
      string(3) "115"
      ["status"]=>
      string(1) "0"
      ["title"]=>
      string(32) "WU abort due to output file size"
      ["timestamp"]=>
      string(10) "1168115082"
      ["views"]=>
      string(3) "747"
      ["replies"]=>
      string(1) "1"
      ["activity"]=>
      string(23) "2.1896773297341998e-126"
      ["sufferers"]=>
      string(1) "0"
      ["score"]=>
      string(1) "0"
      ["votes"]=>
      string(1) "0"
      ["create_time"]=>
      string(10) "1168094989"
      ["hidden"]=>
      string(1) "0"
      ["sticky"]=>
      string(1) "0"
      ["locked"]=>
      string(1) "0"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(13) "views=views+1"
    }
  }
}
query: update docking.thread set views=views+1 where id=146