Exceeded CPU time quota.


Advanced search

Message boards : Number crunching : Exceeded CPU time quota.

Sort
Author Message
Profile adrianxw
Volunteer tester
Avatar

Joined: Dec 30 06
Posts: 164
ID: 343
Credit: 1,669,741
RAC: 0
Message 4411 - Posted 24 Sep 2008 7:24:06 UTC

Just returned from 3 weeks away and found that this wu crashed out with exceeded quota. I saw the news item from 22nd about dodgy wu's but note that this wu was sent today. Is this the same error? Are there more of these dodgy wu's in the queue ready to send?

24/09/2008 00:46:53|Docking@Home|Sending scheduler request: To fetch work. Requesting 8 seconds of work, reporting 0 completed tasks
24/09/2008 00:46:58|Docking@Home|Scheduler request succeeded: got 1 new tasks
24/09/2008 00:47:01|Docking@Home|Started download of 1d4j_mod0013sc_11768_409420.inp
24/09/2008 00:47:13|Docking@Home|Finished download of 1d4j_mod0013sc_11768_409420.inp
24/09/2008 01:19:21|Docking@Home|Starting 1d4h_mod0013sc_69953_284934_1
24/09/2008 01:19:21|Docking@Home|Starting task 1d4h_mod0013sc_69953_284934_1 using charmm34 version 615
24/09/2008 01:38:45|Docking@Home|Starting 1d4j_mod0013sc_11768_409420_0
24/09/2008 01:38:45|Docking@Home|Starting task 1d4j_mod0013sc_11768_409420_0 using charmm34 version 615
.
.
.
24/09/2008 07:41:27|Docking@Home|Aborting task 1d4h_mod0013sc_69953_284934_1: exceeded CPU time limit 22874.078707
.
.
.
24/09/2008 07:41:32|Docking@Home|Computation for task 1d4h_mod0013sc_69953_284934_1 finished
24/09/2008 07:41:32|Docking@Home|Output file 1d4h_mod0013sc_69953_284934_1_0 for task 1d4h_mod0013sc_69953_284934_1 absent
24/09/2008 07:41:32|Docking@Home|Output file 1d4h_mod0013sc_69953_284934_1_1 for task 1d4h_mod0013sc_69953_284934_1 absent
24/09/2008 07:41:32|Docking@Home|Output file 1d4h_mod0013sc_69953_284934_1_2 for task 1d4h_mod0013sc_69953_284934_1 absent
24/09/2008 07:41:34|Docking@Home|Started upload of 1d4h_mod0013sc_69953_284934_1_3
24/09/2008 07:41:35|Docking@Home|Finished upload of 1d4h_mod0013sc_69953_284934_1_3

____________
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.

mewbysea

Joined: Apr 14 07
Posts: 6
ID: 369
Credit: 3,078,191
RAC: 0
Message 4416 - Posted 24 Sep 2008 23:51:54 UTC

If this is the same problem, can you do a server-abort (if that's the right term) to recall these units that may still be queued but not yet started?
____________

Profile Trilce Estrada
Forum moderator
Project administrator
Project developer
Project tester

Joined: Sep 19 06
Posts: 189
ID: 119
Credit: 1,217,236
RAC: 0
Message 4417 - Posted 25 Sep 2008 2:23:34 UTC

We already did it. The problem is for existing workunits that created an additional replica, which was the case of 1d4h_mod0013sc_69953_284934_1, this last one indicates that it was an additional replica created for 1d4h_mod0013sc_69953_284934_0.

In the server there are no more of these workunits marked as unsent, but some of the additional replicas may be generated. If that happens I'll try to intercept them before

Profile adrianxw
Volunteer tester
Avatar

Joined: Dec 30 06
Posts: 164
ID: 343
Credit: 1,669,741
RAC: 0
Message 4459 - Posted 3 Oct 2008 12:38:38 UTC

There was another wu here which arrived 29th that crapped out after 12+ hours with exceeded quota.
____________
Wave upon wave of demented avengers march cheerfully out of obscurity into the dream.

dividedbymyself

Joined: Sep 14 08
Posts: 2
ID: 930
Credit: 92,857
RAC: 0
Message 4695 - Posted 12 Jan 2009 14:27:50 UTC

I just had a WU that aborted with a "recourse limit exceeded" message. It had been busy for 20 1/2 hours. The next WU is now busy for 2 1/4 hours and is not even at 4.5%, so according to my calculation this one will be exceeding the resource limit again as well. Before these last WU's I never had this problem as far as I can remember.
Is this because of the limited memory of 384 MB of this computer or could there be other reasons?
And what would be best, abort this task and download a new task and see what will happen, not to waste time on this one while it's obvious it's not gonna make it in time?
Or is my computer just too "small" for the current tasks?

I'd appreciate some advice.

Thanks,
Bart

Profile Trilce Estrada
Forum moderator
Project administrator
Project developer
Project tester

Joined: Sep 19 06
Posts: 189
ID: 119
Credit: 1,217,236
RAC: 0
Message 4696 - Posted 12 Jan 2009 16:49:00 UTC - in response to Message ID 4695 .

Hi Bart,

I just have the same type of workunits in my own computer. The problem is in the workunits and not in your computer, if you calculate they won't finish on time, just abort them. However, there are some of them taking from 4 to 7 hours and they are finishing fine. More than that may mean they went into a sort of undesirable loop.

Thank you

dividedbymyself

Joined: Sep 14 08
Posts: 2
ID: 930
Credit: 92,857
RAC: 0
Message 4697 - Posted 12 Jan 2009 17:50:09 UTC - in response to Message ID 4696 .
Last modified: 12 Jan 2009 17:53:40 UTC

Hi Bart,

I just have the same type of workunits in my own computer. The problem is in the workunits and not in your computer, if you calculate they won't finish on time, just abort them. However, there are some of them taking from 4 to 7 hours and they are finishing fine. More than that may mean they went into a sort of undesirable loop.

Thank you


Well, this particular computer of mine is pretty slow as well (1300 Mhz). Can't this be part of the problem?
But I just aborted as I think it'll eventually give an error again. I just hope there will not be too much of these "strange" WU's. My latest D@H WU's before the last long ones took from 11 minutes to 9 hours to complete. I'll see what's coming in next...

Thanks,
Bart
Profile Saenger
Volunteer tester
Avatar

Joined: Sep 13 06
Posts: 125
ID: 79
Credit: 411,959
RAC: 0
Message 4717 - Posted 14 Jan 2009 17:16:42 UTC

I ran in the "Maximum CPU time exceeded" error as well with one of my WUs . It was well before deadline, but somehow 27720.43 seconds (07:42:00.43 hours) were declared as being too much.

I don't have the faintest idea what's wrong with them, it would only be fine if such errors could be detected and the WU stopped a bit more early.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

P . P . L .
Avatar

Joined: Oct 20 08
Posts: 69
ID: 2725
Credit: 1,000,979
RAC: 0
Message 4722 - Posted 14 Jan 2009 22:23:33 UTC

Hi.

I had this one error after 10hrs, 37min see below.

Thu 15 Jan 2009 08:13:37 EST|Docking@Home|Aborting task 1pph_mod0013sctryp_2014_41716_0: exceeded CPU time limit 38265.722102

http://docking.cis.udel.edu/community/workunit.php?wuid=2552468

pete.

____________


Profile Trilce Estrada
Forum moderator
Project administrator
Project developer
Project tester

Joined: Sep 19 06
Posts: 189
ID: 119
Credit: 1,217,236
RAC: 0
Message 4731 - Posted 16 Jan 2009 16:25:06 UTC

I'm sorry, I found that the problem was specially present in 1pph. I'm trying to find a solution, if the problem cannot be detected on time, at least the credit should be assigned and the results returned, because in those cases the results are still valid and are even more stable than in the case of shorter workunits

For the moment I reduced the number of conformations, then stable workunits shouldnt take long

SUNY-GT Dock

Joined: Jan 2 09
Posts: 1
ID: 5427
Credit: 12,442
RAC: 0
Message 4776 - Posted 22 Jan 2009 20:38:25 UTC

Hate to rain on your parade Trilce, but right now I've got a 1bl7 workunit (1bl7_mod0013scp38alpha_19273_467291_0 is the exact designation) that's been running for over 5.5 hours now and hasn't budged from 0%. When I display the graphics, it says that "no model has been formed yet." It hasn't timed out (yet), but it maybe the same kind of problem that others have been having, or maybe its just a dud WU. (shrugs shoulders)

I figure its probably related to your transition troubles you've been having... >.>

P . P . L .
Avatar

Joined: Oct 20 08
Posts: 69
ID: 2725
Credit: 1,000,979
RAC: 0
Message 4780 - Posted 23 Jan 2009 20:42:29 UTC
Last modified: 23 Jan 2009 20:43:05 UTC

Hi.

Fri 23 Jan 2009 19:01:27 EST|Docking@Home|Aborting task 1bl7_mod0013scp38alpha_1408_97433_0: exceeded CPU time limit 29810.229190

I had this one error after 8hrs, 16min. I've had one other finish O.K. after a

bit over 3hrs, 40min as i have about 6 more of the same type of task should i let

them run or abort the rest?

pete.
____________


Profile Michela
Forum moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Joined: Sep 13 06
Posts: 163
ID: 10
Credit: 97,083
RAC: 0
Message 4782 - Posted 24 Jan 2009 1:47:24 UTC - in response to Message ID 4780 .

Hi.

Fri 23 Jan 2009 19:01:27 EST|Docking@Home|Aborting task 1bl7_mod0013scp38alpha_1408_97433_0: exceeded CPU time limit 29810.229190

I had this one error after 8hrs, 16min. I've had one other finish O.K. after a

bit over 3hrs, 40min as i have about 6 more of the same type of task should i let

them run or abort the rest?

pete.


Please abort the rest. We had a problem with the p38 protein that we isolated and fixed. We will distribute new jobs with the p38 protein on Saturday, This time the error should not be present.

Thanks,

Michela
____________
If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
P . P . L .
Avatar

Joined: Oct 20 08
Posts: 69
ID: 2725
Credit: 1,000,979
RAC: 0
Message 4783 - Posted 24 Jan 2009 2:57:18 UTC

Michela.

Thanks for that, done.

pete.

____________


Profile ChertseyAl
Avatar

Joined: Sep 9 08
Posts: 3
ID: 805
Credit: 251,157
RAC: 0
Message 4787 - Posted 24 Jan 2009 19:48:40 UTC - in response to Message ID 4782 .

Please abort the rest. We had a problem with the p38 protein that we isolated and fixed. We will distribute new jobs with the p38 protein on Saturday, This time the error should not be present.


Meanwhile, what happens to the valuable crunching effort that has been discarded? The results are probably valid, but declared invalid by BOINC. Any chance that the results can be used? Or have we just wasted our time due to a dumb configuration problem?

Thanks,

Al.

P . P . L .
Avatar

Joined: Oct 20 08
Posts: 69
ID: 2725
Credit: 1,000,979
RAC: 0
Message 4789 - Posted 25 Jan 2009 2:48:12 UTC
Last modified: 25 Jan 2009 2:50:57 UTC

Here's another ran for 10hrs, 38min, different type of task guess i won't see any credit for it.

Sun 25 Jan 2009 13:18:33 EST|Docking@Home|Aborting task 1k1n_mod0013sctryp_6168_258228_1: exceeded CPU time limit 38327.437530

http://docking.cis.udel.edu/community/workunit.php?wuid=2500935

pete.
____________


Augustine
Volunteer tester

Joined: Sep 13 06
Posts: 46
ID: 5
Credit: 143,502
RAC: 0
Message 4792 - Posted 25 Jan 2009 17:29:04 UTC

Rather disappoint that a WU like this is crunched for about 8h only to be aborted because a too pessimistic CPU time limit...

____________

Profile Michela
Forum moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Joined: Sep 13 06
Posts: 163
ID: 10
Credit: 97,083
RAC: 0
Message 4799 - Posted 26 Jan 2009 1:43:43 UTC - in response to Message ID 4792 .

Rather disappoint that a WU like this is crunched for about 8h only to be aborted because a too pessimistic CPU time limit...


Hi All,

Please look at this answer .

Sorry for the inconvenience.

Michela
____________
If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
Profile Michela
Forum moderator
Project administrator
Project developer
Project tester
Project scientist
Avatar

Joined: Sep 13 06
Posts: 163
ID: 10
Credit: 97,083
RAC: 0
Message 4800 - Posted 26 Jan 2009 1:45:43 UTC - in response to Message ID 4792 .

Rather disappoint that a WU like this is crunched for about 8h only to be aborted because a too pessimistic CPU time limit...


Hi All,

Please look at this answer .

Sorry for the inconvenience.

Michela
____________
If you are interested in working on Docking@Home in a great group at UDel, contact me at 'taufer at acm dot org'!
Rene
Volunteer tester
Avatar

Joined: Oct 2 06
Posts: 121
ID: 160
Credit: 109,415
RAC: 0
Message 4812 - Posted 29 Jan 2009 19:26:14 UTC

Hello Michela and all,

Well it's been a while since my last post, but it still feels like home... ;-)
The "website wallpaper" here has been replaced... and I might say a job well done..!!

But back to the title of this thread; I also ran into a wu that exceeded .

No harm done... also wu's that go wrong can solve future problems.

Keep up the good work,
Rene

____________

Profile Conan
Volunteer tester
Avatar

Joined: Sep 13 06
Posts: 219
ID: 100
Credit: 4,256,493
RAC: 0
Message 4824 - Posted 2 Feb 2009 11:22:20 UTC

I have a remote computer that has downloaded a swag of jobs on the 22/1/09 and more than half of them are erroring out with "Maximum cpu time exceeded" error.

This is only happening on one Windows machine (Host 614) and when it occurs the work units either run for 28,783 seconds or Zero seconds.
Nothing between those values gets the error.

I hope to be able to contact the person looking after that computer and get them to reset but I am unsure when that will be.

As a lot of work units were downloaded there will be a lot of returned jobs with this error that have run for 8 hours (almost exactly that number of hours) and will not get any credit for the work, hope they can be used.

____________

Profile Wang Solutions
Volunteer tester
Avatar

Joined: Nov 14 06
Posts: 5
ID: 272
Credit: 5,326,180
RAC: 0
Message 4897 - Posted 23 Mar 2009 6:17:32 UTC - in response to Message ID 4824 .

Pretty much every work unit that I have downloaded for Linux in the past 24 hours has gone over the maximum allocated CPU time (well over 8 hours). Is this because of a problem with the new work units? There appears to be no problems with my computers, and resetting has not helped this.

These are all 1ajv_mod0014 units.

____________
Proud member of BOINC@AUSTRALIA

Profile Trilce Estrada
Forum moderator
Project administrator
Project developer
Project tester

Joined: Sep 19 06
Posts: 189
ID: 119
Credit: 1,217,236
RAC: 0
Message 4898 - Posted 23 Mar 2009 22:57:49 UTC - in response to Message ID 4897 .

It is very strange, so far only 4 users have this problem. I can see that it is not a problem with the speed of the computer because you are using a very fast one. We incremented the limit of cpu allowed so that you won't get the error, but we need to find out what is the pattern of this error occurrence.

Thank you, and we will keep you posted

Message boards : Number crunching : Exceeded CPU time quota.

Database Error
: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) {
  [0]=>
  array(7) {
    ["file"]=>
    string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc"
    ["line"]=>
    int(97)
    ["function"]=>
    string(8) "do_query"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#28 (2) {
      ["db_conn"]=>
      resource(126) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(51) "update DBNAME.thread set views=views+1 where id=341"
    }
  }
  [1]=>
  array(7) {
    ["file"]=>
    string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc"
    ["line"]=>
    int(60)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#28 (2) {
      ["db_conn"]=>
      resource(126) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(3) {
      [0]=>
      object(BoincThread)#3 (16) {
        ["id"]=>
        string(3) "341"
        ["forum"]=>
        string(1) "2"
        ["owner"]=>
        string(3) "343"
        ["status"]=>
        string(1) "0"
        ["title"]=>
        string(24) "Exceeded CPU time quota."
        ["timestamp"]=>
        string(10) "1237849189"
        ["views"]=>
        string(3) "800"
        ["replies"]=>
        string(2) "22"
        ["activity"]=>
        string(22) "1.1227289893251999e-91"
        ["sufferers"]=>
        string(1) "0"
        ["score"]=>
        string(1) "0"
        ["votes"]=>
        string(1) "0"
        ["create_time"]=>
        string(10) "1222241046"
        ["hidden"]=>
        string(1) "0"
        ["sticky"]=>
        string(1) "0"
        ["locked"]=>
        string(1) "0"
      }
      [1]=>
      &string(6) "thread"
      [2]=>
      &string(13) "views=views+1"
    }
  }
  [2]=>
  array(7) {
    ["file"]=>
    string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php"
    ["line"]=>
    int(184)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(11) "BoincThread"
    ["object"]=>
    object(BoincThread)#3 (16) {
      ["id"]=>
      string(3) "341"
      ["forum"]=>
      string(1) "2"
      ["owner"]=>
      string(3) "343"
      ["status"]=>
      string(1) "0"
      ["title"]=>
      string(24) "Exceeded CPU time quota."
      ["timestamp"]=>
      string(10) "1237849189"
      ["views"]=>
      string(3) "800"
      ["replies"]=>
      string(2) "22"
      ["activity"]=>
      string(22) "1.1227289893251999e-91"
      ["sufferers"]=>
      string(1) "0"
      ["score"]=>
      string(1) "0"
      ["votes"]=>
      string(1) "0"
      ["create_time"]=>
      string(10) "1222241046"
      ["hidden"]=>
      string(1) "0"
      ["sticky"]=>
      string(1) "0"
      ["locked"]=>
      string(1) "0"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(13) "views=views+1"
    }
  }
}
query: update docking.thread set views=views+1 where id=341