Incorrect Function 1
Message boards : Number crunching : Incorrect Function 1
Author | Message | |
---|---|---|
It seems this problem has raised its ugly head again. I am seeing dozens of WU's on a number of PC's all fail with:
|
||
ID: 6850 | Rating: 0 | rate: / | ||
same here :-(
|
||
ID: 6853 | Rating: 0 | rate: / | ||
I am also getting the same failed messages........... |
||
ID: 6854 | Rating: 0 | rate: / | ||
I don't get "incorrect function". What I've been getting lately is "computation error" ? |
||
ID: 6855 | Rating: 0 | rate: / | ||
I don't get "incorrect function". What I've been getting lately is "computation error" ? Yeah I get "computation error" on my results also, but when I click on the reported WU results, it lists the "incorrect function 1" error. |
||
ID: 6856 | Rating: 0 | rate: / | ||
I don't get "incorrect function". What I've been getting lately is "computation error" ? Now that I check that page, I DO get the same error. What's happening? |
||
ID: 6857 | Rating: 0 | rate: / | ||
... If I remember right, somewhere in the middle of a calculation it finds out, that one value is out of bounds and exits with an status value of 1. Afaik. the translation of "1" into "incorrect function" is a BOINC thing and doesn't really reflect the reason for this error. When they start the calculation, they cannot tell if it is a candidate for this type of error but I still think they should treat it as a valid result on BOINC-side, as no technical error led to the error. It could very well be handled when it's transferred (or not transferred in this case) from BOINC into the scientific database instead of filling our task lists with error results. p.s.: Here's an older thread about the same problem, it is a known problem, that's why Brian wrote "again". |
||
ID: 6858 | Rating: 0 | rate: / | ||
I've also some of these buggy WUs...
|
||
ID: 6859 | Rating: 0 | rate: / | ||
So I have the same problem, no checkpoints. After 3 hours and 30 minutes no more time is displayed. Is relatively poor when the computer is to be issued once.
|
||
ID: 6860 | Rating: 0 | rate: / | ||
So I have the same problem, no checkpoints. After 3 hours and 30 minutes no more time is displayed. Is relatively poor when the computer is to be issued once. I looked in the slot-folder (for example d:\Boinc\Project_Data\slots\) of the broken WUs and saw, that several files are missing. I cancelled these jobs |
||
ID: 6861 | Rating: 0 | rate: / | ||
During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again.
|
||
ID: 6875 | Rating: 0 | rate: / | ||
During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again. I have the 0 % problem on 3 WU that start with 1m0b1htf, some work just fine, though. What should I do? |
||
ID: 6879 | Rating: 0 | rate: / | ||
During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again. I also have the 0% problem with 3 WU that start with "1m0b1htf", and also 3 that start with "1ohr1htf". They are over 20 hours and counting, noticed it this morning. |
||
ID: 6880 | Rating: 0 | rate: / | ||
Please abort the ones with 0% progress.
During the past weekend, the space on D@H server is getting filled up and as a result, the server sent out some incomplete workunits, please abort workunits with name "1iiq1hih" or "1ohr1hih". Currently, the server is back to normal again. |
||
ID: 6881 | Rating: 0 | rate: / | ||
Just downloaded some 1ebz1hih_mod0014crossdockinghiv1 and they are coming up with the 0% progress problem... Aborting them
|
||
ID: 6887 | Rating: 0 | rate: / | ||
Two separate problems I think ...
|
||
ID: 6891 | Rating: 0 | rate: / | ||
Hello,
|
||
ID: 6916 | Rating: 0 | rate: / | ||
Out of the current batch of 1ohr1htf units I've had 7 errors out of 121 crunched so far and those were limited to 1 PC - a 1090T hex core.
|
||
ID: 6917 | Rating: 0 | rate: / | ||
After a little timeout ... currently the results seem to run much better, no "exit 1" error, my last five went through flawless. |
||
ID: 6927 | Rating: 0 | rate: / | ||
After a little timeout ... currently the results seem to run much better, no "exit 1" error, my last five went through flawless. That was too early :-( After 25 valid 1t7k1htf, 1dif1htf_mod0014crossdockinghiv1_23875_162815 failed with -1 There always seem to be certain series that have this flaw, other series are completely unaffected. |
||
ID: 6928 | Rating: 0 | rate: / | ||
Out of the current batch of 1ohr1htf units I've had 7 errors out of 121 crunched so far and those were limited to 1 PC - a 1090T hex core. ... Your box started working fine when you ran out of 1ohr1htf and received 1t7k1htf instead. I doubt that it is a voltage issue, it would have done that with the lower voltage too I bet. (You had way more than 7 errors btw. and all only in certain series) 1dif1htf (those that caused problems for me) fail on your box too and on all other hosts I checked. Mine is a Xeon L5520, standard voltages and frequencies. @project : Again ... as this "ERROR - Charmm exited with code 1." is a program controlled exit, you should "exit 0" and set a flag that the result ran into a condition where further processing doesn't make sense for scientific reasons. On BOINC-side it should be successfull. Compare it to a prime project - there the numbers that turn out to be no prime do not exit with an error either. |
||
ID: 6929 | Rating: 0 | rate: / | ||
Out of the current batch of 1ohr1htf units I've had 7 errors out of 121 crunched so far and those were limited to 1 PC - a 1090T hex core. ... Came to the conclusion it was dodgy WU's and dropped the the voltage agin a couple of days ago. My 1055T has just had a few 1dif units error out so aborting all 1dif on both crunchers. |
||
ID: 6931 | Rating: 0 | rate: / | ||
1hvi1htf is another buggy series, I'll abort all I get from this type. |
||
ID: 6937 | Rating: 0 | rate: / | ||
A large fraction, but not all, of my 1hvi1htf workunits are now giving this compute error:
|
||
ID: 6938 | Rating: 0 | rate: / | ||
Add 1hvj1htf and 1hvk1htf to the badlist |
||
ID: 6939 | Rating: 0 | rate: / | ||
hmmmm ... the non-1hv*** results are getting rare, maybe it's time for another timeout. The last one had been caused by series having tons of "ERROR - Charmm exited with code 1." but no one has fixed it since then and no one seems to care about tons of crashing results at all. |
||
ID: 6940 | Rating: 0 | rate: / | ||
Wel the problem last almost a month now and indeed the projectleaders seems to not bother at all. I think I detach from this project permanently. |
||
ID: 6941 | Rating: 0 | rate: / | ||
Message boards : Number crunching : Incorrect Function 1
Database Error: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) { [0]=> array(7) { ["file"]=> string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc" ["line"]=> int(97) ["function"]=> string(8) "do_query" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#32 (2) { ["db_conn"]=> resource(126) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(51) "update DBNAME.thread set views=views+1 where id=691" } } [1]=> array(7) { ["file"]=> string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc" ["line"]=> int(60) ["function"]=> string(6) "update" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#32 (2) { ["db_conn"]=> resource(126) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(3) { [0]=> object(BoincThread)#3 (16) { ["id"]=> string(3) "691" ["forum"]=> string(1) "2" ["owner"]=> string(5) "33519" ["status"]=> string(1) "0" ["title"]=> string(20) "Incorrect Function 1" ["timestamp"]=> string(10) "1351289671" ["views"]=> string(3) "414" ["replies"]=> string(2) "26" ["activity"]=> string(22) "1.6378441531674999e-34" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1349130621" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } [1]=> &string(6) "thread" [2]=> &string(13) "views=views+1" } } [2]=> array(7) { ["file"]=> string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php" ["line"]=> int(184) ["function"]=> string(6) "update" ["class"]=> string(11) "BoincThread" ["object"]=> object(BoincThread)#3 (16) { ["id"]=> string(3) "691" ["forum"]=> string(1) "2" ["owner"]=> string(5) "33519" ["status"]=> string(1) "0" ["title"]=> string(20) "Incorrect Function 1" ["timestamp"]=> string(10) "1351289671" ["views"]=> string(3) "414" ["replies"]=> string(2) "26" ["activity"]=> string(22) "1.6378441531674999e-34" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1349130621" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(13) "views=views+1" } } }query: update docking.thread set views=views+1 where id=691