Posts by KSMarksPsych

11)

Message boards : Number crunching : Invalid results reported thread [Use Here]

( Message 3349 )
Posted 3737 days ago by Profile KSMarksPsych
Here are the docking entries in stdoutdae.txt frome before the crash. I don't see anything of note except it appears the WU crashed at the same time as I suspended the network.


2007-05-16 21:42:37 [Docking@Home] Starting task 1tng_mod0011_12328_94826_1 using charmm version 507
2007-05-16 21:42:39 [Docking@Home] [file_xfer] Started upload of file 1tng_mod0011_12120_243524_1_0
2007-05-16 21:42:39 [Docking@Home] [file_xfer] Started upload of file 1tng_mod0011_12120_243524_1_1
2007-05-16 21:42:42 [Docking@Home] [file_xfer] Finished upload of file 1tng_mod0011_12120_243524_1_0
2007-05-16 21:42:42 [Docking@Home] [file_xfer] Throughput 11903 bytes/sec
2007-05-16 21:42:42 [Docking@Home] [file_xfer] Finished upload of file 1tng_mod0011_12120_243524_1_1
2007-05-16 21:42:42 [Docking@Home] [file_xfer] Throughput 11903 bytes/sec
2007-05-16 21:42:42 [Docking@Home] [file_xfer] Started upload of file 1tng_mod0011_12120_243524_1_2
2007-05-16 21:42:42 [Docking@Home] [file_xfer] Started upload of file 1tng_mod0011_12120_243524_1_3
2007-05-16 21:42:46 [Docking@Home] [file_xfer] Finished upload of file 1tng_mod0011_12120_243524_1_3
2007-05-16 21:42:46 [Docking@Home] [file_xfer] Throughput 2633 bytes/sec
2007-05-16 21:42:48 [Docking@Home] [file_xfer] Finished upload of file 1tng_mod0011_12120_243524_1_2
2007-05-16 21:42:48 [Docking@Home] [file_xfer] Throughput 115673 bytes/sec
2007-05-16 21:57:16 [Docking@Home] Sending scheduler request: To report completed tasks
2007-05-16 21:57:16 [Docking@Home] Reporting 1 tasks
2007-05-16 21:57:21 [Docking@Home] Scheduler RPC succeeded [server version 509]
2007-05-16 21:57:21 [Docking@Home] Deferring communication for 11 sec
2007-05-16 21:57:21 [Docking@Home] Reason: requested by project
2007-05-16 22:18:26 [---] Suspending network activity - user request
2007-05-16 22:22:33 [Docking@Home] Deferring communication for 1 min 0 sec
2007-05-16 22:22:33 [Docking@Home] Reason: Unrecoverable error for result 1tng_mod0011_12328_94826_1 ( - exit code 1073807364 (0x40010004))
2007-05-16 22:22:33 [Docking@Home] Computation for task 1tng_mod0011_12328_94826_1 finished
[05/16/07 22:22:33] TRACE [5968]: ***** Console Event Detected *****

[05/16/07 22:22:33] TRACE [5968]: Event: CTRL-SHUTDOWN Event

2007-05-16 22:22:34 [---] Exit requested by user



Specs just for the record...

Vista Home Premium

2007-05-16 22:26:37 [---] Starting BOINC client version 5.8.16 for windows_intelx86
2007-05-16 22:26:37 [---] log flags: task, file_xfer, sched_ops
2007-05-16 22:26:37 [---] Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3
2007-05-16 22:26:37 [---] Data directory: C:\BOINC
2007-05-16 22:26:38 [---] Processor: 2 GenuineIntel Genuine Intel(R) CPU T2300 @ 1.66GHz [x86 Family 6 Model 14 Stepping 8] [fpu tsc pae nx sse sse2 sse3 mmx]
2007-05-16 22:26:38 [---] Memory: 1.99 GB physical, 4.20 GB virtual
2007-05-16 22:26:38 [---] Disk: 139.31 GB total, 121.65 GB free
12)

Message boards : Number crunching : Invalid results reported thread [Use Here]

( Message 3347 )
Posted 3738 days ago by Profile KSMarksPsych
I managed to kill this WU . I'm not exactly sure what in the sequence of events did it...


I just got a new laptop with Vista on it. Since I'm going to wipe it soon, I didn't do much with the installed software. The pop ups to register McAfee were driving me nuts. So I decided to uninstall it and stick Avast on it for the time being. Even though I'm behind a router, I wanted to pull the network cable while I did this. I suspended networking in BOINC. I went to uninstall McAfee. While this was going on, either the entire manager crashed or the communications between the manager and the daemon died.

After rebooting to get rid of the last of McAfee and another reboot to finish the Avast installation, I started up BOINC again. My Predictor WU and my Reisel Seive WU made it through okay. The Docking one was toast.

I'm not in front of that computer right now, but I'll check the logs when I'm back down there.
13)

Message boards : Cafe Docking : GoodBye Andre

( Message 3204 )
Posted 3754 days ago by Profile KSMarksPsych
Good luck Andre!

Sorry to hear you're leaving UT. It's a great system. I did my grad work at UT - Austin.
14)

Message boards : Windows : Running Vista?

( Message 3019 )
Posted 3772 days ago by Profile KSMarksPsych
Here's a registry hack Jord put up on the BOINC boards to make Vista wait a bit longer to shut down.
15)

Message boards : Number crunching : Invalid results reported thread [Use Here]

( Message 2869 )
Posted 3784 days ago by Profile KSMarksPsych
It might help with cases where Charmm exits with code 0, but in this case, there was an error: incorrect function; that one is still kind of a mystery to us and we are investigating it. It seems to be a fortran related error as the cpdn guys have seen this before as well.

If everything goes as planned we will release a charmm with atomic checkpointing for all platforms next week, which will help solve many of the invalid cases; or at least we think so...

Andre

Looks like my one invalid is probably the problem referenced above with restarting a work unit...

http://docking.utep.edu/result.php?resultid=131429


It's the only one I have like this. Would it help to bump up the switch time on that machine to like 4 hours (since units are taking about 3.5 hours to complete)?





Anything you want me to do Andre? It was the only WU like this. And everything else, including CPDN/CPDN Beta are running happily on that machine.
16)

Message boards : Number crunching : Invalid results reported thread [Use Here]

( Message 2858 )
Posted 3785 days ago by Profile KSMarksPsych
Looks like my one invalid is probably the problem referenced above with restarting a work unit...

http://docking.utep.edu/result.php?resultid=131429


It's the only one I have like this. Would it help to bump up the switch time on that machine to like 4 hours (since units are taking about 3.5 hours to complete)?

17)

Message boards : Number crunching : Charmm 5.04 (Windows)

( Message 2273 )
Posted 3854 days ago by Profile KSMarksPsych
David, that is correct. 5.4 for just for the speed optimization. Probably the next version will do something about the checkpointing interval (Richard is working on this). We still needs lots of debug info though, so the charmm logfile will probably be written to much more often. But that will only be for the alpha test phase.

Thanks
Andre



Thanks Andre. Now I remember reading that. That'll teach me to post in the middle of the night....
18)

Message boards : Number crunching : Charmm 5.04 (Windows)

( Message 2249 )
Posted 3854 days ago by Profile KSMarksPsych
I haven't finished my first 5.04 WU yet, but it does seem significantly faster than 5.03.

Now to my questions. Should it be checkpointing so frequently?


1/20/2007 12:38:02 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:03 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:08 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:11 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:14 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:16 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:20 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:22 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:24 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:27 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:29 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed
1/20/2007 12:38:33 AM|Docking@Home|[task_debug] result 1tng_mod0001_22706_487147_2 checkpointed


My write to disk interval is 60 seconds. Running 5.8.3 on Windows XP Pro. It's a P4 2.8 (no HT) with 512 MB of RAM. This host.
19)

Message boards : Number crunching : Who is overclocking their machine?

( Message 2203 )
Posted 3856 days ago by Profile KSMarksPsych
Neither of my hosts (I think I attached the second one at one point in time) are overclocked.

I wonder if they just did a MySQL query looking for invalids and any kind. I had a bunch way back when I joined that I had to manually abort because of a foul up with my connect to interval ::blushes::

I glanced briefly though my results list and I didn't see any that completed successfully on my end that ended up not validating.
20)

Message boards : Cafe Docking : BOINC manager 5.7.x experiences ?

( Message 1687 )
Posted 3903 days ago by Profile KSMarksPsych
The current one (5.7.5)is rock solid for me aside from one annoying bug that'll be fixed in the next alpha release.

Snooze only snooze for 1 minute instead of one hour. Apparently hours and minutes got screwed up somewhere in the code according to Rom's check in notes .

In this release CPU throttling is working correctly. 5.7.4 and previous had a bug where the science app would suddenly stop all processing.


David A. has posted to the alpha list that 5.8 should be ready for release in a week or 2.


Next 10 posts