Excessive disk activity
Message boards : Number crunching : Excessive disk activity
Author | Message | |
---|---|---|
I have been noticing that the current application seems to be doing a heck of a lot of disk reading and writing all the time while a WU is running.
|
||
ID: 2503 | Rating: 0 | rate: / | ||
I have been noticing that the current application seems to be doing a heck of a lot of disk reading and writing all the time while a WU is running. That's correct. There's a couple of threads on this subject already available. According to task manager charmm_5.4_windows_intelx86.exe is doing 200 I/O Reads (1,000,000+ Read Bytes) per second and 4-5 (1,000-2,000 Write Bytes) I/O Writes per second. Reading: the docking grid is read from disk every time we do a docking trial (which is currently about 80x80=6400 times per run). This is the way charmm does it currently, but we are going to try to change this as it shouldn't be necessary to read the same data from disk over an over again. Writing: the charmm log file (with currently lots of info to help us debugging) and the app checkpointing (also after every docking trial). The checkpointing is going to be more effective later on (or at least that is the plan) and we are going to check the volunteers' 'write to disk' preference in the future (we don't do this yet). Is there any way to update the application so it does not use so much disk activity? It's on our to-do list :-) Thanks Andre ____________ D@H the greatest project in the world... a while from now! |
||
ID: 2504 | Rating: 0 | rate: / | ||
Within a few days of running docking again i blew up another hard disk. So it may be stressing them a little more than i would like. Hope it gets reduced soon.
|
||
ID: 2507 | Rating: 0 | rate: / | ||
The 'ancientness' of the disk would explain the failure. Modern sata disk have a very low failure rate of 1 in 100'000 hours or so. An example is the sata disks in the raid system of our 128 cpu cluster her at UTEP. None of the disks have ever failed in more than a year now and these are hammered on continuously (much more than a desktop disk will ever experience).
Within a few days of running docking again i blew up another hard disk. So it may be stressing them a little more than i would like. Hope it gets reduced soon. ____________ D@H the greatest project in the world... a while from now! |
||
ID: 2512 | Rating: 0 | rate: / | ||
Yeah i have some hardware i should be ashamed of lol. That same machine has a NIC with both 10Mb Ethernet and a BNC connector but it works so i use it lol. This is the third small HDD ive lost in this machine in the last few months and my spares are slowly getting newer as the old stuff finally bites the dust. I wouldn't mind a reduction in disk writes anyway as i will protect some of my slightly more modern 4-40Gb PATA disks. The SATA disk i have hasn't missed a beat yet and nor do i expect it would any time soon.
|
||
ID: 2513 | Rating: 0 | rate: / | ||
Any news about a fix for windows? |
||
ID: 2621 | Rating: 0 | rate: / | ||
Windows, Mac, Linux: all the same problem: our checkpointing (which we are working on) and the amount of log entries we write to charmm.out (which we need because we are in alpha).
Any news about a fix for windows? ____________ D@H the greatest project in the world... a while from now! |
||
ID: 2622 | Rating: 0 | rate: / | ||
Nothing changed with v5.05, my harddisk is a spaceshuttle :o |
||
ID: 2992 | Rating: 0 | rate: / | ||
That doesn't make sense since we reduced log writing by a factor 50 at least. I will be out of town the rest of the week, but will ask Memo to check it tomorrow.
Nothing changed with v5.05, my harddisk is a spaceshuttle :o ____________ D@H the greatest project in the world... a while from now! |
||
ID: 2994 | Rating: 0 | rate: / | ||
Here a short debugging of windows app v5.05 for better understanding on Boinc 5.8.15:
|
||
ID: 3003 | Rating: 0 | rate: / | ||
Thanks. We'll check it out!!
Here a short debugging of windows app v5.05 for better understanding on Boinc 5.8.15: ____________ D@H the greatest project in the world... a while from now! |
||
ID: 3007 | Rating: 0 | rate: / | ||
Nothing changed with v5.05, my harddisk is a spaceshuttle :o I concur. 36 minutes and over 400,000 I/O reads. The I/O WRITES however, are only 9,281 The I/O reads need to be looked at. |
||
ID: 3032 | Rating: 0 | rate: / | ||
That doesn't make sense since we reduced log writing by a factor 50 at least. I will be out of town the rest of the week, but will ask Memo to check it tomorrow. It's not the I/O WRITES, it's the READS :-D (at least on my rig) Hope that clarifies. |
||
ID: 3033 | Rating: 0 | rate: / | ||
Same here. I looked at task manager on XP and a WU about 75% complete shows charmm_5.5_windows_intelx86 at nearly 1 million reads totaling over 7 GB from disk and it's only written a bit over 11 MB. Memory size is 12744KB and peak memory is the same. Memory delta is staying at zero and page faults totals 3357 and is staying constant. VM size is constant at 64212. It shows 4 threads running and handles is usually 65 with an occasional 66. Something called I/O Other shows about half a million and I/O Other bytes is over 7 million.
|
||
ID: 3035 | Rating: 0 | rate: / | ||
On checking my Amd 4800+ Windows machine I found that I was in a similar boat to David and Aaron.
|
||
ID: 3044 | Rating: 0 | rate: / | ||
You will see many reads, because Charmm is an interpreter for a script file that implements the docking algorithm that we use. This script (ending .inp in your projects/docking.utep.edu directory) is about a megabyte in size. This does not mean that we read this from your physical disk continuously though: every operating system has something called a buffer cache where data is kept that is read from disk the first time it is needed and then kept there as long as there is space and the data doesn't change. If you use the iostat utility on Linux will see that the number of physical reads/sec is basically 0 after a couple of seconds of running time. The number of reads from the cache won't be zero, but they will be done very fast, because this cache resides in memory. I am sure that even Windows will have some sort of buffer cache although I don't know what it is called (I'm a unix guy :-)
On checking my Amd 4800+ Windows machine I found that I was in a similar boat to David and Aaron. ____________ D@H the greatest project in the world... a while from now! |
||
ID: 3082 | Rating: 0 | rate: / | ||
You will see many reads, because Charmm is an interpreter for a script file that implements the docking algorithm that we use. This script (ending .inp in your projects/docking.utep.edu directory) is about a megabyte in size. This does not mean that we read this from your physical disk continuously though: every operating system has something called a buffer cache where data is kept that is read from disk the first time it is needed and then kept there as long as there is space and the data doesn't change. If you use the iostat utility on Linux will see that the number of physical reads/sec is basically 0 after a couple of seconds of running time. The number of reads from the cache won't be zero, but they will be done very fast, because this cache resides in memory. I am sure that even Windows will have some sort of buffer cache although I don't know what it is called (I'm a unix guy :-) May want to sticky this or put it into the FAQ. Could see people wondering why this app is 'seemingly' making 8 gigabytes of disk reading activity per w/u. |
||
ID: 3093 | Rating: 0 | rate: / | ||
I just stuck it in the FAQ 1 hour ago :-)
____________ D@H the greatest project in the world... a while from now! |
||
ID: 3094 | Rating: 0 | rate: / | ||
Charmm 5.07 for windows:
|
||
ID: 3131 | Rating: 0 | rate: / | ||
Where are you seeing this. I just looked at the _0 through _3 , std???dae.txt and std???gui.txt files for a 90% complete WU (the real files not the windows soft links) and I'm not seeing any excessive output like that in them. This is on Vista.
|
||
ID: 3134 | Rating: 0 | rate: / | ||
Where are you seeing this. I just looked at the _0 through _3 , std???dae.txt and std???gui.txt files for a 90% complete WU (the real files not the windows soft links) and I'm not seeing any excessive output like that in them. This is on Vista. The debug code is only for testing, without it its the same problem. You can use the cc_config.xml with <task_debug>1</task_debug> to debug on/off while boinc is running. |
||
ID: 3135 | Rating: 0 | rate: / | ||
Just wrote this reply to Reb in an email as answer on his question and concerns:
Where are you seeing this. I just looked at the _0 through _3 , std???dae.txt and std???gui.txt files for a 90% complete WU (the real files not the windows soft links) and I'm not seeing any excessive output like that in them. This is on Vista. ____________ D@H the greatest project in the world... a while from now! |
||
ID: 3175 | Rating: 0 | rate: / | ||
Hi,
But according to SpeedFan, the continous checkpointing doesnt do much harm to the harddisk (means, the temperature didnt increase - which it does then I convert a videofile for example...). But still: Isnt it possible to somehow collect the data in RAM first and then write it to the harddisc every minute or every 10 minutes? - Dont know whats more stressful, though: Writing 100bytes every second or 6 KBytes every minute (60KBytes every 10 minutes)? Cheers, Shai ____________ My NEW BOINC-Site Why people joined BOINC Synergy... |
||
ID: 3291 | Rating: 0 | rate: / | ||
But all this Check-pointing also has an adverse affect on the BOINC.exe too, it constantly shows anywheres fron 4% up to 17% CPU usage in the TaskManger. In fact I can't even get 1 PC to run right because the BOINC.exe is using 23% to 25% of the CPU (Q6600 Quad Core). The BOINC Manager will just Lock up for seconds at a time if I try to use any of the Tabs.
|
||
ID: 3294 | Rating: 0 | rate: / | ||
But all this Check-pointing also has an adverse affect on the BOINC.exe too, it constantly shows anywheres fron 4% up to 17% CPU usage in the TaskManger. In fact I can't even get 1 PC to run right because the BOINC.exe is using 23% to 25% of the CPU (Q6600 Quad Core). The BOINC Manager will just Lock up for seconds at a time if I try to use any of the Tabs. Just checked that and its true. While crunching Docking, BOINC.exe uses 3-4% CPU constantly - which it doesnt do while crunching WCG, Rosetta or Einstein, for example. ____________ My NEW BOINC-Site Why people joined BOINC Synergy... |
||
ID: 3295 | Rating: 0 | rate: / | ||
That is some of the solutions we are looking at at the moment. It's not an easy thing to do in Charmm though, because reads and writes are being done everywhere in the 300,000 lines of code and unfortunately we didn't write it: approx. 100 devs are working on this code and everybody has their own coding style :-(
But still: Isnt it possible to somehow collect the data in RAM first and then write it to the harddisc every minute or every 10 minutes? - Dont know whats more stressful, though: Writing 100bytes every second or 6 KBytes every minute (60KBytes every 10 minutes)? ____________ D@H the greatest project in the world... a while from now! |
||
ID: 3301 | Rating: 0 | rate: / | ||
Message boards : Number crunching : Excessive disk activity
Database Error: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) { [0]=> array(7) { ["file"]=> string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc" ["line"]=> int(97) ["function"]=> string(8) "do_query" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#31 (2) { ["db_conn"]=> resource(102) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(51) "update DBNAME.thread set views=views+1 where id=179" } } [1]=> array(7) { ["file"]=> string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc" ["line"]=> int(60) ["function"]=> string(6) "update" ["class"]=> string(6) "DbConn" ["object"]=> object(DbConn)#31 (2) { ["db_conn"]=> resource(102) of type (mysql link persistent) ["db_name"]=> string(7) "docking" } ["type"]=> string(2) "->" ["args"]=> array(3) { [0]=> object(BoincThread)#3 (16) { ["id"]=> string(3) "179" ["forum"]=> string(1) "2" ["owner"]=> string(3) "192" ["status"]=> string(1) "0" ["title"]=> string(23) "Excessive disk activity" ["timestamp"]=> string(10) "1179086162" ["views"]=> string(4) "2735" ["replies"]=> string(2) "25" ["activity"]=> string(23) "1.7189412810210998e-120" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1171580683" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } [1]=> &string(6) "thread" [2]=> &string(13) "views=views+1" } } [2]=> array(7) { ["file"]=> string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php" ["line"]=> int(184) ["function"]=> string(6) "update" ["class"]=> string(11) "BoincThread" ["object"]=> object(BoincThread)#3 (16) { ["id"]=> string(3) "179" ["forum"]=> string(1) "2" ["owner"]=> string(3) "192" ["status"]=> string(1) "0" ["title"]=> string(23) "Excessive disk activity" ["timestamp"]=> string(10) "1179086162" ["views"]=> string(4) "2735" ["replies"]=> string(2) "25" ["activity"]=> string(23) "1.7189412810210998e-120" ["sufferers"]=> string(1) "0" ["score"]=> string(1) "0" ["votes"]=> string(1) "0" ["create_time"]=> string(10) "1171580683" ["hidden"]=> string(1) "0" ["sticky"]=> string(1) "0" ["locked"]=> string(1) "0" } ["type"]=> string(2) "->" ["args"]=> array(1) { [0]=> &string(13) "views=views+1" } } }query: update docking.thread set views=views+1 where id=179