@John
The short answer is NO, HR should not have an effect on the runtime.
For the long answer, the rest of this deals with HR, the BOINC servers scheduler and shared memory, the effects of excessive Disk I/O, Disk file access timestamps, and details on disk caching including the effects of cache size, cache algorithm, interface (IDE or SATA), and NCQ.
The HR (Homogeneous Redundancy) algorithm is used on the D@H BOINC server to determine which machines have compatible CPU's when handing out work. I believe that when they validate results they look for an exact match on the dataset returned. Because different CPU's might have different floating point precision or capabilities, a particular work unit needs to be handed to machines with the same CPU characteristics so that the results returned by the data sets can be validated against each other. For instance, I believe that the HR group your Xeon MacPro falls into is:
OS: Mac OS X (Darwin)
CPU Vendor: Intel
CPU Family: 6
That means that when a particular work unit is handed out, it should only go to other machines with the same characteristics. Remember when people were getting the "There is work, but it has been committed to other platforms" error when trying to get work from D@H a while back. Well, the "other platforms" it is referring to is other HR groups.
On the scheduler, the work units which need to be assigned are kept in ram for fast access. This is called shared memory since it is a special section of RAM that the BOINC server software has had the operating system set aside to allow multiple programs on the BOINC server to access it. In this case, it is accessed by the programs on the BOINC server which are involved in handing out work units to clients. Since only so many work units will fit in the table in ram, and some HR groups only have a few computers in them while we're in alpha test, the shared memory table was getting full of work units which had already been handed out to one machine and were waiting for another machine in the same HR class to come along and ask for work. They had to change the BOINC scheduler on the D@H server to rotate work units between the database and the shared memory table so that there would always be available work units for each HR class, or work units that hadn't been assigned to a HR class yet.
The reason the scheduler needed to be changed was because the standard rules for handing out work units don't allow a work unit to be handed out to other machines owned by the same user. This was for historical reasons because of the cheating that some people did on projects like SETI to get huge scores. Because there are so few machines (Macs especially) on the project, this rule was also relaxed in the D@H server's scheduler to allow different machines owned by the same user to crunch the same work unit. The only rule besides HR class now, afaik, is that if a machine has already crunched a particular work unit, it isn't allowed to be crunched by the same machine again. If there was a problem in the CPU on that machine, it might return the exact same incorrect result again. The whole point of redundancy is to have multiple machines run the same work unit and verify that they got the same results. The number of machines a single work unit will be handed out to and the number of them that have to return a matching result before that result is considered reliable is configurable. Currently, a Single work unit is handed out to at least 3 machines (more if one or more of them errors out on the work unit) and at least 2 of the machines must return matching results for the result to be considered good (and them to get points for crunching the result). If the third machine also matches (which it usually does) then the third machine also gets points for the work unit.
Older BOINC clients don't return the CPU Family/Model/Stepping information so it's possible (although I'm not aware of it) that something has been added to the D@H client to gather that information and return it to the server. That would mean that, while the first work unit crunched with that version of the D@H client might fail to validate, it would also give the D@H BOINC server the information to properly classify the CPU in that machine for future HR matches. It only takes a few microseconds to gather that information and, even if it was stuck in a loop, that doesn't match the patterns I've seen.
For instance, I believe that your machines were dropping to around 74% cpu utilization with the new D@H client. That and other data I've seen indicates that the client is re-reading some data from disk millions of times while processing a work unit. Having to wait on physical Disk I/O would also explain why your CPU utilization has dropped from 100% on the older D@H client to about 74% on the new D@H client. You have 4 cores (IIRC) and are running 4 instances of the D@H client so they're doing so much disk I/O that they're spending about 26% of the time queued up waiting for the hard drive. On a single core machine, the disk can keep up better and there are some disk operations that can occur asynchronously (such as updating the access timestamp on the file being read) so the CPU stays pretty much at 100% utilization. Also, if the data being read fits in the cache on the Hard Drive, then the only real disk activity would be actually writing the timestamp updates. Maybe with 4 cores all reading different data, since they're running different work units, the data being read exceeds the cache on the hard drive and is having to physically be read from the disk. Varying amounts of cache on the Hard Drive and varying number of instances of D@H being run on multiple cores simultaneously would also explain why some machines are actually slightly faster on 5.05 than they were on 5.04, but other machines are taking several times as long to run a work unit on 5.05 as they did on 5.04.
There are several factors to consider with hard drive (aka hard disk, aka disk drive, aka disk, aka drive) caches:
1. Older drives tend to have smaller caches.
2. Different drives and even drives with the same model number but different manufacturing dates might have different algorithms for using the cache. Hard drives have different ways of storing the data in cache, differing amounts of memory overhead, different levels of "Intelligent" prefetching, different ways of deciding when to purge data from the cache because the hard drives software doesn't think it will be needed again, etc.... Even if the hard drives cache has enough memory to theoretically hold the data being read over and over again by the system, it might still be reading it from the actual disk because of any of those factors.
3. If I understand it correctly, SATA drives with NCQ (Native Command Queueing) can issue several commands to the drive and let the drive optimize the order in which it does them. The writes for the timestamp update could be sent to the SATA NCQ hard drive and performed asynchronously.
4. IDE drives and SATA drives without NCQ can only have 1 read/write in progress on the hard drive at a time. If that write involves issueing multiple commands to the hard drive, then only the last command issued can be performed asynchronously unless the Operating System has another process doing the updating.
With respect to D@H 5.05 on my own machines, all but one of which are single core, a D@H machines with a an old 40 Gb IDE drive suffered the most (IIRC, about 30 or 40 minutes slower per WU) followed by one with a newer IDE drive with 8MB cache which only slowed (execution time increased) by a few hundred seconds per WU. A cheap ($150 walmart christmas special) Sempron SATA machine is faster by about 100 seconds per WU and a Dual Core PD 925 with an SATA NCQ drive is about 300 seconds faster per work unit.
Here are some links:
homogeneous redundancy (HR)
in the D@H FAQ
"There is work, but it has been committed to other platforms"
in the D@H FAQ
Shared Memory page
on the D@H website. See the HR info at bottom.
Homogeneous Redundancy (HR)
in the unofficial BOINC wiki. Note that this needs to be updated to include the CPU Family/Model/Stepping data.
Redundancy and Errors
in the unofficial BOINC wiki.
Happy Crunching,
-- David
____________
The views expressed are my own.
Facts are subject to memory error :-)
Have you read a good science fiction novel lately?
|
I couldn't have that explained better myself. Thanks for taking the time for this David. There is definitely something tricky going on though and Memo and I are working to find out what is causing the runtime increase (haven't found the exact reason yet). But I agree with David that it probably has to do something with I/O.
Thanks
Andre
@John
The short answer is NO, HR should not have an effect on the runtime.
For the long answer, the rest of this deals with HR, the BOINC servers scheduler and shared memory, the effects of excessive Disk I/O, Disk file access timestamps, and details on disk caching including the effects of cache size, cache algorithm, interface (IDE or SATA), and NCQ.
The HR (Homogeneous Redundancy) algorithm is used on the D@H BOINC server to determine which machines have compatible CPU's when handing out work. I believe that when they validate results they look for an exact match on the dataset returned. Because different CPU's might have different floating point precision or capabilities, a particular work unit needs to be handed to machines with the same CPU characteristics so that the results returned by the data sets can be validated against each other. For instance, I believe that the HR group your Xeon MacPro falls into is:
OS: Mac OS X (Darwin)
CPU Vendor: Intel
CPU Family: 6
That means that when a particular work unit is handed out, it should only go to other machines with the same characteristics. Remember when people were getting the "There is work, but it has been committed to other platforms" error when trying to get work from D@H a while back. Well, the "other platforms" it is referring to is other HR groups.
On the scheduler, the work units which need to be assigned are kept in ram for fast access. This is called shared memory since it is a special section of RAM that the BOINC server software has had the operating system set aside to allow multiple programs on the BOINC server to access it. In this case, it is accessed by the programs on the BOINC server which are involved in handing out work units to clients. Since only so many work units will fit in the table in ram, and some HR groups only have a few computers in them while we're in alpha test, the shared memory table was getting full of work units which had already been handed out to one machine and were waiting for another machine in the same HR class to come along and ask for work. They had to change the BOINC scheduler on the D@H server to rotate work units between the database and the shared memory table so that there would always be available work units for each HR class, or work units that hadn't been assigned to a HR class yet.
The reason the scheduler needed to be changed was because the standard rules for handing out work units don't allow a work unit to be handed out to other machines owned by the same user. This was for historical reasons because of the cheating that some people did on projects like SETI to get huge scores. Because there are so few machines (Macs especially) on the project, this rule was also relaxed in the D@H server's scheduler to allow different machines owned by the same user to crunch the same work unit. The only rule besides HR class now, afaik, is that if a machine has already crunched a particular work unit, it isn't allowed to be crunched by the same machine again. If there was a problem in the CPU on that machine, it might return the exact same incorrect result again. The whole point of redundancy is to have multiple machines run the same work unit and verify that they got the same results. The number of machines a single work unit will be handed out to and the number of them that have to return a matching result before that result is considered reliable is configurable. Currently, a Single work unit is handed out to at least 3 machines (more if one or more of them errors out on the work unit) and at least 2 of the machines must return matching results for the result to be considered good (and them to get points for crunching the result). If the third machine also matches (which it usually does) then the third machine also gets points for the work unit.
Older BOINC clients don't return the CPU Family/Model/Stepping information so it's possible (although I'm not aware of it) that something has been added to the D@H client to gather that information and return it to the server. That would mean that, while the first work unit crunched with that version of the D@H client might fail to validate, it would also give the D@H BOINC server the information to properly classify the CPU in that machine for future HR matches. It only takes a few microseconds to gather that information and, even if it was stuck in a loop, that doesn't match the patterns I've seen.
For instance, I believe that your machines were dropping to around 74% cpu utilization with the new D@H client. That and other data I've seen indicates that the client is re-reading some data from disk millions of times while processing a work unit. Having to wait on physical Disk I/O would also explain why your CPU utilization has dropped from 100% on the older D@H client to about 74% on the new D@H client. You have 4 cores (IIRC) and are running 4 instances of the D@H client so they're doing so much disk I/O that they're spending about 26% of the time queued up waiting for the hard drive. On a single core machine, the disk can keep up better and there are some disk operations that can occur asynchronously (such as updating the access timestamp on the file being read) so the CPU stays pretty much at 100% utilization. Also, if the data being read fits in the cache on the Hard Drive, then the only real disk activity would be actually writing the timestamp updates. Maybe with 4 cores all reading different data, since they're running different work units, the data being read exceeds the cache on the hard drive and is having to physically be read from the disk. Varying amounts of cache on the Hard Drive and varying number of instances of D@H being run on multiple cores simultaneously would also explain why some machines are actually slightly faster on 5.05 than they were on 5.04, but other machines are taking several times as long to run a work unit on 5.05 as they did on 5.04.
There are several factors to consider with hard drive (aka hard disk, aka disk drive, aka disk, aka drive) caches:
1. Older drives tend to have smaller caches.
2. Different drives and even drives with the same model number but different manufacturing dates might have different algorithms for using the cache. Hard drives have different ways of storing the data in cache, differing amounts of memory overhead, different levels of "Intelligent" prefetching, different ways of deciding when to purge data from the cache because the hard drives software doesn't think it will be needed again, etc.... Even if the hard drives cache has enough memory to theoretically hold the data being read over and over again by the system, it might still be reading it from the actual disk because of any of those factors.
3. If I understand it correctly, SATA drives with NCQ (Native Command Queueing) can issue several commands to the drive and let the drive optimize the order in which it does them. The writes for the timestamp update could be sent to the SATA NCQ hard drive and performed asynchronously.
4. IDE drives and SATA drives without NCQ can only have 1 read/write in progress on the hard drive at a time. If that write involves issueing multiple commands to the hard drive, then only the last command issued can be performed asynchronously unless the Operating System has another process doing the updating.
With respect to D@H 5.05 on my own machines, all but one of which are single core, a D@H machines with a an old 40 Gb IDE drive suffered the most (IIRC, about 30 or 40 minutes slower per WU) followed by one with a newer IDE drive with 8MB cache which only slowed (execution time increased) by a few hundred seconds per WU. A cheap ($150 walmart christmas special) Sempron SATA machine is faster by about 100 seconds per WU and a Dual Core PD 925 with an SATA NCQ drive is about 300 seconds faster per work unit.
Here are some links:
homogeneous redundancy (HR)
in the D@H FAQ
"There is work, but it has been committed to other platforms"
in the D@H FAQ
Shared Memory page
on the D@H website. See the HR info at bottom.
Homogeneous Redundancy (HR)
in the unofficial BOINC wiki. Note that this needs to be updated to include the CPU Family/Model/Stepping data.
Redundancy and Errors
in the unofficial BOINC wiki.
Happy Crunching,
-- David
____________
D@H the greatest project in the world... a while from now!
|