64-Bit / 32 bit, SSE3 and other extensions?


Advanced search

Message boards : Application Info : 64-Bit / 32 bit, SSE3 and other extensions?

Sort
Author Message
Aaron Finney
Volunteer tester

Joined: Mar 23 07
Posts: 74
ID: 367
Credit: 2,409,831
RAC: 0
Message 2933 - Posted 4 Apr 2007 3:13:39 UTC

What optimizations are currently being used for the 32-bit windows application? Any?

Is SSE/2/3 being utilized?

Stupid question : Would it make sense to use some standard optimizations for the 32-bit application that gets sent out to 64-bit windows clients? Would it be wrong to assume that those using the 64-bit windows client would probably have a CPU capable of SSE2 (etc..) extensions?
____________

Profile Andre Kerstens
Forum moderator
Project tester
Volunteer tester
Avatar

Joined: Sep 11 06
Posts: 749
ID: 1
Credit: 15,199
RAC: 0
Message 2934 - Posted 4 Apr 2007 3:50:51 UTC - in response to Message ID 2933 .

No cpu specific optimizations are utilized because the application has to run on so many different processors without crashing. And yes, this is a restriction, but hardly one that can be avoided in a boinc project.

When we have a real 64-bit charmm application, some common cpu extensions like sse might be considered. But that's a while from now since we don't have a 64-bit compile yet.

Thanks! Certainly some things to think about.
Andre

What optimizations are currently being used for the 32-bit windows application? Any?

Is SSE/2/3 being utilized?

Stupid question : Would it make sense to use some standard optimizations for the 32-bit application that gets sent out to 64-bit windows clients? Would it be wrong to assume that those using the 64-bit windows client would probably have a CPU capable of SSE2 (etc..) extensions?


____________
D@H the greatest project in the world... a while from now!
Profile clownius
Volunteer tester
Avatar

Joined: Nov 14 06
Posts: 61
ID: 280
Credit: 2,677
RAC: 0
Message 2938 - Posted 4 Apr 2007 6:13:28 UTC

All 64 bit processors support SSE2 so that optimization can safley go into a 64 bit app. I wish you luck on 64 bit as its what i use these days and the wrapper while it works is not the best solution.
____________

Aaron Finney
Volunteer tester

Joined: Mar 23 07
Posts: 74
ID: 367
Credit: 2,409,831
RAC: 0
Message 2941 - Posted 4 Apr 2007 7:39:34 UTC - in response to Message ID 2938 .
Last modified: 4 Apr 2007 8:12:38 UTC

All 64 bit processors support SSE2 so that optimization can safley go into a 64 bit app. I wish you luck on 64 bit as its what i use these days and the wrapper while it works is not the best solution.


Er.. well, couldn't you just run the same compile again but optimise for SSE2 and whatever other optimizations? Then it would be like an optimised 32-bit application, but only sent to the 64-bit clients. I know.. sounds simple just to say it like that.. hahaha but.. ?

This would kindof be a middle ground without having a 64-bit application, as I don't think it would take many modifications just to switch the compilier to optomize for processor extensions. It still wouldn't be a 64-bit application, but it would be an improvement over the 32-bit 'wrapper' app, correct?

I know I asked this once in the BOINC forums, but when you run the compiler, there are options so that it will make the SSE2 /MMX/ ETC.. optomizations for you right? It's all automated? I mean.. to me this seems like a totally effortless way to get 10-20% faster work, all you have to do is just click a few things and let the compiler run for a few hours (or however long it takes), then replace the application that gets sent to 64-bit clients with the new one right?

If there is no real performance increase, I could let you know about it, fairly quickly, and then we'd know if you should run the compile for future iterations of the Charmm app or if it was a waste of time...?
Profile David Ball
Forum moderator
Volunteer tester
Avatar

Joined: Sep 18 06
Posts: 274
ID: 115
Credit: 1,634,401
RAC: 0
Message 2942 - Posted 4 Apr 2007 7:49:25 UTC - in response to Message ID 2934 .

No cpu specific optimizations are utilized because the application has to run on so many different processors without crashing. And yes, this is a restriction, but hardly one that can be avoided in a boinc project.

When we have a real 64-bit charmm application, some common cpu extensions like sse might be considered. But that's a while from now since we don't have a 64-bit compile yet.

Thanks! Certainly some things to think about.
Andre


Just wanted to make a few comments:

1. It would probably mean more HR classes.

2. Don't the Intel compilers support compilation for those features and falling back to not using them if they detect an AMD cpu or lack of support for the feature at runtime? If it means a big speed increase, it might be worth doing.

3. Alternately, the application could include 2 or 3 versions of Charmm and the wrapper could decide which to execute. It would mean that everyone downloaded the 2 or 3 versions of Charmm and only used one of them, but the big download would only occur when versions of Charmm change.

4. There are now going to be SSE4 instructions.

Happy Crunching!

-- David

____________
The views expressed are my own.
Facts are subject to memory error :-)
Have you read a good science fiction novel lately?
Profile Andre Kerstens
Forum moderator
Project tester
Volunteer tester
Avatar

Joined: Sep 11 06
Posts: 749
ID: 1
Credit: 15,199
RAC: 0
Message 2946 - Posted 4 Apr 2007 15:57:26 UTC

I'll ask Memo to look into this. Enabling some optimization for 64-bit clients makes sense to me. We'll have to see if we can enable extensions so specifically with compiler options and whether the app will still work if that extension wouldn't be available. But I'm sure we can figure that out by doing a bit of research :-)

Thanks
Andre
____________
D@H the greatest project in the world... a while from now!

Profile clownius
Volunteer tester
Avatar

Joined: Nov 14 06
Posts: 61
ID: 280
Credit: 2,677
RAC: 0
Message 2971 - Posted 10 Apr 2007 11:51:41 UTC

I wait with baited breath :) Another project to turn my C2D on 64 bit Linux loose on. Better still if i ever finish building my c2Q something to really let it loose on.
____________

zombie67 [MM]
Volunteer tester
Avatar

Joined: Sep 18 06
Posts: 207
ID: 114
Credit: 2,817,648
RAC: 0
Message 2973 - Posted 10 Apr 2007 15:20:49 UTC

What about the intel-mac application? No need to worry about old processor support. The *oldest* chips is the Core Solo/Duo. Which means that all intel Macs support the following:

FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM SSE3 MON VMX EST TM2 TPR

At a minimum, all intel Mac applications should be made to use SSE2, if not SSE3...right?

Same question for PPC-mac application, as they all support AltiVec, I think.
____________
Dublin, CA
Team SETI.USA

fubared
Volunteer tester

Joined: Nov 14 06
Posts: 11
ID: 293
Credit: 57,379
RAC: 0
Message 2986 - Posted 11 Apr 2007 11:01:40 UTC - in response to Message ID 2973 .

What about the intel-mac application? No need to worry about old processor support. The *oldest* chips is the Core Solo/Duo. Which means that all intel Macs support the following:

FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM SSE3 MON VMX EST TM2 TPR

At a minimum, all intel Mac applications should be made to use SSE2, if not SSE3...right?

Same question for PPC-mac application, as they all support AltiVec, I think.


But Core Duo don't support 64bit so for maximum support, you still need 32bit apps.
zombie67 [MM]
Volunteer tester
Avatar

Joined: Sep 18 06
Posts: 207
ID: 114
Credit: 2,817,648
RAC: 0
Message 2989 - Posted 11 Apr 2007 16:23:31 UTC - in response to Message ID 2986 .
Last modified: 11 Apr 2007 16:24:11 UTC

But Core Duo don't support 64bit so for maximum support, you still need 32bit apps.

Not sure I follow. I didn't say anything about 64 bit.

I asked about optimization used for the Mac applications.
____________
Dublin, CA
Team SETI.USA
Aaron Finney
Volunteer tester

Joined: Mar 23 07
Posts: 74
ID: 367
Credit: 2,409,831
RAC: 0
Message 2999 - Posted 12 Apr 2007 6:44:42 UTC - in response to Message ID 2989 .

But Core Duo don't support 64bit so for maximum support, you still need 32bit apps.

Not sure I follow. I didn't say anything about 64 bit.

I asked about optimization used for the Mac applications.


I agree. There certainly aren't any Intel / Mac computers around that could not take advantage of many of the same extensions. I hear that in some cases the processing time can be halved or more.

The compiles have to be made anyway, they might as well contain these optimizations.
Dotsch
Volunteer tester
Avatar

Joined: Sep 13 06
Posts: 49
ID: 75
Credit: 57,728
RAC: 0
Message 3002 - Posted 12 Apr 2007 7:21:48 UTC - in response to Message ID 2999 .
Last modified: 12 Apr 2007 7:22:56 UTC

I hear that in some cases the processing time can be halved or more.

The speed up from 64 bits and the compiler flags differ heavly on the application and on the architecture. Some applications could also run slower with 64 bit compared with 32 bits.
My expiriance is about -5 to 10 % faster with heavy optimized flags. Also about -15 to +15 % with 64 bits and good optimisation. If you get more, you have luck...
zombie67 [MM]
Volunteer tester
Avatar

Joined: Sep 18 06
Posts: 207
ID: 114
Credit: 2,817,648
RAC: 0
Message 3018 - Posted 13 Apr 2007 1:18:35 UTC - in response to Message ID 2973 .

What about the intel-mac application? No need to worry about old processor support. The *oldest* chips is the Core Solo/Duo. Which means that all intel Macs support the following:

FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM SSE3 MON VMX EST TM2 TPR

At a minimum, all intel Mac applications should be made to use SSE2, if not SSE3...right?

Same question for PPC-mac application, as they all support AltiVec, I think.


Bump? Do the Intel Mac and/or the PPC Mac applications take advantage of things like SSE2 and/or AltiVec? If not, any plans to do so? Seems a waste not to, since no legacy support is required for either.
____________
Dublin, CA
Team SETI.USA
fubared
Volunteer tester

Joined: Nov 14 06
Posts: 11
ID: 293
Credit: 57,379
RAC: 0
Message 3031 - Posted 13 Apr 2007 22:53:47 UTC - in response to Message ID 2989 .

But Core Duo don't support 64bit so for maximum support, you still need 32bit apps.

Not sure I follow. I didn't say anything about 64 bit.

I asked about optimization used for the Mac applications.


Oops sorry. I thought we were talking about 64 bit apps as the title suggest.My bad.
Profile Andre Kerstens
Forum moderator
Project tester
Volunteer tester
Avatar

Joined: Sep 11 06
Posts: 749
ID: 1
Credit: 15,199
RAC: 0
Message 3081 - Posted 16 Apr 2007 22:27:44 UTC - in response to Message ID 3018 .


Bump? Do the Intel Mac and/or the PPC Mac applications take advantage of things like SSE2 and/or AltiVec? If not, any plans to do so? Seems a waste not to, since no legacy support is required for either.


We compile the Mac Intel and PPC app with the -O2 compiler flag set, which does a lot of optimization of code. We'll check into the sse2/altivec optimizations specifically and enable these if not done by O2 yet.

Thanks
Andre
____________
D@H the greatest project in the world... a while from now!
zombie67 [MM]
Volunteer tester
Avatar

Joined: Sep 18 06
Posts: 207
ID: 114
Credit: 2,817,648
RAC: 0
Message 3087 - Posted 17 Apr 2007 4:57:53 UTC - in response to Message ID 3081 .

We compile the Mac Intel and PPC app with the -O2 compiler flag set, which does a lot of optimization of code. We'll check into the sse2/altivec optimizations specifically and enable these if not done by O2 yet.


Great! Please keep us updated!
____________
Dublin, CA
Team SETI.USA
Augustine
Volunteer tester

Joined: Sep 13 06
Posts: 46
ID: 5
Credit: 143,502
RAC: 0
Message 3088 - Posted 17 Apr 2007 14:33:01 UTC - in response to Message ID 2946 .

I'll ask Memo to look into this. Enabling some optimization for 64-bit clients makes sense to me. We'll have to see if we can enable extensions so specifically with compiler options and whether the app will still work if that extension wouldn't be available.

The AMD64 ABIs for both Linux and Windows require SSE2 support. Therefore, when a 64-bit application is built by default it uses SSE and SSE2 for all single and double-precision floating-point calculations. And given that SSE and SSE2 can be taken for granted on AMD64 systems, it follows that vectorization can also be used.

So, Aaron is right on suggesting that the 32-bit application sent to AMD64 clients could have SSE and SSE2 enabled as well as vectorization.

IIUC, Docking uses the GCC compiler. Therefore, make sure to add "-msse2 -mfpmath=sse -ftree-vectorize" (-msse2 implies -msse) when building a 32-bit application for AMD64 clients.

The Intel compiler generates code for several processors and at run-time runs the code better suited depending on the features supported. However, it's known to run the slowest code the possible when run on AMD processors.

The MS compiler however doesn't support vectorization, but it does support SSE and SSE2 through the option "/arch:sse2".

HTH

____________
Profile Andre Kerstens
Forum moderator
Project tester
Volunteer tester
Avatar

Joined: Sep 11 06
Posts: 749
ID: 1
Credit: 15,199
RAC: 0
Message 3089 - Posted 17 Apr 2007 18:28:02 UTC - in response to Message ID 3088 .

We only use the gcc compiler (g95 for the fortran code) on Mac PPC. For all the other platforms we use the Intel compiler.

Thanks for the info Augustine!
Andre


IIUC, Docking uses the GCC compiler. Therefore, make sure to add "-msse2 -mfpmath=sse -ftree-vectorize" (-msse2 implies -msse) when building a 32-bit application for AMD64 clients.

The Intel compiler generates code for several processors and at run-time runs the code better suited depending on the features supported. However, it's known to run the slowest code the possible when run on AMD processors.

The MS compiler however doesn't support vectorization, but it does support SSE and SSE2 through the option "/arch:sse2".

HTH


____________
D@H the greatest project in the world... a while from now!
Tom Philippart
Volunteer tester
Avatar

Joined: Dec 22 06
Posts: 17
ID: 340
Credit: 44,929
RAC: 0
Message 5103 - Posted 2 Jul 2009 19:25:11 UTC

any update on a x64 sse windows app?
____________

Message boards : Application Info : 64-Bit / 32 bit, SSE3 and other extensions?

Database Error
: The MySQL server is running with the --read-only option so it cannot execute this statement
array(3) {
  [0]=>
  array(7) {
    ["file"]=>
    string(47) "/boinc/projects/docking/html_v2/inc/db_conn.inc"
    ["line"]=>
    int(97)
    ["function"]=>
    string(8) "do_query"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#24 (2) {
      ["db_conn"]=>
      resource(102) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(51) "update DBNAME.thread set views=views+1 where id=210"
    }
  }
  [1]=>
  array(7) {
    ["file"]=>
    string(48) "/boinc/projects/docking/html_v2/inc/forum_db.inc"
    ["line"]=>
    int(60)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(6) "DbConn"
    ["object"]=>
    object(DbConn)#24 (2) {
      ["db_conn"]=>
      resource(102) of type (mysql link persistent)
      ["db_name"]=>
      string(7) "docking"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(3) {
      [0]=>
      object(BoincThread)#3 (16) {
        ["id"]=>
        string(3) "210"
        ["forum"]=>
        string(2) "11"
        ["owner"]=>
        string(3) "367"
        ["status"]=>
        string(1) "0"
        ["title"]=>
        string(43) "64-Bit / 32 bit, SSE3 and other extensions?"
        ["timestamp"]=>
        string(10) "1246562711"
        ["views"]=>
        string(4) "2409"
        ["replies"]=>
        string(2) "18"
        ["activity"]=>
        string(22) "1.2315024685712999e-87"
        ["sufferers"]=>
        string(1) "0"
        ["score"]=>
        string(1) "0"
        ["votes"]=>
        string(1) "0"
        ["create_time"]=>
        string(10) "1175656419"
        ["hidden"]=>
        string(1) "0"
        ["sticky"]=>
        string(1) "0"
        ["locked"]=>
        string(1) "0"
      }
      [1]=>
      &string(6) "thread"
      [2]=>
      &string(13) "views=views+1"
    }
  }
  [2]=>
  array(7) {
    ["file"]=>
    string(63) "/boinc/projects/docking/html_v2/user/community/forum/thread.php"
    ["line"]=>
    int(184)
    ["function"]=>
    string(6) "update"
    ["class"]=>
    string(11) "BoincThread"
    ["object"]=>
    object(BoincThread)#3 (16) {
      ["id"]=>
      string(3) "210"
      ["forum"]=>
      string(2) "11"
      ["owner"]=>
      string(3) "367"
      ["status"]=>
      string(1) "0"
      ["title"]=>
      string(43) "64-Bit / 32 bit, SSE3 and other extensions?"
      ["timestamp"]=>
      string(10) "1246562711"
      ["views"]=>
      string(4) "2409"
      ["replies"]=>
      string(2) "18"
      ["activity"]=>
      string(22) "1.2315024685712999e-87"
      ["sufferers"]=>
      string(1) "0"
      ["score"]=>
      string(1) "0"
      ["votes"]=>
      string(1) "0"
      ["create_time"]=>
      string(10) "1175656419"
      ["hidden"]=>
      string(1) "0"
      ["sticky"]=>
      string(1) "0"
      ["locked"]=>
      string(1) "0"
    }
    ["type"]=>
    string(2) "->"
    ["args"]=>
    array(1) {
      [0]=>
      &string(13) "views=views+1"
    }
  }
}
query: update docking.thread set views=views+1 where id=210