|
Replies:
9
-
Last Post:
Aug 3, 2007 3:16 AM
by: fintanr
|
|
|
Posts:
707
From:
SE
Registered:
6/14/05
|
|
|
|
RAM throughput - how can it be measured ?
Posted:
Jul 30, 2007 4:10 PM
To: Communities » performance » discuss
|
|
Hi
I have given myself a new computer :-) the Old one was a Athlon 64 X2 4800+ 2.4 Ghz with DDR/400 PC3200 RAM the New one is a Athlon 64 X2 6000+ 3.0 Ghz with DDR-2/800 PC6400 RAM
This should mean that I have twice the memory throughput , but CAS latency is also doubbled. DDR400 Ram has CL2 while DDR-2/800 Ram has CL4 .
I realise that I have never rally measured RAM throughput on unix. We have lots of tools to measure Disk throughput and CPU Load. and trapstap to measure TLB misses But is there any way to get a figure on data throughput to and from RAM ?
Regards //Lars
|
|
|
Rayson Ho
rayrayson@gmail.com
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Jul 30, 2007 6:25 PM
in response to: tunla
|
|
Take a look at the "STREAM benchmark":
http://www.cs.virginia.edu/stream/
Rayson
On 7/30/07, Lars Tunkrans <lars dot tunkrans at bredband dot net> wrote: > I realise that I have never rally measured RAM throughput on unix. > We have lots of tools to measure Disk throughput and CPU Load. > and trapstap to measure TLB misses > But is there any way to get a figure on data throughput to and from RAM ? _______________________________________________ perf-discuss mailing list perf-discuss at opensolaris dot org
|
|
|
|
Posts:
707
From:
SE
Registered:
6/14/05
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Jul 31, 2007 12:54 AM
in response to: Rayson Ho
To: Communities » performance » discuss
|
|
Thank you for the Stream Benchmark Reference .
I compiled stream.c with " cc -fast -o stream stream.c " with the studio12 compiler on SNV_67 with no warnings or errors.
The results for my older PC with the Athlon 64 X2 4800+ and DDR400 Ram is: ------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 2301.1752 0.0142 0.0139 0.0145 Scale: 2229.3462 0.0145 0.0144 0.0147 Add: 2580.2267 0.0189 0.0186 0.0194 Triad: 2609.1122 0.0186 0.0184 0.0190 -------------------------------------------------------------
and the result for the new box Running SNV_69 with Athlon 64 X2 6000+ and DDR2/800 Ram is:
------------------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 4475.4722 0.0072 0.0072 0.0076 Scale: 4085.3004 0.0079 0.0078 0.0080 Add: 4878.4870 0.0099 0.0098 0.0099 Triad: 4924.0952 0.0098 0.0097 0.0102 -------------------------------------------------------------
So yes , Thanks to the Stream Benchmark I have now sort of proved that the new DDR2 memory controller on the AM2 socket Athlon's very nearly doubles memory throughput.
Thanks again
//Lars
|
|
|
|
Rayson Ho
rayrayson@gmail.com
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Jul 31, 2007 6:49 AM
in response to: tunla
|
|
On 7/31/07, Lars Tunkrans <lars dot tunkrans at bredband dot net> wrote: > So yes , Thanks to the Stream Benchmark I have now sort of proved > that the new > DDR2 memory controller on the AM2 socket Athlon's very nearly > doubles memory throughput.
BTW, if you are really into getting the best possible Stream number for your machines, try some extra flags like: -m64 -xopenmp
* -xopenmp will use both cores of your X2s, so it may be interesting to see the results... :)
Also, large page and -xprefetch may help, but you will end up spending lots of time to get an extra few %
Rayson
> > Thanks again > > //Lars > > > This message posted from opensolaris.org > _______________________________________________ > perf-discuss mailing list > perf-discuss at opensolaris dot org >
http://gridengine.sunsource.net/ _______________________________________________ perf-discuss mailing list perf-discuss at opensolaris dot org
|
|
|
|
Posts:
66
From:
IE
Registered:
6/13/05
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Jul 31, 2007 8:12 AM
in response to: Rayson Ho
|
|
Hi,
> On 7/31/07, Lars Tunkrans <lars dot tunkrans at bredband dot net> wrote: > >> So yes , Thanks to the Stream Benchmark I have now sort of proved >> that the new >> DDR2 memory controller on the AM2 socket Athlon's very nearly >> doubles memory throughput. >> > > BTW, if you are really into getting the best possible Stream number > for your machines, try some extra flags like: -m64 -xopenmp > > * -xopenmp will use both cores of your X2s, so it may be interesting > to see the results... :) > >
Setting OMP_NUM_THREADS to physical processor and core count and comparing results after compiling with -xopenmp is generally an interesting metric to look at as well (the default behaviour is to rely on *OMP_DYNAMIC*, but I'm not sure how that evaluates cores v's threads).
> Also, large page and -xprefetch may help, but you will end up spending > lots of time to get an extra few % > > I found -xprefetch -xprefetch_level=3 made the most difference the last time I looked, but as Rayson stated its a lot of experimentation for a small difference.
- Fintan
> Rayson > > > >> Thanks again >> >> //Lars >> >> >> This message posted from opensolaris.org >> _______________________________________________ >> perf-discuss mailing list >> perf-discuss at opensolaris dot org >> >> > > http://gridengine.sunsource.net/ > _______________________________________________ > perf-discuss mailing list > perf-discuss at opensolaris dot org >
_______________________________________________ perf-discuss mailing list perf-discuss at opensolaris dot org
|
|
|
|
Michael Pogue
Michael.Pogue@Sun.COM
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Jul 31, 2007 1:16 PM
in response to: fintanr
|
|
A side comment about the Stream benchmark: it is HIGHLY sensitive to placement and length of the arrays in memory. Different compilers can place the arrays differently in memory, which can result in very different results. This can happen with different compilers, different switches on the same compiler, or even different versions of the same compiler.
So, when you run the Stream test for comparison purposes, make sure you compile once, measure twice. :-)
Mike
Fintan Ryan wrote: > Hi, > >> On 7/31/07, Lars Tunkrans <lars dot tunkrans at bredband dot net> wrote: >> >>> So yes , Thanks to the Stream Benchmark I have now sort of proved >>> that the new >>> DDR2 memory controller on the AM2 socket Athlon's very nearly >>> doubles memory throughput. >>> >> BTW, if you are really into getting the best possible Stream number >> for your machines, try some extra flags like: -m64 -xopenmp >> >> * -xopenmp will use both cores of your X2s, so it may be interesting >> to see the results... :) >> >> > > Setting OMP_NUM_THREADS to physical processor and core count and > comparing results after compiling with -xopenmp is generally an > interesting metric to look at as well (the default behaviour is to rely > on *OMP_DYNAMIC*, but I'm not sure how that evaluates cores v's threads). > >> Also, large page and -xprefetch may help, but you will end up spending >> lots of time to get an extra few % >> >> > I found -xprefetch -xprefetch_level=3 made the most difference the last > time I looked, but as Rayson stated its a lot of experimentation for a > small difference. > > - Fintan > >> Rayson >> >> >> >>> Thanks again >>> >>> //Lars >>> >>> >>> This message posted from opensolaris.org >>> _______________________________________________ >>> perf-discuss mailing list >>> perf-discuss at opensolaris dot org >>> >>> >> http://gridengine.sunsource.net/ >> _______________________________________________ >> perf-discuss mailing list >> perf-discuss at opensolaris dot org >> > > _______________________________________________ > perf-discuss mailing list > perf-discuss at opensolaris dot org _______________________________________________ perf-discuss mailing list perf-discuss at opensolaris dot org
|
|
|
|
Posts:
707
From:
SE
Registered:
6/14/05
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Jul 31, 2007 3:20 PM
in response to: Michael Pogue
To: Communities » performance » discuss
|
|
Well,
compling with -m64 -xopenmp gives one or two more hundred MB/s but the variation of the values between different runs can also vary almost the same amount. So one probably have to run a series of 10 tests or more and compute the mean value to see a consistent difference.
Thinking about it theres only one MMU on the Athlon64 even if I use both cores to throw data at it. And the CPU is in the region of six to four times faster than RAM. So using one Core should be enough to drive RAM at top speed.
//Lars
|
|
|
|
Posts:
66
From:
IE
Registered:
6/13/05
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Aug 2, 2007 1:49 AM
in response to: tunla
|
|
Hi,
> compling with -m64 -xopenmp gives one or two more hundred MB/s > but the variation of the values between different runs can also vary almost > the same amount. So one probably have to run a series of 10 tests or more and compute > the mean value to see a consistent difference. >
Was this with OMP_NUM_THREADS set to processor count?
You tend to need quite a few iterations, when we do runs in my group we tend to run stream twenty times, average that result, and than reboot, and get at least ten more iterations of the twenty times average.
> Thinking about it theres only one MMU on the Athlon64 even if I use both > cores to throw data at it. And the CPU is in the region of six to four times faster > than RAM. So using one Core should be enough to drive RAM at top speed. > What size are you setting N in the streams code to? We pull this out of smbios info where its available.
- Fintan _______________________________________________ perf-discuss mailing list perf-discuss at opensolaris dot org
|
|
|
|
Posts:
707
From:
SE
Registered:
6/14/05
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Aug 2, 2007 11:23 AM
in response to: fintanr
To: Communities » performance » discuss
|
|
Fintanr wrote:
>Was this with OMP_NUM_THREADS set to processor count?
>What size are you setting N in the streams code to? We pull this out of >smbios info where its available.
Yes I had the Env-Var set.
It was the default N= 20000 setting.
Ill get back to this subject in a while. I now intend to get on with the reason I accuierd this machine. To use it as a platform for Vmware server and to play with all the canned virtual machines that can be had around the net.
//Lars
|
|
|
|
Posts:
66
From:
IE
Registered:
6/13/05
|
|
|
|
Re: RAM throughput - how can it be measured ?
Posted:
Aug 3, 2007 3:16 AM
in response to: tunla
|
|
Hi,
>> What size are you setting N in the streams code to? We pull this out of >> smbios info where its available. >> > > Yes I had the Env-Var set. > > It was the default N= 20000 setting. >
ah, what was the L2 cache size?
See the "Adjust the Problem Size" section of the stream reference doc for more http://www.cs.virginia.edu/stream/ref.html
> Ill get back to this subject in a while. I now intend to get on with the reason I accuierd > this machine. To use it as a platform for Vmware server and to play with all the > canned virtual machines that can be had around the net. > Enjoy.
- Fintan
_______________________________________________ perf-discuss mailing list perf-discuss at opensolaris dot org
|
|
|
|
|