OpenSolaris

Discussions Communities Projects Download Source Browser

Home » OpenSolaris Forums » performance » discuss

Thread: UFS Direct I/O

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 2 - Last Post: Aug 9, 2005 11:18 AM by: iriomote
Matty
matty91@gmail.com
UFS Direct I/O
Posted: Aug 8, 2005 1:52 PM

  Click to reply to this thread Reply

Howdy,

While reading through Solaris Internals this weekend, I came to the
section on UFS direct I/O. The book states that random and large
sequential workloads benefit from direct I/O. Does anyone happen
to know how big a "large sequential" I/O needs to be to benefit from
direct I/O? Are there any advantages to using direct I/O with volumes
devoted to Oracle redo/undo and archive logs? I have read that
it is best to avoid direct I/O with redo/undo, since the file system
will cluster small writes, and boost total throughput (especially during
log switches). I have also read that due to the transient nature of
redo/undo, the CPU and memory resources devoted to creating the pages
would be wasted, since these pages would not be re-used for future
reads/writes. Has anyone sat down and looked at direct I/O in depth? Any
idea which workloads (if any) work best with redo/undo on UFS direct I/O
file systems? If there is a set of documentation that explains this,
please let me know.

Thanks,
- Ryan

_______________________________________________
perf-discuss mailing list
perf-discuss at opensolaris dot org



Jarod Jenson
jarod@aeysis.com
Re: UFS Direct I/O
Posted: Aug 9, 2005 6:59 AM   in response to: Matty

  Click to reply to this thread Reply



Matty's email at 8/8/2005 3:52 PM, said:
> Howdy,
>
> While reading through Solaris Internals this weekend, I came to the
> section on UFS direct I/O. The book states that random and large
> sequential workloads benefit from direct I/O. Does anyone happen
> to know how big a "large sequential" I/O needs to be to benefit from
> direct I/O? Are there any advantages to using direct I/O with volumes
> devoted to Oracle redo/undo and archive logs? I have read that
> it is best to avoid direct I/O with redo/undo, since the file system
> will cluster small writes, and boost total throughput (especially during
> log switches). I have also read that due to the transient nature of
> redo/undo, the CPU and memory resources devoted to creating the pages
> would be wasted, since these pages would not be re-used for future
> reads/writes. Has anyone sat down and looked at direct I/O in depth? Any
> idea which workloads (if any) work best with redo/undo on UFS direct I/O
> file systems? If there is a set of documentation that explains this,
> please let me know.
>
> Thanks,
> - Ryan
>
> _______________________________________________
> perf-discuss mailing list
> perf-discuss at opensolaris dot org
>
>

There is an easy way to think about this in the case of Oracle. Use direct I/O
anywhere Oracle uses O_DSYNC. This is a pretty good rule of thumb that will be
true 99% of the time. This means data, redo, and control files all get direct
I/O and archive does not. The presence of O_DSYNC is going to cause UFS to
"break" all of the rules you are familiar with. For instance, no clustering with
O_DSYNC and buffered I/O. This is the configuration I use on smallish systems
all the way up to fully loaded 25K's.

Be on the lookout in the (hopefully) near future for a fix to direct I/O that
will make it behave the way it really should ;) I'll give details later.

Thanks,

Jarod

_______________________________________________
perf-discuss mailing list
perf-discuss at opensolaris dot org



iriomote

Posts: 16
From: Washington, DC USA

Registered: 8/9/05
Re: UFS Direct I/O
Posted: Aug 9, 2005 11:18 AM   in response to: Matty

  Click to reply to this thread Reply

Ryan:

The goodness of UFS direct I/O is highly application-specific. The benefits arise from two fundamental
reasons - one being avoidance of OS page cache scaling limitations, and the other being removal
of the POSIX single-writer lock constraint - which allows multiple I/O operations to a file to occur
concurently when a write is active. Of these factors, the latter usually has the broadest impact.

For Oracle online redo logs, the concensus is that UFS direct I/O is pretty much always a good thing.
By default, Oracle's logging uses asynchronous writes (aio_write()), with data-synchronous
completion criteria (because the logs are opened with O_DSYNC). Because these writes are
synchronous, filesystem write coalescing at the filesystem level has no opportunity to help.
Because the writes are asynchronous, they can benefit from improved thorughput by removal
of the single-writer lock. (Yes - it's confusing, 'synchronous' and 'asynchonous' and not plain-
English opposites here, but rather different topics altogether. I/O that is not asynchronous is
'blocking' (eg: pwrite()), and I/O that is not synchronous is 'deferred' (ie: only flushed by fsync(),
maxcontig fills, or moved along by fsflushd.)

The only downside to using UFS direct I/O for online logs comes from the archiver losing the
performance advantage of UFS filesystem pre-fetching when reading these files. However,
since the archiver uses larger I/O sizes, I'm not aware that this has ever become anyone's
constraining bottleneck. Therefore, the improved write throughput to logs with UFS direct I/O is
pretty much always a good tradeoff. There are also tradeoffs and limitations associated with the
memory management overhead underlying the OS page cache with and without filesystem buffering.
At high throughput rates, these factors can absolutely be limiting, but most folks are far more
impacted by the single-writer lock than the cost of memory mamagement, so I consider these
impacts to be secondary.

Note that you would *never* want UFS direct I/O on log achive destinations, since the archiver
does *not* use O_DSYNC on its output files, and expects to enjoy the performance benefit of
deferred writes!

The size threshold at which UFS direct I/O would be beneficial can depend on a great many
factors - including UFS tunables; volume management factors; I/O mutlipathing factors; the
actual APIs used by the application; whether or not space allocation is occuring; and backend
configuration factors. For any given configuration, what's best can be best determined by I/O
microbenchmarking techniques. Formulating an appropriate microbenchmark requires an accurate
understanding of the actual APIs and tuning factors used by your actual application. For Oracle
logging, a correct microbenchmark would use O_DSYNC on open() and aio_write() for writing -
and the target files will be pre-allocated so that filesystem logging will not bias the results.
Assuming a high transaction rate, Oracle itself will probably coalesce log writes to 8K operations,
but for a single-stream workload of iterated single-row INSERT/COMMIT operations, log writes may be
quite small. Unfortunately, in the area of I/O microbencharking, errors occur quite frequently due
to inappropriate experiment design and incorrect interpretation of results - so be careful!

For each application and category of I/O, there are tradeoffs to consider in using UFS direct I/O.
As a rule, high-end scaling requires use of some storage option with the essential characteristics of
UFS direct I/O - and that would include RAW, QFS direct I/O with Q-writes, VxFS Quick I/O or VxFS
ODM. All of these should be expected to perform 'similarly' - but the UFS option is free! When moving
to one of these options from 'out-of-the-box' buffered I/O, it is typically necessary to do some Oracle
tuning to make use of the system memory that is liberated when filesystem buffering is switched off.
It is also typical that the impact of these options on backup and restore operations needs to be
properly evaluated.

The physics underlying these factors is all well-understood. The problems come in making policy
decisions around the tradeoffs associated with these factors. There is a load of mis-information
available online. Beware any posting that says "you should always use UFS direct I/O". There is
a complex set of tradeoffs here, including operational constraints and logistics of changing from
other options. The best guidance I can offer in a small space is to "make well-informed decisions
regarding these factors". To promote a better understanding of these factors, I wrote a paper a
while ago called "Oracle I/O: Supply and Demand". That paper is due for an upgrade, and I hope
to push it out this Fall - with the scope expanded to include RAC/Grid considerations and factors
affecting 'direct path' and NOLOGGING write performance. Shucks - this posting is getting way too
close to *being* a whitepaper! ;-)

Hope this helps,
-- Bob Sneed




Terms of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.
© 2010, Oracle Corporation and/or its affiliates

Oracle