Creation Zone

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Tuesday, 4 October 2005

Dynamic TSB support in Solaris 10

Posted on 00:54 by Unknown
Recently at our partner (ISV, short for Independent Software Vendor) site, I was asked to look into some unexpected {huge} performance improvement {compared to Solaris 9} in running their application on a UltraSPARC based server running Solaris 10. The application was compiled/built on Solaris 8. Although Solaris 10 has many improvements, the mileage of the customer applications vary depending on the nature of the application being run. The application under discussion is a userland, CPU intensive financial application, written in C++. Apparently I didn't have much information about the way the application was compiled, and on which platform (hardware & software). So I made one of the Sun Fire v480's, a dual boot server with Solaris 9 and 10 installed on two partitions. I've installed the ISV's application on the third partition, and did a quick test by loading up the machine with virtual users until the average CPU consumption was about 85%. Interestingly I haven't seen phenomenal improvement as the ISV observed, but decent enough (~2.88%) gain that lead me to investigate further.

Here's the trapstat output from Solaris 9 env:

cpu m size| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim
----------+-------------------------------+-------------------------------+----
ttl | 1339705 7.7 12031 0.5 | 869027 6.3 86899 4.7 |19.2
ttl | 1371385 7.9 12165 0.5 | 931897 6.8 93874 5.1 |20.3
ttl | 1261136 7.2 11227 0.5 | 862982 6.3 86420 4.7 |18.7
ttl | 1334286 7.7 12201 0.5 | 871144 6.4 90464 4.9 |19.4
ttl | 1423610 8.2 14101 0.6 | 957773 7.1 105544 5.7 |21.6
ttl | 1399334 8.1 14120 0.6 | 973754 7.2 110116 6.0 |21.9
ttl | 1478324 8.5 13310 0.6 | 975822 7.2 104689 5.7 |21.9
ttl | 1416840 8.1 12698 0.5 | 962725 7.1 98593 5.3 |21.0
ttl | 1464161 8.4 13149 0.6 | 974467 7.2 105842 5.8 |21.9
ttl | 1412006 8.2 13685 0.6 | 915772 6.9 107461 5.9 |21.5

Average time spent in virtual to physical memory translatations: 20.74%

trapstat output from Solaris 10 env:

cpu m size| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim
----------+-------------------------------+-------------------------------+----
ttl | 1449113 8.7 5045 0.2 | 1015584 7.5 35621 1.7 |18.1
ttl | 1504522 9.0 5771 0.3 | 1056137 7.9 39809 1.9 |19.0
ttl | 1372013 8.2 4824 0.2 | 965968 7.2 33577 1.6 |17.2
ttl | 1366566 8.2 5194 0.2 | 988130 7.3 34719 1.6 |17.3
ttl | 1433062 8.6 5170 0.2 | 1006544 7.4 34607 1.6 |17.9
ttl | 1463364 8.8 5403 0.2 | 1023112 7.6 37313 1.7 |18.3
ttl | 1356094 8.1 4904 0.2 | 979501 7.3 34212 1.6 |17.2
ttl | 1497592 9.0 5816 0.3 | 1060080 7.9 39844 1.9 |19.0
ttl | 1468445 8.8 6166 0.3 | 1079617 8.1 42968 2.0 |19.2
ttl | 1505277 9.0 5737 0.3 | 1062101 7.8 39025 1.8 |18.9

Average time spent in virtual to physical memory translatations: 18.21%

In Solaris 9 env, the OS has spent 2.53% CPU cycles more {compared to Solaris 10}, in serving TLB/TSB misses. A closer look at the trapstat outputs reveals that Solaris 10 has less burden of serving TSB misses {for data}. There's about 3.64% difference between Solaris 9 & 10's dTSB miss%; but Solaris 10 spent ~0.75% more cycles in serving dTLB misses, compared to Solaris 9, which leaves us 2.89% (ie., 3.64% - 0.75%). Surprisingly this number (2.89%) matched with the gain (2.88%) I've observed by running the same application on both Solaris 9 & 10.

Solaris 10's dynamic TSB support

Since I ran the same application on same hardware with different versions of Solaris, I can directly attribute the improvement in performance to Solaris 10. It is no brainer to quickly realize that this is the result of the algorithmic changes to dynamic TSB support in Solaris 10.

On Solaris 9 and prior versions, depending on the physical memory installed on the machine, the system allocates a fixed number of TSBs with size 128KB or 512KB, at boot time; and since the number is fixed, all processes have to share those TSBs. Due to the limited (only 2) number of supported TSB sizes, any process that needs a TSB of size somewhere between 128 & 512, say 256KB, may either experience a miss (for eg., if the translation was done in a 128KB TSB) or wastes some memory (for eg, if the translation was done in 512KB).

Prior to version 10, Solaris is lacking the flexibility of using the right TSB size, for the right process. Recent versions of UltraSPARC chips can support TSBs of eight different sizes (8K, 16K, 32K, 64K, 128K, 256K, 512K, 1024K or 1M). By sticking to only 128K and 512K TSB's, Solaris 9 and prior versions couldn't take the advantage of the hardware capability quite efficiently.

Solaris 10 overcomes those drawbacks mentioned above by creating a TSB on the fly, per the needs of the process. Here's the corresponding RFE to fix the issues which were seen until Solaris 9:Integrate support for Dynamic TSBs.

Now it makes more sense for me to mention about the 3% reduction in memory footprint per user, in my test runs.

To complete the original story of huge performance difference between Solaris 9 & 10, I gave them a check list to make sure they are doing apples-to-apples comparison; but I never heard from 'em back. Anyway here's the check list that I sent:
  1. On which hardware (US II/III/IV/..), the application was built?

    This is extremely important to know, because building on US II, and running the binary on later processors (US III, III+, ..) will have significant impact on the overall performance of the application. For eg., I have seen nearly 4% performance difference in CPU utilization with some application that was built on US II, and ran on US III+ machine, with similar work loads.

  2. Is it the same binary that was run on both Solaris 9 & 10?

  3. Check the difference(s) in the run-time environments of both experiments. Have the tests been conducted on the same kind of hardware ? With same #processors? With same load? etc.,

  4. Make sure to use the same {application & OS} tunables, in both experiments

  5. Which version of Solaris 10 is in use to test the application?

    The later versions of Solaris 10 (also called Solaris Express builds), enable large pages for data and instructions by default, if the OS thinks doing so is beneficial. Large pages for data (MPSS) is already introduced in Solaris 9; now Solaris 11 extends it to instructions. So, if Solaris Express bits are being used (less likely though; but there's a possibility), there is almost 13 to 15% improvement (based on the trapstat data shown above) in CPU utilization with no effort from the users.

  6. Make sure large pages are either enabled or not enabled on both (S9 & S10) platforms

  7. If the application binary is not the same, check the changes in the application, that could improve performance significantly. Also check the changes in compiler flags
________________
Technorati tag: Solaris
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • *nix: Workaround to cannot find zipfile directory in one of file.zip or file.zip.zip ..
    Symptom: You are trying to extract the archived files off of a huge (any file with size > 2 GB or 4GB, depending on the OS) ZIP file with...
  • JDS: Installing Sun Java Desktop System 2.0
    This document will guide you through the process of installing JDS 2.0 on a PC from integrated CDROM images Requirements I...
  • Linux: Installing Source RPM (SRPM) package
    RPM stands for RedHat Package Manager. RPM is a system for installing and managing software & most common software package manager used ...
  • Solaris: malloc Vs mtmalloc
    Performance of Single Vs Multi-threaded application Memory allocation performance in single and multithreaded environments is an important a...
  • C/C++: Printing Stack Trace with printstack() on Solaris
    libc on Solaris 9 and later, provides a useful function called printstack , to print a symbolic stack trace to the specified file descripto...
  • Installing MySQL 5.0.51b from the Source Code on Sun Solaris
    Building and installing the MySQL server from the source code is relatively very easy when compared to many other OSS applications. At least...
  • Oracle Apps on T2000: ORA-04020 during Autoinvoice
    The goal of this brief blog post is to provide a quick solution to all Sun-Oracle customers who may run into a deadlock when a handful of th...
  • Siebel Connection Broker Load Balancing Algorithm
    Siebel server architecture supports spawning multiple application object manager processes. The Siebel Connection Broker, SCBroker, tries to...
  • 64-bit dbx: internal error: signal SIGBUS (invalid address alignment)
    The other day I was chasing some lock contention issue with a 64-bit application running on Solaris 10 Update 1; and stumbled with an unexpe...
  • Oracle 10gR2/Solaris x64: Fixing ORA-20000: Oracle Text errors
    First, some facts: * Oracle Applications 11.5.10 (aka E-Business Suite 11 i ) database is now supported on Solaris 10 for x86-64 architectur...

Categories

  • 80s music playlist
  • bandwidth iperf network solaris
  • best
  • black friday
  • breakdown database groups locality oracle pmap sga solaris
  • buy
  • deal
  • ebiz ebs hrms oracle payroll
  • emca oracle rdbms database ORA-01034
  • friday
  • Garmin
  • generic+discussion software installer
  • GPS
  • how-to solaris mmap
  • impdp ora-01089 oracle rdbms solaris tips upgrade workarounds zombie
  • Magellan
  • music
  • Navigation
  • OATS Oracle
  • Oracle Business+Intelligence Analytics Solaris SPARC T4
  • oracle database flashback FDA
  • Oracle Database RDBMS Redo Flash+Storage
  • oracle database solaris
  • oracle database solaris resource manager virtualization consolidation
  • Oracle EBS E-Business+Suite SPARC SuperCluster Optimized+Solution
  • Oracle EBS E-Business+Suite Workaround Tip
  • oracle lob bfile blob securefile rdbms database tips performance clob
  • oracle obiee analytics presentation+services
  • Oracle OID LDAP ADS
  • Oracle OID LDAP SPARC T5 T5-2 Benchmark
  • oracle pls-00201 dbms_system
  • oracle siebel CRM SCBroker load+balancing
  • Oracle Siebel Sun SPARC T4 Benchmark
  • Oracle Siebel Sun SPARC T5 Benchmark T5-2
  • Oracle Solaris
  • Oracle Solaris Database RDBMS Redo Flash F40 AWR
  • oracle solaris rpc statd RPC troubleshooting
  • oracle solaris svm solaris+volume+manager
  • Oracle Solaris Tips
  • oracle+solaris
  • RDC
  • sale
  • Smartphone Samsung Galaxy S2 Phone+Shutter Tip Android ICS
  • solaris oracle database fmw weblogic java dfw
  • SuperCluster Oracle Database RDBMS RAC Solaris Zones
  • tee
  • thanksgiving sale
  • tips
  • TomTom
  • windows

Blog Archive

  • ►  2013 (16)
    • ►  December (3)
    • ►  November (2)
    • ►  October (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (1)
    • ►  May (1)
    • ►  April (1)
    • ►  March (1)
    • ►  February (2)
    • ►  January (1)
  • ►  2012 (14)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  May (1)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (2)
  • ►  2011 (15)
    • ►  December (2)
    • ►  November (1)
    • ►  October (2)
    • ►  September (1)
    • ►  August (2)
    • ►  July (1)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (1)
  • ►  2010 (19)
    • ►  December (3)
    • ►  November (1)
    • ►  October (2)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (1)
    • ►  May (5)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (1)
  • ►  2009 (25)
    • ►  December (1)
    • ►  November (2)
    • ►  October (1)
    • ►  September (1)
    • ►  August (2)
    • ►  July (2)
    • ►  June (1)
    • ►  May (2)
    • ►  April (3)
    • ►  March (1)
    • ►  February (5)
    • ►  January (4)
  • ►  2008 (34)
    • ►  December (2)
    • ►  November (2)
    • ►  October (2)
    • ►  September (1)
    • ►  August (4)
    • ►  July (2)
    • ►  June (3)
    • ►  May (3)
    • ►  April (2)
    • ►  March (5)
    • ►  February (4)
    • ►  January (4)
  • ►  2007 (33)
    • ►  December (2)
    • ►  November (4)
    • ►  October (2)
    • ►  September (5)
    • ►  August (3)
    • ►  June (2)
    • ►  May (3)
    • ►  April (5)
    • ►  March (3)
    • ►  February (1)
    • ►  January (3)
  • ►  2006 (40)
    • ►  December (2)
    • ►  November (6)
    • ►  October (2)
    • ►  September (2)
    • ►  August (1)
    • ►  July (2)
    • ►  June (2)
    • ►  May (4)
    • ►  April (5)
    • ►  March (5)
    • ►  February (3)
    • ►  January (6)
  • ▼  2005 (72)
    • ►  December (5)
    • ►  November (2)
    • ▼  October (6)
      • C/C++: About __FILE__ & __LINE__ Macros
      • My Favorite Music V
      • Solaris: pthread_attr_getstack() broken?
      • Handling SIGFPE
      • Sun Studio: Investigating memory leaks with dbx
      • Dynamic TSB support in Solaris 10
    • ►  September (5)
    • ►  August (5)
    • ►  July (10)
    • ►  June (8)
    • ►  May (9)
    • ►  April (6)
    • ►  March (6)
    • ►  February (5)
    • ►  January (5)
  • ►  2004 (36)
    • ►  December (1)
    • ►  November (5)
    • ►  October (12)
    • ►  September (18)
Powered by Blogger.

About Me

Unknown
View my complete profile