Creation Zone

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Sunday, 4 September 2005

Sun Studio C/C++: Profile Feedback Optimization II

Posted on 23:26 by Unknown
Most of the related information is already available at: Sun Studio C/C++: Profile Feedback Optimization. This blog post tries to cover the missing {from previous blog post} pieces of PFO (aka Feedback Based Optimization, FBO).

Compiling with multiple profiles

Even though it was not mentioned explicitly {in plain english} in the C++ compiler options, Sun C/C++ compilers accept multiple profiles on the compile line, with multiple -xprofile=use:<dir> options. -xprofile=use:<dir>:<dir>..<dir> results in a compilation error.

eg.,
CC -xO4 -xprofile=use:/tmp/prof1.profile -xprofile=/tmp/prof2.profile driver.cpp

When compiler encounters multiple profiles on the compile line, it merges all the data before proceeding to do optimizations based on the feedback data.

Building patches contd.,

In general, it is always recommended to collect profile data, whenever something gets changed in the source code. However it may not be feasible to do it, when very large applications were built with feedback optimization. So, organizations tend to skip the feedback data collection when the changes are limited to very few lines (Quick fixes); and to collect the data once the quick fixes become large enough to release a patch Cluster (aka Fix pack). Normally fix packs will have the binaries for the entire product, and all the old binaries will be replaced with the new ones when the patch was applied.

It is important to know, how a simple change in source code affects the feedback optimization, in the presence of old profile data. Assume that an application was linked with a library libstrimpl.so, that has implementation for string comparison (__strcmp) and for calculating the length of a string (__strlen).

eg.,
% cat strimpl.h
int __strcmp(const char *, const char *);
int __strlen(const char *);

% cat strimpl.c
#include <stdlib.h>
#include "strimpl.h"

int __strcmp(const char *str1, const char *str2 ) {
int rc = 0;

for(;;) {
rc = *str1 - *str2;
if(rc != 0 || *str1 == 0) {
return (rc);
}
++str1;
++str2;
}
}

int __strlen(const char *str) {
int length = 0;

for(;;) {
if (*str == 0) {
return (length);
} else {
++length;
++str;
}
}
}

% cat driver.c
#include <stdio.h>
#include "strimpl.h"

int main() {
int i;

for (i = 0; i < 50; ++i) {
printf("\nstrcmp(pod, podcast) = %d", __strcmp("pod", "podcast"));
printf("\nstrlen(Solaris10) = %d", __strlen("Solaris10"));
}

return (0);
}

Now let's assume that the driver was built with the feedback data, with the following commands:
cc -xO2 -xprofile=collect -G -o libstrimpl.so strimpl.c
cc -xO2 -xprofile=collect -lstrimpl -o driver driver.c
./driver
cc -xO2 -xprofile=use:driver -G -o libstrimpl.so strimpl.c
cc -xO2 -xprofile=use:driver -lstrimpl -o driver driver.c

For the next release of the driver, let's say the string library was extended by a routine to reverse the given string (__strreverse). Let's see what happens if we skip the profile data collection for this library, after integrating the code for __strreverse routine. The new code can be added anywhere (top, middle or at the end) in the source file.

Case 1: Assuming the routine was added at the bottom of the existing routines

% cat strimpl.c
#include <stdlib.h>
#include "strimpl.h"

int __strcmp(const char *str1, const char *str2 ) { ... }

int __strlen(const char *str) { ... }

char *__strreverse(const char *str) {
int i, length = 0;
char *revstr = NULL;

length = __strlen(str);
revstr = (char *) malloc (sizeof (char) * length);

for (i = length; i > 0; --i) {
*(revstr + i - 1) = *(str + length - i);
}

return (revstr);
}

% cc -xO2 -xprofile=use:driver -G -o libstrimpl.so strimpl.c
warning: Profile feedback data for function __strreverse is inconsistent. Ignored.

This (adding the new code at the bottom of the source file) is the recommended/wisest thing to do, if we don't want to collect the feedback data for the new code that we add. Doing so, the existing profile data remains consistent, and get optimized as before. Since there is no feedback data available for the new code, compiler simply does the optimizations as it usually does without -xprofile.

Case 2: Assuming the routine was added somewhere in the middle of the source file

% cat strimpl.c
#include <stdlib.h>
#include "strimpl.h"

int __strcmp(const char *str1, const char *str2 ) { ... }

char *__strreverse(const char *str) {
int i, length = 0;
char *revstr = NULL;

length = __strlen(str);
revstr = (char *) malloc (sizeof (char) * length);

for (i = length; i > 0; --i) {
*(revstr + i - 1) = *(str + length - i);
}

return (revstr);
}

int __strlen(const char *str) { ... }

% cc -xO2 -xprofile=use:driver -G -o libstrimpl.so strimpl.c
warning: Profile feedback data for function __strreverse is inconsistent. Ignored.
warning: Profile feedback data for function __strlen is inconsistent. Ignored.

As compiler keeps track of the routines by line numbers, introducing some code in a routine makes its profile data inconsistent. Also since the position of all other routines that are underneath the newly introduced code may change, their feedback data becomes inconsistent, and hence compiler ignores the profile data, to avoid introducing functional errors.

The same argument holds true, when the new code was added at the top of the existing routines; but it makes it even worse, since all the profile data for the routines of this object become unusable (inconsistent). Have a look at the warnings from the following example:

Case 3: Assuming the routine was added at the top of the source file

#include <stdlib.h>
#include "strimpl.h"

char *__strreverse(const char *str) {
int i, length = 0;
char *revstr = NULL;

length = __strlen(str);
revstr = (char *) malloc (sizeof (char) * length);


for (i = length; i > 0; --i) {
*(revstr + i - 1) = *(str + length - i);
}

return (revstr);
}

int __strcmp(const char *str1, const char *str2 ) { ... }

int __strlen(const char *str) { ... }

% cc -xO2 -xprofile=use:driver -G -o libstrimpl.so strimpl.c
warning: Profile feedback data for function __strreverse is inconsistent. Ignored.
warning: Profile feedback data for function __strcmp is inconsistent. Ignored.
warning: Profile feedback data for function __strlen is inconsistent. Ignored.

SPARC, x86/x64 compatibility

At this time, there is no compatibility between the way the profile data gets generated & gets processed on SPARC, and x86/x64 platforms. That is, it is not possible to share the feedback data generated by C/C++ compilers on SPARC, in x86/x64 platforms and vice-versa.

However there seems to be some plan in place to make it compatible in Sun Studio 12 release.

Asynchronous profile collection

Current profile data collection requires the process to be terminated, in order to dump the feedback data. Also with multi-threading processes, there will be some incomplete profile data generation, due to the lock contention between multiple threads. If the process dynamically loads, and unloads other libraries with the help of dlopen(), dlclose() system calls, it leads to indirect call profiling, and it has its share of problems in collecting the data.

Asynchronous profile collection eases all the problems mentioned above by letting the profiler thread to write the profile data it is collecting, periodically. With the asynchronous data collection, the probability of getting the proper feedback data is high.

This feature will be available by default in Sun Studio 11; and as a patch to Sun Studio 9 & 10 compilers. Stay tuned for the exact patch numbers for Studio 9 and 10.

Notes:
  1. When -xprofile=collect is used to compile a program for profile collection and -xprofile=use is used to compile a program with profile feedback, the source files and compiler options other than -xprofile=collect and -xprofile=use must be identical in both compilations

  2. If both -xprofile=collect and -xprofile=use are specified in the same command line, the rightmost -xprofile option in the command line is applied

  3. If the code was compiled with -g or -g0 options, with the help of er_src utility, we can see how the compiler is optimizing with the feedback data. Here's how to: Sun Studio C/C++: Annotated listing (compiler commentary) with er_src
Acknowledgements:
Chris Aoki, Sun Microsystems
__________________
Technorati tags: Sun Studio | C | C++
Email ThisBlogThis!Share to XShare to Facebook
Posted in | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • *nix: Workaround to cannot find zipfile directory in one of file.zip or file.zip.zip ..
    Symptom: You are trying to extract the archived files off of a huge (any file with size > 2 GB or 4GB, depending on the OS) ZIP file with...
  • JDS: Installing Sun Java Desktop System 2.0
    This document will guide you through the process of installing JDS 2.0 on a PC from integrated CDROM images Requirements I...
  • Linux: Installing Source RPM (SRPM) package
    RPM stands for RedHat Package Manager. RPM is a system for installing and managing software & most common software package manager used ...
  • Solaris: malloc Vs mtmalloc
    Performance of Single Vs Multi-threaded application Memory allocation performance in single and multithreaded environments is an important a...
  • C/C++: Printing Stack Trace with printstack() on Solaris
    libc on Solaris 9 and later, provides a useful function called printstack , to print a symbolic stack trace to the specified file descripto...
  • Installing MySQL 5.0.51b from the Source Code on Sun Solaris
    Building and installing the MySQL server from the source code is relatively very easy when compared to many other OSS applications. At least...
  • Oracle Apps on T2000: ORA-04020 during Autoinvoice
    The goal of this brief blog post is to provide a quick solution to all Sun-Oracle customers who may run into a deadlock when a handful of th...
  • Siebel Connection Broker Load Balancing Algorithm
    Siebel server architecture supports spawning multiple application object manager processes. The Siebel Connection Broker, SCBroker, tries to...
  • 64-bit dbx: internal error: signal SIGBUS (invalid address alignment)
    The other day I was chasing some lock contention issue with a 64-bit application running on Solaris 10 Update 1; and stumbled with an unexpe...
  • Oracle 10gR2/Solaris x64: Fixing ORA-20000: Oracle Text errors
    First, some facts: * Oracle Applications 11.5.10 (aka E-Business Suite 11 i ) database is now supported on Solaris 10 for x86-64 architectur...

Categories

  • 80s music playlist
  • bandwidth iperf network solaris
  • best
  • black friday
  • breakdown database groups locality oracle pmap sga solaris
  • buy
  • deal
  • ebiz ebs hrms oracle payroll
  • emca oracle rdbms database ORA-01034
  • friday
  • Garmin
  • generic+discussion software installer
  • GPS
  • how-to solaris mmap
  • impdp ora-01089 oracle rdbms solaris tips upgrade workarounds zombie
  • Magellan
  • music
  • Navigation
  • OATS Oracle
  • Oracle Business+Intelligence Analytics Solaris SPARC T4
  • oracle database flashback FDA
  • Oracle Database RDBMS Redo Flash+Storage
  • oracle database solaris
  • oracle database solaris resource manager virtualization consolidation
  • Oracle EBS E-Business+Suite SPARC SuperCluster Optimized+Solution
  • Oracle EBS E-Business+Suite Workaround Tip
  • oracle lob bfile blob securefile rdbms database tips performance clob
  • oracle obiee analytics presentation+services
  • Oracle OID LDAP ADS
  • Oracle OID LDAP SPARC T5 T5-2 Benchmark
  • oracle pls-00201 dbms_system
  • oracle siebel CRM SCBroker load+balancing
  • Oracle Siebel Sun SPARC T4 Benchmark
  • Oracle Siebel Sun SPARC T5 Benchmark T5-2
  • Oracle Solaris
  • Oracle Solaris Database RDBMS Redo Flash F40 AWR
  • oracle solaris rpc statd RPC troubleshooting
  • oracle solaris svm solaris+volume+manager
  • Oracle Solaris Tips
  • oracle+solaris
  • RDC
  • sale
  • Smartphone Samsung Galaxy S2 Phone+Shutter Tip Android ICS
  • solaris oracle database fmw weblogic java dfw
  • SuperCluster Oracle Database RDBMS RAC Solaris Zones
  • tee
  • thanksgiving sale
  • tips
  • TomTom
  • windows

Blog Archive

  • ►  2013 (16)
    • ►  December (3)
    • ►  November (2)
    • ►  October (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (1)
    • ►  May (1)
    • ►  April (1)
    • ►  March (1)
    • ►  February (2)
    • ►  January (1)
  • ►  2012 (14)
    • ►  December (1)
    • ►  November (1)
    • ►  October (1)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (2)
    • ►  May (1)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (2)
  • ►  2011 (15)
    • ►  December (2)
    • ►  November (1)
    • ►  October (2)
    • ►  September (1)
    • ►  August (2)
    • ►  July (1)
    • ►  May (2)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (1)
  • ►  2010 (19)
    • ►  December (3)
    • ►  November (1)
    • ►  October (2)
    • ►  September (1)
    • ►  August (1)
    • ►  July (1)
    • ►  June (1)
    • ►  May (5)
    • ►  April (1)
    • ►  March (1)
    • ►  February (1)
    • ►  January (1)
  • ►  2009 (25)
    • ►  December (1)
    • ►  November (2)
    • ►  October (1)
    • ►  September (1)
    • ►  August (2)
    • ►  July (2)
    • ►  June (1)
    • ►  May (2)
    • ►  April (3)
    • ►  March (1)
    • ►  February (5)
    • ►  January (4)
  • ►  2008 (34)
    • ►  December (2)
    • ►  November (2)
    • ►  October (2)
    • ►  September (1)
    • ►  August (4)
    • ►  July (2)
    • ►  June (3)
    • ►  May (3)
    • ►  April (2)
    • ►  March (5)
    • ►  February (4)
    • ►  January (4)
  • ►  2007 (33)
    • ►  December (2)
    • ►  November (4)
    • ►  October (2)
    • ►  September (5)
    • ►  August (3)
    • ►  June (2)
    • ►  May (3)
    • ►  April (5)
    • ►  March (3)
    • ►  February (1)
    • ►  January (3)
  • ►  2006 (40)
    • ►  December (2)
    • ►  November (6)
    • ►  October (2)
    • ►  September (2)
    • ►  August (1)
    • ►  July (2)
    • ►  June (2)
    • ►  May (4)
    • ►  April (5)
    • ►  March (5)
    • ►  February (3)
    • ►  January (6)
  • ▼  2005 (72)
    • ►  December (5)
    • ►  November (2)
    • ►  October (6)
    • ▼  September (5)
      • Setting up Webmin on Solaris 10
      • Troubleshooting: Solaris {vold} refuses to mount &...
      • Code Coverage Analysis with tcov
      • One Year {Blog} Anniversary
      • Sun Studio C/C++: Profile Feedback Optimization II
    • ►  August (5)
    • ►  July (10)
    • ►  June (8)
    • ►  May (9)
    • ►  April (6)
    • ►  March (6)
    • ►  February (5)
    • ►  January (5)
  • ►  2004 (36)
    • ►  December (1)
    • ►  November (5)
    • ►  October (12)
    • ►  September (18)
Powered by Blogger.

About Me

Unknown
View my complete profile