Friday, 8 September 2006
Monday, 4 September 2006
Solaris 10/Oracle: Fixing ORA-27102: out of memory Error
As part of a database tuning effort you increase the SGA/PGA sizes; and Oracle greets with an
ORA-27102: out of memory error message. The system had enough free memory to serve the needs of Oracle.SQL> startup
ORA-27102: out of memory
SVR4 Error: 22: Invalid argument
Diagnosis
$ oerr ORA 27102
27102, 00000, "out of memory"
// *Cause: Out of memory
// *Action: Consult the trace file for details
Not so helpful. Let's look the alert log for some clues.
% tail -2 alert.log
WARNING: EINVAL creating segment of size 0x000000028a006000
fix shm parameters in /etc/system or equivalent
Oracle is trying to create a 10G shared memory segment (depends on SGA/PGA sizes), but operating system (Solaris in this example) responded with an invalid argument (EINVAL) error message. There is a little hint about setting shm parameters in
/etc/system.Prior to Solaris 10,
shmsys:shminfo_shmmax parameter has to be set in /etc/system with maximum memory segment value that can be created. 8M is the default value on Solaris 9 and prior versions; where as 1/4th of the physical memory is the default on Solaris 10 and later. On a Solaris 10 (or later) system, it can be verified as shown below:% prtconf | grep Mem
Memory size: 32760 Megabytes
% id -p
uid=59008(oracle) gid=10001(dba) projid=3(default)
% prctl -n project.max-shm-memory -i project 3
project: 3: default
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
project.max-shm-memory
privileged 7.84GB - deny -
system 16.0EB max deny -
Now it is clear that the system is using the default value of 8G in this scenario, where as the application (Oracle) is trying to create a memory segment (10G) larger than 8G. Hence the failure.
So, the solution is to configure the system with a value large enough for the shared segment being created, so Oracle succeeds in starting up the database instance.
On Solaris 9 and prior releases, it can be done by adding the following line to
/etc/system, followed by a reboot for the system to pick up the new value.set shminfo_shmmax = 0x000000028a006000However
shminfo_shmmax parameter was obsoleted with the release of Solaris 10; and Sun doesn't recommend setting this parameter in /etc/system even though it works as expected.On Solaris 10 and later, this value can be changed dynamically on a per project basis with the help of resource control facilities . This is how we do it on Solaris 10 and later:
% prctl -n project.max-shm-memory -r -v 10G -i project 3
% prctl -n project.max-shm-memory -i project 3
project: 3: default
NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT
project.max-shm-memory
privileged 10.0GB - deny -
system 16.0EB max deny -
Note that changes made with the
prctl command on a running system are temporary, and will be lost when the system is rebooted. To make the changes permanent, create a project with projadd command and associate it with the user account as shown below:% projadd -p 102 -c 'eBS benchmark' -U oracle -G dba -K 'project.max-shm-memory=(privileged,10G,deny)' OASB
% usermod -K project=OASB oracle
Finally make sure the project is created with
projects -l or cat /etc/project commands.% projects -l
...
...
OASB
projid : 102
comment: "eBS benchmark"
users : oracle
groups : dba
attribs: project.max-shm-memory=(privileged,10737418240,deny)
% cat /etc/project
...
...
OASB:102:eBS benchmark:oracle:dba:project.max-shm-memory=(privileged,10737418240,deny)
With these changes, Oracle would start the database up normally.
SQL> startup
ORACLE instance started.
Total System Global Area 1.0905E+10 bytes
Fixed Size 1316080 bytes
Variable Size 4429966096 bytes
Database Buffers 6442450944 bytes
Redo Buffers 31457280 bytes
Database mounted.
Database opened.
Related information:
- What's New in Solaris System Tuning in the Solaris 10 Release?
- Resource Controls (overview)
- System Setup Recommendations for Solaris 8 and Solaris 9
- Man page of prctl(1)
- Man page of projadd
Addendum : Oracle RAC settings
Anonymous Bob suggested the following settings for Oracle RAC in the form of a comment for the benefit of others who run into similar issue(s) when running Oracle RAC. I'm pasting the comment as is (Disclaimer: I have not verified these settings):
Thanks for a great explanation, I would like to add one comment that will help those with an Oracle RAC installation. Modifying the default project covers oracle processes great and is all that is needed for a single instance DB. In RAC however, the CRS process starts the DB and it is a root owned process and root does not use the default project. To fix ORA-27102 issue for RAC I added the following lines to an init script that runs before the init.crs script fires.
# Recommended Oracle RAC system params
ndd -set /dev/udp udp_xmit_hiwat 65536
ndd -set /dev/udp udp_recv_hiwat 65536
# For root processes like crsd
prctl -n project.max-shm-memory -r -v 8G -i project system
prctl -n project.max-shm-ids -r -v 512 -i project system
# For oracle processes like sqlplus
prctl -n project.max-shm-memory -r -v 8G -i project default
prctl -n project.max-shm-ids -r -v 512 -i project default
So simple yet it took me a week working with Oracle and SUN to come up with that answer...Hope that helps someone out.
Bob
# posted by Blogger Bob : 6:48 AM, April 25, 2008
Technorati tags:
Solaris | Open Solaris | Oracle | troubleshooting
Monday, 14 August 2006
Solaris: Workaround to stdio's 255 open file descriptors limitation
#1085341: 32-bit stdio routines should support file descriptors >255, a 14 year old RFE explains the problem and the bug report links to handful of other bugs which are some how related to stdio's open file descriptors limitation.
Now the good news: the wait is finally over. Sun Microsystems finally made a fix/workaround available to the community in the form of a new library called
extendedFILE. If you are running Solaris Express (SX) 06/06 or later builds, your system already has the workaround. You just need to enable it to get around the 255 open file descriptors problem with stdio's API. This workaround will be part of the Update 3 release of Solaris 10, which is due in October 2006. There are some plans to backport this workaround to Solaris 9 and 10. However there is no clear timeline for completion of this backport, at the moment.The workaround does not require any source code changes or re-compilation of the objects. You just need to increase the file descriptor limit using
limit or ulimit commands; and pre-load /usr/lib/extendedFILE.so.1 before running the application.However applications fiddling with
_file field of FILE structure may not work. This is because when extendedFILE library is pre-loaded, descriptors > 255 will be stored in an auxiliary location and a fake descriptor will be stored in the FILE structure's _file field. In fact, accessing _file field was long discouraged; and to discourage non-standard practices even further _file has been renamed to _magic starting with SX 06/06. So, applications which access _file directly rather than with fileno() function, may encounter compilation errors starting with S10 U3. This step is necessary to ensure that the source code is in a clean state, so the resulting object code is not vulnerable to data corruption during run-time.The following example shows the failure and the steps to workaround the issue. Note that with the extendedfile library pre-loaded, the process can open upto 65532 files excluding stdin, out and err.
* Test case (simple C program tries to open 65536 files):
% cat files.c
#include <stdio.h>
#define NoOfFILES 65536
int main()
{
char filename[10];
FILE *fds[NoOfFILES];
int i;
for (i = 0; i < NoOfFILES; ++i)
{
sprintf (filename, "%d.log", i);
fds[i] = fopen(filename, "w");
if (fds[i] == NULL)
{
printf("\n** Number of open files = %d. fopen() failed with error: ", i);
perror("");
exit(1);
}
else
{
fprintf (fds[i], "some string");
}
}
return (0);
}
% cc -o files files.c
* Re-producing the failure:
% limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 8192 kbytes
coredumpsize unlimited
descriptors 256
memorysize unlimited
% uname -a
SunOS sunfire4 5.11 snv_42 sun4u sparc SUNW,Sun-Fire-280R
%./files
** Number of open files = 253. fopen() failed with error: Too many open files
* Showcasing the Solaris workaround:
% limit descriptors 5000
% limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 8192 kbytes
coredumpsize unlimited
descriptors 5000
memorysize unlimited
% setenv LD_PRELOAD_32 /usr/lib/extendedFILE.so.1
% ./files
** Number of open files = 4996. fopen() failed with error: Too many open files
% limit descriptors 65536
% limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 8192 kbytes
coredumpsize unlimited
descriptors 65536
memorysize unlimited
%./files
** Number of open files = 65532. fopen() failed with error: Too many open files
(Note that descriptor 0, 1 and 2 will be used by stdin, stdout and stderr)
% ls -l 65531.log
-rw-rw-r-- 1 gmandali ccuser 11 Aug 9 12:35 65531.log
For further information about the
extendedFILE library and for the extensions to fopen, fdopen, popen, .. please have a look at the new and updated man pages:extendedFILE
enable_extended_FILE_stdio
fopen
fdopen
__________
Technorati tags:
Sun | Solaris | OpenSolaris
Thursday, 20 July 2006
Java Troubleshooting & Diagnostic Guide
The document has copious information about tools like jinfo, jmap, jstack, jconsole, jps, jstat, Heap Analysis Tool (HAT), Heap Profiler (HPROF), jdb, dbx; and discusses about fatal error handling, deadlock detection, memory leak diagnosis, diagnosing crashes in native code, compiled code, VM and HotSpot compiler threads etc., in detail. Must have document for majority of Java developers.
__________
Technorati tags:
Sun | java | troubleshooting
Friday, 14 July 2006
Siebel Analytics 7.8.4 support for Solaris 10
Check Siebel Support Web site for the following revised SRSP document:
System Requirements & Supported Platforms for the Siebel Business Analytics Platform
Version 7.8.4, Rev. B July 2006
Related post:
Siebel CRM 7.8 certified on Solaris 10
______________
Technorati tag: Solaris | Siebel | Analytics
Tuesday, 20 June 2006
Ubuntu 6.06: setting up the root password
From what I read/understand, the user that we create during the installation of OE will have sudo privileges to run root only commands. But I am pretty sure that it is inconvenient to prepend the string sudo before all such commands. So, it is better to login as root user rather than the normal user, when a large number of "root only" commands/tools/utilities needs to be run. The first step is to set the password for the root user. Use the sudo privileges of the default user to run
passwd root command. Enter the password string of the default normal user when it prompts for password with an useful hint: enter your non-root user password. Type in the password chosen for the root user, when you are prompted to enter the new UNIX password with another useful hint: enter new password for root. Re-type the chosen password one more time in the next step, and you are done.% sudo passwd root
Password: (enter your non-root user password) <- default normal user's password
Enter new UNIX password: (enter new password for root) <- choose a password for the root user. It is different
from the one you typed in previous step -- and of course the default normal user's
password is not going to change after this step.
Retype new UNIX password: (re-enter new password for root)
passwd: password updated successfully
Perhaps the steps are the same for earlier versions of Ubuntu - but this is the very first time I have ever got my hands on Ubuntu distro.
_______________
Technorati tags: Ubuntu | Linux
Wednesday, 14 June 2006
Sun Studio: symbol collisions revisited
dbx and Solaris' link-edit tracing facility.Let us start with the obligatory example first.
% cat dummy.c
#include <stdio.h>
void compare (int x, int y)
{
printf("\ndummy.c: compare(int, int)");
}
void Comparable (int a, int b)
{
printf("\nNext line should print \"dummy.c: compare(int, int)\". It you see something
else, it must be a symbol collision.");
compare (a, b);
}
% cc -G -o libdummy.so dummy.c
% cat thirdparty.c
#include <stdio.h>
int compare (char x, char y)
{
printf("\nthirdparty.c: compare(char, char)");
return (0);
}
% cc -G -o lib3rdparty.so thirdparty.c
% export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH
% cat test.c
void Comparable(int, int);
int main()
{
Comparable(1, 2);
return (0);
}
% cc -o test -l3rdparty -ldummy test.c
% ./test
Next line should print "dummy.c: compare(int, int)". It you see something else, it must be a symbol collision.
thirdparty.c: compare(char, char)
1) Using
dbx to detect symbol collision% dbx test
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.5' in your .dbxrc
Reading test
Reading ld.so.1
Reading lib3rdparty.so
Reading libdummy.so
Reading libc.so.1
(dbx) stop in compare
More than one identifier 'compare'.
Select one of the following:
0) Cancel
1) `lib3rdparty.so`compare
2) `libdummy.so`compare
a) All
> a
dbx: warning: 'compare' has no debugger info -- will trigger on first instruction
dbx: warning: 'compare' has no debugger info -- will trigger on first instruction
Will create handlers for all 2 hits
(2) stop in compare
(3) stop in compare
(dbx) run
Running: test
(process id 4404)
Next line should print "dummy.c: compare(int, int)". It you see something else, it must be a symbol collision.stopped in compare at 0xd27a0240
0xd27a0240: compare : pushl %ebp
(dbx) where
=>[1] compare(0x1, 0x2), at 0xd27a0240
[2] Comparable(0x1, 0x2), at 0xd27702d9
[3] main(0x1, 0x8047324, 0x804732c), at 0x80506f8
proc -map shows the list of objects loaded with addresses.(dbx) proc -map
Loadobject mappings for Process ID: 4404
0x08050000 /export/home/techno/C/test
0xd27a0000 /export/home/techno/C/lib3rdparty.so
0xd2770000 /export/home/techno/C/libdummy.so
0xd2690000 /lib/libc.so.1
0xd27c8000 /lib/ld.so.1 [LM_ID_LDSO]
From the above mappings, it appears that the symbol
compare was bound to the definition in lib3rdparty.so. However it should really be bound to the definition in libdummy.so. Other way to check the mappings is by getting the process id with
proc -pid command, and using pmap <pid> command in dbx environment.(dbx) proc -pid
4404
(dbx) pmap 4404
4404: /export/home/techno/C/test
08046000 8K rwx-- [ stack ]
08050000 4K r-x-- /export/home/techno/C/test
08060000 4K rwx-- /export/home/techno/C/test
D261E000 384K rw--- [ anon ]
D2680000 24K rwx-- [ anon ]
D2690000 764K r-x-- /lib/libc.so.1
D275F000 24K rw--- /lib/libc.so.1
D2765000 8K rw--- /lib/libc.so.1
D2770000 4K r-x-- /export/home/techno/C/libdummy.so
D2780000 4K rwx-- /export/home/techno/C/libdummy.so
D2790000 4K rwx-- [ anon ]
D27A0000 4K r-x-- /export/home/techno/C/lib3rdparty.so
D27B0000 4K rwx-- /export/home/techno/C/lib3rdparty.so
D27BA000 4K rwxs- [ anon ]
D27C8000 140K r-x-- /lib/ld.so.1
D27FB000 4K rwx-- /lib/ld.so.1
D27FC000 8K rwx-- /lib/ld.so.1
total 1396K
The following outputs in
dbx env. confirm that the symbol compare was actually resolved in lib3rdparty.so rather than libdummy.so.(dbx) which compare
`lib3rdparty.so`compare
which command prints the full qualification of a given name.(dbx) scopesThe
Function compare
File "(unknown OF)"
Loadobject /export/home/techno/C/lib3rdparty.so
scopes command prints a list of active scopes.(dbx) statusThe
*(2) stop in compare
(3) stop in compare
status prints the stop breakpoints in effect. In the above output we do not know which compare belong to which library. -s option of status will show the information we need.(dbx) status -s
stop in `lib3rdparty.so`#compare
stop in `libdummy.so`#compare
2) Using Solaris' link-edit tracing to detect symbol collision
Setting the run-time linker (
ld.so.1) env variable LD_DEBUG to the token bindings causes the run-time linker to show the binding of a symbol reference to a symbol definition. Usually the output from run-time linker will be overwhelming if the application is big with a handful of symbols. By default, all this output will be displayed on stdout; so, it might be a little inconvenient to read all the output on stdout. However the tracing data from run-time linker can be redirected to a file using LD_DEBUG_OUTPUT env variable as shown below.eg.,
% export LD_DEBUG=bindingsObserve that the call to symbol (function or method)
% export LD_DEBUG_OUTPUT=/tmp/bindings.log
% ./test
Next line should print "dummy.c: compare(int, int)". It you see something else, it must be a symbol collision.
thirdparty.c: compare(char, char)
% grep compare /tmp/bindings.log.0443*
/tmp/bindings.log.04436:04436: binding file=./libdummy.so to file=./lib3rdparty.so: symbol `compare'
compare in libdummy.so was resolved in lib3rdparty.so. It confirms the symbol conflict. This problem arises when the symbols are left in default global scope. Use elfdump tool to check the scope and attribute of the symbol of interest (compare in this example).% /usr/ccs/bin/elfdump -sN.symtab libdummy.so | grep compare
[40] 0x00000280 0x00000027 FUNC GLOB D 0 .text compare
% /usr/ccs/bin/elfdump -sN.symtab lib3rdparty.so | grep compare
[41] 0x00000240 0x00000036 FUNC GLOB D 0 .text compare
Symbols with global default scope are always vulnerable to symbol conflicts, as seen in the above example. The primary reason being any one can interpose upon these global default symbols. To prevent others from interposing on the symbols that are completely useless outside the defining component (executables or shared objects), make them protected with Sun Studio's
-xldscope=symbolic compiler option or __symbolic attribute. Note that these language extensions were introduced in Sun Studio 8. Refer to Reducing Symbol Scope with Sun Studio C/C++ article for detailed information about global, symbolic scopes and also about the linker scoping language extension. If you are using any older compiler, use -Bsymbolic linker option to build your executables and shared libraries there by making the symbols protected.eg.,
% cc -G -o libdummy.so -xldscope=symbolic dummy.c
% /usr/ccs/bin/elfdump -sN.symtab libdummy.so | grep compare
[40] 0x00000280 0x00000027 FUNC GLOB P 0 .text compare
% cc -G -o lib3rdparty.so -xldscope=symbolic thirdparty.c
% /usr/ccs/bin/elfdump -sN.symtab lib3rdparty.so | grep compare
[41] 0x00000240 0x00000036 FUNC GLOB P 0 .text compare
% cc -o test -l3rdparty -ldummy test.c
% ./test
Next line should print "dummy.c: compare(int, int)". It you see something else, it must be a symbol collision.
dummy.c: compare(int, int)
Note that the compiler option
-xldscope=symbolic and the linker option -Bsymbolic make all the symbols protected - so, it is not possible for others to interpose upon the symbols of such libraries or executables. For some reason if you have to leave some symbols in global default scope, use __symbolic specifier for those symbols which must be protected from interposers.eg.,
% cat dummy.c
#include <stdio.h>
__symbolic void compare (int x, int y)
{
printf("\ndummy.c: compare(int, int)");
}
void Comparable (int a, int b)
{
printf("\nNext line should print \"dummy.c: compare(int, int)\". It you see something
else, it must be a symbol collision.");
compare (a, b);
}
% cc -G -o libdummy_.so dummy.c
% /usr/ccs/bin/elfdump -sN.symtab libdummy_.so | grep -i compar
[40] 0x00000280 0x00000027 FUNC GLOB P 0 .text compare
[45] 0x000002b0 0x00000037 FUNC GLOB D 0 .text Comparable
Observe that the symbol
compare is protected (although global in scope); and hence cannot be overridden by another global symbol with name compare. However Comparable can still be overridden by a different implementation, due to its global default scope.Just to make sure that the symbol was bound to the definition in the right library, you can check the mappings, scope etc., again with
dbx. Link-edit tracing won't show any binding information when it gets resolved within the same library._______________
Technorati tags: Sun Studio | Solaris | linker | Programming

