[Updated: 04/07/2006] Much accurate information is available in a better format at:
Mixed-Language Programming and External Linkage
__________________
Mixed-Language Programming and External Linkage
__________________
It is a common practice to mix code written in one programming language with
code written in another. But the developer needs to take some additional care to
make such programs work; else the compilation endup with link errors about unresolved
symbols. Let's discuss the problem(s) & solution(s) of mixing code written in
different programming languages with a simple example
Assume that we're writing C++ code and wish to call a C function from C++ code
bpte4500s001:/sunbuild1/giri/testcases/%cat greet.h
char *greet();
bpte4500s001:/sunbuild1/giri/testcases/%cat greet.c
#include "greet.h"
char *greet()
{
return ((char *) "Hello!");
}
bpte4500s001:/sunbuild1/giri/testcases/%cc -G -o libgreet.so greet.c
bpte4500s001:/sunbuild1/giri/testcases/%ls -l libgreet.so
-rwxrwxr-x 1 build engr 2788 Jan 12 12:21 libgreet.so*
Let's try to call the C function "greet()" from a C++ program
bpte4500s001:/sunbuild1/giri/testcases/%cat mixedcode.cpp
#include
extern char *greet();
int main() {
char *greeting = greet();
cout << greeting << "\n";
return (0);
}
Note:
The "extern" keyword declares a variable or function and specifies that it has
external linkage i.e., its name is visible from files other than the one in which
it's defined
bpte4500s001:/sunbuild1/giri/testcases/%CC -lgreet mixedcode.cpp
Undefined first referenced
symbol in file
char*greet() mixedcode.o
ld: fatal: Symbol referencing errors. No output written to a.out
Though the C++ code is linked with the dynamic library "libgreet.so" which holds the
implementation for greet(), the linking failed with undefined symbol error. What
went wrong?
The reason for the link error is that a typical C++ compiler mangles (encrypts) some
of the symbols (for eg., function name) to support Function Overloading. So the
symbol "greet" will be changed to something else depending on the algorithm
implemented in compiler during symbol mangling process and the object file will not
be having the symbol "greet" anywhere. Symbol table section of mixedcode.o object
file confirms this. Lets have a look at the symbol tables of libgreet.so &
mixedcode.o:
bpte4500s001:/sunbuild1/giri/testcases/%elfdump -s libgreet.so
Symbol Table Section: .symtab
index value size type bind oth ver shndx name
...
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS libgreet.so
...
[37] 0x00000268 0x00000004 OBJT GLOB D 0 .rodata _lib_version
[38] 0x000102f3 0x00000000 OBJT GLOB D 0 .data1 _edata
[39] 0x00000228 0x00000028 FUNC GLOB D 0 .text greet
[40] 0x0001026c 0x00000000 OBJT GLOB D 0 .dynamic _DYNAMIC
bpte4500s001:/sunbuild1/giri/testcases/%elfdump -s mixedcode.o
Symbol Table Section: .symtab
index value size type bind oth ver shndx name
[0] 0x00000000 0x00000000 NOTY LOCL D 0 UNDEF
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS mixedcode.cpp
[2] 0x00000000 0x00000000 SECT LOCL D 0 .rodata
[3] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF
__1cDstd2l6Frn0ANbasic_ostream4Ccn0ALchar_traits4Cc____pkc_2_
[4] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF __1cFgreet6F_pc_
[5] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF __1cDstdEcout_
[6] 0x00000010 0x00000050 FUNC GLOB D 0 .text main
[7] 0x00000000 0x00000000 NOTY GLOB D 0 ABS __fsr_init_value
bpte4500s001:/sunbuild1/giri/testcases/%dem __1cFgreet6F_pc_
__1cFgreet6F_pc_ == char*greet()
char*greet() has been mangled to __1cFgreet6F_pc_ by the Sun Studio 9 C++ compiler.
That's why the static linker (ld) couldn't match the symbol in the object file.
What's the solution to this problem?
The solution to this problem is to disable name mangling, so that we can call
external C functions from C++ code. This can be done by prepending extern "C" to the
signature of the function to be called from C++ code.
syntax:
extern "C" <function declaration>
Or if we have more than one C function to be called from C++, put the
function signatures within a extern "C" block
extern "C" {
<function declaration>
<function declaration>
...
<function declaration>
}
The linkage directive extern "C" tells the compiler to inhibit the default encoding
(name mangling) of a function name for a particular function
Notes:
1) A function declared as extern "C" cannot be overloaded
2) extern "C" declaration can only be applied to global functions
3) extern "C" declaration must always be after the last include
4) It is possible to use a linkage directive with all the functions in a file. This
is useful if we wish to use C library functions in a C++ program
extern "C" {
#include "mylibrary.h"
}
Please do not use extern "C" when including standard C header files because these
header files already contain extern "C" directives
So let's modify the source of mixedcode.cpp a bit, and recompile the program
bpte4500s001:/sunbuild1/giri/testcases/%cat mixedcode.cpp
#include
extern "C" char *greet();
int main() {
char *greeting = greet();
cout << greeting << "\n";
return (0);
}
bpte4500s001:/sunbuild1/giri/testcases/%CC -lgreet mixedcode.cpp
bpte4500s001:/sunbuild1/giri/testcases/%./a.out
Hello!
It works!! Let's have a look at the symbol table of mixedcode.o again
bpte4500s001:/sunbuild1/giri/testcases/%CC -c -lgreet mixedcode.cpp
bpte4500s001:/sunbuild1/giri/testcases/%elfdump -s mixedcode.o
Symbol Table Section: .symtab
index value size type bind oth ver shndx name
[0] 0x00000000 0x00000000 NOTY LOCL D 0 UNDEF
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS mixedcode.cpp
[2] 0x00000000 0x00000000 SECT LOCL D 0 .rodata
[3] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF
__1cDstd2l6Frn0ANbasic_ostream4Ccn0ALchar_traits4Cc____pkc_2_
[4] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF greet
[5] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF __1cDstdEcout_
[6] 0x00000010 0x00000050 FUNC GLOB D 0 .text main
[7] 0x00000000 0x00000000 NOTY GLOB D 0 ABS __fsr_init_value
As expected, the function name "greet" was not mangled by the C++ compiler and hence
the linker could find the symbol in the object file and able to build the executable
Please note that extern "C" declaration do not specify the details of what must be
done to allow C & C++ to be mixed. Name mangling is commonly part of the problem to
be solved, but it is only a part. There are certain other issues with mixing
languages and needs additional steps to resolve those issues. For example, on some
systems, C & C++ functions are called in different ways. If the declaration and
definition don't match, the program may crash or show abnormal behavior. For a broad
description, other issues/solutions etc., please read "Linkage Specification" of "The
C++ Programming Language"
Suggested Reading:
1) "Linkage Specification" of "The C++ Programming Language"
2) C++ name mangling - http://technopark02.blogspot.com/2004/11/c-name-mangling.html