diff options
Diffstat (limited to 'blog/2018-11-28-cpp-compiler.org')
-rw-r--r-- | blog/2018-11-28-cpp-compiler.org | 127 |
1 files changed, 127 insertions, 0 deletions
diff --git a/blog/2018-11-28-cpp-compiler.org b/blog/2018-11-28-cpp-compiler.org new file mode 100644 index 0000000..2f4a8fb --- /dev/null +++ b/blog/2018-11-28-cpp-compiler.org @@ -0,0 +1,127 @@ +#+date:2018-11-28 +#+title: The C++ Compiler + +* A Brief Introduction + +[[https://en.wikipedia.org/wiki/C%2B%2B][C++]] is a general-purpose programming language with object-oriented, generic, and +functional features in addition to facilities for low-level memory manipulation. + +The source code, shown in the snippet below, must be compiled before it can be +executed. There are many steps and intricacies to the compilation process, and +this post was a personal exercise to learn and remember as much information as I +can. + +#+BEGIN_SRC cpp +#include <iostream> + +int main() +{ + std::cout << "Hello, world!\n"; +} +#+END_SRC + +* Compilation Process + +** An Overview + +Compiling C++ projects is a frustrating task most days. Seemingly nonexistent +errors keeping your program from successfully compiling can be annoying +(especially since you know you wrote it perfectly the first time, right?). + +I'm learning more and more about C++ these days and decided to write this +concept down so that I can cement it even further in my own head. However, C++ +is not the only compiled language. Check out [[https://en.wikipedia.org/wiki/Compiled_language][the Wikipedia entry for compiled +languages]] for more examples of compiled languages. + +I'll start with a wonderful, graphical way to conceptualize the C++ compiler. +View [[https://web.archive.org/web/20190419035048/http://faculty.cs.niu.edu/~mcmahon/CS241/Notes/compile.html][The C++ Compilation Process]] by Kurt MacMahon, an NIU professor, to see the +graphic and an explanation. The goal of the compilation process is to take the +C++ code and produce a shared library, dynamic library, or an executable file. + +** Compilation Phases + +Let's break down the compilation process. There are four major steps to +compiling C++ code. + +*** Step 1 + +The first step is to expand the source code file to meet all dependencies. The +C++ preprocessor includes the code from all the header files, such as +=#include <iostream>=. Now, what does that mean? The previous example includes +the =iostream= header. This tells the computer that you want to use the +=iostream= standard library, which contains classes and functions written in the +core language. This specific header allows you to manipulate input/output +streams. After all this, you'll end up which a temporary file that contains the +expanded source code. + +In the example of the C++ code above, the =iostream= class would be included +in the expanded code. + +*** Step 2 + +After the code is expanded, the compiler comes into play. The compiler takes the +C++ code and converts this code into the assembly language, understood by the +platform. You can see this in action if you head over to the [[https://godbolt.org][GodBolt Compiler +Explorer]], which shows C++ being converted into assembly dynamically. + +For example, the =Hello, world!= code snippet above compiles into the following +assembly code: + +#+BEGIN_SRC asm +.LC0: + .string "Hello, world!\n" +main: + push rbp + mov rbp, rsp + mov esi, OFFSET FLAT:.LC0 + mov edi, OFFSET FLAT:_ZSt4cout + call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) + mov eax, 0 + pop rbp + ret +__static_initialization_and_destruction_0(int, int): + push rbp + mov rbp, rsp + sub rsp, 16 + mov DWORD PTR [rbp-4], edi + mov DWORD PTR [rbp-8], esi + cmp DWORD PTR [rbp-4], 1 + jne .L5 + cmp DWORD PTR [rbp-8], 65535 + jne .L5 + mov edi, OFFSET FLAT:_ZStL8__ioinit + call std::ios_base::Init::Init() [complete object constructor] + mov edx, OFFSET FLAT:__dso_handle + mov esi, OFFSET FLAT:_ZStL8__ioinit + mov edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev + call __cxa_atexit +.L5: + nop + leave + ret +_GLOBAL__sub_I_main: + push rbp + mov rbp, rsp + mov esi, 65535 + mov edi, 1 + call __static_initialization_and_destruction_0(int, int) + pop rbp + ret +#+END_SRC + +*** Step 3 + +Third, the assembly code generated by the compiler is assembled into the object +code for the platform. Essentially, this is when the compiler takes the assembly +code and assembles it into machine code in a binary format. After researching +this online, I figured out that a lot of compilers will allow you to stop +compilation at this step. This would be useful for compiling each source code +file separately. This saves time later if a single file changes; only that file +needs to be recompiled. + +*** Step 4 + +Finally, the object code file generated by the assembler is linked together with +the object code files for any library functions used to produce a shared +library, dynamic library, or an executable file. It replaces all references to +undefined symbols with the correct addresses. |