aboutsummaryrefslogtreecommitdiff
path: root/blog/2018-11-28-cpp-compiler.org
diff options
context:
space:
mode:
authorChristian Cleberg <hello@cleberg.net>2023-12-02 11:23:08 -0600
committerChristian Cleberg <hello@cleberg.net>2023-12-02 11:23:08 -0600
commitcaccd81c3eb7954662d20cab10cc3afeeabca615 (patch)
tree567ed10350c1ee319c178952ab6aa48265977e58 /blog/2018-11-28-cpp-compiler.org
downloadcleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.tar.gz
cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.tar.bz2
cleberg.net-caccd81c3eb7954662d20cab10cc3afeeabca615.zip
initial commit
Diffstat (limited to 'blog/2018-11-28-cpp-compiler.org')
-rw-r--r--blog/2018-11-28-cpp-compiler.org127
1 files changed, 127 insertions, 0 deletions
diff --git a/blog/2018-11-28-cpp-compiler.org b/blog/2018-11-28-cpp-compiler.org
new file mode 100644
index 0000000..2f4a8fb
--- /dev/null
+++ b/blog/2018-11-28-cpp-compiler.org
@@ -0,0 +1,127 @@
+#+date:2018-11-28
+#+title: The C++ Compiler
+
+* A Brief Introduction
+
+[[https://en.wikipedia.org/wiki/C%2B%2B][C++]] is a general-purpose programming language with object-oriented, generic, and
+functional features in addition to facilities for low-level memory manipulation.
+
+The source code, shown in the snippet below, must be compiled before it can be
+executed. There are many steps and intricacies to the compilation process, and
+this post was a personal exercise to learn and remember as much information as I
+can.
+
+#+BEGIN_SRC cpp
+#include <iostream>
+
+int main()
+{
+ std::cout << "Hello, world!\n";
+}
+#+END_SRC
+
+* Compilation Process
+
+** An Overview
+
+Compiling C++ projects is a frustrating task most days. Seemingly nonexistent
+errors keeping your program from successfully compiling can be annoying
+(especially since you know you wrote it perfectly the first time, right?).
+
+I'm learning more and more about C++ these days and decided to write this
+concept down so that I can cement it even further in my own head. However, C++
+is not the only compiled language. Check out [[https://en.wikipedia.org/wiki/Compiled_language][the Wikipedia entry for compiled
+languages]] for more examples of compiled languages.
+
+I'll start with a wonderful, graphical way to conceptualize the C++ compiler.
+View [[https://web.archive.org/web/20190419035048/http://faculty.cs.niu.edu/~mcmahon/CS241/Notes/compile.html][The C++ Compilation Process]] by Kurt MacMahon, an NIU professor, to see the
+graphic and an explanation. The goal of the compilation process is to take the
+C++ code and produce a shared library, dynamic library, or an executable file.
+
+** Compilation Phases
+
+Let's break down the compilation process. There are four major steps to
+compiling C++ code.
+
+*** Step 1
+
+The first step is to expand the source code file to meet all dependencies. The
+C++ preprocessor includes the code from all the header files, such as
+=#include <iostream>=. Now, what does that mean? The previous example includes
+the =iostream= header. This tells the computer that you want to use the
+=iostream= standard library, which contains classes and functions written in the
+core language. This specific header allows you to manipulate input/output
+streams. After all this, you'll end up which a temporary file that contains the
+expanded source code.
+
+In the example of the C++ code above, the =iostream= class would be included
+in the expanded code.
+
+*** Step 2
+
+After the code is expanded, the compiler comes into play. The compiler takes the
+C++ code and converts this code into the assembly language, understood by the
+platform. You can see this in action if you head over to the [[https://godbolt.org][GodBolt Compiler
+Explorer]], which shows C++ being converted into assembly dynamically.
+
+For example, the =Hello, world!= code snippet above compiles into the following
+assembly code:
+
+#+BEGIN_SRC asm
+.LC0:
+ .string "Hello, world!\n"
+main:
+ push rbp
+ mov rbp, rsp
+ mov esi, OFFSET FLAT:.LC0
+ mov edi, OFFSET FLAT:_ZSt4cout
+ call std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
+ mov eax, 0
+ pop rbp
+ ret
+__static_initialization_and_destruction_0(int, int):
+ push rbp
+ mov rbp, rsp
+ sub rsp, 16
+ mov DWORD PTR [rbp-4], edi
+ mov DWORD PTR [rbp-8], esi
+ cmp DWORD PTR [rbp-4], 1
+ jne .L5
+ cmp DWORD PTR [rbp-8], 65535
+ jne .L5
+ mov edi, OFFSET FLAT:_ZStL8__ioinit
+ call std::ios_base::Init::Init() [complete object constructor]
+ mov edx, OFFSET FLAT:__dso_handle
+ mov esi, OFFSET FLAT:_ZStL8__ioinit
+ mov edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
+ call __cxa_atexit
+.L5:
+ nop
+ leave
+ ret
+_GLOBAL__sub_I_main:
+ push rbp
+ mov rbp, rsp
+ mov esi, 65535
+ mov edi, 1
+ call __static_initialization_and_destruction_0(int, int)
+ pop rbp
+ ret
+#+END_SRC
+
+*** Step 3
+
+Third, the assembly code generated by the compiler is assembled into the object
+code for the platform. Essentially, this is when the compiler takes the assembly
+code and assembles it into machine code in a binary format. After researching
+this online, I figured out that a lot of compilers will allow you to stop
+compilation at this step. This would be useful for compiling each source code
+file separately. This saves time later if a single file changes; only that file
+needs to be recompiled.
+
+*** Step 4
+
+Finally, the object code file generated by the assembler is linked together with
+the object code files for any library functions used to produce a shared
+library, dynamic library, or an executable file. It replaces all references to
+undefined symbols with the correct addresses.