Initial commit

2018-12-31 15:26:22 +01:00
commit 0045501c5e
5 changed files with 800 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,71 @@
+# Compiler Construction
+
+|    Date    |    Deadline     |
+| ---------- | --------------- |
+| 2019-03-15 | [Example Input] |
+| 2019-04-05 | Milestone 1     |
+| 2019-05-03 | Milestone 2     |
+| 2019-05-24 | Milestone 3     |
+| 2019-06-14 | Milestone 4     |
+| 2019-06-21 | Milestone 5     |
+| 2019-06-30 | Final           |
+
+[Example Input](example_input.md)
+
+- [mC Compiler Specification](specification.md)
+- [Getting Started Code-base](https://git.uibk.ac.at/c7031162/mcc)
+- [Submission Guideline](submission.md)
+
+## Structure
+
+The ultimate goal of this course is to build a working compiler according to the given specification.
+
+You are not allowed to use code from other people participating in this course or code that has been submitted previously by somebody else.
+However, a *getting started* code-base is provided.
+
+You will be able to work on your compiler during the lab.
+I'll be present for questions all the time, yet a big part of this course is to acquire the necessary knowledge yourself.
+
+Please note that minor modifications may be made to the specification until 1 week before the final deadline.
+Therefore, double check for modifications before submitting — Git provides you the diff anyway.
+
+Apart from this, there will be one *required* submissions near the beginning of the semester.
+You have to submit an additional example input, which may be added to the set of example inputs — this way the number of integration tests is extended.
+
+Furthermore, there are five *optional* milestones.
+They provide a golden thread and enable you to receive feedback, plus you get a feel for my reviewing scheme.
+
+You can work together in teams of 1–3 people. Teams may span across pro-seminar groups.
+
+## Grading
+
+Your final grade is determined mostly by the final submission.
+A positive grade for your final submission is required to pass this course.
+
+In addition to this, I'll do short QA sessions during the course which influence your final grade.
+
+Other submissions are not graded.
+
+Be sure to adhere to the specification, deviating from it (without stating a proper reason) will negatively impact your grade.
+
+### Evaluation System
+
+I'll be using a virtualised, updated Ubuntu 18.04 LTS (64 bit) to examine your submissions.
+The submitted code has to compile and run on this system.
+This information tells you which software versions I'll be using.
+
+### Absence
+
+You must not be absent more than three times to pass this course.
+You do not have to inform me of your absence.
+
+## Contacting Me
+
+If you have questions or want to know more about a certain topic, I am always glad to help.
+You can find me in room 2W05 of the ICT building.
+
+You can also contact me by email, just be sure to send it from your university account.
+Please keep your email informal and include the course number in the subject.
+Preferably use the following link.
+
+📧 [send email](mailto:alexander.hirsch@uibk.ac.at?subject=703807%20-%20)
--- a/example_input.md
+++ b/example_input.md
@@ -0,0 +1,16 @@
+# Example Input
+
+Some example inputs for the compiler are already provided.
+These examples are to be used as integration tests.
+
+Your initial task is to create another example which may be added to the set.
+
+Try to use as many features of the mC language as possible.
+The example may read form `stdin` and write to `stdout` using the built-in functions.
+
+Provide an `.stdin.txt` and `.stdout.txt` for verification purposes.
+
+The getting started code-base provides a stub for the mC compiler.
+It converts mC to C and compiles it using GCC.
+
+See [Submission Guideline](submission.md).
--- a/images/fib_ast.png
+++ b/images/fib_ast.png
--- a/specification.md
+++ b/specification.md
@@ -0,0 +1,647 @@
+# mC Compiler Specification
+
+This document describes the mC compiler as well as the mC language itself along with some requirements.
+Like a regular compiler, the mC compiler is divided into 3 main parts: front-end, back-end, and a core in-between.
+
+The front-end's task is to validate a given input using syntactic and semantic checks.
+The syntactic checking is done by the *parser* which, on success, generates an abstract syntax tree (AST).
+This tree data structure is mainly used for semantic checking, although one can also apply transformations on it.
+Moving on, the AST is translated to the compiler's intermediate representation (IR) and passed to the core.
+Invalid inputs cause errors to be reported.
+
+The core provides infrastructure for running analyses and transformations on the IR.
+These analyses and transformation are commonly used for optimisation.
+Additional data structures, like the control flow graph (CFG), are utilised for this phase.
+Next, the (optimised) IR is passed to the back-end.
+
+The back-end translates the platform *independent* IR code to platform *dependent* assembly code.
+An assembler converts this code to *object code*, which is finally crafted into an executable by the linker.
+For these last two steps, GCC is used — referred to as *back-end compiler* in this context.
+
+The mC compiler is implemented using modern C (or C++) adhering to the C11 (or C++17) standard.
+
+## Milestones
+
+1. **Parser**
+    - Inputs are accepted / rejected correctly (syntax only).
+    - Syntactically invalid inputs result in a meaningful error message containing the corresponding source location.
+    - An AST is constructed for valid inputs.
+    - The obtained AST can be printed in the DOT format (see `mc_ast_to_dot`).
+2. **Semantic checks**
+    - The compiler rejects semantically wrong inputs.
+    - Invalid inputs trigger a meaningful error message including source location information.
+    - Type checking can be traced (see `mc_type_check_trace`).
+    - Symbol tables can be viewed (see `mc_symbol_table`).
+3. **Control flow graph**
+    - Valid inputs are convert to IR.
+    - The IR can be printed (see `mc_ir`)
+    - The CFG can be printed in the DOT format.
+4. **Back-end**
+    - Valid inputs are converted to IR and then to assembly code.
+    - GCC is invoked to create the final executable.
+5. **Build Infrastructure**
+    - Your code builds and tests successfully on my evaluation system.
+
+## mC Language
+
+This section defines *mC* — a simple, C-like language.
+The semantics of mC are identical to C unless specified otherwise.
+
+### Grammar
+
+The next segment presents the grammar of mC using this notation:
+
+- `#` starts a line comment
+- `,` indicates concatenation
+- `|` indicates alternation
+- `( )` indicates grouping
+- `[ ]` indicates optional parts (0 or 1)
+- `{ }` indicates repetition (1 or more)
+- `[ ]` and `{ }` can be combined to build 0 or more repetition
+- `" "` indicates a terminal string
+- `/ /` indicates a [RegEx](https://www.regular-expressions.info/)
+
+```
+# Primitives
+
+alpha            = /[a-zA-Z_]/
+
+alpha_num        = /[a-zA-Z0-9_]/
+
+digit            = /[0-9]/
+
+identifier       = alpha , [ { alpha_num } ]
+
+bool_literal     = "true" | "false"
+
+int_literal      = { digit }
+
+float_literal    = { digit } , "." , { digit }
+
+string_literal   = /"[^"]*"/
+
+
+# Operators
+
+unary_op         = "-" | "!"
+
+binary_op        = "+"  | "-" | "*" | "/"
+                 | "<"  | ">" | "<=" | ">="
+                 | "&&" | "||"
+                 | "==" | "!="
+
+
+# Types
+
+type             = "bool" | "int" | "float" | "string"
+
+
+# Literals
+
+literal          = bool_literal
+                 | int_literal
+                 | float_literal
+                 | string_literal
+
+
+# Declarations / Assignments
+
+declaration      = type , [ "[" , int_literal , "]" ] , identifier
+
+assignment       = identifier , [ "[" , expression , "]" ] , "=" , expression
+
+
+# Expressions
+
+expression       = literal
+                 | identifier , [ "[" , expression , "]" ]
+                 | call_expr
+                 | unary_op , expression
+                 | expression , binary_op , expression
+                 | "(" , expression , ")"
+
+
+# Statements
+
+statement        = if_stmt
+                 | while_stmt
+                 | ret_stmt
+                 | declaration , ";"
+                 | assignment  , ";"
+                 | expression  , ";"
+                 | compound_stmt
+
+if_stmt          = "if" , "(" , expression , ")" , statement , [ "else" , statement ]
+
+while_stmt       = "while" , "(" , expression , ")" , statement
+
+ret_stmt         = "return" , [ expression ] , ";"
+
+compound_stmt    = "{" , [ { statement } ] , "}"
+
+
+# Function Definitions / Calls
+
+function_def     = ( "void" | type ) , identifier , "(" , [ parameters ] , ")" , compound_stmt
+
+parameters       = declaration , [ { "," , declaration } ]
+
+call_expr        = identifier , "(" , [ arguments ] , ")"
+
+arguments        = expression , [ { "," expression } ]
+
+
+# Program
+
+program          = [ { function_def } ]
+```
+
+### Comments
+
+mC supports only *C-style* comments, starting with `/*` and ending with `*/`.
+Like in C, they can span across multiple lines.
+Comments are discarded by the parser, but do not forget to take newlines into account for line numbering.
+
+### Size Limitations
+
+Inside your compiler, use `long` and `double` to store mC's `int` / `float` literals.
+You may assume that they are big and precise enough to store the corresponding literal.
+
+Similarly for arrays, you may assume that arrays are at most `LONG_MAX` bytes long.
+
+### Special Semantics
+
+#### Boolean
+
+For mC we consider `bool` a first-class citizen, distinct from `int`.
+The operators `!`, `&&`, and `||` can only be used for booleans.
+Additionally we do *not* support short-circuit evaluation.
+
+#### Strings
+
+Strings are immutable and do not support any operation (e.g. concatenation).
+Yet, like comments, strings can span across multiple lines.
+Furthermore, they do not support escape sequences.
+
+Their sole purpose is to be used with the built-in `print` function.
+
+#### Arrays
+
+Only one dimensional arrays with static size are supported.
+The size must be stated during declaration and is part of the type.
+The following statement declares an array of integers with 42 elements.
+
+    int[42] my_array;
+
+We do not support *any* operations on whole arrays.
+For example, the following code is *invalid*:
+
+    int[10] a;
+    int[10] b;
+    int[10] c;
+
+    c = a + b;    /* not supported */
+
+You'd have to do this via a loop, assigning every element:
+
+    int i;
+    i = 0;
+    while (i < 10) {
+        c[i] = a[i] + b[i];
+        i = i + 1;
+    }
+
+Even further, one cannot assign to a variable of array type.
+
+    c = a;    /* not supported, even though both are of type int[10] */
+
+#### Call by Value
+
+Function arguments are always passed by value.
+
+`bool`, `int`, and `float` are passed directly.
+Strings and arrays are passed via pointers.
+
+#### Type Conversion
+
+There are no type conversion, neither implicit nor explicit.
+
+An expression used as a condition (for `if` or `while`) is expected to be of type `bool`.
+
+#### Entry Point
+
+Your top-level rule is `program` which simply consists of 0 or more function definitions.
+While the parser happily accepts empty source files, a semantic check enforces that a function named `main` must be present.
+`main` takes no arguments and returns an `int`.
+
+#### Declaration, Definition, and Initialization
+
+`declaration` is used to declare variables which can then be initialised with `assignment`.
+
+Furthermore we do not provide a way to declare functions.
+All functions are declared by their definition.
+It is possible to call a function before it has been defined.
+
+#### Empty Parameter List
+
+In C, the parameter list of a function taking no arguments contains only `void`.
+For mC we simply use an empty parameter list.
+Hence, instead of writing `int main(void)` we write `int main()`.
+
+#### Dangling Else
+
+A [*dangling else*](https://en.wikipedia.org/wiki/Dangling_else) belongs to the innermost `if`.
+The following mC code snippets are semantically equivalent:
+
+    if (c1)              |        if (c1) {
+        if (c2)          |            if (c2) {
+            f2();        |                f2();
+        else             |            } else {
+            f3();        |                f3();
+                         |            }
+                         |        }
+
+### I/O
+
+The following built-in functions are provided by the compiler for I/O operations:
+
+- `void print(string)`      outputs the given string to `stdout`
+- `void print_nl()`         outputs the new-line character (`\n`) to `stdout`
+- `void print_int(int)`     outputs the given integer to `stdout`
+- `void print_float(float)` outputs the given float to `stdout`
+- `int read_int()`          reads an integer from `stdin`
+- `float read_float()`      reads a float from `stdin`
+
+## mC Compiler
+
+The mC compiler is implemented as a library.
+It can be used either programmatically or via the provided command-line applications.
+
+The focus lies on a clean and modular implementation as well as a straight forward architecture, rather than raw performance.
+For example, each semantic check may traverse the AST in isolation.
+
+- Exported symbols are prefixed with `mcc_`.
+- It is threadsafe.
+- No memory is leaked — even in error cases.
+- Functions do not interact directly with `stdin`, `stdout`, or `stderr`.
+- No function terminates the application on correct usage.
+
+### Logging
+
+Logging infrastructure may be present, however all log output is disabled by default.
+The log level can be set with the environment variable `MCC_LOG_LEVEL`.
+
+The output destination can be set with `MCC_LOG_FILE` and defaults to `stdout`.
+
+Log messages do not overlap on multi-threaded execution.
+
+### Parser
+
+The parser reads the given input and, if it conforms syntactically to an mC program, constructs the corresponding AST.
+An invalid input is rejected, resulting in a meaningful error message, for instance:
+
+    foo.mc:3:8: error: unexpected '{', expected ‘(’
+
+It is recommended to closely follow the error message format of other compilers.
+Displaying the offending source line along with the error message is helpful, but not required.
+Parsing may stop on the first error.
+Error recovery is optional.
+
+The parser component may be generated by tools like `flex` and `bison`, or similar.
+However, pay attention to operator precedence.
+
+Note that partial mC programs, like an expression or statement, are not valid inputs for the main *parse* function.
+However, the library can provide additional functions for parsing single expressions or statements.
+
+### Abstract Syntax Tree
+
+The AST data structure definition itself is *not* specified.
+Consider using the visitor pattern for tree traversals.
+
+Given this example input:
+
+```c
+int fib(int n)
+{
+    if (n < 2) return n;
+    return fib(n - 1) + fib(n - 2);
+}
+```
+
+The visualisation of the AST for the `fib` function could look like this:
+
+![`fib` AST exampe](images/fib_ast.png)
+
+### Semantic Checks
+
+As the parser only does syntactic checking, additional semantic checks are implemented:
+
+- Checking for uses of undeclared variables
+- Checking for multiple declarations of variables with the same name in the same scope
+- Checking for multiple definitions of functions with the same name
+- Checking for calls to unknown functions
+- Checking for presence of `main` and correct signature
+- Checking that all execution paths of a non-void function return a value
+- Type checking (remember, nor implicit or explicit conversions)
+    - This also includes checking arguments and return types for call expressions.
+
+In addition to the AST, *symbol tables* are created and used for semantic checking.
+Be sure to correctly model [*shadowing*](https://en.wikipedia.org/wiki/Variable_shadowing).
+
+### Intermediate Representation
+
+As IR, a low-level [three-address code (TAC)](https://en.wikipedia.org/wiki/Three-address_code) is used.
+The instruction set of this code is *not* specified.
+
+Note that the compiler core is independent from the front-end or back-end.
+
+### Control Flow Graph
+
+A control flow graph data structure is present and can be constructed for a given IR program.
+This graph is commonly used by analyses for extracting structural information crucial for transformation steps.
+
+It is recommended to also provide a visitor mechanism for this graph.
+
+### Assembly Code Generation
+
+mC targets x86 and uses GCC as back-end compiler.
+On an x86_64 system, GCC multilib support must be available and the flag `-m32` is passed to the compiler.
+
+The code generated by the back-end is compiled with the [GNU Assembler](https://en.wikipedia.org/wiki/GNU_Assembler) (by GCC).
+Pay special attention to floating point and integer handling.
+
+Use [cdecl calling convention](https://en.wikipedia.org/wiki/X86_calling_conventions#cdecl).
+It is paramount to correctly implement the calling convention, otherwise you will corrupt your stack during function calls and returns.
+
+## Applications
+
+Apart from the main compiler executable `mcc`, additional auxiliary executables are implemented.
+These executables aid the development process and are used for evaluation.
+
+Most of the applications are defined by their usage information.
+Composing them with other command-line tools, like `dot`, is a core feature.
+
+Unless specified, the exact output format is up to you.
+However, do *not* omit details — like simplifying the AST.
+
+All applications exit with code `EXIT_SUCCESS` iff they succeeded in their operation.
+
+Note each executable excepts multiple inputs files.
+Each input is parsed in isolation; the ASTs are merged before semantic checks are run.
+
+### `mcc`
+
+This is the main compiler executable, sometimes referred to as *driver*.
+
+    usage: mcc [OPTIONS] file...
+
+    The mC compiler. Takes mC input files and produes an executable.
+
+    Use '-' as input file to read from stdin.
+
+    OPTIONS:
+      -h, --help                displays this help message
+      -v, --version             displays the version number
+      -q, --quiet               suppress error output
+      -o, --output <file>       write the output to <file> (defaults to 'a.out')
+
+    Environment Variables:
+      MCC_BACKEND               override the backend compiler (defaults to 'gcc' in PATH)
+
+### `mc_ast_to_dot`
+
+    usage: mc_ast_to_dot [OPTIONS] file...
+
+    Utility for printing an abstract syntax tree in the DOT format. The output
+    can be visualised using graphviz. Errors are reported on invalid inputs.
+
+    Use '-' as input file to read from stdin.
+
+    OPTIONS:
+      -h, --help                displays this help message
+      -o, --output <file>       write the output to <file> (defaults to stdout)
+      -f, --function <name>     limit scope to given function
+
+### `mc_symbol_table`
+
+    usage: mc_symbol_table [OPTIONS] file...
+
+    Utility for displaying the generated symbol tables. Errors are reported on
+    invalid inputs.
+
+    Use '-' as input file to read from stdin.
+
+    OPTIONS:
+      -h, --help                displays this help message
+      -o, --output <file>       write the output to <file> (defaults to stdout)
+      -f, --function <name>     limit scope to given function
+
+### `mc_type_check_trace`
+
+    usage: mc_type_check_trace [OPTIONS] file...
+
+    Utility for tracing the type checking process. Errors are reported on
+    invalid inputs.
+
+    Use '-' as input file to read from stdin.
+
+    OPTIONS:
+      -h, --help                displays this help message
+      -o, --output <file>       write the output to <file> (defaults to stdout)
+      -f, --function <name>     limit scope to given function
+
+### `mc_ir`
+
+    usage: mc_ir [OPTIONS] file...
+
+    Utility for viewing the generated intermediate reprensetation. Errors are
+    reported on invalid inputs.
+
+    Use '-' as input file to read from stdin.
+
+    OPTIONS:
+      -h, --help                displays this help message
+      -o, --output <file>       write the output to <file> (defaults to stdout)
+      -f, --function <name>     limit scope to given function
+
+### `mc_cfg_to_dot`
+
+    usage: mc_cfg_to_dot [OPTIONS] file...
+
+    Utility for printing a contorl flow graph in the DOT format. The output
+    can be visualised using graphviz. Errors are reported on invalid inputs.
+
+    Use '-' as input file to read from stdin.
+
+    OPTIONS:
+      -h, --help                displays this help message
+      -o, --output <file>       write the output to <file> (defaults to stdout)
+      -f, --function <name>     limit scope to given function
+
+### `mc_asm`
+
+    usage: mc_asm [OPTIONS] file...
+
+    Utility for printing the generated assembly code. Errors are reported on
+    invalid inputs.
+
+    Use '-' as input file to read from stdin.
+
+    OPTIONS:
+      -h, --help                displays this help message
+      -o, --output <file>       write the output to <file> (defaults to stdout)
+      -f, --function <name>     limit scope to given function
+
+## Project Structure
+
+The following directory layout is used.
+
+    mcc/
+    ├── app/                                # Each C file in this directory corresponds to one executable.
+    │   ├── mc_ast_to_dot.c
+    │   ├── mcc.c
+    │   └── …
+    ├── docs/                               # Additional documentation resides here.
+    │   └── …
+    ├── include/                            # All public headers live here, note the `mcc` subdirectory.
+    │   └── mcc/
+    │       ├── ast.h
+    │       ├── ast_print.h
+    │       ├── ast_visit.h
+    │       ├── parser.h
+    │       └── …
+    ├── src/                                # The actual implementation, may also contain private headers and so on.
+    │   ├── ast.c
+    │   ├── ast_print.c
+    │   ├── ast_visit.c
+    │   ├── parser.y
+    │   ├── scanner.l
+    │   └── …
+    ├── test/
+    │   ├── integration/                    # Example inputs for integration testing.
+    │   │   ├── fib/
+    │   │   │   ├── fib.mc
+    │   │   │   ├── fib.stdin.txt
+    │   │   │   └── fib.stdout.txt
+    │   │   └── …
+    │   └── unit/                           # Unit tests, typically one file per unit.
+    │       ├── parser_test.c
+    │       └── …
+    └── README.md
+
+The README is kept short and clean with the following sections:
+
+- Prerequisites
+- Build instructions
+- Known issues
+
+`src` contains the implementation of the library, while `include` defines its API.
+
+Each application (C file inside `app`) is liked against the shared library and uses the provided interface.
+They mainly contain argument parsing and combine the functionality provided by the library to achieve their task.
+
+The repository does not contain or track generated files.
+
+Under normal circumstances, all generated files are placed somewhere inside the build directory.
+
+### Known Issues
+
+At any point in time, the README contain a list of unfixed, known issues.
+
+Each entry is kept short and concise and should be justified.
+Complex issues may reference a dedicated document inside `docs` providing more details.
+
+## Build Infrastructure
+
+As build system (generator), use either [Meson](http://mesonbuild.com/), [CMake](https://cmake.org/), or plain Makefiles.
+Ensure dependencies between source files are modelled correctly.
+
+*Note:* Talk to me if you want to use a different build system.
+
+### Building
+
+The default build configuration is *release* (optimisations enabled).
+Unless Meson or CMake is used, the README documents how to switch to a *debug* configuration.
+
+Warnings are always enabled; `-Wall -Wextra` are used at least.
+
+### Testing
+
+Crucial or complicated logic is tested adequately.
+
+The project infrastructure provides a *simple* way to run all unit and integration tests.
+See the getting started code-base for an example (`scripts/run_integration_tests`).
+
+Similarly, a way to run unit tests using`valgrind` is provided.
+
+### Coverage
+
+An HTML coverage report can be obtained follow the *simple* instructions inside the README.
+
+### Dependencies
+
+The implementation should not have any dependencies apart from the C (or C++) standard library and a unit testing framework.
+The *prerequisites* section of the README enumerates the dependencies.
+
+The unit testing framework is *vendored* and automatically used by the build system.
+See the getting started code-base for an example.
+
+## Coding Guidelines
+
+Architectural design and readability of your code will be judged.
+
+- Don't be a git — use [Git](https://git-scm.com/)!
+- Files are UTF-8 encoded and use Unix line-endings (`\n`).
+- Files contain *one* newline at the end.
+- Lines do not contain trailing white-space.
+- Your code does not trigger warnings, justify them if otherwise.
+- Do not waste time or space (memory leaks).
+- Check for leaks using `valgrind`, especially in error cases.
+- Keep design and development principles in mind, especially KISS and DRY.
+- Always state the sources of non-original content.
+    - Use persistent links when possible.
+    - Ideas and inspirations should be referenced too.
+
+> Credit where credit is due.
+
+### C/C++
+
+- While not required, it is highly recommended to use a formatting tool, like [ClangFormat](https://clang.llvm.org/docs/ClangFormat.html).
+  A configuration file is provided with the getting started code-base, however, you are free to rule your own.
+- Lines should not exceed 120 columns.
+- The nesting depth of control statements should not exceed 4.
+    - Move inner code to dedicated functions or macros.
+    - Avoid using conditional and loop statements inside `case`.
+- Use comments *where necessary*.
+    - Code should be readable and tell *what* is happening.
+    - A comment should tell you *why* something is happening, or what to look out for.
+    - An overview at the beginning of a module header is welcome.
+- Use the following order for includes:
+    - Corresponding header (`ast.c` → `ast.h`)
+    - System headers
+    - Other library headers
+    - Public headers of the same project
+    - Private headers of the same project
+- The structure of a source file should be similar to its corresponding header file.
+    - Separators can be helpful, but they should not distract the reader.
+- Keep public header files free from implementation details, this also applies to the overview comment.
+- Use assertions to verify preconditions.
+- Ensure the correct usage of library functions, and always check return codes.
+- Prefer bound-checking functions, like `snprintf` over non-bound-checking variant.
+
+Also, keep the following in mind, taken from [Linux Kernel Coding Style](https://www.kernel.org/doc/html/v4.10/process/coding-style.html):
+
+> Functions should be short and sweet, and do just one thing.
+> They should fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, as we all know), and do one thing and do that well.
+>
+> The maximum length of a function is inversely proportional to the complexity and indentation level of that function.
+> So, if you have a conceptually simple function that is just one long (but simple) case-statement, where you have to do lots of small things for a lot of different cases, it's OK to have a longer function.
+>
+> However, if you have a complex function, and you suspect that a less-than-gifted first-year high-school student might not even understand what the function is all about, you should adhere to the maximum limits all the more closely.
+> Use helper functions with descriptive names (you can ask the compiler to in-line them if you think it's performance-critical, and it will probably do a better job of it than you would have done).
+>
+> Another measure of the function is the number of local variables.
+> They shouldn't exceed 5–10, or you’re doing something wrong.
+> Re-think the function, and split it into smaller pieces.
+> A human brain can generally easily keep track of about 7 different things, anything more and it gets confused.
+> You know you’re brilliant, but maybe you'd like to understand what you did 2 weeks from now.
--- a/submission.md
+++ b/submission.md
@@ -0,0 +1,66 @@
+# Submission Guideline
+
+- `XX` is to be replaced with the number of your team with leading zero (e.g `02`).
+- `Y` is to be replaced with the corresponding milestone number.
+- One submission *per team*.
+
+## Example Input Submission
+
+Assuming your example input is named `mandelbrot`, zip the corresponding files like so:
+
+    mandelbrot.zip
+    └── mandelbrot
+        ├── mandelbrot.mc
+        ├── mandelbrot.stdin.txt
+        └── mandelbrot.stdout.txt
+
+Submit the zip archive via mail using the following line as subject (or link below).
+List your team members in the mail body.
+
+    703602 - Example Input
+
+📧 [send email](mailto:alexander.hirsch@uibk.ac.at?subject=703602%20-%20Example%20Input)
+
+## Milestone Submission
+
+1. `cd` into your repository.
+2. Commit all pending changes.
+3. Checkout the revision you want to submit.
+4. Ensure everything builds.
+    - Warnings are okay
+    - Tests may fail
+    - Memory may be leaked
+    - Known issues should be present
+5. Run the following command:
+
+       $ git archive --prefix=team_XX_milestone_Y/ --format=zip HEAD > team_XX_milestone_Y.zip
+
+6. Verify that the resulting archive contains everything you want to submit and nothing more.
+7. Submit the zip archive via mail using the following line as subject (or link below).
+
+       703602 - Team XX Milestone Y
+
+   📧 [send email](mailto:alexander.hirsch@uibk.ac.at?subject=703602%20-%20Team%20XX%20Milestone%20Y)
+
+## Final Submission
+
+1. `cd` into your repository.
+2. Commit all pending changes.
+3. Checkout the revision you want to submit.
+4. Ensure everything works.
+    - Everything builds
+    - No (unjustified) warnings
+    - All unit test succeed
+    - All integration tests succeed
+    - No memory is leaked
+    - Known issues must be present
+5. Run the following command:
+
+       $ git archive --prefix=team_XX_final/ --format=zip HEAD > team_XX_final.zip
+
+6. Verify that the resulting archive contains everything you want to submit and nothing more.
+7. Submit the zip archive via mail using the following line as subject (or link below).
+
+       703602 - Team XX Final
+
+   📧 [send email](mailto:alexander.hirsch@uibk.ac.at?subject=703602%20-%20Team%20XX%20Final)