===[ Scientific method ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-----------------------------------[ code ]------------------------------------ .---------. What do you want to solve? .----->| PROBLEM | (You have some question you want to answer.) | '---------' | | | v | .------------. Try to find out if someone already solved this problem. | | (RE)SEARCH | (This can be skipped for a learning experience :)) | '------------' | | | +-------> This can generate new questions, | | write them down for later tests. | v | .------------. Hypothesis is a fancy word for: +---->| HYPOTHESES | "I think it could work like this..." | '------------' "This happens if I do that ..." | | | v | .-------------. | | EXPERIMENTS | Most fun lies here! | | & | Here you tackle the problem(s), you implement your | | ANALYSIS | hypothesis, executing it, and observing its outcome. | '-------------' | something | | is wrong | '-----------+ If hypothesis or question is wrong, you need to reformulate it. | v .----------. When you arrive here, you should have the answer you need. | SOLUTION | (You should have at least one solution for your problem.) '----------' -------------------------------------------------------------------------------

===[ What is scientific method? ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In a nutshell: Scientific method is set of guidelines that allows us to solve any problem. The term "scientific method" sounds mysterious and complicated, but it is actually pretty natural thing for us to do. When we are solving something we want to know *the* cause of it. Scientific method gives us a recipe how we can tackle unknown problems. (Science is all about conquering unknowns and it needs good tools to do it and one of those tools is scientific method.) Scientific method is pretty flexible and we can use only part of it. There are some basic principles we should always have in mind: 1. At first, we need to know what we ... want to know :). It typically takes form of a question. E.g "How is binary executed in Linux?", "Why the hell is it not working?!". 2. We start searching for an answer. This is done by using a search engine (e.g. google), reading though documentations, looking inside source codes, tracing, asking on forums or unbelievably even someone in person, ... 3. After (re)search phase we may have a good idea what we are dealing with and how to solve it. This is called "hypothesis". 4. Most often we also want to experimentally prove that those solutions that we have found are correct. 5. Then we take the results from our experiments, look at them, and decide if it solves our problem. If not, we can return back to research, experimentation or we have to even formulate a new question. These steps are so natural, that we are typically not even aware of them. What scientific method adds is stress on systematic approach. We should be the ones that controls an experiment, not the other way around.

===[ Systematic approach ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Working systematically means working effectively. We want to get reliable results fast. The method helps us to achieve it. I highly recommend you to learn it properly.

===[ Thinking about a problem ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Thinking hard about a problem and figuring out a good question is crucial if we want some effectiveness. You have probably heard the saying: "there is no such thing as a stupid question", ... well no, but there can be a poorly chosen question for a specific problem. Imagine that we have XXX

===[ Experimentation ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. When experimenting we should always be aware of all variables in a system[1]. Ideally we want the only one variable and everything else to be constant, then we know that this specific change will cause this type of system behavior. If we have no such luxury and we have to deal with multiple variables, we can try to use bisection, i.e dividing a system until we have smaller areas with less variables and more reliable results. 2. Understanding inputs and outputs. XXX 3. Before we start experimenting, we have to have relevant environment. Reproduce behavior. XXX 3. Documenting our progress. For me personally, this is crucial, because after a few hours or days of debugging I won't remember what I have already tried. Also I frequently revisit older issues I have already solved, because after a while some problems starts connecting with each other. I highly recommend you to write notes. It does not have to be polished, it could be just a few words or copy-paste of an input and output, but it have to have a form that your future self is able to read. [1] https://en.wikipedia.org/wiki/System

===[ How to use it? ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Suppose we want to know how is execution of binaries done in Linux. 1. [Question] "How does Linux execute a binary?" 2. [Research] A quick google search for "How does Linux execute a binary?" gives us some garbage on how to set execution bit on a binary/script by 'chmod'. This is not what we want. We have to try to reformulate the question: "linux kernel binary execution". That gives us way better results. One of them is great "linux-insides": https://0xax.gitbooks.io/linux-insides/content/SysCall/linux-syscall-4.html 3. [Hypothesis] Now we may have some idea that it is done by syscall 'execve' and that kernel opens the file, reads the magic header, check if the type of the binary is known and if so, it loads the binary into a memory, setup registers and jumps back to user space. 4. [Question] How can we see it? Searching for "linux tracing" gets us: https://www.man7.org/linux/man-pages/man1/strace.1.html strace -- syscall tracer 5. [Experimentation] Now we can create C/ASM binary, that calls 'execve', but it is so ubiquitous that we can just call anything. Using 'strace' is super easy: ------------------------------[ strace example ]------------------------------- $ strace -vv /bin/true execve("/bin/true", ["/bin/true"], ["BRM=brk", "ENV2=2"]) = 0 ... ------------------------------------------------------------------------------- 'strace' is great, but it only shows us how does user space communicate with kernel. 6. [Question] Can we see how does kernel calls look like? By searching for "linux kernel tracing" we get: https://www.kernel.org/doc/html/latest/trace/ftrace.html ftrace -- kernel function tracer 'ftrace' is bit more complicated to setup, but lets say we have everything in place and we can just use tracefs ftrace[4]. 7. [Controlling variables] If we trace whole system, we would not know which execution is ours, because there will be many background processes executing at the same time. For example, we can trace a PID of shell and inside ithat shell we will call 'exec', like this: --------------------------------[ shell exec ]--------------------------------- sh echo $$ # This PID we need to write into ftrace filter. exec /bin/true # After ftrace setup, we can run 'execve'. ------------------------------------------------------------------------------- Although the output we get is big, exec function is near the beginning. 8. [Filtering noise] Lets say, we do not want so much unrelated calls, because it is hard to navigate. We can control it even further by creating specialized binaries, that do the least minimum code possible. For example, we can create an assembly code that just exits: ------------------------------[ exit_only.nasm ]------------------------------- BITS 64 GLOBAL _start _start: mov rax, 0x3c ; sys_exit (code) xor rdi, rdi ; code=0 syscall ------------------------------------------------------------------------------- Now, we can run it instead of '/bin/true' and when we trace it, we should get only the relevant data of '__x64_sys_execve' and '__x64_sys_exit' function calls and what they are calling (and garbage like interrupts). [4] XXX

===[ Magical thinking ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


===[ Resources ]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

https://en.wikipedia.org/wiki/Scientific_method https://www.youtube.com/watch?v=nsnyl8llfH4 (Mark Rober -- 1st place Egg Drop > project ideas- using SCIENCE) XXX