Machine code to execution, what is going on?
I have been learning computer programming over the years and a layer of abstraction is clouding what I understand. Say I compiled some source code and I have the executable machine code. Once the computer runs this code what is going on?
For example, say I run a simple if then else statement and have the machine code. What is going on in the circuitry of the computer that executes this instruction?
I depends on the language, but for a regular executable (like you get from C/C++) the machine code are bit patterns that the CPU will interpret as direct instructions. These machine language instructions map 1-to-1 with assembly instructions if you have done any assembly (if you are getting a CS degree you should end up taking a course or two where you learn assembly, and may even have to manually translate between assembly and machine language using the CPU reference).
Other languages like Java and C# are a little more tricky, where there is an intermediate "byte-code" that is interpreted and translated into machine code as the program is run, allowing them to achieve some platform independence.
As for what the instructions themselves do, it depends on the CPU, but some of them can be things like the following:
- move a value from a register* to a RAM address or vice-versa
- perform an operation on two values (like add, XOR, AND, OR, etc...)
- compare two values (determine if they are equal or if one is greater than the other)
- change the current instruction register (jump) to a different address, dependent on the result of the last comparison (used in loops)
- move a value onto the stack**, or remove a value from the stack
- save our position/state on the stack, and jump to a new address
- recall a saved position/state from the stack, and jump back to where we used to be (these last two are used in function calls)
and probably more.
*A register is some very small but very fast memory space that exists within the CPU; certain registers are used for certain instructions. One of these is the current instruction register which holds the address of the next instruction to be executed. Changing this value directly will jump to a new address, similar to a goto statement.
**The stack is an internal stack from which values can be stored and retrieved in a first-in-last-out manner. This is how function calling is achieved. Like Hansel and Gretel, we leave a trail of bread crumbs behind us so that we can find our way back home.
Additionally, there are two "schools of thought" towards how instruction sets should be organized: RISC and CISC (reduced instruction set computers, and complex instruction set computers). In a RISC computer, there are much fewer less powerful instructions, and it's up to the programmer or compiler to use groups of them to do more complex things. In a CISC computer, there are much more and more powerful instructions, with some redundancy. Intel based CPUs are CISC.