What are the advantages of machine language

Differences between assembler, interpreter and compiler

Introduction [edit | Edit source]

The first computers in history were not so much programmed as they were configured. The activity known as programming was very complex and was often carried out using a large number of switches. The address (binary) was set in the memory on up to 16 switches, then the data value was set on a further eight switches and this was written into the memory using a further button. Reading was the same, except that instead of the eight switches, light-emitting diodes took over the display. Once you have keyed your program bit by bit (in the truest sense of the word!) Into the memory, you still had to load the instruction counter of the CPU in this way and the program could start. This is how the first home computer, the Altair 8800, was programmed this way.

With the ever faster computers, the more and more complex programs and especially with the introduction of dialog operation, a different form of programming had to be created. The first programming languages ​​emerged (Fortran, Algol and COBOL) and these were based on human language in order to be more understandable for the programmer. The programs were written in text form and then had to be fed to the computer (e.g. using punch cards). Interactive programming on the computer itself took place only later. But one problem remained: the program text first had to be converted into a form that the computer could read (machine language) and there are three approaches here.

Assembler [edit | Edit source]

It is the first approach and the simplest form of translation. The program is written in text form, the assembly language, but is translated 1: 1 into machine language. A word (mnemonic) is assigned to each machine command and assembler is therefore primarily a readable form of machine language. However, the logic of the associated CPU is used and there is no abstraction of the machine, rather the logic of the processor must be used. This means that the assembler programmer must know the CPU architecture and the hardware structure for the corresponding processes. This also means that assembler code has to be developed anew for each platform - even a program for the C64 does not run without changes on a PET, although both use a compatible CPU.

Assemblers - especially macro assemblers - nevertheless make the work much easier and already suggest a touch of high-level language programming. You save the programmer the time-consuming precalculation of addresses by using labels (for jump commands or data access) and using macros, standard code sequences can be reused as often as required (even parameterized) without having to be reprogrammed each time.

Advantages:

  • Possibilities: Complete control and use of the machine is only possible in assembler - this is made clear by the following saying: "What you cannot program in assembler, you have to solder".
  • Speed: Since assembler programs carry no ballast directly with them on the CPU level, assembler is basically very fast.
  • Program size: In no other language can such compact programs be written as in assembler.

Disadvantage:

  • Effort: Since almost every task in assembler has to be composed of many assembler steps, the assembler programming is complex. In addition, in contrast to high-level languages, the process or an algorithm can only be seen with difficulty from the code.
  • Error-prone: assembler programming is a tightrope act without a network. Errors almost always lead to a crash of the computer. Troubleshooting is sometimes so time-consuming that you wouldn't wish it on your worst enemy.
  • Expertise: In order to be able to program successfully in assembler, the machine has to be understood and the handling of other number systems (hex, binary, octal) has become second nature. Even intimate knowledge of the use and behavior of various peripherals (for a particular CPU family) is unavoidable in the long term.

Compiler [edit | Edit source]

The demand for system-independent languages ​​with corresponding formulation options for data structures, control structures, later functions and procedures, can no longer be implemented by assembler. Here we speak of high-level languages, and a 1: 1 translation is no longer possible. Compiler languages ​​are the second approach, separate the description of the solution from the hardware and therefore carry out the semantic conversion of the high-level language logic into the assembly language. Often the result of a compiler consists of assembler code, which is only translated into machine code in a subsequent assembler run. This second run is transparent to many compilers because the assembler is often integrated directly into the compiler. This separation is important because it ensures that a high-level language can be compiled on several target platforms. The result does not necessarily have to be machine language, but is now often a bytecode that is then executed by a simple and highly efficient virtual machine for this pseudo machine code - Java and .NET go this way today.
In contrast to the assembler programmer, the high-level language developer ideally does not need to have any knowledge of the underlying hardware and system architecture.

Pure compiler languages ​​are: C / C ++, Pascal

Advantages:

  • Relatively fast: C and Pascal in particular are suitable for both fast compilations and efficient and fast compiler runs.
  • Portable: Programs can often be transferred to other platforms without modification - they just have to be recompiled there.

Disadvantage:

  • Time-consuming: After every change to the source code, the steps compiling, assembling and linking take place. Since a compiler is more extensive than an assembler, it was often not possible in the 8-bit era to keep all three in memory, and so the steps had to be carried out one after the other by loading the individual components, whereby the intermediate steps alone often took a lot of time claimed.
  • Troubleshooting: Although it is easier than with assemblers, since at least a syntactic analysis is carried out during compilation - nevertheless: The "triple game" of compiling, assembling and linking is necessary after every error correction.
  • Memory problem: The 8-bit bits of that time rarely had more than 64KByte of memory - that is simply too little for many compiler languages. Most C compilers for the C64 only support a subset of the functions according to the K&R or ANSI standard.

Interpreter [edit | Edit source]

The assembler and compiler have in common that the result is an executable program that does exactly what the programmer has specified in the source code. The idea of ​​an interpreter is to dispense with the translation of the source code (apart from a possible more efficient intermediate representation using tokens) and to insert a software layer between the source code and the machine (i.e. the interpreter), which calls routines based on the source code at runtime, which do exactly what the developer described. Contrary to popular belief, the interpreter does not translate into machine code at runtime - it is not a compiler / assembler, so it does not produce a compilation and, subsequently, no machine code and therefore does not belong to one of the approaches mentioned above. An interpreter is simply a program. If the interpreter encounters a PRINT, for example, then a code section is called in the interpreter that writes the text on the screen. Each BASIC command - to use BASIC as an example - thus has its equivalent ready in the form of a subroutine within the interpreter. The few exceptions concern syntactic structures or elements of the language (e.g. THEN, ELSE, TO).

Interpreter languages ​​are: BASIC (old versions), Lisp, Logo, Perl, PHP, Ruby, scripting for shells

Advantages:

  • "Easy to use": A BASIC interpreter was already in the 8-bit computers built-in and you could start immediately after switching it on. No additional software was required that had to be loaded beforehand.
  • Debugging: An interpreter simply stops execution when an error occurs. The error can be eliminated immediately and the program can be restarted immediately without running the compiler.

Disadvantage:

  • Execution speed: The reason for the slowness of interpreter languages ​​is not the called functions, but the recurring parsing of the source code that is necessary during runtime.
  • Memory consumption: The interpreter needs part of the memory itself at runtime.
  • Possibilities: The interpreter is the limit! Only that can be programmed for which the interpreter has provided corresponding commands. That was also the reason why BASIC 2.0 of the C64 did not support graphics or sound directly and instead programmers either had to use appropriately supplemented system routines - reloaded or written into the memory accordingly via POKE - or a BASIC extension such as Simons Basic.

Compreter (Just-In-Time Compiler) [edit | Edit source]

Compreter are the third approach that did not exist at the time of the C64. Compreters actually do exactly what the interpreters were supposed to do at the time: The source code is not translated into machine language in advance, but rather at runtime (just-in-time = JIT). This relatively young technology has only become possible and popular since the late 1990s because computers became fast enough to do this in real time. Just-in-time compilers also use the Java and .NET virtual machines for the bytecode - but not the languages ​​themselves! Java and C #, for example, are pure compiler languages.

Representatives of this genre are: Java, all CLR (.NET) languages ​​(Visual Basic, C #, C ++. NET).

Advantages:

  • Development effort: Combines the advantages of an interpreter language (interactive troubleshooting) with those of a compiler language (performance).
  • Cross-platform: If a virtual machine of the language has been implemented for a platform and is therefore available, the code runs there without changes.

Disadvantage:

  • “In the box”: The special possibilities of the underlying hardware cannot be used, or can only be used indirectly.
  • Memory consumption: In order to run the program, the entire environment for the language's virtual machine must always be loaded.

Hybrid technologies Edit source]

While the separation between the individual techniques mentioned above was clearly defined in the 1990s, there are more and more hybrid approaches today. BASIC was developed as Visual Basic for the computer language, Java and C # are first compiled and then executed in a virtual environment (combination of interpretation and JIT compiling).

But as early as the 1960s, a language that adheres to hybrid status emerged, namely Forth, which is a combination of a compiler and a Inner interpreter, which processes the "intermediate code". The idea here is based on the fast portability and the compactness of the overall system, even if this is associated with compromises in the execution speed.

There were also compilers for the good old BASIC - but these were more of the kind that produced pseudo-code. Ultimately, they replaced the source code with P-code or subroutine call sequences and thus only saved the most complex parsing tasks that would otherwise happen during interpreter runtime. The interpreter in the ROM is still necessary to process the actual basic commands or to support the data types, e.g. floating point arithmetic.

C ++ is also available as a .NET version, which differs significantly from native C ++ in terms of application and syntax.

The combination between a compiler language and assembler already existed in the days of the C64: All good C compilers allow inline assembling, i.e. the use of assembler code directly in C.
Even pure interpreter languages ​​such as Interpreter-Basic made it possible to call programs in assembler or C or to access the memory directly - POKE / PEEK and SYS send their regards. The concept in BBC-Basic even provides for an inline assembler, where machine code merges directly with the basic program.