Computing-thinking-Output
Worldmap
Context |
| ||
Goal |
Cultivate computational thinking by understanding computer architecture and understand the frontiers of the computer industry | ||
Criteria | Output | Process | Input |
|
|
|
|
Walking through the book, you will find that in addition to putting a "world map" at the very beginning, we also put a "map" at the beginning of each chapter. Why? We have found that logic models (i.e. maps) are very important and even more important than the knowledge you are learning (which will be specified in the preface).
While learning computer architecture knowledge, you should also try to write your own logic model to cultivate your computational thinking (details afterwards).
You are already a mature code monkey who should learn to write your own logic model :-)
Get Ready, Start Now!
Logic Model
Context | A group of noobs in computer systems and computational thinking want to get a rough view of the entire computer hardware and software system by this book. | ||
Goal | Appeal reader to read this book and get a preliminary grasp of computational thinking. | ||
Criteria | Output | Process | Input |
|
|
|
|
Editor's Words
- Hahaha, welcome into this big hole and see how many things you will have to learn!!! But don't be afraid, we will tell you about our experience and lessons.
- Although this system implements both hardware and software, the whole system is virtual using the simulator, so most of your work is coding (rather than hold a welding torch to drum the hardware in the garage)! So you can use version control tools to manage your code, like Git!
- Why Git? When you are learning programming language and looking at what other programmers are using, you will find that the textbook only teaches you the grammar (such as the output "Hello World"), but does not teach you how to deal with a big project (Have you ever thought about doing an instant messaging tool? Can you do it with pure C syntax?).
- Traditional programming materials will not teach you many high-quality tools and libraries (it is impossible as the industry upgrades very quickly, and it is better to look at the documentations). Using these tools and libraries (and even editing and creating these tools and libraries) is the right thing to do for creating software development efficiently with your programming skills (if you want to learn Eskimo, but you don't have Eskimos around, then why you learn it?)
- So we hope that while you are learning the knowledge (making wheels), you will also learn to use these advanced tools (wheels) so that you can reuse other people's code/tools and save time. (Since you have ready-made wheels, why bother to recreate the wheel yourself? However, you can create the wheels like compiler and operating system again for deeper understanding).
- The higher requirement is that at the end of your journey, you can try to summarize what you have learned into a book like us, and summarize it into a logic model (the details of the logic model will be introduced later). You will find that coding is the easiest, and the hardest is combining these "fragments" of knowledge into a three-dimensional structure and compress it into a single sentence. We hope that after learning this book, your biggest progress is not knowing how to code or what the hell a computer has, but realizing the essence of abstraction, modularization, and staged thinking by observing the construction of the computer, and apply this way of thinking to your study, work, and life.
How to read this book
- Each chapter is divided into five parts, namely: Logic Model, Introduction, Core Ideas, Skill Instruction, and Advanced
- You should read the logic model and the introduction before reading the contents of the book
- When you encounter difficulties in finishing the problems in "Computer System Elements", you can first refer to the core ideas (no direct answer, only give hints on ideas)
- If you encounter technical difficulties in the process, you can read the Skills Guide section (targeted answers to common BUG skills)
- After completing the homework, you need to read the Advanced. It will enhance the reader's cognition and computational thinking from the higher dimensions of the industry frontier and thinking mode.
- Check the logic model to make sure that you don't miss anything, then go to the next chapter
Come on, my warrior!
Logic Model
- What is logic model? The “world map” section gives a sample of logic model with six parts: Context, Goals, Criteria, Output, Process and Input (and sometimes external factors).
- Why do we want to use a logic model? When you want to accomplish something, a logic model will tell you:
- What do you want (Context and Goals)
- How do you judge whether you have achieved your goal (Criteria)
- What do you want to accomplish (Output)
- How do you achieve it (Process)
- What do you need (Input)
- Why do we emphasize logic model? The homework tests your skills while the logic model tests your thinking. It is like a map, and the fact that you can draw this map means that you have a clear understanding of the logic of the issue.
- How to write a logic model? You should start by your motivation: first establish the context and goals, then develop specific, quantifiable criteria and corresponding output, and then think about what process we need to go through to get output, what resources are needed in the process (Input).
- References
Computational Thinking
- The definition of computational thinking: Computational thinking is a method that reinterprets a seemingly difficult problem into one which we know how to solve by reduction, embedding, transformation, and simulation; it uses heuristic reasoning to seek answers, that is, thinking methods for planning, learning, and scheduling under uncertain situations.[3]
- The history of computational thinking: First proposed by Professor Yizhen Zhou in 2006.
- The meaning of computational thinking: Embodied in a logic model, it can help people accomplish his/her goals more efficiently by consciously cultivating computational thinking
Other References
Besides this book, there are plenty of resources on the Internet to help you learn. If you are limited to this book and the materials at hand, I can only be sorry that you have not seen a bigger world of Internet technology. The Internet is indeed wide enough, but have you figured out that the place you surf every day is just the place you are familiar with, have you ever tried to "pass by" someone else's world? You‘re on Weibo (Note: Twitter in China) and Zhihu (Note: Quora in China) for hours every day, knowing and playing with people with the same interests as if it is the whole world, but is the world really only that big?
So let it go, buddy!
This book is intended to provide assistance to you, and you can also find many other places that offer help. For example, the Nand2Tetris website,which provide not only the necessary tools, but also many tips (although it will not provide answers directly).
Readers can also discover some deeper materials if interested. There are many good blogs on the Internet that worth of a glance.
- Yifeng Ruan's Web Blog
- There are a number of practical uses besides basic computer architecture.
- V2EX
- A gathering place for technicians?
- FreeBuf
- Do you want power, my kid?
- Bloggers on CSDN
- They form an Encyclopedia!
First Stack:Every NAND gate counts for ALU
(Note: The original title is an analogy of Napoleon's "every french soldier carries a marshal's baton in his knapsack" in Chinese, and this translation follows the same rule (and do you see a pun that comes naturally?))
Logic Model
Context |
| ||
Goals |
| ||
Criteria | Output | Process | Input |
|
|
|
|
Introduction
1.George Boole
- Foundation of Computer Theory
- The Mathematical Analysis of Logic (1847)
- An Investigation of the Laws of Thought (1854)
2.Claude Shannon
- Systematic theory support for Boolean logic instantiation
- Analysis and Switching Circuits
3.Wing.J.M
- Initiator of computational thinking
- Wing.J.M Compiutational Thinking
Core Ideas
- Code is written line by line, but it is actually simulating the stitching of a circuit. For a logic gate, both its inputs and outputs are unordered.
- There can only be one input per pin but arbitary number for outputs. So you can connect the OUT of one chip to the IN of multiple chips.
- The specific data bits of the internal pins cannot be directly connected. To do this, you can create a new chip and use this pin as an input pin.
- The combinatorial logic chip needs to calculate the result immediately, so the connection among combinatorial logic chip cannot form a loop. The calculation result of the sequential logic chip is released in the next cycle, so it is legal to have at least one loop in sequetial logic chip.
- At the transition point of two adjacent clock cycles, the output of the DFF is updated, so all operations on the logic must be completed within one clock cycle. The faster these operations are done, the smaller the clock cycle and the higher CPU efficiency.
- A reference to the chip call relationship:
Note: In this figure, the chip with a white background indicates the built-in chip. The blue background indicates the chip that needs to be implemented in the first chapter of nand2tetris, the green indicates the one in the second chapter, and the yellow indicates the third chapter.
Skill Instruction
- HDL syntax
- Use chip call flexibly
- Suggested Addition:Xor16
- Ugly idea:
- No chip call,spread out manually: tedious!
- Write a program to manually spread out the code:
Problem:The code is too long to compile RAM16K within an hour.
- The customized array can not be accessed by interval [l..r],only in and out types can. To call a customized array one should create a new .hdl file to convert it to in and out type.
- Note that the outputs can be segmented,XXX(out[0..15]=a[0..15],out[8]=b) (used when switching between 15 and 16 bits))
- "No animation" is preferred during Hardware simulator test. If there're any problems, switch to "program flow" for debugging。
Advanced
Deep Thinking
- This chapter is a basis for later CPU construction.
- At the transition point of two adjacent clock cycles, the output of the DFF is updated, so all operations on the logic must be completed within one clock cycle. The faster these operations are done, the smaller the clock cycle and the higher the CPU efficiency.
- HDL
- HDL is a declarative programming language[4]
- Understanding the logic gate principle and relationship from a physical perspective
- Four elements of computational thinking
- Focusing
- Staging
- Modularity
- Abstraction
- Contract
- Concept: A contract refers to the proper analysis, the determination of input and output, and the most effective way to accomplish the task based on researches before doing a task. Whether the task is effectively completed and the contract is reached is determined by comparing with the originally set output. This mode of thinking is an important part of computational thinking.
- Significance: By formulating a contract, you can clarify your work objectives and achieve better results.
- Demonstration: Task splitting is a comprehensive demonstration of computational thinking. The first thing to do is task abstraction, then modularize the task in steps after setting the goals; then by staging the task to modules, the problem is solved by focusing on each layer at one time.
Industry Frontier
- The assistance that group theory provides to the connections among logic gates
- Link:File:Physics-Topology-Logic-and-Computation-A-Rosetta-Stone.pdf
- Introduction videos:
- Viewpoint: Transform all logic gates into arrows (symbols), and by studying the relationship between these arrows, we could obtain aimed possible relationship combination and reasonable connections.
The Second Stack:Fight for 3 hours to build a CPU
(Note: This title is an analogy of the joking slogan "Fight for 3 weeks to build a computer" for "Computer Orgnization", one of the compulsory courses in Tsinghua's CS bachelor's degree program (it's real, actually))
Logic Model
Context | The reader has mastered the basic gate circuit and wants to conbine the basic gate circuit to get them work together. | ||
Goal | Build a programmable computer at the hardware level; understand the syntax of machine language and assembly language; build a strong basis for the following implementation of the assembler compiler | ||
Criteria | Output | Process | Input |
Assembly language code passes the test after translation Computer chip successfully runs the given program |
Assembly language code Core hardware system(CPU,RAM32, etc.) |
Learn the syntax of machine language and assembly language Complete the small tasks given in the book in assembly language Build CPU, RAM and other chips according to the syntax of the machine language Splicing individual chips |
The related knowledge of combinatorial logic and sequential logic The chip completed in the first chapter Assembler, CPU simulator, hardware simulator |
Introduction
In this chapter, we need to build chips of CPU, Memory and Computer to complete the hardware part of the Hack machine.
We shall learn the syntax of the machine language and assembly language before we make the chip, since the CPU executes the instructions in the machine language, and then we shall write several small programs in assembly language. This on the one hand allows us to understand how the code of the machine language is executed on the hardware, on the other hand, it will familiarize us with the assembly process and lay a good basis for the next chapter.
Core Ideas
- The structure of the CPU is complicated, however HDL is a declarative language whose order between the statements has no practical meaning. Just connect the lines in the structure diagram of the CPU one by one.
- You can use Mux to implement the function of the if statement. As for else if--what do you think?:-)
- Reference for chip call relationship:
Note: In this figure, the chip with a white background indicates the built-in chip. The blue background indicates the chip that needs to be implemented in the first chapter of nand2tetris, the green indicates the one in the second chapter, the yellow indicates the third, and the orange indicates the fifth.
Skill Instruction
- Compilation
- Constant Assignment:When we need to calculate the values and constants in the A/D register, the C-instruction can only +1/-1 the value in the register. If we need to use large constants, the A-instruction provides the only way to input arbitrary constants into the computer.
- Simultaneous Assignment: an instruction like "AMD=xxx" can be used to assign values to multiple registers/memory locations at the same time.
- Address Conflict: The value in the A register can be used as the address of the data memory or the instruction memory. In order to avoid conflicts, you should not both use M and possibly triggers jump in the same instruction.
- Memory
- The input/output device occupies data memory at a specific address. When you tap a key on the keyboard, its corresponding ASCII code value will appear in the corresponding memory unit of the keyboard. When you modify the memory segment corresponding to the screen, the pixels on the screen will also change.
- Chip
- The A and D registers are two implemented chips, named ARegister and DRegister, which function the same as the Register chip you built in Chapter 1.
- Multiple if can directly use Mux8Way16 written in first chapter
Advanced
Deep Thinking:
- The syntax for computing with the C command looks similar to a high-level language, but actually in the command "A=A+1", "A+1" appears as a whole, not in the general sense. This is why the C command cannot calculate arbitrary constants.
- Hack computers use Harvard architecture for storage where data memory is separated from instruction memory. Therefore, the Hack language can only modify the contents of the data memory, but can not modify the contents of the instruction memory (that is, itself). Hereis an explanation of the differences between the Harvard structure and the Von Neumann structure.
Industry Frontier:
- Assembly Language
- In addition to the Hack introduced in the textbook, there are many well-known assembly languages in the industry, such as x86 (for intel x86 architecture) and 6502 (for FC).
- They can be divided into two categories,the Reduced Instruction Set (RISC) and the Complex Instruction Set (CISC) In general, the former is used for simpler cases (single-chip microcomputers, embedded development), while the latter generally handles more complex tasks (computers). The former includes ARM, RISC-V, while the latter includes Intel、AMD. Among them RISC-V is an open source machine instruction developed by Berkeley, which interested readers could learn by him/herself.
- CPU
- Famous CPU include Intel(such as common i3、i5、i7 processor), AMD(Ryzen)Qualcomm's Opteron processor (widely used in mobile devices such as mobile phones, tablets, etc.) and A12 processor carried by Apple mobile devices.
- Additionally you could refer to FPGA related literature and try to build a CPU that can run in the garage! (Like Steve Jobs!)
- Here is a brief history of CPU development
- Famous CPU include Intel(such as common i3、i5、i7 processor), AMD(Ryzen)Qualcomm's Opteron processor (widely used in mobile devices such as mobile phones, tablets, etc.) and A12 processor carried by Apple mobile devices.
- Other hardware devices
- Motherboard
- The Motherboard is an integrated circuit. It is equivalent to a bridge that connects a large number of chips and provides communication channels to integrate them into one.
- BIOS
- Short for Basic Input/Output System.It used to be written in ROM (Read Only Memory), but now more often in EEPROM or flash memory(which could be erased and written repeatedly) due to the increasing complexity and more frequent update of the BIOS program.
- Memory
- The same "Memory" we hear in "This computer / mobile phone runs in 8G memory". Random Access Memory (RAM) means any address can be accessed within same time interval. This is a crutial feature as it ensures that a complete thing can be done in a single period (note the connections with the sequential logic above!), which solves the big problem of computer storage and reading.
- There is no file system in the Jack computer, but in actual systems it is not productive. In real life, most of the personal computers have hard disk, optical disk and other storage devices, and nowadays a popular option is the solid state disk (SSD).
- GPU
- Short for Graphic Process Card, which is mainly used to process graphics. Its special architecture (large number of computing cores--for example GTX1080 has 2560 computing cores, while a CPU only have 2 to 6 cores) enables powerful parallel computing, thus it is also widely applied in neural network training and bitcoin mining. Major releasers include NVIDIA (the latest available RTX 2080Ti) and AMD.
- Sound card, screen, mouse, keyboard and other peripherals
- In fact, these peripheral devices are composed of chipsets plus special parts to meet the different needs of users. Say, a machine only has CPU, RAM and ROM (that you can't modify) and infrared emitter, it can also be regarded as a computer (a bit special though). This is called "Embedded Development".
- Motherboard
- Reference
- Wikipedia
- Computer Science: An Overview
The Third Stack:Assembly Compiler->Able-to-write Complier
(Note: In Chinese the word "Assembly" (pronounced as "hui-bian") sounds the same as "be able to write"(also "hui-bian"))
Logic Model
Context | Readers are familiar with the instruction set of Hack machine language and the syntax of corresponding assembly language (A-instruction and C-instruction), but do not understand the principle of assembly compilation | ||
Goal | Understand the principle of assembly and compilation, understand the significance of assembly and compilation, and preliminarily understand the engineering method of software engineering. | ||
Criteria | Output | Process | Input |
Whether one can write an assembly program that can pass the test or not |
Unsigned assembly compiler (Parser, code) and signed assembly compiler (Parser, code, symboltable) |
Master the correspondence between assembly language and machine language, understand the process of assembly and compilation Write Parser, code, symboltable step by step Understand the significance of the assembly process from the perspective of the entire computer |
Assembly language and machine language High-level language (Python recommended) Test tool Assembler |
Introduction
- From the "assembly compiler": This chapter will try to build the assembly compiler and reveal the engineering methods of the software part.
- To "Able-to-write Compiler":Grasp the modular approach to this chapter and you could realize other compilers
- Existing Assembly Language:X86、PowerPC、ARM……
- Classification:The difference in CPU leads to the difference in assembly language. Reduced Instruction Set (RISC) and Complex Instruction Set (CISC)
- Origin: In December 1953, IBM programmer J. Backus wrote a memo proposing a new programming language for IBM 704 as the machine language is considered not convenient enough. Bacchus led a 13-person team to complete the first computer high-level language, the FORTRAN language, in 1954. In 1966, the United States unifies its standard, called the FORTRAN 66 language. Bacchus therefore took the 1977 "Turing Award".
Core Ideas
- Assembly language is a mnemonic that can be literally translated.
- Workload is large here, therefore the whole assembly compiler is split into three parts: Parser, Symboltable, and Codewriter whose functions are realized separately, and finally these three parts are put together to form a complete program.
- From this chapter you will enter the software section of the book and begin writing a variety of "compilers" that translate the language of the upper layer into a lower-level language.
Thus, the "compiler" of each chapter is related to the language of the current and previous chapter so it is hard to complete the assignment if you are not familiar with them enough.
So before doing the homework for each chapter, it's better to review contents related to syntax in previous chapters, and give extra attention to syntax in current chapter to ensure that you have enough knowledge of both languages.
Skill Instruction
- This chapter is the first project written in a high-level language, and the simplest one. As a practice project, the focus of this chapter should be on standard modular methods such as reading and writing files.
- Parser's core function is CommandType, which is used to determine the type of each line of statements; the core of Symboltable is the keyword list, and the core of Codewriter is the write function.
- There is a correspondence between the six-bit instruction set and the input end of the ALU , and the 0/1 of the instruction set is also associated with 0/1 in the digital circuit. Understanding the relationship helps the creation of Symboltable.
- When dealing with extra symbols such as comments, spaces, etc., you can take advantage of string functions in Python (more convenient than other languages).
- When dealing with parentheses, internal coding requires some high-level language features more than the underlying computational storage logic.
Advanced
- Assembly language is a kind of mnemonic, which is a shorthand for 0/1 machine language. Its composition logic is consistent with 0/1 language, which is in essence a direct paraphrasing.
- Modular approach: Modularity refers to the process of dividing a system into several modules from top to bottom when solving a complex problem. There are multiple attributes that reflect their internal characteristics. --Baidu Encyclopedia
- Why modularity: Modularity is not only conducive to the decomposition of large workloads, but also beneficial to the management of the division of labor. At the same time, the approach to designing a modular is also an exercise in computational thinking.
- The reason to modularize is that both a compiler and a virtual machine are translators. A translator must have a translation part--Parser; besides there is also correspondence--Symboltable; and finally there is output for new language--the Codewriter part. The premise of this kind of fragmentation is to have a deep understanding of the nature of such software, that is, to analyze and abstract the entire software hierarchy through computational thinking.
- The methods to modularize Parser, Symboltable and Codewriter used in his chapter are also applicable in following chapters.
- Why do we implement such low-level software with a high-level language? Please refer to https://www.coursera.org/learn/nand2tetris2/lecture/YbkZG/unit-0-0-machine-language-primer
Recommendation
- An introduction video to Assembly Language: https://www.youtube.com/watch?v=ViNnfoE56V8
- Definition to Assembler: https://zh.wikipedia.org/zh-hans/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80
The fourth stack: How To Run A Computer On A Computer
Logic Model
Context | Readers are familiar with assembly language and want to continue to complete the conversion from machine language to high-level language | ||
Goal | Learn the knowledge of virtual machines, and deeply understand 2-Tier Compilation, abstraction and other related ideas through the process | ||
Criteria | Output | Process | Input |
The assembly code to which VM translator translate from VM code could pass the test by emulator in toolkit |
VM translator written in high-level language |
|
|
Introduction
Dear readers, if you have completed the previous content in the order of this book, you will find that you have successfully completed the construction of one most primitive computers. We started with the most basic NAND gates, built out the sequential circuits, ALU, RAM and ROM, and finally combined them together to build a complete computer.
Such a computer is essential but also difficult to operate (imagine a modern programmer who needs to process a picture pixel by pixel--what a big workload!). So based on facts, we need to "abstract" some common features, and that's why class and function appear. Meanwhile for safety concern, we must implement a certain encapsulation of the original functions of the computer, leaving only some secure ports for the programmer to use, which gave birth to a variety of high-level languages. We use the high-level language Jack in this book.
- Two-tier translator: In this book, we are not directly translating assembly language into a high-level language, but introducing an intermediate platform: the virtual machine language. Why? There are two reasons: one is to reduce the difficulty--the difference between high-level language and machine language is significant, and you're likely to have trouble if you connect them directly; the other reason is that people use different high-level languages for different purposes, so it is time-consuming and ineffective if we rewrite a compiler from high-level language to machine language every time we make small innovations. Instead we extract a set of the most basic, rarely changed operations from different high-level languages, and integrate them into a new language as a platform to connect many high-level languages and machine languages. Hence the virtual machine language was born.
- Stack operation: The basic principle of a virtual machine is to take advantage of stack operations. Reader will learn more about the role of heap after learning the compiler.
Core Ideas
- The virtual machine language is LL(0), so it can be processed sentence by sentence.
- Don't store the SP in a high-level language, instead just find a piece of memory to store SP! (Do not worry about insufficient memory or being reused.) After initializing the SP, a piece of memory is obtained for the stack. Each push and pop operation method finds the top of the stack according to the SP. After the operation, the SP is moved.
Skill Instruction
- C++ has fewer string functions. It is recommended to use Python and call packages such as re, itertools.
- Temp is the only one that does not change the address from 5, however the first address of other memory areas will change, stored in LCL, ARG, THIS, THAT, find the first address each access, then address translation (see 5).
- The return in the virtual machine language returns only one value (for the convenience of processing, if void function is compiled, 0 will be pushed)
- -1 refers to true while false uses 0 (note that this is different from C language expression! But it is convenient for bitwise bit operations)
- Other logic operations like eq may use C instructions
- Consider @13~@15 when you need to save intermediate variables
- For call, function, return processing method, see table on page 163
- All arithmetic operations that can be used are listed on the table in p67. By using
- @xx
- D=A
- @M
- D=D+M
- A=D
- you can save a lot of A=A+1 'for' loops and jump directly to the desired address (there can be multiple variables on the left side of the equal sign, such as AMD=xxx)
- .Debugging method:
- GDB debugging tool
- Print out intermediate variables, observe value changes, set conditional breakpoints, etc.
Enhanced Understanding
- There are many ways to translate the code since this chapter. We should find for the optimal solution. One way to optimize your code is to use Mathematica to enumerate a lot of code and search for the shortest path.
- Modularity is divided into two parts: syntactic analysis and code generation, with each part subdivided internally. No matter what language is used, it is recommended to stick to the API specified in the textbook to ensure the modularity and encapsulation of the program.
- To ensure staging, you must keep the current code module after Chapter 7. Be careful when you change the code after Chapter 8.
- Be sure to make later debugging easier during the process of coding and designing (be careful with exception handling; add comments to the translated code, etc.).
Advanced
WebAssembly (WASM in short, introduced by W3C in March 2017) is a binary encoding set for virtual machines based on stack operations. It can instantly convert a high-level language (currently supporting C++) into a webassembly language and run it on a web page. The advantages includes:
- Fast and efficient
- Available under network, more portable
- Support different high-level languages for easy cooperation
- Secure
It is equivalent to the abstraction layer of the virtual machine in the hack computer.
The JVM (created by Sun Microsystems in 1994) is fully called the Java Virtual Machine.It is designed to provide a computer model based on abstract specification descriptions, providing great flexibility to interpreter developers while ensuring that Java code runs on any system that conforms to the specification. The JVM gives specific definitions of certain aspects of its implementation, particularly the explicit specification of the Java executable code, namely the Bytecode format. This specification includes the syntax and value of the opcode and operand, the numeric representation of the identifier, and the Java object in Java class file, the memory map of the constant buffer pool in the JVM. These definitions provide the information and development environment required for JVM interpreter developers. Java designers hopes to give developers the freedom to use Java.
Note: The virtual machine model in the book Nand2Teris imitates JVM.
The Fifth Stack:Jack!Joker!!!
Logic Model
Context | Readers have mastered the knowledge of virtual machines | ||
Goal | Establish a high-level language-compiler-operating trinity value system | ||
Criteria | Output | Process | Input |
Jack applet generates the expected effect in the simulator The .xml output from the compiler is the same as the sample .xml (can be verified by TextComparer) The .vm output from the compiler works fine in the simulator and achieves the desired effect. The OS functions the same with the given sample when running, after being compiled by the compiler you have written (through various tests) |
Your own Jack applet Two-stage compiler source code (several .cpp, .java files) and a runnable compiler program Operating system *.jack file |
Learn the syntax of the Jack language Write a small program with Jack Construct the first stage compiler (output as .xml) Construct the second stage compiler (output is .vm) Write an operating system with Jack |
Virtual machine related knowledge Ability to program in high level languages Virtual machine simulator, Jack compiler |
Core Ideas
- Although Jack, the compiler, and the operating system are three chapters being separated in the book, they are actually interrelated. For example, if the reader is not familiar with the operating system, he/she will find it difficult to understand and handle the compilation details of some operating systems when dealing with compiler.
- One needs to set exactly two pointers, this and that in the VM language: "this" is mostly used as pointer to the address of the current class object, while "that" is commonly used to point to the address of the array to be operated.
Skill Instruction
- Review
- Review the knowledge of the virtual machine and the syntax structure of VMCode before implementing the compiler.
- Meanwhile review the strategy used when writing the compiler and the program API. You can reuse some code (such as file input and output when writing the compiler).
- Confusion
- First read the chapter of operating system to understand what happens when you write a function that calls an operating system function.
- You can refer to the virtual machine code for the operating system in downloaded files for better understanding of the compilation details.
- How to compile parts of non-LL(0)
- The Jack language is almost a language of LL(0), but there are some places where you need to read two characters to judge (Hint: in the rules of term). General practice: When the program Parse goes to a specific place, read two characters at a time to judge.
- How to handle Array and String classes
- When a program declares an Array or String, you don't need to allocate the space needed for the class! The space allocation is done by Array.new(int size) and String.new(int length)! It is the duty of the constructor and not the duty of the compiler!
- How to handle the string constant "xxx"
- How to deal with this String constant? Turn it into an anonymous String class!
push constant size call String.new 1 /*anonymous String object*/ push constant xx(ASCII code of a character) call String.appendChar 2 /*You needn't to push again as this String object has been in memory already!*/ push constant xx call String.appendChar 2 ……
- Function call: method or function?
- When the function f() is called, the default mode is the method of the current class object this. If it is the function of the current class XXX, the Jack program will write XXX.f().
- When calling function A.f(), if A is a class object already defined by the current function (local variable/parameter/class member and static object), then f is method, and its corresponding class is the class of the object; otherwise it is function , A is the class name.
- Code
- Strictly follow the software engineering principle of “high cohesion and low coupling”, and avoid various functions in a single piece of code.
- Pay attention to the possibility of code reuse to reduce the amount of duplicate code and workload
- Variable naming should be as close as possible to the role of the variable, and not too long / overly abbreviated to increase readability
- If you have a lot of variables in your own code and the relationship is complicated, you can draw a picture on paper to indicate their relationship.
- This is not significant in the programming of the previous chapters because they are not too complicated.
- But it is still useful if you are working on a larger project in the future.
- (Examples are as follows):
- Debugging
- Printing intermediate variables is one way to debug, although a bit less intuitive. You can use breakpoint tools like gdb
- You can process Jack code that restore the operating system by referring to operating system virtual machine code and use it to check if the compiler has potential bugs
Advanced
- High-level Language
- In addition to Jack's "simple" high-level languages, there are many high-level languages, such as C++/Python/Java (perhaps you are using one of them to write the compiler for this book)
- There are also many types of higher-order languages, such as the declarative programming (HDL) mentioned above, while C is imperative programming, Java-style object-oriented programming, and so on.
- Compiler
- There are different compilers for different languages, and the generation of compilers is a more complicated process. Interested readers can read the "Construction and Interpretation of Computer Programs" (SICP) and "Compilation Principles" for in-depth understanding and learning. LLVM is a highly efficient tool for large-scale production of compilers and is worthy of attention.
- Operating System
- Popular OS include Windows, MacOS, Linux, etc. Although Linux is less common among individual users, it runs on a large number of servers.
- Birdbrother's linux Private Kitchen
- Linux is also a mainstream operating system, but it is slightly "distant" from everyday PCs. However, when you have a deep understanding of Linux, you will not only have a clear understanding of the entire computer (you must have to know these to use Linux), but also get rid of the "you may be a victim of pirate Windows" predicament!
- Refer to this article to get 101 knowledge of Linux: Work entirely with Linux!---Yin Wang
The Sixth Stack:Sword To Horizon!
Logic Model
Context | The reader has finished the whole process of constructing the computer. | ||
Goal | Write a Tetris using the knowledge learned in the process | ||
Criteria | Output | Process | Input |
The Tetris features basic block shapes, block transforms, block elimination, etc. |
Jack source code for Tetris |
|
|
Core Ideas
You have completed this training,
and we can't give you more.
Time to display your creativity,
and take the sword to the end of the earth.
You have been grown from a leek to be "harvested"
into a "reaper" with supreme power!
Now the computer world is yours ! ! !
References
- Wikipedia-zh,en
- Digital Publications for Foundamentals of Computational Thinking and System Design-2018 Spring