2 Star 2 Fork 2

万里 / tinyvm

Create your Gitee Account
Explore and code with more than 12 million developers,Free private repositories !:)
Sign up
Clone or Download
contribute
Sync branch
Cancel
Notice: Creating folder will generate an empty file .keep, because not support in Git
Loading...
README
MIT

TinyVM

原始仓库地址:https://github.com/jakogut/tinyvm

TinyVM是一个支持类似Intel x86汇编的虚拟机,目标是使用内存少,代码量少,二进制文件小。

可以使用类unix系统的make和gcc来构建。

make

或者

make rebuild

如果需要构建调试版本,在make命令后添加参数DEBUG=yes即可。

如果需要构建带分析的版本,在make命令后添加参数PROFILE=yes即可。

除了C标准库,没有额外依赖。

TODO

基础特性

  • 对NASM 风格的本地标签名的支持
  • 对 bytes/words/dwords 类型的支持
  • 添加预处理器C语言风格的支持
  • 将转换后的TVM程序代码转移到虚拟机内部地址空间中

高级特性

  • 修复/重构调试器(目前无法运行)
  • 支持中断
  • 基于SDL或GLFW的屏幕,用于输出缓冲区的内容
  • JIT编译
  • C 语言接口
  • 使用TVM指令重写C语言标准库

语法

虚拟机大致遵循传统的Intel x86汇编语法。

  1. 寄存器
  2. 内存
  3. 标签
  4. 预处理器
  5. 指令集
    • 内存
    • 调用约定
    • 算数操作符
    • 二元运算符
    • 比较运算符
    • 控制流操作
    • 输入 / 输出

0. 值

值可以指定为10进制,16进制或者2进制。Intel汇编语法和TVM语法不同之处在于值和基本符号之间的界定符号。默认情况下,一个基本符号认为是10进制。其他任何以0x为前缀被认为是16进制。

也可以使用基本符号来指定值。比如针对32,可以使用0x20指定为16进制,20或者h是16进制的符号,同理,10000或者b是2进制基本符号。

1.寄存器

TVM有17个寄存器,以x86寄存器为模型。

寄存器名称都是小写字母。

(EAX-EDX都是通用寄存器)

  • EAX

  • EBX

  • ECX

  • EDX

  • ESI

  • EDI

  • ESP - 栈指针, 指向栈顶

  • EBP - 基址指针, 指向栈起始处

  • EIP - 指令指针, 这是使用jump命令修改的,永远不要直接修改

  • R08 - R15, 通用寄存器

2. 内存

内存地址使用括号表示,以四个字节为单位。程序运行在虚拟机内部空间,所以该地址空间不会有地址超出限制。

需要注意的是包含重要程序功能的地址,比如栈的内存地址。

覆盖了这一块内存区域将会导致程序崩溃。所以记住不要使用绝对地址。而是使用相对地址。

在地址空间中指定第256个字,你可以使用[256],[100|h],或者[100000000|b]。

指定值时有效的任何语法在指定地址时有效。

3. 标签

通过将冒号附加标识来指定标签,尚不支持本地标签。

标签必须在上一行的起始位置或者单独战局一行。

4. 预处理器

TVM的预处理器类似于大多数C编译器,都是以%开头。

支持以下预处理器:

I. Include

%include filename

在解释文件名之前,将所有内容粘贴到源代码中。

II. Define

%define

定义一个常量,因此所有这个常量的实例都会被替换为这个常量的值。

5. 指令列表

列出的说明以完整的用法格式显示,带有示例参数,并括在方括号中。 方括号不在实际的TVM程序中使用。

I. 内存

[mov arg0, arg1]

将arg1的值赋值给arg0。

II. 栈

[push arg]

将arg推入栈顶。

[pop arg]

弹出栈顶值,并保存到arg中。

[pushf]

将FLAGS寄存器的值入栈顶。

[popf arg]

弹出FLAGS寄存器的值到arg中。

III. 调用约定

[call address]

将当前地址入栈顶,然后跳转到指定的子程序中。

[ret]

将栈顶的地址弹出,保存到IP寄存器中,然后返回到调用者。

IV. 算数操作符

[inc arg]

将arg加1

[dec arg]

将arg减1

[add arg0, arg1]

arg0 = arg0 + arg1

[sub arg0, arg1]

arg0 = arg0 - arg1

[mul arg0, arg1]

arg0 = arg0 + arg1

[div arg0, arg1]

arg0 = arg0 / arg1

[mod arg0, arg1]

arg0 = arg0 % arg1

[rem arg]

检索存储在余数寄存器中的值,保存到arg中

V. 二元运算符

[not arg]

计算arg的not值,结果保存在arg中

[xor arg0, arg1]

计算arg0和arg1的xor值,结果保存在arg0中

[or arg0, arg1]

计算arg0和arg1的or值,结果保存在arg0中

[and arg0. arg1]

计算arg0和arg1的and值,结果保存在arg0中

[shl arg0, arg1]

将arg0左移arg1个位置

[shr arg0, arg1]

将arg0右移arg1个位置

VI. 比较运算符

[cmp arg0, arg1]

比较arg0和arg1大小,将结果保存到FLAGS寄存器中。

VII. 控制流操作

[jmp address | label]

跳转到一个address或者label处

[je address | label]

如果相等就跳转address或者label处

[jne address | label]

如果不相等就跳转address或者label处

[jg address | label]

如果大于就跳转到address或者label处

[jge address | label]

如果等于或者等于就跳转到address或者label处

[jl address | label]

如果小于就跳转到address或者label处

[jle address | label]

如果小于或者等于就跳转到address或者label处

VIII. 输入/输出

[prn arg]

打印一个整数

执行流程

  1. 读取源文件
    1. 获取文件大小
    2. 根据文件大小申请内存
    3. 将文件内容复制到这个新申请的空间
    4. 关闭文件句柄
  2. 创建虚拟机对象(分配内存)
    1. 虚拟机结构体
  3. 解析文件内容
    1. 按行遍历文件
    2. 每一行字符串处理(include处理,define处理)
      1. include处理
        1. 从字符串中解析出文件名,读取文件
        2. 将文件内容替换到这一条include语句
      2. define处理
        1. 判断这个变量是否已经保存在全局变量哈希表中
        2. 不存在则添加到这个hash表
        3. 存在则将这个宏替换成实际的值
    3. 词法解析
    4. 语法解析

TinyVM

TinyVM is a virtual machine with the goal of having a small footprint. Low memory usage, a small amount of code, and a small binary.

Building can be accomplished on UNIX-like systems with make and GCC.

There are no external dependencies, save the C standard library.

Building can be accomplished using "make," or "make rebuild".

To build a debug version, add "DEBUG=yes" after "make". To build a binary with profiling enabled, add "PROFILE=yes" after "make".

TODO

Basic

  • NASM style local labels
  • Implement defining bytes/words/dwords
  • Add C style defines to preprocessor
  • Move parsed TVM program code into the VM's virtual address space

Advanced

  • Fix/refactor the debugger (it doesn't work)
  • Interrupts
  • SDL or GLFW based screen for outputting the contents of a framebuffer
  • JIT compilation
  • C interface
  • C Library written in TVM code

Please send patches to Joseph Kogut <joseph.kogut(at)gmail.com>

Syntax

This virtual machine loosely follows traditional Intel x86 assembly syntax.

  1. VALUES
  2. REGISTERS
  3. MEMORY
  4. LABELS
  5. PREPROCESSOR
  6. INSTRUCTION LISTING
    • I. Memory
    • II. Stack
    • III. Calling Conventions
    • IV. Arithmetic Operators
    • V. Binary Operators
    • VI. Comparison
    • VII. Control Flow Manipulation
    • VIII. Input / Output

0. VALUES

Values can be specified in decimal, hexadecimal, or binary. The only difference between Intel syntax assembly and TVM syntax is the delimiter between the value and the base specifier. By default, values without a base specifier are assumed to be in decimal. Any value prepended with "0x" is assumed to be in hexadecimal.

Values can also be specified using base identifiers. To specify the value "32," I can use 0x20 for hexadecimal, 20|h for hexadecimal using a base identifier, or 100000|b for binary using a base identifer.

1. REGISTERS

TVM has 17 registers, modeled after x86 registers. Register names are written lower-case.

(EAX - EDX, General Purpose)

  • EAX

  • EBX

  • ECX

  • EDX

  • ESI

  • EDI

  • ESP - Stack pointer, points to the top of the stack

  • EBP - Base pointer, points to the base of the stack

  • EIP - Instruction pointer, this is modified with the jump commands, never directly

  • R08 - R15, General Purpose

2. MEMORY

Memory addresses are specified using brackets, in units of four bytes. Programs running within the virtual machine have their own address space, so no positive address within the address space is off limits.

It is important to note that certain areas of memory are used for vital program function, such as the memory used by the stack.

Overwriting such parts of memory will likely cause the program to crash. As such, it is recommended that no absolute addressing be used.Instead, address memory relatively.

To specify the 256th word in the address space, you can use [256], [100|h], [0x100], or [100000000|b]. Any syntax that's valid when specifying a value is valid when specifying an address.

3. LABELS

Labels are specified by appending a colon to an identifier. Local labels are not yet supported.

Labels must be specified at the beginning of a line or on their own line.

4. PREPROCESSOR

TVM's preprocessor works similarly to many C compilers and uses the prefix "%".

I. Include

%include filename Pastes all of the contents of filename into the source code before interpretting it.

II. Define

%define identifier value Define a constant so that all instances of the string "identifier" will be replaced by "value".

5. INSTRUCTION LISTING

Instructions listed are displayed in complete usage form, with example arguments, enclosed in square brackets. The square brackets are not to be used in actual TVM programs.

I. Memory

[mov arg0, arg1]

Moves value specified from arg1 to arg0

II. Stack

[push arg]

Pushes arg onto the stack

[pop arg]

Pops a value from the stack, storing it in arg

[pushf]

Pushes the FLAGS register to the stack

[popf arg]

Pops the flag register to arg

III. Calling Conventions

[call address]

Push the current address to the stack and jump to the subroutine specified

[ret]

Pop the previous address from the stack to the instruction pointer to return control to the caller

IV. Arithmetic Operators

[inc arg]

Increments arg

[dec arg]

Decrements arg

[add arg0, arg1]

Adds arg1 to arg0, storing the result in arg0

[sub arg0, arg1]

Subtracts arg1 from arg0, storing the result in arg0

[mul arg0, arg1]

Multiplies arg1 and arg0, storing the result in arg0

[div arg0, arg1]

Divides arg0 by arg1, storing the quotient in arg0

[mod arg0, arg1]

Same as the '%' (modulus) operator in C. Calculates arg0 mod arg1 and stores the result in the remainder register.

[rem arg]

Retrieves the value stored in the remainder register, storing it in arg

V. Binary Operators

[not arg]

Calculates the binary NOT of arg, storing it in arg

[xor arg0, arg1]

Calculates the binary XOR of arg0 and arg1, storing the result in arg0

[or arg0, arg1]

Calculates the binary OR of arg0 and arg1, storing the result in arg0

[and arg0, arg1]

Calculates the binary AND of arg0 and arg1, storing the result in arg0

[shl arg0, arg1]

Shift arg0 left by arg1 places

[shr arg0, arg1]

Shifts arg0 right by arg1 places

VI. Comparison

[cmp arg0, arg1]

Compares arg0 and arg1, storing the result in the FLAGS register

VII. Control Flow Manipulation

[jmp address]

Jumps to an address or label

[je address]

Jump if equal

[jne address]

Jump if not equal

[jg address]

Jump if greater

[jge address]

Jump if equal or greater

[jl address]

Jump if lesser

[jle address]

Jump if lesser or equal

VIII. Input / Output

[prn arg]

Print an integer

MIT License Copyright (c) 2013 Joseph A. Kogut Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

TinyVM是一个小巧、快、轻量级,无第三方依赖,纯C实现的虚拟机。 expand collapse
C and 5 more languages
MIT
Cancel

Releases

No release

Contributors

All

Activities

Load More
can not load any more
C
1
https://gitee.com/wanliofficial/tinyvm.git
git@gitee.com:wanliofficial/tinyvm.git
wanliofficial
tinyvm
tinyvm
master

Search