SPEED/SPACE Benchmarks of Java Object Serializing(V1)
Summary
Java 序列化体系的性能孰高孰低,网上已经有了许多比较文章。
但我认为有些比较存在问题:
因此撰写此文及程序,作为众人参考比较的选择。
There has been so many discussions about which is the best Java serialization system. Yet I think there were some problem in some of them.
- Sample data structure is too simple
- Testing program did not generalize serialization systems to evaluate each of them fairly
- Only a few serialization systems are involved
- Testing program is not extendable to involve more serialization systems
That's why this testing program and article were written, providing another option to build your own Java Serialization Systems evaluation.
Serialization systems involved
Testing results to concern
Speed of serialization
Speed of deserialization
Space cost after serialization
Generalization
Protobuf协议需要对消息定义执行静态编译,JDK built-in序列化协议需要被序列化对象实现java.io.Serializable接口。而其他框架都是运行时动态对任意Java Object进行序列化。为了能在同一个基准上进行比较,需要定义泛化约束如下。
Static compilation is required for Protocol Buffers message definitions, and JDK built-in serialization protocol requires Objects to implement java.io.Serializable, while others serialize any Plain Java Object dynamically. Constraints were defined to generalize all serialization systems.
Structure Generalization
- All domain objects should have the same structure defined by
.proto
, and provides converters to convert POJOs and protobuf messages back and forth.
- All domain objects should implement java.io.Serializable
Input Generalization
运行同一轮基准测试时,所有序列化框架输入的数据相同,循环次数相同。
Use exactly the same input for each Serialization System and loop exactly the same times for the same benchmark testing.
How to build and run
构建测试程序
cd master
mvn clean install
Build The Testing Program
- Enter building directory
cd master
- Startover building
mvn clean install
运行测试程序
cd benchmark
java -jar target/benchmark-<version>.jar
Run The Testing Program
- Enter benchmark directory
cd benchmark
- Start running by typing
java -jar target/benchmark-<version>.jar
- Checkout logs in ${user.home}/benchmark.log
Testing Program Designing
Samples Testing Models
为了满足测试的多样性,较全面测试空间和时间性能,测试样本对象当满足如下条件。
Testing objects are supposed to satisfy requirements mentioned below, so that space/speed performances are better evaluated.
- C1-01 Testing objects and the properties of them are created randomly
- C1-02 Testing objects should have integer/string/float/enum properties All of these types are mandatory.
- C1-03 Testing objects are supposed to hold at least 1 collection property
- C1-04 Testing objects are supposed to refer to each other
Serialized Object
序列化对象是普通Java对象的包装,满足如下条件。
Serialized Objects are wrappers of POJOs, are supposed to satisfy requirements mentioned below.
- C2-01 Accepts a POJO for initialization
byte[]
类型的无参方法获取序列化后的字节流
- C2-02 Provides a
byte[]
method without args for accessing serialized byte array
int
类型的无参方法获取字节流长度
- C2-03 Provides a
int
method without args for accessing the length of byte array
String
的无参方法获取序列化后字节流的UTF-8字符串形态
- C2-04 Provides a
String
method without args for accessing the UTF-8 form of byte array
String
的无参方法获取序列化后字节流的Base64字符串形态
- C2-05 Provides a
String
method without args for accessing the Base64 form of byte array
- C2-06 Provides method without args returning the same type as accepted POJO, which is deserialized from the byte array
C2-06
所提及的方法不能直接返回C2-01
传入的对象
- C2-07 The method required by
C2-06
shall not return the POJO accepted byC2-01
set*
,add*
等会改变对象状态的方法C2-02
所提供的方法应当进行保护性复制
- C2-08 Serialized Object is supposed to be IMMUTABLE
- Provide no mutators that changes the object status, like
set*
,add*
- Method defined by
C2-02
should return a defencive copy of the internal byte array
Benchmark Testing Objects
Space Benchmark Testing Objects
空间基准测试比较简单。只需要随机测试样本,逐个输出各序列化体系的空间占用即可。
Space benchmark testing is the simpler one. Generate samples, and record space cost of each serialization systems. That's all we have to do.
Speed Benchmark Testing Objects
为了公平比较各序列化体系,定义速度基准测试对象约束如下
To be fair, the subsequent constraints are defined
- C3-01 Provides a method which accepts 1
Object
argument, which is to be serialized; and 1int
argument, which indicates times of looping.
C3-01
定义的方法开始和结束时进行计时,计算总消耗时间和平均每次序列化的时间
- C3-02 Calculate elapsed time of the method defined by
C3-01
, and average elapsed time of each serialization.
- C3-03 The execution order of each Speed Benchmark Objects are able to be adjusted at runtime, freely.
Implementing Testing Program
SerializedObject
是所有序列化对象的基类,根据 2.2的要求实现.SerializedObject
的子类告诉其父类如何把所包装对象序列化成字节流.SerializedObject
的子类告诉其父类如何把字节流反序列化成对象.SerializedObject
的子类可通过实现 beforeSerilize()
方法初始化序列化过程中需要用到的工具.SerializedObject
在序列化过程中捕捉的受检异常都会被包装到SerializationException
重新抛出.SerializedObject
提供了工厂方法初始化其子类,其子类的构造函数都是package-private的。
SerializedObject
is the base class of all serialized objects, which complies with 2.2.- Sub-types of
SerializedObject
tells their super class how to serialize the wrapped object.- Sub-types of
SerializedObject
tells their super class how to deserialize from the byte array.- Sub-types of
SerializedObject
are allowed to implementbeforeSerilize()
to initiate the internal utilities.- Checked exception of serialization procedure inside
SerializedObject
are wrapped and rethrown bySerializationException
.SerializedObject
provides factory method to initialize it's known sub-types, since the constructor of which are package-private.
Hessian2SerializedObject
需要额外的配置,用以指定自定义的序列化和反序列化策略。相应的配置放在META-INF目录下面。
Hessian2SerializedObject
requires extra configuration under META-INF, which specifies custom serializers.
是速度基准测试接口
which is a Speed Benchmark Interface
Benchmark
接口根据2.3.2定义了单次基准测试的执行方法
Benchmark
defined method for benchmark testing, complies with 2.3.2.
Benchmark
的执行计时通过ProfilingAspect
拦截实现
Benchmark
executions are intercepted byProfilingAspect
, for elapsed time calculation.
ProfilingAspect
的总耗时单位是毫秒,单次调用平均耗时单位为微秒。
ProfilingAspect
records total elapsed time in Milliseconds, and average elapsed time of a single call in Microseconds.
Benchmark
已知的接口的实现Benchmark
实现分别执行1,000, 5,000, 20,000, 50,000, 200,000次
- Arranges known
Benchmark
implementations.- Run each
Benchmark
implementation for 1,000, 5,000, 20,000, 50,000, 200,000 times.- Define thread pool which executes Benchmark Testing and manage its lifecycle.
Protocol Buffers消息对象需要通过静态编译预生成. 同时为避免冗长的代码,测试程序使用了lombok。如果你导入代码到IDE时发现缺少了相应的类或者库,请先到master目录运行mvn clean install
,然后再重新导入代码。
Protocol Buffers messages requires static compilation. Moreover, the testing program introduced lombok. If you see any required classes or dependencies are missing after importing into IDE, checkout the
master
directory and runmvn clean insall
first, and re-import the testing program after that.
TestingModels
是样本测试数据生成器,可随机生成被测试的样本对象及枚举值。测试样本类型由lombok编译器生成。无论编译与否,原文件在message/testing-models/src/main/lombok
目录下找到。
TestingModels
is the sample testing object provider, which generates samples testing objects and enums randomly. Sample testing model types are generated by lombok automatically. The original source can be found undermessage/testing-models/src/main/lombok
even before compilation.
io.demo.message.domain.proto
io.demo.message.domain.proto
包含2种类型
message/testing-models/target/generated-sources/protobuf/java
找到。2 kinds of classes are under
io.demo.message.domain.proto
Message
classes generated by Protobuf compiler, which can be found undermessage/testing-models/target/generated-sources/protobuf/java
after compilation.- Converters transforms Testing Models and Protobuf messages back and forth
How to extend the Testing Program
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。
1. 开源生态
2. 协作、人、软件
3. 评估模型