3 Star 0 Fork 1

lijinting01 / message

加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
贡献代码
同步代码
取消
提示: 由于 Git 不支持空文件夾,创建文件夹后会生成空的 .keep 文件
Loading...
README
Apache-2.0

Java Object 序列化的基准测试(V1)

SPEED/SPACE Benchmarks of Java Object Serializing(V1)

1.概要

Summary

Java 序列化体系的性能孰高孰低,网上已经有了许多比较文章。

但我认为有些比较存在问题:

  • 测试样本结构简单
  • 测试程序进行泛化处理以公平衡量各序列化体系
  • 涉及序列化体系较少
  • 测试程序扩展性差难以加入其它序列化体系

因此撰写此文及程序,作为众人参考比较的选择。

There has been so many discussions about which is the best Java serialization system. Yet I think there were some problem in some of them.

  • Sample data structure is too simple
  • Testing program did not generalize serialization systems to evaluate each of them fairly
  • Only a few serialization systems are involved
  • Testing program is not extendable to involve more serialization systems

That's why this testing program and article were written, providing another option to build your own Java Serialization Systems evaluation.

1.1.涉及的序列化体系

Serialization systems involved

  • JDK bulit-in
  • Protobuf
  • Hessian2
  • Kryo
  • Fastjson
  • Jackson
  • Gson

1.2.测试结果关注点

Testing results to concern

  • 序列化速度

Speed of serialization

  • 反序列化速度(pending)

Speed of deserialization

  • 序列化后所占用的字节空间

Space cost after serialization

1.3.泛化处理

Generalization

Protobuf协议需要对消息定义执行静态编译,JDK built-in序列化协议需要被序列化对象实现java.io.Serializable接口。而其他框架都是运行时动态对任意Java Object进行序列化。为了能在同一个基准上进行比较,需要定义泛化约束如下。

Static compilation is required for Protocol Buffers message definitions, and JDK built-in serialization protocol requires Objects to implement java.io.Serializable, while others serialize any Plain Java Object dynamically. Constraints were defined to generalize all serialization systems.

1.3.1.结构泛化

Structure Generalization

  • 测试所用的领域对象必须与.proto文件预定义的消息结构相同,并提供转换器与.proto文件预定义的消息相互转化。
  • All domain objects should have the same structure defined by .proto, and provides converters to convert POJOs and protobuf messages back and forth.
  • 测试所用的领域对象必须实现java.io.Serializable接口
  • All domain objects should implement java.io.Serializable

1.3.2.输入泛化

Input Generalization

运行同一轮基准测试时,所有序列化框架输入的数据相同,循环次数相同。

Use exactly the same input for each Serialization System and loop exactly the same times for the same benchmark testing.

1.3.3.如何构建和运行

How to build and run

构建测试程序

  • 进入构建目录 cd master
  • 全量构建 mvn clean install

Build The Testing Program

  • Enter building directory cd master
  • Startover building mvn clean install

运行测试程序

  • 进入benchmark目录 cd benchmark
  • 开始运行 java -jar target/benchmark-<version>.jar
  • 在${user.home}/benchmark.log 查看输出、日志

Run The Testing Program

  • Enter benchmark directory cd benchmark
  • Start running by typing java -jar target/benchmark-<version>.jar
  • Checkout logs in ${user.home}/benchmark.log

2.测试程序设计

Testing Program Designing

2.1.测试样本对象

Samples Testing Models

为了满足测试的多样性,较全面测试空间和时间性能,测试样本对象当满足如下条件。

Testing objects are supposed to satisfy requirements mentioned below, so that space/speed performances are better evaluated.

  • C1-01 测试样本对象的内容采用随机生成
  • C1-01 Testing objects and the properties of them are created randomly
  • C1-02 数据类型使用上至少包括整数、字符串、浮点数和枚举
  • C1-02 Testing objects should have integer/string/float/enum properties All of these types are mandatory.
  • C1-03 至少使用一个集合类型
  • C1-03 Testing objects are supposed to hold at least 1 collection property
  • C1-04 测试样本对象应当有相互引用的结构
  • C1-04 Testing objects are supposed to refer to each other

2.2.序列化对象

Serialized Object

序列化对象是普通Java对象的包装,满足如下条件。

Serialized Objects are wrappers of POJOs, are supposed to satisfy requirements mentioned below.

  • C2-01 接受一个普通Java对象作为初始化对象
  • C2-01 Accepts a POJO for initialization
  • C2-02 提供返回值为byte[]类型的无参方法获取序列化后的字节流
  • C2-02 Provides a byte[] method without args for accessing serialized byte array
  • C2-03 提供返回值为int类型的无参方法获取字节流长度
  • C2-03 Provides a int method without args for accessing the length of byte array
  • C2-04 提供返回值为String的无参方法获取序列化后字节流的UTF-8字符串形态
  • C2-04 Provides a String method without args for accessing the UTF-8 form of byte array
  • C2-05 提供返回值为String的无参方法获取序列化后字节流的Base64字符串形态
  • C2-05 Provides a String method without args for accessing the Base64 form of byte array
  • C2-06 提供返回值与初始化对象相同无参方法对序列化后的字节流反序列化
  • C2-06 Provides method without args returning the same type as accepted POJO, which is deserialized from the byte array
  • C2-07 C2-06所提及的方法不能直接返回C2-01传入的对象
  • C2-07 The method required by C2-06 shall not return the POJO accepted by C2-01
  • C2-08 序列化对象应当是不可变对象
    • 不提供任何set*,add*等会改变对象状态的方法
    • C2-02所提供的方法应当进行保护性复制
  • C2-08 Serialized Object is supposed to be IMMUTABLE
    • Provide no mutators that changes the object status, like set*, add*
    • Method defined by C2-02 should return a defencive copy of the internal byte array

2.3.基准测试对象

Benchmark Testing Objects

2.3.1.空间基准测试对象

Space Benchmark Testing Objects

空间基准测试比较简单。只需要随机测试样本,逐个输出各序列化体系的空间占用即可。

Space benchmark testing is the simpler one. Generate samples, and record space cost of each serialization systems. That's all we have to do.

2.3.2.速度基准测试对象

Speed Benchmark Testing Objects

为了公平比较各序列化体系,定义速度基准测试对象约束如下

To be fair, the subsequent constraints are defined

  • C3-01 提供接受1个Object类型参数和1个int类型参数的方法。其中Object类型参数为待序列化对象,int类型参数为循环次数
  • C3-01 Provides a method which accepts 1 Object argument, which is to be serialized; and 1 int argument, which indicates times of looping.
  • C3-02C3-01定义的方法开始和结束时进行计时,计算总消耗时间和平均每次序列化的时间
  • C3-02 Calculate elapsed time of the method defined by C3-01, and average elapsed time of each serialization.
  • C3-03 速度基准测试对象的执行次序应当可以在运行时随意调整
  • C3-03 The execution order of each Speed Benchmark Objects are able to be adjusted at runtime, freely.

3.测试程序实现

Implementing Testing Program

3.1.SerializedObject

  • SerializedObject 是所有序列化对象的基类,根据 2.2的要求实现.
  • SerializedObject 的子类告诉其父类如何把所包装对象序列化成字节流.
  • SerializedObject 的子类告诉其父类如何把字节流反序列化成对象.
  • SerializedObject 的子类可通过实现 beforeSerilize()方法初始化序列化过程中需要用到的工具.
  • SerializedObject 在序列化过程中捕捉的受检异常都会被包装到SerializationException重新抛出.
  • SerializedObject 提供了工厂方法初始化其子类,其子类的构造函数都是package-private的。
  • SerializedObject is the base class of all serialized objects, which complies with 2.2.
  • Sub-types of SerializedObject tells their super class how to serialize the wrapped object.
  • Sub-types of SerializedObject tells their super class how to deserialize from the byte array.
  • Sub-types of SerializedObject are allowed to implement beforeSerilize() to initiate the internal utilities.
  • Checked exception of serialization procedure inside SerializedObject are wrapped and rethrown by SerializationException.
  • SerializedObject provides factory method to initialize it's known sub-types, since the constructor of which are package-private.

3.1.1.Hessian2

Hessian2SerializedObject需要额外的配置,用以指定自定义的序列化和反序列化策略。相应的配置放在META-INF目录下面。

Hessian2SerializedObjectrequires extra configuration under META-INF, which specifies custom serializers.

3.2.Benchmark Interface

是速度基准测试接口

which is a Speed Benchmark Interface

  • Benchmark 接口根据2.3.2定义了单次基准测试的执行方法
  • Benchmark defined method for benchmark testing, complies with 2.3.2.
  • Benchmark 的执行计时通过ProfilingAspect拦截实现
  • Benchmark executions are intercepted by ProfilingAspect, for elapsed time calculation.
  • ProfilingAspect 的总耗时单位是毫秒,单次调用平均耗时单位为微秒。
  • ProfilingAspect records total elapsed time in Milliseconds, and average elapsed time of a single call in Microseconds.

3.3.SpeedBenchmarks

  • 组合所有 Benchmark 已知的接口的实现
  • 对所有Benchmark实现分别执行1,000, 5,000, 20,000, 50,000, 200,000次
  • 定义执行Benchmark Testing的线程池并管理之
  • Arranges known Benchmark implementations.
  • Run each Benchmark implementation for 1,000, 5,000, 20,000, 50,000, 200,000 times.
  • Define thread pool which executes Benchmark Testing and manage its lifecycle.

3.4.自动生成的代码

Protocol Buffers消息对象需要通过静态编译预生成. 同时为避免冗长的代码,测试程序使用了lombok。如果你导入代码到IDE时发现缺少了相应的类或者库,请先到master目录运行mvn clean install,然后再重新导入代码。

Protocol Buffers messages requires static compilation. Moreover, the testing program introduced lombok. If you see any required classes or dependencies are missing after importing into IDE, checkout the masterdirectory and run mvn clean insall first, and re-import the testing program after that.

3.5.Testing Models

TestingModels是样本测试数据生成器,可随机生成被测试的样本对象及枚举值。测试样本类型由lombok编译器生成。无论编译与否,原文件在message/testing-models/src/main/lombok目录下找到。

TestingModels is the sample testing object provider, which generates samples testing objects and enums randomly. Sample testing model types are generated by lombok automatically. The original source can be found under message/testing-models/src/main/lombok even before compilation.

3.6.Package io.demo.message.domain.proto

io.demo.message.domain.proto包含2种类型

  • Protobuf编译器生成的消息类,编译后可在message/testing-models/target/generated-sources/protobuf/java找到。
  • Protobuf消息类和测试样本类之间的转换类。

2 kinds of classes are underio.demo.message.domain.proto

  • Messageclasses generated by Protobuf compiler, which can be found under message/testing-models/target/generated-sources/protobuf/java after compilation.
  • Converters transforms Testing Models and Protobuf messages back and forth

4.如何扩展测试程序

How to extend the Testing Program

Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: You must give any other recipients of the Work or Derivative Works a copy of this License; and You must cause any modified files to carry prominent notices stating that You changed the files; and You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "{}" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright 2018 lijinting01 Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

简介

Benchmark and usage of message serialization/deserialization. 展开 收起
Java
Apache-2.0
取消

发行版

暂无发行版

贡献者

全部

近期动态

加载更多
不能加载更多了
Java
1
https://gitee.com/lijinting01/message.git
git@gitee.com:lijinting01/message.git
lijinting01
message
message
master

搜索帮助