HAS Performance Test Report

1. Overview

HAS is a dedicated Hadoop authentication server to support various authentication mechanisms other than just Kerberos. With HAS users can remain their familiar login methods, and new authentication mechanism could be customized and plugined.

A Hadoop cluster could have thousands of nodes, there maybe so many authentication requests are sent to HAS server at the same time. So the stability in high concurrency is so important for HAS.

2. Test Environment

The test use Alibaba Cloud Elastic Compute Service, detailed test environment like the following:

2.1 Hardware environment

  • HAS Server:

CPU:Intel(R) Xeon(R)CPU E5-2682 @ 2.50GHz
MEM: 16GB
Disk: 43GB 86GB

  • HAS Client:

CPU:Intel(R) Xeon(R)CPU E5-2682 @ 2.50GHz
MEM: 16GB
Disk: 43GB 86GB * 3

2.2 Software environment

OS: CentOS 7.2
JAVA: 1.8
HAS: 1.0.0
MySQL: 5.5.52

3. Test Method

By using login-test scripting tool, the test can be broadly divided into four steps:

  1. Add principals to HAS server

  2. Export keytab files to HAS Client

    cd HAS/has-dist         
    sh bin/login-test add <conf_dir> <work_dir> <principal_num>
    
  3. Use keytab files to login concurrently

    sh bin/login-test run <conf_dir> <work_dir> <concurrency_num>
    
  4. Record login result and the cost time of login

Testing process like the following:

testing process

4. Test Result

The test result consists of total cost time and time per request of login using keytab file.

4.1 Using Json Backend

Concurrency10050010005000800010000
ResultSuccessSuccessSuccessSuccessSuccessSuccess
Total time (ms)54011151661457163287208
Time per request (ms)5.4002.2301.6610.9140.7910.721

4.2 Using MySQL Backend

MySQL Configuration:

max connection: 5000
innodb buffer size: 8G

Concurrency10050010005000800010000
ResultSuccessSuccessSuccessSuccessSuccessSuccess
Total time (ms)76528804821127122141922968
Time per request (ms)7.6505.7604.8212.5422.6772.297

5. Conclusion

performance in different backends

Figure above demonstrates the time per request of HAS authentication in different backends and concurrency. As can be seen, HAS can complete authentication work in high concurrency, and has a good performance. So HAS is good enough for Hadoop.

The CPU utilization and network IO of HAS server are demonstrated in the appendix, with the number of concurrency up to 10000. The appendix shows that HAS server is not under heavy workload in mysql backend.

6. Appendix

  • CPU Utilization

cpu utilization

  • Network IO

network io