Thursday, August 31, 2017

In-memory sysbench, a larger server and contention - part 1

Yesterday I shared results for in-memory sysbench on a large server. Today I have more results from a similar test but with more contention. The tests yesterday used 8 tables with 1M rows/table. The test here uses 1 table with 8M rows. In some ways the results are similar to yesterday. But there are some interesting differences that I will explain in part 2 (another post).

tl;dr
  • MyRocks does worse than InnoDB 5.6.35 at low and mid concurrency for all tests except update-index. It suffers more on the read-heavy tests.
  • MyRocks and TokuDB are more competitive at mid and high concurrency than at low. MyRocks is faster than InnoDB 5.6.35 for many of the high concurrency tests.
  • InnoDB QPS at low concurrency tends to decrease after 5.6 for tests heavy on point queries but something in MySQL 5.7 made InnoDB range scans much faster.
  • InnoDB QPS at mid and high concurrency tends to increase after 5.6

Configuration

Everything is the same as explained in the previous post except I used 1 table with 8M rows here and 8 tables with 1M rows/table there. Using 1 table instead of 8 means there can be more contention in the database engine on things like the InnoDB per-index mutex. A command line is:
bash all.sh 1 8000000 180 300 180 innodb 1 0 /orig5717/bin/mysql none /sysbench.new

Results without charts

All of the data is here. Below I share the QPS ratios to compare the QPS for one engine with the QPS from InnoDB from MySQL 5.6.35. The engine is slower than InnoDB 5.6.35 when the ratio is less than 1.0.

QPS ratio:
* rocks = myrocks / inno5635
* inno = inno5717 / inno5635
* toku = toku5717 / inno5635

1 connection
rocks   inno    toku
0.460   0.946   0.195   update-inlist
0.862   0.906   0.278   update-one
1.496   1.239   0.569   update-index
0.674   0.906   0.248   update-nonindex
0.756   0.921   0.240   update-nonindex-special
0.793   0.866   0.319   delete-only
0.701   1.027   0.612   read-write.range100
0.812   1.657   1.033   read-write.range10000
0.701   1.089   0.737   read-only.range100
0.804   1.676   1.281   read-only.range10000
0.675   0.904   0.731   point-query
0.508   0.923   0.732   random-points
0.554   0.904   0.633   hot-points
0.760   0.857   0.257   insert-only
-----   -----   -----
0.754   1.059   0.562   average

8 connections
rocks   inno    toku
0.968   1.587   0.293   update-inlist
1.014   0.843   0.190   update-one
1.837   2.183   0.608   update-index
0.879   1.090   0.307   update-nonindex
0.928   1.094   0.312   update-nonindex-special
0.968   1.068   0.340   delete-only
0.722   1.045   0.560   read-write.range100
0.814   1.626   1.108   read-write.range10000
0.714   1.126   0.825   read-only.range100
0.811   1.639   1.255   read-only.range10000
0.690   0.914   0.727   point-query
0.718   1.156   0.840   random-points
0.966   1.354   0.832   hot-points
0.859   1.104   0.310   insert-only
-----   -----   -----
0.921   1.274   0.608   average

48 connections
rocks   inno    toku
1.679   3.087   0.788   update-inlist
0.982   0.979   0.231   update-one
1.222   1.986   0.606   update-index
1.379   1.947   0.886   update-nonindex
1.387   1.936   0.854   update-nonindex-special
1.189   1.876   0.578   delete-only
0.826   1.148   0.514   read-write.range100
0.840   1.316   0.953   read-write.range10000
0.743   1.112   0.740   read-only.range100
0.850   1.342   1.034   read-only.range10000
0.941   1.368   1.066   point-query
2.042   1.445   0.686   random-points
0.793   1.507   0.711   hot-points
1.820   1.605   0.692   insert-only
-----   -----   -----
1.192   1.618   0.739   average

Results with charts

Charts using the data from the previous section. For some of them I truncate the x-axis to make it easier to see differences between engines.





Wednesday, August 30, 2017

How to build MongoRocks for MongoDB 3.4

With advice from Igor I compiled MongoRocks from source for MongoDB 3.4. I previously shared how to do this for MongoDB 3.2. I did this to use a new version of RocksDB, 5.7.3 in this case, as the team continues to make RocksDB better. The instructions below are for Ubuntu 16.04. There are slightly different instructions here.

# Install many of the dependencies for MongoRocks
sudo yum install snappy-devel zlib-devel bzip2-devel lz4-devel
sudo yum install scons gcc-g++ git

# Unpack MongoDB 3.4.7 source in $MONGOSRC

# Directory in which git repos are created
mkdir ~/git

# Get MongoRocks engine for MongoDB 3.4.7
cd ~/git
git clone https://github.com/mongodb-partners/mongo-rocks.git
cd mongo-rocks
git checkout tags/r3.4.7 -b r347

# get and build RocksDB libraries for version 5.7.3
cd ~/git
git clone https://github.com/facebook/rocksdb.git
cd rocksdb
git checkout tags/v5.7.3 -b v573
DISABLE_JEMALLOC=1 USE_RTTI=1 CFLAGS=-fPIC make static_lib -j 4

# prepare source build with support for RocksDB
cd $MONGOSRC
mkdir -p src/mongo/db/modules/
ln -sf ~/git/mongo-rocks src/mongo/db/modules/rocks

# build mongod & mongo binaries
# if you need to debug and see command lines then add --debug=presub
scons CPPPATH=/home/mdcallag/git/rocksdb/include \
      LIBPATH=/home/mdcallag/git/rocksdb \
      LIBS="lz4 zstd" mongod mongo

# install mongod
mkdir -p ~/b/m347
cd ~/b/m347
mkdir data
mkdir bin
cp $MONGOSRC/build/opt/mongo/mongod bin
cp $MONGOSRC/build/opt/mongo/mongo bin

# create mongo.conf file with the text that follows. You must
# change $HOME and consider changing the value for cacheSizeGB
---
processManagement:
  fork: true
systemLog:
  destination: file
  path: $HOME/b/m347/log
  logAppend: true
storage:
  syncPeriodSecs: 600
  dbPath: $HOME/b/m347/data
  journal:
    enabled: true
operationProfiling.slowOpThresholdMs: 2000
replication.oplogSizeMB: 4000
storage.rocksdb.cacheSizeGB: 4
---

# start mongod, consider using numactl --interleave=all
bin/mongod --config mongo.conf --master --storageEngine rocksdb

# confirm RocksDB is there
$ head -1 data/db/LOG

In-memory sysbench and a large server

Today I share worst-case performance results for MyRocks -- in-memory sysbench and a small database. I like MyRocks because it reduces space and write amplification, but I don't show results for that here. Besides, there isn't much demand for better compression from such a small database. This is part 1 with results from a large server.

tl;dr
  • MyRocks does worse than InnoDB 5.6.35 at low and mid concurrency for all tests except update-index. It suffers more on the read-heavy tests.
  • MyRocks and TokuDB are more competitive at mid and high concurrency than at low
  • InnoDB QPS at low concurrency tends to decrease after 5.6
  • InnoDB QPS at mid and high concurrency tends to increase after 5.6

Configuration

I use my sysbench fork and helper scripts, release specific my.cnf files and a server with 48 HW threads, fast SSD and 256gb of RAM. The binlog was enabled and sync-on-commit was disabled for the binlog and database log. I remembered to disable SSL.

I tested MyRocks, TokuDB and InnoDB, with buffered IO and a 180g database cache for MyRocks/TokuDB and O_DIRECT and a 180gb buffer pool for InnoDB. The server is shared by the sysbench client and mysqld. For MyRocks I used a build from August 15 with git hash 0d76ae. For TokuDB I used Percona Server 5.7.17-12. For InnoDB I used upstream 5.6.35, 5.7.17 and 8.0.2. For InnoDB 8.0.2 I used latin1 charset and latin1_swedish_ci collation. Compression was not used for any engines. More details are in the release specific my.cnf files and I used the same my.cnf for InnoDB with 8.0.1 and 8.0.2. All mysqld use jemalloc.

The test used 8 tables with 1M rows/table. My use of sysbench is explained here. Tests are run in an interesting pattern -- load, write-heavy, read-only, insert-only. On the large server each test is run for 1, 2, 4, 8, 16, 24, 32, 40, 48 and 64 concurrent connections for either 3 or 5 minutes per concurrency level. So each test runs for either 30 or 50 minutes total and I hope that is long enough to get the database into a steady state. An example command line to run the test with my helper scripts is:
bash all.sh 8 1000000 180 300 180 innodb 1 0 /orig5717/bin/mysql none /sysbench.new

Results without charts

All of the data is here. Below I share the QPS ratios to compare the QPS for one engine with the QPS from InnoDB from MySQL 5.6.35. The engine is slower than InnoDB 5.6.35 when the ratio is less than 1.0. Things that I see include:
  • MyRocks does worse than InnoDB 5.6.35 at low and mid concurrency for all tests except update-index. It suffers more on the read-heavy tests.
  • MyRocks and TokuDB are more competitive at mid and high concurrency than at low
  • InnoDB QPS at low concurrency tends to decrease after 5.6
  • InnoDB QPS at mid and high concurrency tends to increase after 5.6

QPS ratio:
* rocks = myrocks / inno5635
* inno = inno5717 / inno5635
* toku = toku5717 / inno5635

1 connection
rocks   inno    toku
0.476   0.959   0.201   update-inlist
0.828   0.923   0.278   update-one
1.146   1.090   0.458   update-index
0.648   0.901   0.243   update-nonindex
0.740   0.898   0.236   update-nonindex-special
0.836   0.917   0.347   delete-only
0.711   1.041   0.586   read-write.range100
0.809   1.671   1.038   read-write.range10000
0.664   1.096   0.753   read-only.range100
0.801   1.657   1.263   read-only.range10000
0.641   0.905   0.735   point-query
0.442   0.923   0.753   random-points
0.480   0.900   0.694   hot-points
0.742   0.855   0.259   insert-only
-----   -----   -----
0.711   1.052   0.560   average of the above

8 connections
rocks   inno    toku
0.966   1.611   0.308   update-inlist
0.871   1.029   0.276   update-one
1.201   1.467   0.417   update-index
0.858   1.093   0.315   update-nonindex
0.898   1.090   0.314   update-nonindex-special
0.949   1.058   0.338   delete-only
0.710   1.039   0.534   read-write.range100
0.811   1.621   1.128   read-write.range10000
0.675   1.098   0.851   read-only.range100
0.810   1.639   1.263   read-only.range10000
0.648   0.910   0.746   point-query
0.541   1.097   0.931   random-points
0.754   1.317   1.037   hot-points
0.776   1.028   0.286   insert-only
-----   -----   -----
0.819   1.221   0.625   average of the above

48 connections
rocks   inno    toku
1.649   3.127   0.922   update-inlist
0.760   1.193   0.372   update-one
1.316   2.236   0.700   update-index
1.360   1.982   0.937   update-nonindex
1.374   1.965   0.995   update-nonindex-special
1.126   1.845   0.566   delete-only
0.804   1.129   0.507   read-write.range100
0.838   1.310   0.956   read-write.range10000
0.711   1.098   0.866   read-only.range100
0.823   1.305   1.034   read-only.range10000
0.932   1.347   1.084   point-query
1.417   2.920   2.248   random-points
1.840   3.226   2.350   hot-points
1.096   1.927   0.567   insert-only
-----   -----   -----
1.146   1.901   1.007   average of the above

Results with charts

The charts below have the same data from the previous section - the QPS for the engine relative to the QPS for InnoDB from MySQL 5.6.35.

Tuesday, August 29, 2017

IO-bound sysbench on a smaller server

This is part 3 of my performance report on IO-bound sysbench. In part 1 and part 2 I used a large server with 48 HW threads. Here I use a core i5 NUC with 4 HW threads.

tl;dr
  • results are similar to the first post
  • MyRocks is competitive on the write-heavy tests
  • MyRocks is slower on the read-heavy tests
  • I don't attempt to show that MyRocks wins on space and write efficiency and that is why I like it

Configuration

I use my sysbench fork and helper scripts, release specific my.cnf files. The server is a core i5 NUC with 4 HW threads, 16gb of RAM and a fast SSD. The binlog was enabled and sync-on-commit was disabled for the binlog and database log. I remembered to disable SSL.

I tested MyRocks and InnoDB, with buffered IO and a 4g block cache for MyRocks and O_DIRECT and a 12gb buffer pool for InnoDB. The server is shared by the sysbench client and mysqld. For MyRocks I used a build from August 15 with git hash 0d76ae. For InnoDB I used upstream 5.6.35, 5.7.17 and 8.0.2. For InnoDB 8.0.2 I used latin1 charset and latin1_swedish_ci collation. Compression was not used for InnoDB or MyRocks. Note that this description is correct where it disagrees with the my.cnf files that I shared.

The test used 4 tables with 80M rows/table. My use of sysbench is explained here. Tests are run in an interesting pattern -- load, write-heavy, read-only, insert-only. On the core i5 NUC each test is run for 1 and then 2 concurrent connections for either 5 or 10 minutes per concurrency level. So each test runs for either 10 or 20 minutes total and I hope that is long enough to get the database into a steady state. An example command line to run the test with my helper scripts is:
bash all.sh 4 80000000 600 600 300 innodb 1 0 /orig5717/bin/mysql none /sysbench.new

Results without charts

All of the data is here. The results here are mostly similar to the results from the large server however MyRocks does worse here on some of the read-heavy tests. The QPS ratio for point-query is 0.651 for MyRocks here versus 0.850 on the large server. Note that the hot-points workload is not IO-bound, regardless we need to make MyRocks more effective on it. MySQL did something to make InnoDB range scans more efficient starting in 5.7. I don't know whether the problem for MyRocks is CPU or IO overhead in the range-scan heavy tests (read-write.*, read-only.*).

QPS ratio:
* rocks = myrocks.none / inno5635
* inno = inno5717 / inno5635
* value less than 1.0 means that InnoDB 5.6 is faster

1 connection
rocks   inno
2.909   1.074   update-index
1.068   1.095   update-nonindex
1.006   0.935   update-nonindex-special
1.682   0.988   delete-only
1.053   0.961   read-write.range100
0.881   1.554   read-write.range10000
0.776   1.348   read-only.range100
0.898   1.584   read-only.range10000
0.651   1.197   point-query
1.000   1.285   random-points
0.267   0.943   hot-points
0.989   0.941   insert-only

Results with charts

Sorry, no charts this time. Charts from the previous post are close enough.

Monday, August 28, 2017

Sysbench, a large server and fast SSD vs MyRocks, InnoDB and TokuDB: part 2

This provide more data using the setup in my previous post. The difference is that the previous post used 8 tables with 100M rows/table and this post uses 1 table with 800M rows/table, so this test has more opportunity for contention.

tl;dr
  • Results here are mostly similar to results from the previous test but ...
  • MyRocks does better here relative to other engines, so it suffers less from contention.
  • InnoDB and TokuDB don't do as well on this test as on the previous test so they suffer more from contention.

Results without pictures

All of the data is here. The data below is the QPS ratio comparing the QPS for MyRocks, InnoDB 5.7.17 and TokuDB 5.7.17 with InnoDB 5.6.35. A value less than one means the QPS for the engine is less than for InnoDB 5.6.35.

QPS ratio:
* rocks = myrocks.none / inno5635
* inno = inno5717 / inno5635
* toku = toku5717.none / inno5635
1 connection
rocks   inno    toku    engine/test
1.765   1.306   0.995   update-index
1.049   1.122   0.421   update-nonindex
0.842   1.020   0.351   update-nonindex-special
4.415   1.821   1.111   delete-only
0.988   1.166   0.329   read-write.range100
1.189   1.300   1.339   read-write.range10000
0.910   1.092   0.358   read-only.range100
0.980   0.966   1.246   read-only.range10000
0.862   0.950   0.257   point-query
0.949   0.949   0.271   random-points
0.515   0.885   0.606   hot-points
0.963   0.991   0.324   insert-only

8 connections
rocks   inno    toku    engine/test
5.958   2.644   2.782   update-index
1.513   1.434   0.473   update-nonindex
1.289   1.422   0.365   update-nonindex-special
1.542   0.995   0.275   delete-only
1.206   1.306   0.253   read-write.range100
1.002   1.288   1.159   read-write.range10000
1.017   1.083   0.329   read-only.range100
0.871   1.069   1.191   read-only.range10000
0.862   0.855   0.204   point-query
0.915   0.859   0.208   random-points
0.890   1.194   0.758   hot-points
0.951   1.149   0.353   insert-only

48 connections
rocks   inno    toku    engine/test
3.836   2.340   1.492   update-index
2.425   2.520   0.312   update-nonindex
2.112   2.981   0.557   update-nonindex-special
3.028   1.950   0.306   delete-only
1.790   1.643   0.122   read-write.range100
1.488   1.883   0.768   read-write.range10000
0.820   1.005   0.093   read-only.range100
0.903   1.219   0.579   read-only.range10000
0.854   0.971   0.079   point-query
0.871   0.953   0.096   random-points
0.739   1.286   0.698   hot-points
1.768   1.579   0.491   insert-only

Results with pictures

Sorry, no charts this time. Charts from the previous post are close enough.

Thursday, August 24, 2017

Sysbench, a large server and fast SSD vs MyRocks, InnoDB and TokuDB

I have new performance results to share starting with modern sysbench, a large server, fast SSD and a database that is larger than RAM. Competitive read performance with much better space and write efficiency is why I like MyRocks and this is an example of that. However, we have work to do on read efficiency that will be apparent in future perf reports.

tl;dr
  • For write-heavy workloads MyRocks competes with InnoDB 5.7 and 8.0 and does better than 5.6 at all concurrency levels. 
  • For read-heavy workloads MyRocks is usually within 10% of InnoDB 5.6.
  • For write-heavy workloads InnoDB in 5.7 and 8.0 do better than 5.6.
  • For some read-heavy workloads InnoDB has a perf regression from 5.6 to 5.7/8.0.
  • TokuDB usually has the worst QPS

Configuration

I use my sysbench fork and helper scripts, release specific my.cnf files and a server with 48 HW threads, fast SSD and 256gb of RAM although only 50gb was available to the database and OS page cache. The binlog was enabled and sync-on-commit was disabled for the binlog and database log. I remembered to disable SSL.

I tested MyRocks, TokuDB and InnoDB, with buffered IO and a 10g database cache for MyRocks/TokuDB and O_DIRECT and a 35gb buffer pool for InnoDB. The server is shared by the sysbench client and mysqld. For MyRocks I used a build from August 15 with git hash 0d76ae and repeated the test first without compression and then no compression for L0/L1/L2, LZ4 for interior levels and zstd for the max level of the leveled LSM. For TokuDB I used Percona Server 5.7.17-12 and repeated the test first without compression and then with zlib compression. For InnoDB I used upstream 5.6.35, 5.7.17 and 8.0.2. For InnoDB 8.0.2 I used latin1 charset and latin1_swedish_ci collation. Compression was not used for InnoDB. More details are in the release specific my.cnf files and I used the same my.cnf for InnoDB with 8.0.1 and 8.0.2.

The test used 8 tables with 100M rows/table. My use of sysbench is explained here. Tests are run in an interesting pattern -- load, write-heavy, read-only, insert-only. On the large server each test is run for 1, 2, 4, 8, 16, 24, 32, 40, 48 and 64 concurrent connections for either 3 or 5 minutes per concurrency level. So each test runs for either 30 or 50 minutes total and I hope that is long enough to get the database into a steady state. An example command line to run the test with my helper scripts is:
bash all.sh 8 100000000 180 300 180 innodb 1 0 /orig5717/bin/mysql none /sysbench.new

Results without pictures

I have a lot of numbers to share and struggled with the presentation. All of the data is here. I will start with relative QPS - the QPS for an engine versus the QPS for InnoDB from upstream 5.6.35. There is one column for MyRocks without compression, one for TokuDB without compression and one for InnoDB from upstream 5.7.17. I did not include columns for compressed MyRocks, compressed TokuDB and InnoDB from upstream 8.0.2 to improve readability and because MyRocks/TokuDB QPS with compression is similar to it without compression and InnoDB QPS from 8.0.2 is similar to InnoDB from 5.7.17.

When the QPS ratio is greater than one then the engine is faster than InnoDB 5.6.35. I show ratios for 1, 8 and 48 connections to compare low, medium and high concurrency workloads. Things that I notice in the results include:
  • For write-heavy workloads MyRocks competes with InnoDB 5.7 and 8.0 and does better than 5.6 at all concurrency levels.
  • For read-heavy workloads MyRocks is usually within 10% of InnoDB 5.6. 
  • For write-heavy workloads InnoDB in 5.7 and 8.0 do better than 5.6.
  • For some read-heavy workloads InnoDB has a perf regression from 5.6 to 5.7/8.0.
  • TokuDB usually has the worst QPS

QPS ratio:
* rocks = myrocks.none / inno5635
* inno = inno5717 / inno5635
* toku = toku5717.none / inno5635

1 connection
rocks   inno    toku    engine/test
2.516   1.338   1.395   update-index
1.103   1.210   0.482   update-nonindex
0.907   1.067   0.376   update-nonindex-special
1.325   1.099   0.337   delete-only
0.942   1.149   0.317   read-write.range100
1.064   1.157   1.210   read-write.range10000
0.798   0.968   0.323   read-only.range100
0.980   0.946   1.280   read-only.range10000
0.850   0.917   0.226   point-query
0.933   0.900   0.300   random-points
0.457   0.904   0.608   hot-points
0.890   1.009   0.317   insert-only

8 connections
rocks   inno    toku    engine/test
4.892   2.862   2.304   update-index
1.769   1.681   0.169   update-nonindex
1.382   1.506   0.476   update-nonindex-special
1.409   1.178   0.250   delete-only
0.989   1.082   0.202   read-write.range100
0.893   1.141   1.055   read-write.range10000
0.838   0.895   0.279   read-only.range100
0.870   1.046   1.258   read-only.range10000
0.857   0.840   0.215   point-query
0.911   0.824   0.245   random-points
0.621   1.194   0.819   hot-points
0.828   1.056   0.313   insert-only

48 connections
rocks   inno    toku    engine/test
2.294   2.223   0.817   update-index
2.685   2.726   0.393   update-nonindex
2.172   3.031   0.734   update-nonindex-special
2.830   3.054   0.283   delete-only
2.758   2.706   0.192   read-write.range100
1.113   1.394   0.577   read-write.range10000
0.810   0.996   0.101   read-only.range100
0.893   1.203   0.589   read-only.range10000
0.845   0.954   0.089   point-query
0.864   0.941   0.125   random-points
1.412   3.166   2.055   hot-points
1.111   1.855   0.579   insert-only

Results with pictures

Some people prefer graphs, so there is the relative QPS as a chart. I truncated the x-axis at 2 to make it easier see differences. The first chart is from the test with 1 connection.
At 8 connections
At 48 connections

Tuesday, August 15, 2017

MyRocks my.cnf changes from June 16 to Aug 15

There have been several recent performance improvements to MyRocks that I want to test. The last build I did was from June 16. With the my.cnf options that I use there is one new option and 3 have been removed. The new option is rocksdb_max_background_jobs. The removed options are rocksdb_base_background_compactions, rocksdb_max_background_compactions and rocksdb_max_background_flushes.