1 ==================================================
2 memaslap - Load testing and benchmarking a server
3 ==================================================
16 .. envvar:: MEMCACHED_SERVERS
23 :program:`memaslap` is a load generation and benchmark tool for memcached
24 servers. It generates configurable workload such as threads, concurrency,
25 connections, run time, overwrite, miss rate, key size, value size, get/set
26 proportion, expected throughput, and so on. Furthermore, it also tests data
27 verification, expire-time verification, UDP, binary protocol, facebook test,
28 replication test, multi-get and reconnection, etc.
30 Memaslap manages network connections like memcached with
31 libevent. Each thread of memaslap is bound with a CPU core, all
32 the threads don't communicate with each other, and there are several socket
33 connections in each thread. Each connection keeps key size distribution,
34 value size distribution, and command distribution by itself.
36 You can specify servers via the :option:`memslap --servers` option or via the
37 environment variable :envvar:`MEMCACHED_SERVERS`.
45 Memslap is developed to for the following purposes:
48 Manages network connections with libevent asynchronously.
52 Set both TCP and UDP up to use non-blocking IO.
56 Improves parallelism: higher performance in multi-threads environments.
60 Improves time efficiency: faster processing speed.
64 Generates key and value more efficiently; key size distribution and value size distribution are configurable.
68 Supports get, multi-get, and set commands; command distribution is configurable.
72 Supports controllable miss rate and overwrite rate.
76 Supports data and expire-time verification.
80 Supports dumping statistic information periodically.
84 Supports thousands of TCP connections.
88 Supports binary protocol.
92 Supports facebook test (set with TCP and multi-get with UDP) and replication test.
102 Effective implementation of network.
103 ____________________________________
106 For memaslap, both TCP and UDP use non-blocking network IO. All
107 the network events are managed by libevent as memcached. The network module
108 of memaslap is similar to memcached. Libevent can ensure
109 memaslap can handle network very efficiently.
112 Effective implementation of multi-threads and concurrency
113 _________________________________________________________
116 Memslap has the similar implementation of multi-threads to
117 memcached. Memslap creates one or more self-governed threads;
118 each thread is bound with one CPU core if the system tests setting CPU
121 In addition, each thread has a libevent to manage the events of the network;
122 each thread has one or more self-governed concurrency; and each
123 concurrency has one or more socket connections. All the concurrent tasks don't
124 communicate with each other even though they are in the same thread.
126 Memslap can create thousands of socket connections, and each
127 concurrency has tens of socket connections. Each concurrency randomly or
128 sequentially selects one socket connection from its socket connection pool
129 to run, so memaslap can ensure each concurrency handles one
130 socket connection at any given time. Users can specify the number of
131 concurrency and socket connections of each concurrency according to their
135 Effective implementation of generating key and value
136 ____________________________________________________
139 In order to improve time efficiency and space efficiency,
140 memaslap creates a random characters table with 10M characters. All the
141 suffixes of keys and values are generated from this random characters table.
143 Memslap uses the offset in the character table and the length
144 of the string to identify a string. It can save much memory.
145 Each key contains two parts, a prefix and a suffix. The prefix is an
146 uint64_t, 8 bytes. In order to verify the data set before,
147 memaslap need to ensure each key is unique, so it uses the prefix to identify
148 a key. The prefix cannot include illegal characters, such as '\r', '\n',
149 '\0' and ' '. And memaslap has an algorithm to ensure that.
151 Memslap doesn't generate all the objects (key-value pairs) at
152 the beginning. It only generates enough objects to fill the task window
153 (default 10K objects) of each concurrency. Each object has the following
154 basic information, key prefix, key suffix offset in the character table, key
155 length, value offset in the character table, and value length.
157 In the work process, each concurrency sequentially or randomly selects an
158 object from the window to do set operation or get operation. At the same
159 time, each concurrency kicks objects out of its window and adds new object
163 Simple but useful task scheduling
164 _________________________________
167 Memslap uses libevent to schedule all concurrent tasks of
168 threads, and each concurrency schedules tasks based on the local task
169 window. Memslap assumes that if each concurrency keeps the same
170 key distribution, value distribution and commands distribution, from
171 outside, memaslap keeps all the distribution as a whole.
172 Each task window includes a lot of objects, each object stores its basic
173 information, such as key, value, expire time, and so on. At any time, all
174 the objects in the window keep the same and fixed key and value
175 distribution. If an object is overwritten, the value of the object will be
176 updated. Memslap verifies the data or expire-time according to
177 the object information stored in the task window.
179 Libevent selects which concurrency to handle based on a specific network
180 event. Then the concurrency selects which command (get or set) to operate
181 based on the command distribution. If it needs to kick out an old object and
182 add a new object, in order to keep the same key and value distribution, the
183 new object must have the same key length and value length.
185 If memcached server has two cache layers (memory and SSD), running
186 memaslap with different window sizes can get different cache
187 miss rates. If memaslap adds enough objects into the windows at
188 the beginning, and the cache of memcached cannot store all the objects
189 initialized, then memaslap will get some objects from the second
190 cache layer. It causes the first cache layer to miss. So the user can
191 specify the window size to get the expected miss rate of the first cache
195 Useful implementation of multi-servers , UDP, TCP, multi-get and binary protocol
196 ________________________________________________________________________________
199 Because each thread is self-governed, memaslap can assign
200 different threads to handle different memcached servers. This is just one of
201 the ways in which memaslap tests multiple servers. The only
202 limitation is that the number of servers cannot be greater than the number
203 of threads. The other way to test multiple servers is for replication
204 test. Each concurrency has one socket connection to each memcached server.
205 For the implementation, memaslap can set some objects to one
206 memcached server, and get these objects from the other servers.
208 By default, Memslap does single get. If the user specifies
209 multi-get option, memaslap will collect enough get commands and
210 pack and send the commands together.
212 Memslap tests both the ASCII protocol and binary protocol,
213 but it runs on the ASCII protocol by default.
214 Memslap by default runs on the TCP protocol, but it also
215 tests UDP. Because UDP is unreliable, dropped packages and out-of-order
216 packages may occur. Memslap creates a memory buffer to handle
217 these problems. Memslap tries to read all the response data of
218 one command from the server and reorders the response data. If some packages
219 get lost, the waiting timeout mechanism can ensure half-baked packages will
220 be discarded and the next command will be sent.
229 Below are some usage samples:
232 memaslap -s 127.0.0.1:11211 -S 5s
236 memaslap -s 127.0.0.1:11211 -t 2m -v 0.2 -e 0.05 -b
240 memaslap -s 127.0.0.1:11211 -F config -t 2m -w 40k -S 20s -o 0.2
244 memaslap -s 127.0.0.1:11211 -F config -t 2m -T 4 -c 128 -d 20 -P 40k
248 memaslap -s 127.0.0.1:11211 -F config -t 2m -d 50 -a -n 40
252 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m
256 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m -p 2
260 The user must specify one server at least to run memaslap. The
261 rest of the parameters have default values, as shown below:
263 Thread number = 1 Concurrency = 16
265 Run time = 600 seconds Configuration file = NULL
267 Key size = 64 Value size = 1024
269 Get/set = 9:1 Window size = 10k
271 Execute number = 0 Single get = true
273 Multi-get = false Number of sockets of each concurrency = 1
275 Reconnect = false Data verification = false
277 Expire-time verification = false ASCII protocol = true
279 Binary protocol = false Dumping statistic information periodically = false
281 Overwrite proportion = 0% UDP = false
283 TCP = true Limit throughput = false
285 Facebook test = false Replication test = false
288 Key size, value size and command distribution.
289 ______________________________________________
292 All the distributions are read from the configuration file specified by user
293 with "—cfg_cmd" option. If the user does not specify a configuration file,
294 memaslap will run with the default distribution (key size = 64,
295 value size = 1024, get/set = 9:1). For information on how to edit the
296 configuration file, refer to the "Configuration File" section.
298 The minimum key size is 16 bytes; the maximum key size is 250 bytes. The
299 precision of proportion is 0.001. The proportion of distribution will be
300 rounded to 3 decimal places.
302 The minimum value size is 1 bytes; the maximum value size is 1M bytes. The
303 precision of proportion is 0.001. The proportion of distribution will be
304 rounded to 3 decimal places.
305 Currently, memaslap only tests set and get commands. And it
306 testss 100% set and 100% get. For 100% get, it will preset some objects to
310 Multi-thread and concurrency
311 ____________________________
314 The high performance of memaslap benefits from the special
315 schedule of thread and concurrency. It's important to specify the proper
316 number of them. The default number of threads is 1; the default number of
317 concurrency is 16. The user can use "—threads" and "--concurrency" to
318 specify these variables.
320 If the system tests setting CPU affinity and the number of threads
321 specified by the user is greater than 1, memaslap will try to
322 bind each thread to a different CPU core. So if you want to get the best
323 performance memaslap, it is better to specify the number of
324 thread equal to the number of CPU cores. The number of threads specified by
325 the user can also be less or greater than the number of CPU cores. Because
326 of the limitation of implementation, the number of concurrencies could be
327 the multiple of the number of threads.
329 1. For 8 CPU cores system
333 --threads=2 --concurrency=128
335 --threads=8 --concurrency=128
337 --threads=8 --concurrency=256
339 --threads=12 --concurrency=144
341 2. For 16 CPU cores system
345 --threads=8 --concurrency=128
347 --threads=16 --concurrency=256
349 --threads=16 --concurrency=512
351 --threads=24 --concurrency=288
353 The memaslap performs very well, when
354 used to test the performance of memcached servers.
355 Most of the time, the bottleneck is the network or
356 the server. If for some reason the user wants to
357 limit the performance of memaslap, there
358 are two ways to do this:
360 Decrease the number of threads and concurrencies.
361 Use the option "--tps" that memaslap
362 provides to limit the throughput. This option allows
363 the user to get the expected throughput. For
364 example, assume that the maximum throughput is 50
365 kops/s for a specific configuration, you can specify
366 the throughput equal to or less than the maximum
367 throughput using "--tps" option.
374 Most of the time, the user does not need to specify the window size. The
375 default window size is 10k. For Schooner Memcached, the user can specify
376 different window sizes to get different cache miss rates based on the test
377 case. Memslap testss cache miss rate between 0% and 100%.
378 If you use this utility to test the performance of Schooner Memcached, you
379 can specify a proper window size to get the expected cache miss rate. The
380 formula for calculating window size is as follows:
382 Assume that the key size is 128 bytes, and the value size is 2048 bytes, and
385 1. Small cache cache_size=1M, 100% cache miss (all data get from SSD).
390 (1). cache miss rate 0%
394 (2). cache miss rate 5%
400 (1). cache miss rate 0%
410 The formula for calculating window size for cache miss rate 0%:
412 cache_size / concurrency / (key_size + value_size) \* 0.5
414 The formula for calculating window size for cache miss rate 5%:
416 cache_size / concurrency / (key_size + value_size) \* 0.7
423 Memslap testss both data verification and expire-time
424 verification. The user can use "--verify=" or "-v" to specify the proportion
425 of data verification. In theory, it testss 100% data verification. The
426 user can use "--exp_verify=" or "-e" to specify the proportion of
427 expire-time verification. In theory, it testss 100% expire-time
428 verification. Specify the "--verbose" options to get more detailed error
431 For example: --exp_verify=0.01 –verify=0.1 , it means that 1% of the objects
432 set with expire-time, 10% of the objects gotten will be verified. If the
433 objects are gotten, memaslap will verify the expire-time and
437 multi-servers and multi-config
438 _______________________________
441 Memslap testss multi-servers based on self-governed thread.
442 There is a limitation that the number of servers cannot be greater than the
443 number of threads. Memslap assigns one thread to handle one
444 server at least. The user can use the "--servers=" or "-s" option to specify
449 --servers=10.1.1.1:11211,10.1.1.2:11212,10.1.1.3:11213 --threads=6 --concurrency=36
451 The above command means that there are 6 threads, with each thread having 6
452 concurrencies and that threads 0 and 3 handle server 0 (10.1.1.1); threads 1
453 and 4 handle server 1 (10.1.1.2); and thread 2 and 5 handle server 2
456 All the threads and concurrencies in memaslap are self-governed.
458 So is memaslap. The user can start up several
459 memaslap instances. The user can run memaslap on different client
460 machines to communicate with the same memcached server at the same. It is
461 recommended that the user start different memaslap on different
462 machines using the same configuration.
465 Run with execute number mode or time mode
466 _________________________________________
469 The default memaslap runs with time mode. The default run time
470 is 10 minutes. If it times out, memaslap will exit. Do not
471 specify both execute number mode and time mode at the same time; just
476 --time=30s (It means the test will run 30 seconds.)
478 --execute_number=100000 (It means that after running 100000 commands, the test will exit.)
481 Dump statistic information periodically.
482 ________________________________________
485 The user can use "--stat_freq=" or "-S" to specify the frequency.
491 Memslap will dump the statistics of the commands (get and set) at the frequency of every 20
494 For more information on the format of dumping statistic information, refer to "Format of Output" section.
501 The user can use "--division=" or "-d" to specify multi-get keys count.
502 Memslap by default does single get with TCP. Memslap also testss data
503 verification and expire-time verification for multi-get.
505 Memslap testss multi-get with both TCP and UDP. Because of
506 the different implementation of the ASCII protocol and binary protocol,
507 there are some differences between the two. For the ASCII protocol,
508 memaslap sends one "multi-get" to the server once. For the
509 binary protocol, memaslap sends several single get commands
510 together as "multi-get" to the server.
517 Memslap testss both UDP and TCP. For TCP,
518 memaslap does not reconnect the memcached server if socket connections are
519 lost. If all the socket connections are lost or memcached server crashes,
520 memaslap will exit. If the user specifies the "--reconnect"
521 option when socket connections are lost, it will reconnect them.
523 User can use "--udp" to enable the UDP feature, but UDP comes with some
526 UDP cannot set data more than 1400 bytes.
528 UDP is not tested by the binary protocol because the binary protocol of
529 memcached does not tests that.
531 UDP doesn't tests reconnection.
538 Set data with TCP and multi-get with UDP. Specify the following options:
540 "--facebook --division=50"
542 If you want to create thousands of TCP connections, specify the
544 "--conn_sock=" option.
546 For example: --facebook --division=50 --conn_sock=200
548 The above command means that memaslap will do facebook test,
549 each concurrency has 200 socket TCP connections and one UDP socket.
551 Memslap sets objects with the TCP socket, and multi-gets 50
552 objects once with the UDP socket.
554 If you specify "--division=50", the key size must be less that 25 bytes
555 because the UDP packet size is 1400 bytes.
562 For replication test, the user must specify at least two memcached servers.
563 The user can use "—rep_write=" option to enable feature.
567 --servers=10.1.1.1:11211,10.1.1.2:11212 –rep_write=2
569 The above command means that there are 2 replication memcached servers,
570 memaslap will set objects to both server 0 and server 1, get
571 objects which are set to server 0 before from server 1, and also get objects
572 which are set to server 1 before from server 0. If server 0 crashes,
573 memaslap will only get objects from server 1. If server 0 comes
574 back to life again, memaslap will reconnect server 0. If both
575 server 0 and server 1 crash, memaslap will exit.
578 Supports thousands of TCP connections
579 _____________________________________
582 Start memaslap with "--conn_sock=" or "-n" to enable this
583 feature. Make sure that your system can tests opening thousands of files
584 and creating thousands of sockets. However, this feature does not tests
585 reconnection if sockets disconnect.
589 --threads=8 --concurrency=128 --conn_sock=128
591 The above command means that memaslap starts up 8 threads, each
592 thread has 16 concurrencies, each concurrency has 128 TCP socket
593 connections, and the total number of TCP socket connections is 128 \* 128 =
597 Supports binary protocol
598 ________________________
601 Start memaslap with "--binary" or "-B" options to enable this
602 feature. It testss all the above features except UDP, because the latest
603 memcached 1.3.3 does not implement binary UDP protocol.
609 Since memcached 1.3.3 doesn't implement binary UDP protocol,
610 memaslap does not tests UDP. In addition, memcached 1.3.3 does not tests
611 multi-get. If you specify "--division=50" option, it just sends 50 get
612 commands together as "multi-get" to the server.
621 This section describes the format of the configuration file. By default
622 when no configuration file is specified memaslap reads the default
623 one located at ~/.memaslap.cnf.
625 Below is a sample configuration file:
630 ---------------------------------------------------------------------------
631 #comments should start with '#'
633 #start_len end_len proportion
635 #key length range from start_len to end_len
636 #start_len must be equal to or greater than 16
637 #end_len must be equal to or less than 250
638 #start_len must be equal to or greater than end_len
639 #memaslap will generate keys according to the key range
640 #proportion: indicates keys generated from one range accounts for the total
643 #example1: key range 16~100 accounts for 80%
644 # key range 101~200 accounts for 10%
645 # key range 201~250 accounts for 10%
646 # total should be 1 (0.8+0.1+0.1 = 1)
652 #example2: all keys length are 128 bytes
658 #start_len end_len proportion
660 #value length range from start_len to end_len
661 #start_len must be equal to or greater than 1
662 #end_len must be equal to or less than 1M
663 #start_len must be equal to or greater than end_len
664 #memaslap will generate values according to the value range
665 #proportion: indicates values generated from one range accounts for the
666 total generated values
668 #example1: value range 1~1000 accounts for 80%
669 # value range 1001~10000 accounts for 10%
670 # value range 10001~100000 accounts for 10%
671 # total should be 1 (0.8+0.1+0.1 = 1)
677 #example2: all value length are 128 bytes
683 #cmd_type cmd_proportion
685 #currently memaslap only testss get and set command.
691 #example: set command accounts for 50%
692 # get command accounts for 50%
693 # total should be 1 (0.5+0.5 = 1)
709 At the beginning, memaslap displays some configuration information as follows:
712 servers : 127.0.0.1:11211
732 set proportion: set_prop=0.10
736 get proportion: get_prop=0.90
747 The servers used by memaslap.
753 The number of threads memaslap runs with.
759 The number of concurrencies memaslap runs with.
765 How long to run memaslap.
771 The task window size of each concurrency.
777 The proportion of set command.
783 The proportion of get command.
787 The output of dynamic statistics is something like this:
792 ---------------------------------------------------------------------------------------------------------------------------------
794 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
795 Avg(us) Std_dev Geo_dist
796 Period 5 345826 69165 65.3 0 27 2198 203
798 Global 20 1257935 62896 71.8 0 26 3791 224
803 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
804 Avg(us) Std_dev Geo_dist
805 Period 5 38425 7685 7.3 0 42 628 240
807 Global 20 139780 6989 8.0 0 37 3790 253
812 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
813 Avg(us) Std_dev Geo_dist
814 Period 5 384252 76850 72.5 0 27 2198 207
816 Global 20 1397720 69886 79.7 0 26 3791 227
818 ---------------------------------------------------------------------------------------------------------------------------------
829 Statistics information of get command
835 Statistics information of set command
841 Statistics information of both get and set command
847 Result within a period
865 Throughput, operations/second
877 How many objects can't be gotten
883 The minimum response time
889 The maximum response time
895 The average response time
901 Standard deviation of response time
907 Geometric distribution based on natural exponential function
911 At the end, memaslap will output something like this:
916 ---------------------------------------------------------------------------------------------------------------------------------
917 Get Statistics (1257956 events)
925 8: 484890 459823 12543 824
928 Set Statistics (139782 events)
936 8: 50784 65574 2064 167
939 Total Statistics (1397738 events)
947 8: 535674 525397 14607 991
957 written_bytes: 242516030
958 read_bytes: 1003702556
959 object_bytes: 152086080
964 Run time: 20.0s Ops: 1397754 TPS: 69817 Net_rate: 59.4M/s
965 ---------------------------------------------------------------------------------------------------------------------------------
976 Get statistics of response time
982 Set statistics of response time
988 Both get and set statistics of response time
994 The accumulated and minimum response time
1000 The accumulated and maximum response time
1006 The accumulated and average response time
1012 Standard deviation of response time
1018 Geometric distribution based on logarithm 2
1024 Total get commands done
1030 Total set commands done
1036 How many objects can't be gotten from server
1042 How many objects need to verify but can't get them
1048 How many objects with insistent value
1054 How many objects are expired but we get them
1060 How many objects are unexpired but we can't get them
1084 How many UDP packages are disorder
1090 How many UDP packages are lost
1096 How many times UDP time out happen
1114 Throughput, operations/second
1120 The average rate of network
1132 List one or more servers to connect. Servers count must be less than
1133 threads count. e.g.: --servers=localhost:1234,localhost:11211
1136 Number of threads to startup, better equal to CPU numbers. Default 8.
1139 Number of concurrency to simulate with load. Default 128.
1142 Number of TCP socks per concurrency. Default 1.
1144 -x, --execute_number=
1145 Number of operations(get and set) to execute for the
1146 given test. Default 1000000.
1149 How long the test to run, suffix: s-seconds, m-minutes, h-hours,
1150 d-days e.g.: --time=2h.
1153 Load the configure file to get command,key and value distribution list.
1156 Task window size of each concurrency, suffix: K, M e.g.: --win_size=10k.
1160 Fixed length of value.
1163 The proportion of date verification, e.g.: --verify=0.01
1166 Number of keys to multi-get once. Default 1, means single get.
1169 Frequency of dumping statistic information. suffix: s-seconds,
1170 m-minutes, e.g.: --resp_freq=10s.
1173 The proportion of objects with expire time, e.g.: --exp_verify=0.01.
1174 Default no object with expire time
1177 The proportion of objects need overwrite, e.g.: --overwrite=0.01.
1178 Default never overwrite object.
1181 Reconnect tests, when connection is closed it will be reconnected.
1184 UDP tests, default memaslap uses TCP, TCP port and UDP port of
1185 server must be same.
1188 Whether it enables facebook test feature, set with TCP and multi-get with UDP.
1191 Whether it enables binary protocol. Default with ASCII protocol.
1194 Expected throughput, suffix: K, e.g.: --tps=10k.
1197 The first nth servers can write data, e.g.: --rep_write=2.
1200 Whether it outputs detailed information when verification fails.
1203 Display this message and then exit.
1206 Display the version of the application and then exit.
1214 memaslap -s 127.0.0.1:11211 -S 5s
1216 memaslap -s 127.0.0.1:11211 -t 2m -v 0.2 -e 0.05 -b
1218 memaslap -s 127.0.0.1:11211 -F config -t 2m -w 40k -S 20s -o 0.2
1220 memaslap -s 127.0.0.1:11211 -F config -t 2m -T 4 -c 128 -d 20 -P 40k
1222 memaslap -s 127.0.0.1:11211 -F config -t 2m -d 50 -a -n 40
1224 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m
1226 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m -p 2
1234 :manpage:`memcached(1)` :manpage:`libmemcached(3)`