1 ==================================================
2 memaslap - Load testing and benchmarking a server
3 ==================================================
14 .. envvar:: MEMCACHED_SERVERS
19 :program:`memaslap` is a load generation and benchmark tool for memcached
20 servers. It generates configurable workload such as threads, concurrency,
21 connections, run time, overwrite, miss rate, key size, value size, get/set
22 proportion, expected throughput, and so on. Furthermore, it also tests data
23 verification, expire-time verification, UDP, binary protocol, facebook test,
24 replication test, multi-get and reconnection, etc.
26 Memaslap manages network connections like memcached with
27 libevent. Each thread of memaslap is bound with a CPU core, all
28 the threads don't communicate with each other, and there are several socket
29 connections in each thread. Each connection keeps key size distribution,
30 value size distribution, and command distribution by itself.
32 You can specify servers via the :option:`memslap --servers` option or via the
33 environment variable :envvar:`MEMCACHED_SERVERS`.
38 Memslap is developed to for the following purposes:
40 Manages network connections with libevent asynchronously.
42 Set both TCP and UDP up to use non-blocking IO.
44 Improves parallelism: higher performance in multi-threads environments.
46 Improves time efficiency: faster processing speed.
48 Generates key and value more efficiently; key size distribution and value size distribution are configurable.
50 Supports get, multi-get, and set commands; command distribution is configurable.
52 Supports controllable miss rate and overwrite rate.
54 Supports data and expire-time verification.
56 Supports dumping statistic information periodically.
58 Supports thousands of TCP connections.
60 Supports binary protocol.
62 Supports facebook test (set with TCP and multi-get with UDP) and replication test.
67 Effective implementation of network.
68 ____________________________________
70 For memaslap, both TCP and UDP use non-blocking network IO. All
71 the network events are managed by libevent as memcached. The network module
72 of memaslap is similar to memcached. Libevent can ensure
73 memaslap can handle network very efficiently.
75 Effective implementation of multi-threads and concurrency
76 _________________________________________________________
78 Memslap has the similar implementation of multi-threads to
79 memcached. Memslap creates one or more self-governed threads;
80 each thread is bound with one CPU core if the system tests setting CPU
83 In addition, each thread has a libevent to manage the events of the network;
84 each thread has one or more self-governed concurrency; and each
85 concurrency has one or more socket connections. All the concurrent tasks don't
86 communicate with each other even though they are in the same thread.
88 Memslap can create thousands of socket connections, and each
89 concurrency has tens of socket connections. Each concurrency randomly or
90 sequentially selects one socket connection from its socket connection pool
91 to run, so memaslap can ensure each concurrency handles one
92 socket connection at any given time. Users can specify the number of
93 concurrency and socket connections of each concurrency according to their
96 Effective implementation of generating key and value
97 ____________________________________________________
99 In order to improve time efficiency and space efficiency,
100 memaslap creates a random characters table with 10M characters. All the
101 suffixes of keys and values are generated from this random characters table.
103 Memslap uses the offset in the character table and the length
104 of the string to identify a string. It can save much memory.
105 Each key contains two parts, a prefix and a suffix. The prefix is an
106 uint64_t, 8 bytes. In order to verify the data set before,
107 memaslap need to ensure each key is unique, so it uses the prefix to identify
108 a key. The prefix cannot include illegal characters, such as '\r', '\n',
109 '\0' and ' '. And memaslap has an algorithm to ensure that.
111 Memslap doesn't generate all the objects (key-value pairs) at
112 the beginning. It only generates enough objects to fill the task window
113 (default 10K objects) of each concurrency. Each object has the following
114 basic information, key prefix, key suffix offset in the character table, key
115 length, value offset in the character table, and value length.
117 In the work process, each concurrency sequentially or randomly selects an
118 object from the window to do set operation or get operation. At the same
119 time, each concurrency kicks objects out of its window and adds new object
122 Simple but useful task scheduling
123 _________________________________
125 Memslap uses libevent to schedule all concurrent tasks of
126 threads, and each concurrency schedules tasks based on the local task
127 window. Memslap assumes that if each concurrency keeps the same
128 key distribution, value distribution and commands distribution, from
129 outside, memaslap keeps all the distribution as a whole.
130 Each task window includes a lot of objects, each object stores its basic
131 information, such as key, value, expire time, and so on. At any time, all
132 the objects in the window keep the same and fixed key and value
133 distribution. If an object is overwritten, the value of the object will be
134 updated. Memslap verifies the data or expire-time according to
135 the object information stored in the task window.
137 Libevent selects which concurrency to handle based on a specific network
138 event. Then the concurrency selects which command (get or set) to operate
139 based on the command distribution. If it needs to kick out an old object and
140 add a new object, in order to keep the same key and value distribution, the
141 new object must have the same key length and value length.
143 If memcached server has two cache layers (memory and SSD), running
144 memaslap with different window sizes can get different cache
145 miss rates. If memaslap adds enough objects into the windows at
146 the beginning, and the cache of memcached cannot store all the objects
147 initialized, then memaslap will get some objects from the second
148 cache layer. It causes the first cache layer to miss. So the user can
149 specify the window size to get the expected miss rate of the first cache
152 Useful implementation of multi-servers , UDP, TCP, multi-get and binary protocol
153 ________________________________________________________________________________
155 Because each thread is self-governed, memaslap can assign
156 different threads to handle different memcached servers. This is just one of
157 the ways in which memaslap tests multiple servers. The only
158 limitation is that the number of servers cannot be greater than the number
159 of threads. The other way to test multiple servers is for replication
160 test. Each concurrency has one socket connection to each memcached server.
161 For the implementation, memaslap can set some objects to one
162 memcached server, and get these objects from the other servers.
164 By default, Memslap does single get. If the user specifies
165 multi-get option, memaslap will collect enough get commands and
166 pack and send the commands together.
168 Memslap tests both the ASCII protocol and binary protocol,
169 but it runs on the ASCII protocol by default.
170 Memslap by default runs on the TCP protocol, but it also
171 tests UDP. Because UDP is unreliable, dropped packages and out-of-order
172 packages may occur. Memslap creates a memory buffer to handle
173 these problems. Memslap tries to read all the response data of
174 one command from the server and reorders the response data. If some packages
175 get lost, the waiting timeout mechanism can ensure half-baked packages will
176 be discarded and the next command will be sent.
181 Below are some usage samples:
183 memaslap -s 127.0.0.1:11211 -S 5s
185 memaslap -s 127.0.0.1:11211 -t 2m -v 0.2 -e 0.05 -b
187 memaslap -s 127.0.0.1:11211 -F config -t 2m -w 40k -S 20s -o 0.2
189 memaslap -s 127.0.0.1:11211 -F config -t 2m -T 4 -c 128 -d 20 -P 40k
191 memaslap -s 127.0.0.1:11211 -F config -t 2m -d 50 -a -n 40
193 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m
195 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m -p 2
197 The user must specify one server at least to run memaslap. The
198 rest of the parameters have default values, as shown below:
200 Thread number = 1 Concurrency = 16
202 Run time = 600 seconds Configuration file = NULL
204 Key size = 64 Value size = 1024
206 Get/set = 9:1 Window size = 10k
208 Execute number = 0 Single get = true
210 Multi-get = false Number of sockets of each concurrency = 1
212 Reconnect = false Data verification = false
214 Expire-time verification = false ASCII protocol = true
216 Binary protocol = false Dumping statistic information periodically = false
218 Overwrite proportion = 0% UDP = false
220 TCP = true Limit throughput = false
222 Facebook test = false Replication test = false
224 Key size, value size and command distribution.
225 ______________________________________________
227 All the distributions are read from the configuration file specified by user
228 with "—cfg_cmd" option. If the user does not specify a configuration file,
229 memaslap will run with the default distribution (key size = 64,
230 value size = 1024, get/set = 9:1). For information on how to edit the
231 configuration file, refer to the "Configuration File" section.
233 The minimum key size is 16 bytes; the maximum key size is 250 bytes. The
234 precision of proportion is 0.001. The proportion of distribution will be
235 rounded to 3 decimal places.
237 The minimum value size is 1 bytes; the maximum value size is 1M bytes. The
238 precision of proportion is 0.001. The proportion of distribution will be
239 rounded to 3 decimal places.
240 Currently, memaslap only tests set and get commands. And it
241 testss 100% set and 100% get. For 100% get, it will preset some objects to
244 Multi-thread and concurrency
245 ____________________________
247 The high performance of memaslap benefits from the special
248 schedule of thread and concurrency. It's important to specify the proper
249 number of them. The default number of threads is 1; the default number of
250 concurrency is 16. The user can use "—threads" and "--concurrency" to
251 specify these variables.
253 If the system tests setting CPU affinity and the number of threads
254 specified by the user is greater than 1, memaslap will try to
255 bind each thread to a different CPU core. So if you want to get the best
256 performance memaslap, it is better to specify the number of
257 thread equal to the number of CPU cores. The number of threads specified by
258 the user can also be less or greater than the number of CPU cores. Because
259 of the limitation of implementation, the number of concurrencies could be
260 the multiple of the number of threads.
262 1. For 8 CPU cores system
266 --threads=2 --concurrency=128
268 --threads=8 --concurrency=128
270 --threads=8 --concurrency=256
272 --threads=12 --concurrency=144
274 2. For 16 CPU cores system
278 --threads=8 --concurrency=128
280 --threads=16 --concurrency=256
282 --threads=16 --concurrency=512
284 --threads=24 --concurrency=288
286 The memaslap performs very well, when
287 used to test the performance of memcached servers.
288 Most of the time, the bottleneck is the network or
289 the server. If for some reason the user wants to
290 limit the performance of memaslap, there
291 are two ways to do this:
293 Decrease the number of threads and concurrencies.
294 Use the option "--tps" that memaslap
295 provides to limit the throughput. This option allows
296 the user to get the expected throughput. For
297 example, assume that the maximum throughput is 50
298 kops/s for a specific configuration, you can specify
299 the throughput equal to or less than the maximum
300 throughput using "--tps" option.
305 Most of the time, the user does not need to specify the window size. The
306 default window size is 10k. For Schooner Memcached, the user can specify
307 different window sizes to get different cache miss rates based on the test
308 case. Memslap testss cache miss rate between 0% and 100%.
309 If you use this utility to test the performance of Schooner Memcached, you
310 can specify a proper window size to get the expected cache miss rate. The
311 formula for calculating window size is as follows:
313 Assume that the key size is 128 bytes, and the value size is 2048 bytes, and
316 1. Small cache cache_size=1M, 100% cache miss (all data get from SSD).
321 (1). cache miss rate 0%
325 (2). cache miss rate 5%
331 (1). cache miss rate 0%
341 The formula for calculating window size for cache miss rate 0%:
343 cache_size / concurrency / (key_size + value_size) \* 0.5
345 The formula for calculating window size for cache miss rate 5%:
347 cache_size / concurrency / (key_size + value_size) \* 0.7
352 Memslap testss both data verification and expire-time
353 verification. The user can use "--verify=" or "-v" to specify the proportion
354 of data verification. In theory, it testss 100% data verification. The
355 user can use "--exp_verify=" or "-e" to specify the proportion of
356 expire-time verification. In theory, it testss 100% expire-time
357 verification. Specify the "--verbose" options to get more detailed error
360 For example: --exp_verify=0.01 –verify=0.1 , it means that 1% of the objects
361 set with expire-time, 10% of the objects gotten will be verified. If the
362 objects are gotten, memaslap will verify the expire-time and
365 multi-servers and multi-config
366 _______________________________
368 Memslap testss multi-servers based on self-governed thread.
369 There is a limitation that the number of servers cannot be greater than the
370 number of threads. Memslap assigns one thread to handle one
371 server at least. The user can use the "--servers=" or "-s" option to specify
376 --servers=10.1.1.1:11211,10.1.1.2:11212,10.1.1.3:11213 --threads=6 --concurrency=36
378 The above command means that there are 6 threads, with each thread having 6
379 concurrencies and that threads 0 and 3 handle server 0 (10.1.1.1); threads 1
380 and 4 handle server 1 (10.1.1.2); and thread 2 and 5 handle server 2
383 All the threads and concurrencies in memaslap are self-governed.
385 So is memaslap. The user can start up several
386 memaslap instances. The user can run memaslap on different client
387 machines to communicate with the same memcached server at the same. It is
388 recommended that the user start different memaslap on different
389 machines using the same configuration.
391 Run with execute number mode or time mode
392 _________________________________________
394 The default memaslap runs with time mode. The default run time
395 is 10 minutes. If it times out, memaslap will exit. Do not
396 specify both execute number mode and time mode at the same time; just
401 --time=30s (It means the test will run 30 seconds.)
403 --execute_number=100000 (It means that after running 100000 commands, the test will exit.)
405 Dump statistic information periodically.
406 ________________________________________
408 The user can use "--stat_freq=" or "-S" to specify the frequency.
414 Memslap will dump the statistics of the commands (get and set) at the frequency of every 20
417 For more information on the format of dumping statistic information, refer to "Format of Output" section.
422 The user can use "--division=" or "-d" to specify multi-get keys count.
423 Memslap by default does single get with TCP. Memslap also testss data
424 verification and expire-time verification for multi-get.
426 Memslap testss multi-get with both TCP and UDP. Because of
427 the different implementation of the ASCII protocol and binary protocol,
428 there are some differences between the two. For the ASCII protocol,
429 memaslap sends one "multi-get" to the server once. For the
430 binary protocol, memaslap sends several single get commands
431 together as "multi-get" to the server.
436 Memslap testss both UDP and TCP. For TCP,
437 memaslap does not reconnect the memcached server if socket connections are
438 lost. If all the socket connections are lost or memcached server crashes,
439 memaslap will exit. If the user specifies the "--reconnect"
440 option when socket connections are lost, it will reconnect them.
442 User can use "--udp" to enable the UDP feature, but UDP comes with some
445 UDP cannot set data more than 1400 bytes.
447 UDP is not tested by the binary protocol because the binary protocol of
448 memcached does not tests that.
450 UDP doesn't tests reconnection.
455 Set data with TCP and multi-get with UDP. Specify the following options:
457 "--facebook --division=50"
459 If you want to create thousands of TCP connections, specify the
461 "--conn_sock=" option.
463 For example: --facebook --division=50 --conn_sock=200
465 The above command means that memaslap will do facebook test,
466 each concurrency has 200 socket TCP connections and one UDP socket.
468 Memslap sets objects with the TCP socket, and multi-gets 50
469 objects once with the UDP socket.
471 If you specify "--division=50", the key size must be less that 25 bytes
472 because the UDP packet size is 1400 bytes.
477 For replication test, the user must specify at least two memcached servers.
478 The user can use "—rep_write=" option to enable feature.
482 --servers=10.1.1.1:11211,10.1.1.2:11212 –rep_write=2
484 The above command means that there are 2 replication memcached servers,
485 memaslap will set objects to both server 0 and server 1, get
486 objects which are set to server 0 before from server 1, and also get objects
487 which are set to server 1 before from server 0. If server 0 crashes,
488 memaslap will only get objects from server 1. If server 0 comes
489 back to life again, memaslap will reconnect server 0. If both
490 server 0 and server 1 crash, memaslap will exit.
492 Supports thousands of TCP connections
493 _____________________________________
495 Start memaslap with "--conn_sock=" or "-n" to enable this
496 feature. Make sure that your system can tests opening thousands of files
497 and creating thousands of sockets. However, this feature does not tests
498 reconnection if sockets disconnect.
502 --threads=8 --concurrency=128 --conn_sock=128
504 The above command means that memaslap starts up 8 threads, each
505 thread has 16 concurrencies, each concurrency has 128 TCP socket
506 connections, and the total number of TCP socket connections is 128 \* 128 =
509 Supports binary protocol
510 ________________________
512 Start memaslap with "--binary" or "-B" options to enable this
513 feature. It testss all the above features except UDP, because the latest
514 memcached 1.3.3 does not implement binary UDP protocol.
520 Since memcached 1.3.3 doesn't implement binary UDP protocol,
521 memaslap does not tests UDP. In addition, memcached 1.3.3 does not tests
522 multi-get. If you specify "--division=50" option, it just sends 50 get
523 commands together as "multi-get" to the server.
528 This section describes the format of the configuration file. By default
529 when no configuration file is specified memaslap reads the default
530 one located at ~/.memaslap.cnf.
532 Below is a sample configuration file:
536 ---------------------------------------------------------------------------
537 #comments should start with '#'
539 #start_len end_len proportion
541 #key length range from start_len to end_len
542 #start_len must be equal to or greater than 16
543 #end_len must be equal to or less than 250
544 #start_len must be equal to or greater than end_len
545 #memaslap will generate keys according to the key range
546 #proportion: indicates keys generated from one range accounts for the total
549 #example1: key range 16~100 accounts for 80%
550 # key range 101~200 accounts for 10%
551 # key range 201~250 accounts for 10%
552 # total should be 1 (0.8+0.1+0.1 = 1)
558 #example2: all keys length are 128 bytes
564 #start_len end_len proportion
566 #value length range from start_len to end_len
567 #start_len must be equal to or greater than 1
568 #end_len must be equal to or less than 1M
569 #start_len must be equal to or greater than end_len
570 #memaslap will generate values according to the value range
571 #proportion: indicates values generated from one range accounts for the
572 total generated values
574 #example1: value range 1~1000 accounts for 80%
575 # value range 1001~10000 accounts for 10%
576 # value range 10001~100000 accounts for 10%
577 # total should be 1 (0.8+0.1+0.1 = 1)
583 #example2: all value length are 128 bytes
589 #cmd_type cmd_proportion
591 #currently memaslap only testss get and set command.
597 #example: set command accounts for 50%
598 # get command accounts for 50%
599 # total should be 1 (0.5+0.5 = 1)
611 At the beginning, memaslap displays some configuration information as follows:
613 servers : 127.0.0.1:11211
623 set proportion: set_prop=0.10
625 get proportion: get_prop=0.90
632 The servers used by memaslap.
636 The number of threads memaslap runs with.
640 The number of concurrencies memaslap runs with.
644 How long to run memaslap.
648 The task window size of each concurrency.
652 The proportion of set command.
656 The proportion of get command.
658 The output of dynamic statistics is something like this:
662 ---------------------------------------------------------------------------------------------------------------------------------
664 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
665 Avg(us) Std_dev Geo_dist
666 Period 5 345826 69165 65.3 0 27 2198 203
668 Global 20 1257935 62896 71.8 0 26 3791 224
672 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
673 Avg(us) Std_dev Geo_dist
674 Period 5 38425 7685 7.3 0 42 628 240
676 Global 20 139780 6989 8.0 0 37 3790 253
680 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
681 Avg(us) Std_dev Geo_dist
682 Period 5 384252 76850 72.5 0 27 2198 207
684 Global 20 1397720 69886 79.7 0 26 3791 227
686 ---------------------------------------------------------------------------------------------------------------------------------
693 Statistics information of get command
697 Statistics information of set command
701 Statistics information of both get and set command
705 Result within a period
717 Throughput, operations/second
725 How many objects can't be gotten
729 The minimum response time
733 The maximum response time
737 The average response time
741 Standard deviation of response time
745 Geometric distribution based on natural exponential function
747 At the end, memaslap will output something like this:
751 ---------------------------------------------------------------------------------------------------------------------------------
752 Get Statistics (1257956 events)
760 8: 484890 459823 12543 824
763 Set Statistics (139782 events)
771 8: 50784 65574 2064 167
774 Total Statistics (1397738 events)
782 8: 535674 525397 14607 991
792 written_bytes: 242516030
793 read_bytes: 1003702556
794 object_bytes: 152086080
799 Run time: 20.0s Ops: 1397754 TPS: 69817 Net_rate: 59.4M/s
800 ---------------------------------------------------------------------------------------------------------------------------------
807 Get statistics of response time
811 Set statistics of response time
815 Both get and set statistics of response time
819 The accumulated and minimum response time
823 The accumulated and maximum response time
827 The accumulated and average response time
831 Standard deviation of response time
835 Geometric distribution based on logarithm 2
839 Total get commands done
843 Total set commands done
847 How many objects can't be gotten from server
851 How many objects need to verify but can't get them
855 How many objects with insistent value
859 How many objects are expired but we get them
863 How many objects are unexpired but we can't get them
879 How many UDP packages are disorder
883 How many UDP packages are lost
887 How many times UDP time out happen
899 Throughput, operations/second
903 The average rate of network
909 List one or more servers to connect. Servers count must be less than
910 threads count. e.g.: --servers=localhost:1234,localhost:11211
913 Number of threads to startup, better equal to CPU numbers. Default 8.
916 Number of concurrency to simulate with load. Default 128.
919 Number of TCP socks per concurrency. Default 1.
921 -x, --execute_number=
922 Number of operations(get and set) to execute for the
923 given test. Default 1000000.
926 How long the test to run, suffix: s-seconds, m-minutes, h-hours,
927 d-days e.g.: --time=2h.
930 Load the configure file to get command,key and value distribution list.
933 Task window size of each concurrency, suffix: K, M e.g.: --win_size=10k.
937 Fixed length of value.
940 The proportion of date verification, e.g.: --verify=0.01
943 Number of keys to multi-get once. Default 1, means single get.
946 Frequency of dumping statistic information. suffix: s-seconds,
947 m-minutes, e.g.: --resp_freq=10s.
950 The proportion of objects with expire time, e.g.: --exp_verify=0.01.
951 Default no object with expire time
954 The proportion of objects need overwrite, e.g.: --overwrite=0.01.
955 Default never overwrite object.
958 Reconnect tests, when connection is closed it will be reconnected.
961 UDP tests, default memaslap uses TCP, TCP port and UDP port of
965 Whether it enables facebook test feature, set with TCP and multi-get with UDP.
968 Whether it enables binary protocol. Default with ASCII protocol.
971 Expected throughput, suffix: K, e.g.: --tps=10k.
974 The first nth servers can write data, e.g.: --rep_write=2.
977 Whether it outputs detailed information when verification fails.
980 Display this message and then exit.
983 Display the version of the application and then exit.
988 memaslap -s 127.0.0.1:11211 -S 5s
990 memaslap -s 127.0.0.1:11211 -t 2m -v 0.2 -e 0.05 -b
992 memaslap -s 127.0.0.1:11211 -F config -t 2m -w 40k -S 20s -o 0.2
994 memaslap -s 127.0.0.1:11211 -F config -t 2m -T 4 -c 128 -d 20 -P 40k
996 memaslap -s 127.0.0.1:11211 -F config -t 2m -d 50 -a -n 40
998 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m
1000 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m -p 2
1007 :manpage:`memcached(1)` :manpage:`libmemcached(3)`