1 ==================================================
2 memaslap - Load testing and benchmarking a server
3 ==================================================
21 :program:`memaslap` is a load generation and benchmark tool for memcached
22 servers. It generates configurable workload such as threads, concurrencies,
23 connections, run time, overwrite, miss rate, key size, value size, get/set
24 proportion, expected throughput, and so on. Furthermore, it also testss data
25 verification, expire-time verification, UDP, binary protocol, facebook test,
26 replication test, multi-get and reconnection, etc.
28 Memaslap manages network connections like memcached with
29 libevent. Each thread of memaslap is bound with a CPU core, all
30 the threads don't communicate with each other, and there are several socket
31 connections in each thread. Each connection keeps key size distribution,
32 value size distribution, and command distribution by itself.
34 You can specify servers via the :option:`--servers` option or via the
35 environment variable :envvar:`MEMCACHED_SERVERS`.
43 Memslap is developed to for the following purposes:
46 Manages network connections with libevent asynchronously.
50 Set both TCP and UDP up to use non-blocking IO.
54 Improves parallelism: higher performance in multi-threads environments.
58 Improves time efficiency: faster processing speed.
62 Generates key and value more efficiently; key size distribution and value size distribution are configurable.
66 Supports get, multi-get, and set commands; command distribution is configurable.
70 Supports controllable miss rate and overwrite rate.
74 Supports data and expire-time verification.
78 Supports dumping statistic information periodically.
82 Supports thousands of TCP connections.
86 Supports binary protocol.
90 Supports facebook test (set with TCP and multi-get with UDP) and replication test.
100 Effective implementation of network.
101 ____________________________________
104 For memaslap, both TCP and UDP use non-blocking network IO. All
105 the network events are managed by libevent as memcached. The network module
106 of memaslap is similar to memcached. Libevent can ensure
107 memaslap can handle network very efficiently.
110 Effective implementation of multi-threads and concurrency
111 _________________________________________________________
114 Memslap has the similar implementation of multi-threads to
115 memcached. Memslap creates one or more self-governed threads;
116 each thread is bound with one CPU core if the system testss setting CPU
119 In addition, each thread has a libevent to manage the events of the network;
120 each thread has one or more self-governed concurrencies; and each
121 concurrency has one or more socket connections. All the concurrencies don’t
122 communicate with each other even though they are in the same thread.
124 Memslap can create thousands of socket connections, and each
125 concurrency has tens of socket connections. Each concurrency randomly or
126 sequentially selects one socket connection from its socket connection pool
127 to run, so memaslap can ensure each concurrency handles one
128 socket connection at any given time. Users can specify the number of
129 concurrency and socket connections of each concurrency according to their
133 Effective implementation of generating key and value
134 ____________________________________________________
137 In order to improve time efficiency and space efficiency,
138 memaslap creates a random characters table with 10M characters. All the
139 suffixes of keys and values are generated from this random characters table.
141 Memslap uses the offset in the character table and the length
142 of the string to identify a string. It can save much memory.
143 Each key contains two parts, a prefix and a suffix. The prefix is an
144 uint64_t, 8 bytes. In order to verify the data set before,
145 memaslap need to ensure each key is unique, so it uses the prefix to identify
146 a key. The prefix cannot include illegal characters, such as ‘\r’, ‘\n’,
147 ‘\0’ and ‘ ‘. And memaslap has an algorithm to ensure that.
149 Memslap doesn’t generate all the objects (key-value pairs) at
150 the beginning. It only generates enough objects to fill the task window
151 (default 10K objects) of each concurrency. Each object has the following
152 basic information, key prefix, key suffix offset in the character table, key
153 length, value offset in the character table, and value length.
155 In the work process, each concurrency sequentially or randomly selects an
156 object from the window to do set operation or get operation. At the same
157 time, each concurrency kicks objects out of its window and adds new object
161 Simple but useful task scheduling
162 _________________________________
165 Memslap uses libevent to schedule all the concurrencies of
166 threads, and each concurrency schedules tasks based on the local task
167 window. Memslap assumes that if each concurrency keeps the same
168 key distribution, value distribution and commands distribution, from
169 outside, memaslap keeps all the distribution as a whole.
170 Each task window includes a lot of objects, each object stores its basic
171 information, such as key, value, expire time, and so on. At any time, all
172 the objects in the window keep the same and fixed key and value
173 distribution. If an object is overwritten, the value of the object will be
174 updated. Memslap verifies the data or expire-time according to
175 the object information stored in the task window.
177 Libevent selects which concurrency to handle based on a specific network
178 event. Then the concurrency selects which command (get or set) to operate
179 based on the command distribution. If it needs to kick out an old object and
180 add a new object, in order to keep the same key and value distribution, the
181 new object must have the same key length and value length.
183 If memcached server has two cache layers (memory and SSD), running
184 memaslap with different window sizes can get different cache
185 miss rates. If memaslap adds enough objects into the windows at
186 the beginning, and the cache of memcached cannot store all the objects
187 initialized, then memaslap will get some objects from the second
188 cache layer. It causes the first cache layer to miss. So the user can
189 specify the window size to get the expected miss rate of the first cache
193 Useful implementation of multi-servers , UDP, TCP, multi-get and binary protocol
194 ________________________________________________________________________________
197 Because each thread is self-governed, memaslap can assign
198 different threads to handle different memcached servers. This is just one of
199 the ways in which memaslap tests multiple servers. The only
200 limitation is that the number of servers cannot be greater than the number
201 of threads. The other way to test multiple servers is for replication
202 test. Each concurrency has one socket connection to each memcached server.
203 For the implementation, memaslap can set some objects to one
204 memcached server, and get these objects from the other servers.
206 By default, Memslap does single get. If the user specifies
207 multi-get option, memaslap will collect enough get commands and
208 pack and send the commands together.
210 Memslap testss both the ASCII protocol and binary protocol,
211 but it runs on the ASCII protocol by default.
212 Memslap by default runs on the TCP protocol, but it also
213 tests UDP. Because UDP is unreliable, dropped packages and out-of-order
214 packages may occur. Memslap creates a memory buffer to handle
215 these problems. Memslap tries to read all the response data of
216 one command from the server and reorders the response data. If some packages
217 get lost, the waiting timeout mechanism can ensure half-baked packages will
218 be discarded and the next command will be sent.
227 Below are some usage samples:
230 memaslap -s 127.0.0.1:11211 -S 5s
234 memaslap -s 127.0.0.1:11211 -t 2m -v 0.2 -e 0.05 -b
238 memaslap -s 127.0.0.1:11211 -F config -t 2m -w 40k -S 20s -o 0.2
242 memaslap -s 127.0.0.1:11211 -F config -t 2m -T 4 -c 128 -d 20 -P 40k
246 memaslap -s 127.0.0.1:11211 -F config -t 2m -d 50 -a -n 40
250 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m
254 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m -p 2
258 The user must specify one server at least to run memaslap. The
259 rest of the parameters have default values, as shown below:
261 Thread number = 1 Concurrency = 16
263 Run time = 600 seconds Configuration file = NULL
265 Key size = 64 Value size = 1024
267 Get/set = 9:1 Window size = 10k
269 Execute number = 0 Single get = true
271 Multi-get = false Number of sockets of each concurrency = 1
273 Reconnect = false Data verification = false
275 Expire-time verification = false ASCII protocol = true
277 Binary protocol = false Dumping statistic information
281 Overwrite proportion = 0% UDP = false
283 TCP = true Limit throughput = false
285 Facebook test = false Replication test = false
287 Key size, value size and command distribution.
288 ______________________________________________
291 All the distributions are read from the configuration file specified by user
292 with “—cfg_cmd” option. If the user does not specify a configuration file,
293 memaslap will run with the default distribution (key size = 64,
294 value size = 1024, get/set = 9:1). For information on how to edit the
295 configuration file, refer to the “Configuration File” section.
297 The minimum key size is 16 bytes; the maximum key size is 250 bytes. The
298 precision of proportion is 0.001. The proportion of distribution will be
299 rounded to 3 decimal places.
301 The minimum value size is 1 bytes; the maximum value size is 1M bytes. The
302 precision of proportion is 0.001. The proportion of distribution will be
303 rounded to 3 decimal places.
304 Currently, memaslap only testss set and get commands. And it
305 testss 100% set and 100% get. For 100% get, it will preset some objects to
309 Multi-thread and concurrency
310 ____________________________
313 The high performance of memaslap benefits from the special
314 schedule of thread and concurrency. It’s important to specify the proper
315 number of them. The default number of threads is 1; the default number of
316 concurrency is 16. The user can use “—threads” and “--concurrency” to
317 specify these variables.
319 If the system tests setting CPU affinity and the number of threads
320 specified by the user is greater than 1, memaslap will try to
321 bind each thread to a different CPU core. So if you want to get the best
322 performance memaslap, it is better to specify the number of
323 thread equal to the number of CPU cores. The number of threads specified by
324 the user can also be less or greater than the number of CPU cores. Because
325 of the limitation of implementation, the number of concurrencies could be
326 the multiple of the number of threads.
328 1. For 8 CPU cores system
332 --threads=2 --concurrency=128
334 --threads=8 --concurrency=128
336 --threads=8 --concurrency=256
338 --threads=12 --concurrency=144
340 2. For 16 CPU cores system
344 --threads=8 --concurrency=128
346 --threads=16 --concurrency=256
348 --threads=16 --concurrency=512
350 --threads=24 --concurrency=288
352 The memaslap performs very well, when
353 used to test the performance of memcached servers.
354 Most of the time, the bottleneck is the network or
355 the server. If for some reason the user wants to
356 limit the performance of memaslap, there
357 are two ways to do this:
359 Decrease the number of threads and concurrencies.
360 Use the option “--tps” that memaslap
361 provides to limit the throughput. This option allows
362 the user to get the expected throughput. For
363 example, assume that the maximum throughput is 50
364 kops/s for a specific configuration, you can specify
365 the throughput equal to or less than the maximum
366 throughput using “--tps” option.
373 Most of the time, the user does not need to specify the window size. The
374 default window size is 10k. For Schooner Memcached, the user can specify
375 different window sizes to get different cache miss rates based on the test
376 case. Memslap testss cache miss rate between 0% and 100%.
377 If you use this utility to test the performance of Schooner Memcached, you
378 can specify a proper window size to get the expected cache miss rate. The
379 formula for calculating window size is as follows:
381 Assume that the key size is 128 bytes, and the value size is 2048 bytes, and
384 1. Small cache cache_size=1M, 100% cache miss (all data get from SSD).
389 (1). cache miss rate 0%
393 (2). cache miss rate 5%
399 (1). cache miss rate 0%
409 The formula for calculating window size for cache miss rate 0%:
411 cache_size / concurrency / (key_size + value_size) \* 0.5
413 The formula for calculating window size for cache miss rate 5%:
415 cache_size / concurrency / (key_size + value_size) \* 0.7
422 Memslap testss both data verification and expire-time
423 verification. The user can use "--verify=" or "-v" to specify the proportion
424 of data verification. In theory, it testss 100% data verification. The
425 user can use "--exp_verify=" or "-e" to specify the proportion of
426 expire-time verification. In theory, it testss 100% expire-time
427 verification. Specify the "--verbose" options to get more detailed error
430 For example: --exp_verify=0.01 –verify=0.1 , it means that 1% of the objects
431 set with expire-time, 10% of the objects gotten will be verified. If the
432 objects are gotten, memaslap will verify the expire-time and
436 multi-servers and multi-config
437 _______________________________
440 Memslap testss multi-servers based on self-governed thread.
441 There is a limitation that the number of servers cannot be greater than the
442 number of threads. Memslap assigns one thread to handle one
443 server at least. The user can use the "--servers=" or "-s" option to specify
448 --servers=10.1.1.1:11211,10.1.1.2:11212,10.1.1.3:11213 --threads=6 --concurrency=36
450 The above command means that there are 6 threads, with each thread having 6
451 concurrencies and that threads 0 and 3 handle server 0 (10.1.1.1); threads 1
452 and 4 handle server 1 (10.1.1.2); and thread 2 and 5 handle server 2
455 All the threads and concurrencies in memaslap are self-governed.
457 So is memaslap. The user can start up several
458 memaslap instances. The user can run memaslap on different client
459 machines to communicate with the same memcached server at the same. It is
460 recommended that the user start different memaslap on different
461 machines using the same configuration.
464 Run with execute number mode or time mode
465 _________________________________________
468 The default memaslap runs with time mode. The default run time
469 is 10 minutes. If it times out, memaslap will exit. Do not
470 specify both execute number mode and time mode at the same time; just
475 --time=30s (It means the test will run 30 seconds.)
477 --execute_number=100000 (It means that after running 100000 commands, the test will exit.)
480 Dump statistic information periodically.
481 ________________________________________
484 The user can use "--stat_freq=" or "-S" to specify the frequency.
490 Memslap will dump the statistics of the commands (get and set) at the frequency of every 20
493 For more information on the format of dumping statistic information, refer to “Format of Output” section.
500 The user can use "--division=" or "-d" to specify multi-get keys count.
501 Memslap by default does single get with TCP. Memslap also testss data
502 verification and expire-time verification for multi-get.
504 Memslap testss multi-get with both TCP and UDP. Because of
505 the different implementation of the ASCII protocol and binary protocol,
506 there are some differences between the two. For the ASCII protocol,
507 memaslap sends one “multi-get” to the server once. For the
508 binary protocol, memaslap sends several single get commands
509 together as “multi-get” to the server.
516 Memslap testss both UDP and TCP. For TCP,
517 memaslap does not reconnect the memcached server if socket connections are
518 lost. If all the socket connections are lost or memcached server crashes,
519 memaslap will exit. If the user specifies the “--reconnect”
520 option when socket connections are lost, it will reconnect them.
522 User can use “--udp” to enable the UDP feature, but UDP comes with some
525 UDP cannot set data more than 1400 bytes.
527 UDP is not testsed by the binary protocol because the binary protocol of
528 memcached does not tests that.
530 UDP doesn’t tests reconnection.
537 Set data with TCP and multi-get with UDP. Specify the following options:
539 "--facebook --division=50"
541 If you want to create thousands of TCP connections, specify the
543 "--conn_sock=" option.
545 For example: --facebook --division=50 --conn_sock=200
547 The above command means that memaslap will do facebook test,
548 each concurrency has 200 socket TCP connections and one UDP socket.
550 Memslap sets objects with the TCP socket, and multi-gets 50
551 objects once with the UDP socket.
553 If you specify "--division=50", the key size must be less that 25 bytes
554 because the UDP packet size is 1400 bytes.
561 For replication test, the user must specify at least two memcached servers.
562 The user can use “—rep_write=” option to enable feature.
566 --servers=10.1.1.1:11211,10.1.1.2:11212 –rep_write=2
568 The above command means that there are 2 replication memcached servers,
569 memaslap will set objects to both server 0 and server 1, get
570 objects which are set to server 0 before from server 1, and also get objects
571 which are set to server 1 before from server 0. If server 0 crashes,
572 memaslap will only get objects from server 1. If server 0 comes
573 back to life again, memaslap will reconnect server 0. If both
574 server 0 and server 1 crash, memaslap will exit.
577 Supports thousands of TCP connections
578 _____________________________________
581 Start memaslap with "--conn_sock=" or "-n" to enable this
582 feature. Make sure that your system can tests opening thousands of files
583 and creating thousands of sockets. However, this feature does not tests
584 reconnection if sockets disconnect.
588 --threads=8 --concurrency=128 --conn_sock=128
590 The above command means that memaslap starts up 8 threads, each
591 thread has 16 concurrencies, each concurrency has 128 TCP socket
592 connections, and the total number of TCP socket connections is 128 \* 128 =
596 Supports binary protocol
597 ________________________
600 Start memaslap with "--binary" or "-B" options to enable this
601 feature. It testss all the above features except UDP, because the latest
602 memcached 1.3.3 does not implement binary UDP protocol.
608 Since memcached 1.3.3 doesn't implement binary UDP protocol,
609 memaslap does not tests UDP. In addition, memcached 1.3.3 does not tests
610 multi-get. If you specify "--division=50" option, it just sends 50 get
611 commands together as “mulit-get” to the server.
620 This section describes the format of the configuration file. By default
621 when no configuration file is specified memaslap reads the default
622 one located at ~/.memaslap.cnf.
624 Below is a sample configuration file:
629 ---------------------------------------------------------------------------
630 #comments should start with '#'
632 #start_len end_len proportion
634 #key length range from start_len to end_len
635 #start_len must be equal to or greater than 16
636 #end_len must be equal to or less than 250
637 #start_len must be equal to or greater than end_len
638 #memaslap will generate keys according to the key range
639 #proportion: indicates keys generated from one range accounts for the total
642 #example1: key range 16~100 accounts for 80%
643 # key range 101~200 accounts for 10%
644 # key range 201~250 accounts for 10%
645 # total should be 1 (0.8+0.1+0.1 = 1)
651 #example2: all keys length are 128 bytes
657 #start_len end_len proportion
659 #value length range from start_len to end_len
660 #start_len must be equal to or greater than 1
661 #end_len must be equal to or less than 1M
662 #start_len must be equal to or greater than end_len
663 #memaslap will generate values according to the value range
664 #proportion: indicates values generated from one range accounts for the
665 total generated values
667 #example1: value range 1~1000 accounts for 80%
668 # value range 1001~10000 accounts for 10%
669 # value range 10001~100000 accounts for 10%
670 # total should be 1 (0.8+0.1+0.1 = 1)
676 #example2: all value length are 128 bytes
682 #cmd_type cmd_proportion
684 #currently memaslap only testss get and set command.
690 #example: set command accounts for 50%
691 # get command accounts for 50%
692 # total should be 1 (0.5+0.5 = 1)
708 At the beginning, memaslap displays some configuration information as follows:
711 servers : 127.0.0.1:11211
731 set proportion: set_prop=0.10
735 get proportion: get_prop=0.90
746 The servers used by memaslap.
752 The number of threads memaslap runs with.
758 The number of concurrencies memaslap runs with.
764 How long to run memaslap.
770 The task window size of each concurrency.
776 The proportion of set command.
782 The proportion of get command.
786 The output of dynamic statistics is something like this:
791 ---------------------------------------------------------------------------------------------------------------------------------
793 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
794 Avg(us) Std_dev Geo_dist
795 Period 5 345826 69165 65.3 0 27 2198 203
797 Global 20 1257935 62896 71.8 0 26 3791 224
802 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
803 Avg(us) Std_dev Geo_dist
804 Period 5 38425 7685 7.3 0 42 628 240
806 Global 20 139780 6989 8.0 0 37 3790 253
811 Type Time(s) Ops TPS(ops/s) Net(M/s) Get_miss Min(us) Max(us)
812 Avg(us) Std_dev Geo_dist
813 Period 5 384252 76850 72.5 0 27 2198 207
815 Global 20 1397720 69886 79.7 0 26 3791 227
817 ---------------------------------------------------------------------------------------------------------------------------------
828 Statistics information of get command
834 Statistics information of set command
840 Statistics information of both get and set command
846 Result within a period
864 Throughput, operations/second
876 How many objects can’t be gotten
882 The minimum response time
888 The maximum response time
894 The average response time
900 Standard deviation of response time
906 Geometric distribution based on natural exponential function
910 At the end, memaslap will output something like this:
915 ---------------------------------------------------------------------------------------------------------------------------------
916 Get Statistics (1257956 events)
924 8: 484890 459823 12543 824
927 Set Statistics (139782 events)
935 8: 50784 65574 2064 167
938 Total Statistics (1397738 events)
946 8: 535674 525397 14607 991
956 written_bytes: 242516030
957 read_bytes: 1003702556
958 object_bytes: 152086080
963 Run time: 20.0s Ops: 1397754 TPS: 69817 Net_rate: 59.4M/s
964 ---------------------------------------------------------------------------------------------------------------------------------
975 Get statistics of response time
981 Set statistics of response time
987 Both get and set statistics of response time
993 The accumulated and minimum response time
999 The accumulated and maximum response time
1005 The accumulated and average response time
1011 Standard deviation of response time
1017 Geometric distribution based on logarithm 2
1023 Total get commands done
1029 Total set commands done
1035 How many objects can’t be gotten from server
1041 How many objects need to verify but can’t get them
1047 How many objects with insistent value
1053 How many objects are expired but we get them
1059 How many objects are unexpired but we can’t get them
1083 How many UDP packages are disorder
1089 How many UDP packages are lost
1095 How many times UDP time out happen
1113 Throughput, operations/second
1119 The average rate of network
1131 List one or more servers to connect. Servers count must be less than
1132 threads count. e.g.: --servers=localhost:1234,localhost:11211
1135 Number of threads to startup, better equal to CPU numbers. Default 8.
1138 Number of concurrency to simulate with load. Default 128.
1141 Number of TCP socks per concurrency. Default 1.
1143 -x, --execute_number=
1144 Number of operations(get and set) to execute for the
1145 given test. Default 1000000.
1148 How long the test to run, suffix: s-seconds, m-minutes, h-hours,
1149 d-days e.g.: --time=2h.
1152 Load the configure file to get command,key and value distribution list.
1155 Task window size of each concurrency, suffix: K, M e.g.: --win_size=10k.
1159 Fixed length of value.
1162 The proportion of date verification, e.g.: --verify=0.01
1165 Number of keys to multi-get once. Default 1, means single get.
1168 Frequency of dumping statistic information. suffix: s-seconds,
1169 m-minutes, e.g.: --resp_freq=10s.
1172 The proportion of objects with expire time, e.g.: --exp_verify=0.01.
1173 Default no object with expire time
1176 The proportion of objects need overwrite, e.g.: --overwrite=0.01.
1177 Default never overwrite object.
1180 Reconnect tests, when connection is closed it will be reconnected.
1183 UDP tests, default memaslap uses TCP, TCP port and UDP port of
1184 server must be same.
1187 Whether it enables facebook test feature, set with TCP and multi-get with UDP.
1190 Whether it enables binary protocol. Default with ASCII protocol.
1193 Expected throughput, suffix: K, e.g.: --tps=10k.
1196 The first nth servers can write data, e.g.: --rep_write=2.
1199 Whether it outputs detailed information when verification fails.
1202 Display this message and then exit.
1205 Display the version of the application and then exit.
1213 memaslap -s 127.0.0.1:11211 -S 5s
1215 memaslap -s 127.0.0.1:11211 -t 2m -v 0.2 -e 0.05 -b
1217 memaslap -s 127.0.0.1:11211 -F config -t 2m -w 40k -S 20s -o 0.2
1219 memaslap -s 127.0.0.1:11211 -F config -t 2m -T 4 -c 128 -d 20 -P 40k
1221 memaslap -s 127.0.0.1:11211 -F config -t 2m -d 50 -a -n 40
1223 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m
1225 memaslap -s 127.0.0.1:11211,127.0.0.1:11212 -F config -t 2m -p 2
1233 To find out more information please check:
1234 `http://libmemcached.org/ <http://libmemcached.org/>`_
1242 Mingqiang Zhuang <mingqiangzhuang@hengtiansoft.com> (Schooner Technolgy)
1243 Brian Aker, <brian@tangent.org>
1250 :manpage:`memcached(1)` :manpage:`libmemcached(3)`