In this lesson, were going to see how we can set up a horizontally sharded database spread out across multiple EC2 instances. For this, we'll need a total of 9 instances.
vitess-main
will be the server responsible for running etcd
, vtctld
, and vtadmin
.vitess-gate-1
, vitess-gate-2
, vitess-gate-3
, vitess-gate-4
will be used to run the VTGates, as well as the scripts to run a workload on the server.vitess-shard-1
, vitess-shard-2
, vitess-shard-3
, vitess-shard-4
will run the instances of VTTablet and MySQL for a 4-shard keyspace.
Each of these needs to be configured with the same steps outline in this lesson. In addition to this, each needs a few custom things configured, which will be covered in the next few sections
First, log in to vitess-main
. This is the same server that has been used in the previous lessons. In this one, we’ll need to split up 101_initial_cluster.sh
into two parts, but differently than last time. Create a 101_before.sh
script that contains everything from 101_initial_cluster.sh
before the VTGate is started. Then create 101_after.sh
which should have source ../common/env.sh
and set up VTAdmin.
In addition to this, a few other small adjustments will need to be made to helper scripts. First, a few changes to env.sh
:
sed -i 's/ETCD_SERVER="localhost:2379"/ETCD_SERVER="${hostname}:2379"/' ../common/env.sh
sed -i 's/localhost:15999/${hostname}:15999/' ../common/env.sh
These changes replace localhost
with hostname
in a few key places, which is important for allowing this server to communicate with the other servers that will be set up. A few changes will also be needed in the script that starts up etcd
:
sed -i 's/--listen-client-urls "http:\/\/${ETCD_SERVER}"/--listen-client-urls "http:\/\/0.0.0.0:2379"/' ../common/scripts/etcd-up.sh
sed -i 's/"${cell}"/--topo-global-server-address "${ETCD_SERVER}" "${cell}"/' ../common/scripts/etcd-up.sh
This change tells etcd
to listen for connection from any IP address, and also adds the /--topo-global-server-address
argument. Also add the public IPv4 DNS address in the script that sets up the VTAdmin infrastructure. For this, add hostname="YOUR_SERVER_PUBLIC_IPv4_DNS"
right after the source ../common/env.sh
line in ../common/scripts/vtadmin-up.sh
.
A few other files to change:
- Change
201_customer_tablets.sh
script to use none
instead of semi-sync
for durability policy. This change is needed since we wont be doing any replication on the sharded keyspace. - Update
../common/vtadmin/discovery.json
to have an entry for each of the four VTGate servers, adding the IP addresses for each vitess-shard-*
server. - Modify
303_reshard.sh
and replace -80,80-
with -40,40-80,80-c0,c0-
. This is needed because we are going to be using four shards instead of two.
Complete these steps on each of the four vitess-gate-*
EC2 instances.
Open up ../common/env.sh
and change hostname=$(hostname -f)
to hostname="VITESS_MAIN_PRIVATE_IP_ADDRESS"
. Also execute the following:
sed -i 's/ETCD_SERVER="localhost:2379"/ETCD_SERVER="${hostname}:2379"/' ../common/env.sh
sed -i 's/localhost:15999/${hostname}:15999/' ../common/env.sh
sed -i '48i hostname="$(hostname -f)"' ../common/scripts/vtgate-up.sh
We next need to set up the vitess-shard-*
servers. First, add the following shard.sh
script to each server:
source ../common/env.sh
T_UID_ARG=${1}
SHARD_ARG=${2}
echo "TABLET IDENTIFIER: ${T_UID}"
echo "SHARD: ${SHARD_ARG}"
SHARD=${SHARD_ARG} CELL=zone1 KEYSPACE=customer TABLET_UID=${T_UID_ARG} ../common/scripts/mysqlctl-up.sh
sleep 2
echo "Waiting for mysqlctls to start..."
wait
echo "mysqlctls are running!"
SHARD=${SHARD_ARG} CELL=zone1 KEYSPACE=customer TABLET_UID=${T_UID_ARG} ../common/scripts/vttablet-up.sh
On each server, open ../common/env.sh
and replace the line hostname=$(hostname -f)
with hostname="PRIVATE_IP_ADDRESS_OF_ETCD_SERVER"
PRIVATE_IP_ADDRESS_OF_ETCD_SERVER
is the private IP that EC2 assigned to vitess-main
. Then, also run the following updates:
sed -i 's/ETCD_SERVER="localhost:2379"/ETCD_SERVER="${hostname}:2379"/' ../common/env.sh
sed -i 's/localhost:15999/${hostname}:15999/' ../common/env.sh
sed -i '18i hostname="$(hostname -f)"' ../common/scripts/vttablet-up.sh
sed -i "s/tablet_hostname=''/tablet_hostname=\$hostname/" ../common/scripts/vttablet-up.sh
With all of that in place, the next step is to start the cluster. To get it up-and-running, execute the following sequence of scripts:
On vitess-main
execute: bash 101_before.sh
On each of the vitess-gate-*
servers, execute: bash CELL=zone1 ../common/scripts/vtgate-up.sh
On vitess-main
execute: bash 101_after.sh
On vitess-main
execute all of the 200-level scripts in sequence
On vitess-main
execute: bash 301_customer_sharded.sh
On vitess-shard-1
execute: bash shard.sh 300 -40
On vitess-shard-2
execute: bash shard.sh 400 40-80
On vitess-shard-3
execute: bash shard.sh 500 80-c0
On vitess-shard-4
execute: bash shard.sh 600 c0-
On vitess-main
execute 303_reshard.sh
and onward
If you visit the admin panel, you should now have a fully-configured Vitess cluster. The corder
and customer
keyspace has been sharded across four shards.
Let's run a stress-test workload on the cluster to see how many QPS it is capable of processing. For this, place the following load.py
script on each of the vitess-gate-*
servers:
import threading
import time
import random
import sys
import MySQLdb
connections = []
threads = []
QUERIES = int(sys.argv[-2])
THREADS = int(sys.argv[-1])
def make_connection():
connection = MySQLdb.connect(
host='127.0.0.1',
user='root',
port=15306,
autocommit=True)
return connection
def run_load(index):
cursor = connections[index].cursor()
for i in range(QUERIES):
cursor.execute('SELECT * FROM customer WHERE customer_id = %s;', (random.randint(1, 5000),))
cursor.fetchone()
if __name__ =="__main__":
for i in range (0, THREADS):
c = make_connection()
connections.append(c)
for i in range (0, THREADS):
t = threading.Thread(target=run_load, args=(i,))
threads.append(t)
print('starting query workload')
start = time.time()
for t in threads:
t.start()
for t in threads:
t.join()
elapsed = time.time() - start
print('completed query workload in', round(elapsed,2), 'seconds')
print('QPS =', round((QUERIES*THREADS)/elapsed,2))
Then, execute the following on each:
python3 load.py 1000000 100
This command will spin up 100 connections to the Vitess cluster and execute 1 million queries per connection. Doing this across all four servers mean that 400 million queries will be executed if let run to completion.
In the example from the video, you can see that we achieved approximately 60k queries per second with this configuration. Running the workload in this was uses some of the resources on the VTGate servers for executing the workload, and thus is not totally realistic. It is also an extremely simple and small database, meaning that the queries are executing unrealistically fast.
However, it still acts as a useful benchmark to check the limits of the performance of your cluster. You can experiment by setting up clusters with more or less VTGates and VTTablets and compare performance.