Skip to content

Commit 9d109fa

Browse files
NikTJ777acabrelepaulc4pchapman-nuodb
authored
Ntj/refactor and allinone (#1)
* Refactor scripts into separate script files. Mount script file or dir into each container. Incorporate Aaron's db-in-a-container. Add profiles to control which services (containers) are started. * Changes after first test runs * Rename various services and profiles. Remove profiles for the default "distributed" deployment. Move waitfor-nuoadmin code back into nuosm. Add log capture and better wait semantics to start-monolith. Add stop-nuodb script for internally triggered graceful shutdown of processes. * Improve arg parsing in stop-nuodb. * Add explicit docker-network config to all services. * Only enable nuocd-te2 in the insights profile. * Enable monolith to be scaled to multiple instances. Map ports to dynamic "ephemeral" prts on host. Remove hostname seting to allow dynamic hostname generation by docker compose. * Revert monolith to statically mapped ports. Set ENTRYPOINT and API_SERVER to localhst in monolith. Add instadb service that is db-in-a-container with dynamically-mapped ports, so multiple instances can run simultaneously. * Set LOGDIR default value in nuote. * Add --timeout option to delete server-processes command to force immediate cleanup of stranded processes after a service restart. Bump default NuoDB version to 5.0.1-2 Improve diagnostics and logging. * Reimplement all timeouts and gates with the nuodocker timeouts. Add new IMPORT_TIMEOUT variable. Add and improve logging and diagnostics. Refactor raftstate cleanup into new 'remove-zombie' script. * Inject remove-zombies into every container. * Add separate compose files for the different deployment styles. This allows commands such as: docker compose -f instadb.yaml up -d and: docker compose -f instadb.yaml down * Improve how errors are logged to console and file. Update the README with the latest options. * Update README file. * Fixed typos, formatting. Some rewording/clarifications. * Force TE internal port to be 48006, and force SM internal port to be 48007. Add clarifications to README. Fix typos in README. * More tidying * Fix a typo in the README. --------- Co-authored-by: acabrele <acabrele@nuodb.com> Co-authored-by: Paul Chapman <paulc4@users.noreply.github.com> Co-authored-by: Paul Chapman <pchapman@nuodb.com>
1 parent b232215 commit 9d109fa

File tree

12 files changed

+814
-260
lines changed

12 files changed

+814
-260
lines changed

README.md

Lines changed: 214 additions & 95 deletions
Large diffs are not rendered by default.

nuodb/docker-compose.yaml

Lines changed: 171 additions & 157 deletions
Large diffs are not rendered by default.

nuodb/env-default

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,25 +2,36 @@
22
# default ENV VAR values
33
#
44

5-
NUODB_IMAGE=nuodb/nuodb-ce:4.1.2.vee-4
5+
NUODB_IMAGE=nuodb/nuodb-ce:5.0.1-2
66

77
DB_NAME=demo
88
DB_USER=dba
99
DB_PASSWORD=dba
1010
ENGINE_MEM=1Gi
1111
SQL_ENGINE=vee
12+
LOGDIR=/var/log/nuodb
1213

13-
# Set to a larger value if SM startup takes unusually long
14-
# - for example if IMPORT_LOCAL or IMPORT_REMOTE (see below) is a large file that takes multiple minutes to extract.
15-
# Value is in seconds.
16-
STARTUP_TIMEOUT=60
14+
# docker compose restart policy.
15+
# set to one of:
16+
# - "no"
17+
# - always
18+
# - on-failure
19+
# - unless-stopped
20+
RESTART_POLICY=unless-stopped
21+
22+
# Set to a larger value if database startup takes unusually long
23+
STARTUP_TIMEOUT=90
1724

1825
# Uncomment and set, or set on the docker-compose command-line to add further engine options
1926
# ENGINE_OPTIONS=
2027

2128
# normally this is left unset, causing the default to be used.
2229
ARCHIVE_PATH=
2330

31+
# Set to a larger value if IMPORT_x is set to a large file or dir that takes multiple minutes to restore.
32+
# Value is in seconds.
33+
IMPORT_TIMEOUT=
34+
2435
# set IMPORT_LOCAL to the path of a LOCAL tar file on the host where docker-compose is being run.
2536
# The SM container will mount the file, extract (untar) it and use the contents as the initial state of the database.
2637
IMPORT_LOCAL=
@@ -39,9 +50,6 @@ IMPORT_AUTH=
3950
# can advise on any non-standard value required for IMPORT_LEVEL.
4051
IMPORT_LEVEL=1
4152

42-
# Set this to 'true' if the content to be imported is a backupset output from a hotcopy --full WITHOUT the --simple option.
43-
IMPORT_IS_BACKUPSET=false
44-
4553
# This value is not normally changed.
4654
IMPORT_MOUNT=/var/opt/nuodb/import
4755

nuodb/instadb.yaml

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
version: '3'
2+
3+
networks:
4+
instadb:
5+
6+
services:
7+
instadb:
8+
image: $NUODB_IMAGE
9+
# profiles: [ "instadb" ]
10+
restart: ${RESTART_POLICY:-unless-stopped}
11+
networks:
12+
instadb:
13+
14+
# Do NOT remove this env_file value!!
15+
env_file: .env
16+
17+
environment:
18+
PEER_ADDRESS: localhost
19+
NUODB_DOMAIN_ENTRYPOINT: localhost
20+
NUOCMD_API_SERVER: localhost:8888
21+
STARTUP_TIMEOUT: ${STARTUP_TIMEOUT:-90}
22+
EXTERNAL_ADDRESS: ${EXTERNAL_ADDRESS:-localhost}
23+
ARCHIVE_DIR: ${ARCHIVE_PATH:-/var/opt/nuodb/archive}
24+
DB_OPTIONS: "mem ${ENGINE_MEM:-1Gi} execution-engine ${SQL_ENGINE:-vee} ${ENGINE_OPTIONS:-}"
25+
ports:
26+
- :48004-48006
27+
- :8888
28+
volumes:
29+
- ./scripts:/usr/local/scripts
30+
- ./scripts/stop-nuodb:/usr/local/bin/stop-nuodb
31+
- ${IMPORT_LOCAL:-./empty-file}:${IMPORT_MOUNT:-/var/tmp/env}
32+
33+
command: [ "/usr/local/scripts/start-monolith" ]
34+
35+
36+
# ycsb-demo:
37+
# image: nuodb/ycsb:latest
38+
# networks:
39+
# net:
40+
# depends_on:
41+
# - te1
42+
# environment:
43+
# PEER_ADDRESS: ${PEER_ADDRESS:-nuoadmin1}
44+
# DB_NAME:
45+
# DB_USER:
46+
# DB_PASSWORD:
47+
# command: ["/driver/startup.sh"]

nuodb/monolith.yaml

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
version: '3'
2+
3+
networks:
4+
net:
5+
6+
services:
7+
monolith:
8+
image: $NUODB_IMAGE
9+
# profiles: [ "monolith" ]
10+
restart: ${RESTART_POLICY:-unless-stopped}
11+
networks:
12+
net:
13+
14+
# Do NOT remove this env_file value!!
15+
env_file: .env
16+
17+
environment:
18+
PEER_ADDRESS: ${PEER_ADDRESS:-db}
19+
NUODB_DOMAIN_ENTRYPOINT: ${PEER_ADDRESS:-db}
20+
NUOCMD_API_SERVER: localhost:8888
21+
STARTUP_TIMEOUT: ${STARTUP_TIMEOUT:-90}
22+
EXTERNAL_ADDRESS: ${EXTERNAL_ADDRESS:-localhost}
23+
ARCHIVE_DIR: ${ARCHIVE_PATH:-/var/opt/nuodb/archive}
24+
DB_OPTIONS: "mem ${ENGINE_MEM:-1Gi} execution-engine ${SQL_ENGINE:-vee} ${ENGINE_OPTIONS:-}"
25+
hostname: ${PEER_ADDRESS:-db}
26+
ports:
27+
- 48004-48006:48004-48006
28+
- 8888:8888
29+
volumes:
30+
- ./scripts:/usr/local/scripts
31+
- ./scripts/stop-nuodb:/usr/local/bin/stop-nuodb
32+
- ${IMPORT_LOCAL:-./empty-file}:${IMPORT_MOUNT:-/var/tmp/env}
33+
34+
command: [ "/usr/local/scripts/start-monolith" ]
35+
36+
37+
# ycsb-demo:
38+
# image: nuodb/ycsb:latest
39+
# networks:
40+
# net:
41+
# depends_on:
42+
# - te1
43+
# environment:
44+
# PEER_ADDRESS: ${PEER_ADDRESS:-nuoadmin1}
45+
# DB_NAME:
46+
# DB_USER:
47+
# DB_PASSWORD:
48+
# command: ["/driver/startup.sh"]

nuodb/scripts/import-archive

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
#!/bin/sh
2+
3+
# import the contents of the database archive
4+
5+
: ${IMPORT_LEVEL:=1}
6+
7+
# If archive IMPORT has been defined, and there is no existing archive, then perform the import
8+
if [ -n "$IMPORT_LOCAL$IMPORT_REMOTE" -a ! -f "$ARCHIVE_DIR/1.atm" -a "$runningArchives" -eq 0 ]; then
9+
echo "Importing into empty archive..."
10+
[[ -n "$IMPORT_REMOTE" && "$IMPORT_REMOTE" != ?*://?* ]] && echo "ERROR: IMPORT_REMOTE is not a valid URL: $IMPORT_REMOTE - import aborted" && exit 98
11+
12+
# clean up any tombstone of the archive for this SM
13+
if [ -n "$myArchive" ]; then
14+
echo "Cleaning up archive tombstone for $HOSTNAME: $myArchive..."
15+
[ $(nuocmd get archives --db-name $DB_NAME | wc -l) -eq 1 ] && echo "Cleaning up database first..." && nuocmd delete database --db-name $DB_NAME 2>&1 || exit 98
16+
nuocmd delete archive --archive-id $myArchive --purge 2>&1 || exit 98
17+
fi
18+
19+
# if IMPORT_REMOTE is set - work out whether to import from existing (IMPORT_LOCAL) cache
20+
importFromCache='false'
21+
if [ -n "$IMPORT_REMOTE" ]; then
22+
[ -n "$IMPORT_AUTH" -a "$IMPORT_AUTH" != ':' ] && curlAuth="--user $IMPORT_AUTH"
23+
if [ -n "$IMPORT_LOCAL" ]; then
24+
25+
# IMPORT_LOCAL is an empty dir
26+
if [ -d "$IMPORT_MOUNT" -a $(ls -1 "$IMPORT_MOUNT" | wc -l) -eq 0 ]; then
27+
echo "Extracting and caching $IMPORT_REMOTE into directory host:$IMPORT_LOCAL..."
28+
time curl -k ${curlAuth:-} "$IMPORT_REMOTE" | tar xzf - --strip-components ${IMPORT_LEVEL} -C $IMPORT_MOUNT || exit 98
29+
importFromCache='true'
30+
31+
# IMPORT_LOCAL is an empty file
32+
elif [ ! -s "$IMPORT_MOUNT" ]; then
33+
echo "Caching $IMPORT_REMOTE into file host:$IMPORT_LOCAL..."
34+
time curl -k ${curlAuth:-} "$IMPORT_REMOTE" > "$IMPORT_MOUNT" || exit 98
35+
importFromCache='true'
36+
37+
# IMPORT_LOCAL is not empty - assume it is a valid cache
38+
else
39+
echo "host:$IMPORT_LOCAL is not empty - assuming it contains a cached copy of $IMPORT_REMOTE."
40+
importFromCache='true'
41+
fi
42+
43+
# IMPORT_LOCAL is not set - so there is no local cache
44+
else
45+
echo "IMPORT_LOCAL is not set - caching disabled."
46+
echo "Importing from $IMPORT_REMOTE into $ARCHIVE_DIR..."
47+
time curl -k ${curlAuth:-} "$IMPORT_REMOTE" | tar xzf - --strip-components ${IMPORT_LEVEL} -C $ARCHIVE_DIR || exit 98
48+
fi
49+
50+
# IMPORT_REMOTE is not set, so check that IMPORT_LOCAL is not empty
51+
else
52+
[ -f "$IMPORT_MOUNT" -a ! -s "$IMPORT_MOUNT" ] && echo "ERROR: IMPORT_LOCAL file host:$IMPORT_LOCAL is empty." && exit 98
53+
[ -d "$IMPORT_MOUNT" -a $(ls -1 "$IMPORT_MOUNT" | wc -l) -eq 0 ] && echo "ERROR: IMPORT_LOCAL directory host:$IMPORT_LOCAL is empty." && exit 98
54+
importFromCache='true'
55+
fi
56+
57+
# IMPORT_LOCAL should now have the correct content - import it into the archive
58+
if [ -n "$IMPORT_LOCAL" ]; then
59+
[ -n "$IMPORT_REMOTE" -a "$importFromCache" = 'true' -a -s "$IMPORT_MOUNT" ] && echo "Using host:$IMPORT_LOCAL as a cached copy of $IMPORT_REMOTE..."
60+
if [ -d "$IMPORT_MOUNT" ]; then
61+
echo "Importing directory host:$IMPORT_LOCAL into $ARCHIVE_DIR..."
62+
time nuodocker restore archive --origin-dir $IMPORT_MOUNT --restore-dir $ARCHIVE_DIR --db-name "$DB_NAME" --clean-metadata || exit 98
63+
elif [ "$importFromCache" = 'true' -a -s "$IMPORT_MOUNT" ]; then
64+
echo "Importing file host:$IMPORT_LOCAL into $ARCHIVE_DIR..."
65+
time tar xf "$IMPORT_MOUNT" --strip-components ${IMPORT_LEVEL} -C "$ARCHIVE_DIR" || exit 98
66+
else
67+
echo "ERROR: IMPORT_LOCAL has been specified, but host:$IMPORT_LOCAL is not a valid import source - IMPORT_LOCAL must be a directory, an initially empty file, or a cached copy of IMPORT_REMOTE - import aborted..."
68+
exit 98
69+
fi
70+
fi
71+
72+
# sanity check the imported content in the archive
73+
[ -d "$ARCHIVE_DIR/full" ] && echo "ERROR: Imported data looks like a BACKUPSET (in which case IMPORT_LOCAL must be a DIRECTORY): $(ls -l $ARCHIVE_DIR | head -n 10)" && exit 98
74+
[ ! -f "$ARCHIVE_DIR/1.atm" ] && echo "ERROR: Imported archive does not seem to contain valid data: $(ls -l $ARCHIVE_DIR | head -n 10)" && exit 98
75+
echo "Imported data looks good: $(ls -l $ARCHIVE_DIR | head -n 5)"
76+
77+
# if the archive was not imported from a dir, then clean the meta-data in the archive
78+
if [ ! -d "$IMPORT_MOUNT" ]; then
79+
nuodocker restore archive --origin-dir "$ARCHIVE_DIR" --restore-dir "$ARCHIVE_DIR" --db-name "$DB_NAME" --clean-metadata || exit 99
80+
fi
81+
fi

nuodb/scripts/remove-zombie

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#!/bin/sh
2+
#
3+
# remove a zombie of the engine that is trying to start
4+
5+
# the caller mut specify the hostname
6+
hostType=$1
7+
hostName=$2
8+
9+
# wait until the admin layer has become ready
10+
msg=$(nuocmd check servers --timeout ${STARTUP_TIMEOUT} --check-converged --check-active)
11+
if [ $? -ne 0 ]; then
12+
echo "$me: ERROR: Timed out waiting for admin layer to be ready: $msg"
13+
exit 98
14+
fi
15+
16+
myStartIds="$(nuocmd get processes --db-name $DB_NAME | grep 'type=$hostType' | grep 'address=$hostName/' | grep -o 'start-id: [0-9]*' | sed 's/start-id: //' )"
17+
18+
count=$(echo $myStartIds | wc -l)
19+
echo "$(basename $0): Found $((count - 1)) matching start-ids: $myStartIds"
20+
21+
for id in $myStartIds ; do
22+
# delete any matching engine processes still in the Raft state
23+
msg="$(nuocmd shutdown process --server-id --start-id $id --evict --timeout 0)"
24+
[ $? -ne 0 ] && echo "ERROR: Unable to remove engine with start-id $id: $msg"
25+
done

nuodb/scripts/start-monolith

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
#!/bin/sh
2+
3+
# Start all 3 processes needed for a database in this (single) container.
4+
5+
PATH=$PATH:/usr/local/scripts
6+
7+
PEER_ADDRESS=$HOSTNAME
8+
9+
me="$(basename $0)"
10+
11+
echo "=================================="
12+
13+
# start a background nuoadmin process
14+
start-nuoadmin &
15+
16+
# wait until the admin layer has become ready
17+
msg=$(nuocmd check servers --timeout ${STARTUP_TIMEOUT} --check-converged --check-active)
18+
if [ $? -ne 0 ]; then
19+
echo "$me: ERROR: Timed out waiting for admin layer to be ready: $msg"
20+
exit 98
21+
fi
22+
23+
# delete any engine processes still in the Raft state
24+
nuocmd shutdown server-processes --server-id "${PEER_ADDRESS}" --db-name "$DB_NAME" --evict --timeout 0
25+
26+
echo "$me: AP is ready - starting SM and TE"
27+
28+
# start a background nuosm process
29+
start-nuosm &
30+
31+
# start a background nuote process
32+
start-nuote &
33+
34+
echo "$me: Waiting for DB $DB_NAME to become RUNNING..."
35+
nuocmd check database --db-name $DB_NAME --check-running --wait-for-acks --timeout "${STARTUP_TIMEOUT}" # wait for RUNNING SM
36+
nuocmd check database --db-name $DB_NAME --check-running --wait-for-acks --timeout 10 # wait for RUNNING SM + all other engines are alive
37+
if [ -n "$(nuocmd get processes --db-name $DB_NAME | grep 'type=TE' | grep 'state=RUNNING')" -a $? = 0 ]; then
38+
echo "$me: Database is RUNNING..."
39+
else
40+
echo "$me: Database check timed out after $STARTUP_TIMEOUT sec"
41+
42+
echo "$me: $(nuocmd show database --db-name "$DB_NAME" --all-incarnations)"
43+
44+
if [ -n "$NUODB_DEBUG" ]; then
45+
echo "$me: SM logs"
46+
cat /var/log/nuodb/SM.log
47+
48+
echo
49+
echo "$me: TE logs"
50+
cat /var/log/nuodb/TE.log
51+
52+
echo
53+
echo "$me: AP logs"
54+
cat /var/log/nuodb/AP.log
55+
fi
56+
fi
57+
58+
echo "$me: $(nuocmd show domain)"
59+
60+
# wait for all child processes to stop
61+
wait
62+
63+
echo "$me: Database $DB_NAME has been stopped. Exiting."

nuodb/scripts/start-nuoadmin

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#!/bin/bash
2+
#
3+
# Start a nuoadmin AP process
4+
5+
: ${LOGDIR:=/var/log/nuodb}
6+
echo "Starting AP..."
7+
8+
nuoadmin -- \
9+
pendingProcessTimeout=${STARTUP_TIMEOUT}000 \
10+
pendingReconnectTimeout=90000 \
11+
thrift.message.max=1073741824 \
12+
processLivenessCheckSec=30 \
13+
1>/dev/null | tee $LOGDIR/AP.log

0 commit comments

Comments
 (0)