Skip to content

Conversation

AndreyKoltsov1997
Copy link
Member

Expected behavior
Reading from item-input-file with N lines gives you N succ operations

Observed behaviour
with 4.2.23 we don't get N succ operations, but somewhere around 90-95% of N. Other operations are left in pending state. 4.2.22 doesn't have this issue.

Related JIRA's issue: https://mongoose-issues.atlassian.net/browse/BASE-1396

@AndreyKoltsov1997 AndreyKoltsov1997 added the bug Something isn't working label Feb 7, 2020
@AndreyKoltsov1997 AndreyKoltsov1997 self-assigned this Feb 7, 2020
@AndreyKoltsov1997
Copy link
Member Author

AndreyKoltsov1997 commented Feb 7, 2020

Things to check:

  • Semaphore acquirement behavior;
  • FAIL_UNKNOWN status setting removal;

@AndreyKoltsov1997
Copy link
Member Author

Since issue doesn't always reproduce, I've wrote a simple script that launches multiple bytestream read operations that have to fail.
The purpose of it is that you could add debug via prints and/or add breakpoints to necessary places. Once the issue reproduces, you could analyse the output and/or do step-by-step debug.

#!/bin/sh

# NOTE: Script checks localhost for running ...
# ... pravega. In case of runni
TARGET_PRAVEGA_HOST="127.0.0.1"
TARGET_PRAVEGA_PORT="9090"
TARGET_PRAVEGA_ADDRS="$TARGET_PRAVEGA_HOST:$TARGET_PRAVEGA_PORT"
WGET_UNRESOLVED_HOST_MSG="can't"
STATUS_FAILURE=-1
STATUS_SUCCESS=0

# PARAMS: 
# * $1 - address of Pravega;
pravegaHealthcheck(){
    targetPravegaAddrs=$1
    echo "Checking pravega's availability..."
    pravegaHealthcheckMsg=$(wget --server-response --spider --quiet "${targetPravegaAddrs}" 2>&1 | awk 'NR==1{print $2}')
    if [[ $pravegaHealthcheckMsg == $WGET_UNRESOLVED_HOST_MSG ]]; then
        echo "Pravega is not available at $targetPravegaAddrs. Restart Pravega and launch again."
        killall -9 java
        exit 1
    fi
}

hasReproduced=$STATUS_FAILURE
while [ $hasReproduced -ne $STATUS_SUCCESS ]
do
    for i in 1000 10000; do
        pravegaHealthcheck $TARGET_PRAVEGA_ADDRS &
        mongooseOutput=$(java -jar mongoose-4.2.16.jar --storage-driver-type=pravega  --storage-net-node-port=9090 --storage-net-node-addrs=127.0.0.1  --load-op-limit-rate=100 --load-step-limit-time=5m --storage-driver-threads=10 --storage-driver-stream-data=bytes  --storage-net-node-conn-pooling=false --item-data-size=10B --storage-driver-scaling-segments=1 --storage-driver-limit-concurrency=1000  --load-op-limit-count=${i}  --storage-namespace=koltscopebug18 --load-op-type=read)
        failedOpCount=$(grep  "Failed:" | grep -o '[0-9]\+')
        if [[ $failedOpCount == "" ]]; then
            echo "Mongoose seems to be unavailable"
            exit 1
        fi
        if [ $failedOpCount -ne $i ]; then
            hasReproduced=$STATUS_SUCCESS
        fi
        echo "current failed op count is $failedOpCount"
    done
done
echo "Reproduced!"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant