Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
c5ec7ef
New requirements with train model deps
dgpatelgit Jan 5, 2021
f8223d1
Added deps in docker creation
dgpatelgit Jan 5, 2021
f92b59c
Fixed linter and common errors
dgpatelgit Jan 5, 2021
f8ccb60
Removed gcc-c++ from installation
dgpatelgit Jan 5, 2021
60b1818
Added C++ required for tensorflow
dgpatelgit Jan 5, 2021
d27553a
Included python-dev
dgpatelgit Jan 5, 2021
b3f2d10
Installed tensorflow-gpu as done for train model script
dgpatelgit Jan 5, 2021
a5f252c
Installed tensorflow-gpu as done for train model script
dgpatelgit Jan 5, 2021
41aa5c6
Using python 3.7 for train
dgpatelgit Jan 6, 2021
33e6a91
Resolved conflicting package version
dgpatelgit Jan 6, 2021
a26f6ec
Avoided installing requirement.txt
dgpatelgit Jan 6, 2021
9d3f6fe
Pinned tensorflow version
dgpatelgit Jan 6, 2021
7bbc50a
Fixed linter and version syntax error
dgpatelgit Jan 6, 2021
8045523
Pinned numpy and ninja package versions
dgpatelgit Jan 6, 2021
4867085
Pinned numpy and ninja package versions
dgpatelgit Jan 6, 2021
f51d5ac
Installed version with different layers in docker
dgpatelgit Jan 6, 2021
8779324
Added rudra as part of code and removed deps
dgpatelgit Jan 6, 2021
4266272
Added rudra as part of code and removed deps
dgpatelgit Jan 6, 2021
343188f
Removed rudra installation from test
dgpatelgit Jan 8, 2021
4179740
Pinned setuptools for test docker image
dgpatelgit Jan 8, 2021
fbb0d11
Added test deps ruamel.yaml
dgpatelgit Jan 8, 2021
f998966
Added requirements.txt deps installation
dgpatelgit Jan 8, 2021
fb54995
Added requirements.txt deps installation
dgpatelgit Jan 8, 2021
1ad7af8
Added requirements.txt deps installation
dgpatelgit Jan 8, 2021
0d9a4b2
Reordered deps installation
dgpatelgit Jan 8, 2021
222f895
Removed flask app
dgpatelgit Jan 8, 2021
a512fe2
Modified entry point for docker file
dgpatelgit Jan 8, 2021
2cb2d39
Entry script modified
dgpatelgit Jan 8, 2021
c71294b
Entry script modified
dgpatelgit Jan 8, 2021
07e006f
Added api server back with extra deps of train model
dgpatelgit Jan 8, 2021
a612f8d
Installing deps from requirements.txt
dgpatelgit Jan 8, 2021
e85246a
Version tunning
dgpatelgit Jan 8, 2021
6fad955
Version tunning
dgpatelgit Jan 8, 2021
a8f3ce7
Removed code to read local data
dgpatelgit Jan 12, 2021
ebaa7ad
Added setuptools 41.0.0
dgpatelgit Jan 12, 2021
f5f21fb
Removed version for scipy
dgpatelgit Jan 12, 2021
cfa45d3
Added gevent==1.5.0
dgpatelgit Jan 12, 2021
27327b2
Added gevent==1.5.0 and comment old version
dgpatelgit Jan 12, 2021
abfe4e4
Removed manual deps, added to requirments.txt
dgpatelgit Jan 12, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 35 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,19 +1,47 @@
FROM centos:7
FROM registry.centos.org/centos/centos:7

LABEL maintainer="Avishkar Gupta <avgupta@redhat.com>"

COPY ./recommendation_engine /recommendation_engine
COPY ./rudra /rudra
COPY ./requirements.txt /requirements.txt
#COPY ./requirements_new.txt /requirements_new.txt
COPY ./entrypoint.sh /bin/entrypoint.sh
COPY ./training /training

RUN yum install -y epel-release &&\
yum install -y openssl-devel &&\
yum install -y gcc git python36-pip python36-requests httpd httpd-devel python36-devel &&\
yum clean all
RUN yum -y install gcc openssl-devel bzip2-devel libffi-devel &&\
cd /tmp &&\
yum -y install -v httpd httpd-devel wget git make &&\
wget https://www.python.org/ftp/python/3.7.4/Python-3.7.4.tgz &&\
tar xzf Python-3.7.4.tgz &&\
cd Python-3.7.4 &&\
./configure --enable-optimizations &&\
make altinstall &&\
export PATH="/usr/local/bin:$PATH" &&\
python3.7 -m pip install --upgrade pip --user

#RUN python3.7 -m pip install setuptools==41.0.0 --user &&\
# python3.7 -m pip install -r requirements.txt --user

#RUN python3.7 -m pip install numpy==1.16.5 Jinja2==2.10.1 --user &&\
# python3.7 -m pip install setuptools==41.0.0 tensorflow==2.0.0b1 pandas boto3 scipy daiquiri flask h5py --user

#RUN python3.7 -m pip install git+https://github.com/fabric8-analytics/fabric8-analytics-rudra --user

#RUN python3.7 -m pip install numpy==1.16.5 Jinja2==2.10.1 --user

#RUN yum install -y epel-release &&\
# yum install -y openssl-devel &&\
# yum install -y gcc gcc-c++ git python36-pip python36-requests httpd httpd-devel python36-devel python-dev &&\
# yum clean all

#RUN pip3 install pandas boto3 numpy tensorflow scipy daiquiri flask h5py --user

RUN chmod 0777 /bin/entrypoint.sh

RUN pip3 install git+https://github.com/fabric8-analytics/fabric8-analytics-rudra#egg=rudra
RUN pip3 install -r requirements.txt
#RUN pip3 install git+https://github.com/fabric8-analytics/fabric8-analytics-rudra#egg=rudra
#RUN pip3 install -r requirements.txt
RUN python3.7 -m pip install -r requirements.txt

ENTRYPOINT ["/bin/entrypoint.sh"]
#ENTRYPOINT ["python3.7 /recommendation_engine/flask_predict.py"]
32 changes: 17 additions & 15 deletions Dockerfile.tests
Original file line number Diff line number Diff line change
Expand Up @@ -6,25 +6,27 @@ LABEL MAINTAINER="Avishkar Gupta <avgupta@redhat.com>"
# copy testing source code and scripts into root dir /
# --------------------------------------------------------------------------------------------------

ADD ./recommendation_engine /recommendation_engine
ADD ./requirements.txt /requirements.txt
ADD ./training/ /training
ADD ./tests/ /tests
ADD ./tests/scripts/entrypoint-test.sh /entrypoint-test.sh
ADD .coveragerc /.coveragerc
ADD ./.git /.git
ADD ./tools /tools
RUN chmod 0777 /entrypoint-test.sh
#ADD ./recommendation_engine /recommendation_engine
#ADD ./requirements.txt /requirements.txt
#ADD ./training/ /training
#Add ./rudra /rudra
#ADD ./tests/ /tests
#ADD ./tests/scripts/entrypoint-test.sh /entrypoint-test.sh
#ADD .coveragerc /.coveragerc
#ADD ./.git /.git
#ADD ./tools /tools
#RUN chmod 0777 /entrypoint-test.sh

ENV PYTHONPATH=/

RUN pip3 install --upgrade pip
RUN pip install tensorflow==2.0.0
RUN pip install git+https://github.com/fabric8-analytics/fabric8-analytics-rudra#egg=rudra
RUN pip install pytest pytest-cov radon==2.4.0 codecov raven blinker
RUN pip install -r requirements.txt
#RUN pip3 install --upgrade pip
#RUN pip install ruamel.yaml setuptools==41.0.0 tensorflow==2.0.0
#RUN pip install git+https://github.com/fabric8-analytics/fabric8-analytics-rudra#egg=rudra
#RUN pip install pytest pytest-cov radon==2.4.0 codecov raven blinker
#RUN pip install -r requirements.txt

# --------------------------------------------------------------------------------------------------
# RUN THE UNIT TESTS
# --------------------------------------------------------------------------------------------------
ENTRYPOINT ["/entrypoint-test.sh"]
#ENTRYPOINT ["/entrypoint-test.sh"]
ENTRYPOINT ["pwd"]
4 changes: 2 additions & 2 deletions deployment/submit_emr_job_pretrain.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ def submit_job(input_bootstrap_file, input_src_code_file):

# S3 bucket/key, where the spark job logs will be maintained
s3_log_bucket = config.DEPLOYMENT_PREFIX + '-automated-analytics-spark-jobs'
s3_log_key = '{}_{}_spark_emr_log_'.format(config.DEPLOYMENT_PREFIX, COMPONENT_PREFIX,
str_cur_time)
s3_log_key = '{}_{}_spark_emr_log_{}'.format(config.DEPLOYMENT_PREFIX, COMPONENT_PREFIX,
str_cur_time)
s3_log_uri = 's3://{bucket}/{key}'.format(bucket=s3_log_bucket, key=s3_log_key)

_logger.debug("Uploading the bootstrap action to AWS S3 URI {} ...".format(s3_bootstrap_uri))
Expand Down
1 change: 1 addition & 0 deletions entrypoint.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
#!/bin/bash

gunicorn --pythonpath /recommendation_engine -b 0.0.0.0:$SERVICE_PORT --workers=2 -k sync -t $SERVICE_TIMEOUT flask_predict:app
#python3.7 /recommendation_engine/flask_predict.py
12 changes: 6 additions & 6 deletions recommendation_engine/autoencoder/train/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,16 +135,16 @@ def train(self, data):
x_train = np.load(os.path.join(TEMPORARY_DATA_PATH, 'content_matrix.npz'))
x_train = x_train['matrix']
input_dim = x_train.shape[1]
logger.info("size of training file is: ".format(len(x_train), len(x_train[0])))
logger.info("size of training file is: {} {}".format(len(x_train), len(x_train[0])))
user_to_item_matrix = load_rating(TEMPORARY_USER_ITEM_FILEPATH, TEMPORARY_DATASTORE)
item_to_user_matrix = load_rating(TEMPORARY_ITEM_USER_FILEPATH, TEMPORARY_DATASTORE)
logger.info("Shape of User and Item matrices:".format(np.shape(user_to_item_matrix),
np.shape(item_to_user_matrix)))
logger.info("Shape of User and Item matrices: {} {}".format(np.shape(user_to_item_matrix),
np.shape(item_to_user_matrix)))
pretrain.fit(x_train)
encoder_weights = p.train(x_train)
logger.info("Shape of encoder weights are: ".format(tf.shape(encoder_weights),
len(encoder_weights),
len(encoder_weights[0])))
logger.info("Shape of encoder weights are: {} {} {}".format(tf.shape(encoder_weights),
len(encoder_weights),
len(encoder_weights[0])))
pmf_obj = PMFTraining(len(user_to_item_matrix), len(item_to_user_matrix), encoder_weights)
logger.debug("PMF model has been initialised")
pmf_obj(user_to_item_matrix=user_to_item_matrix,
Expand Down
14 changes: 8 additions & 6 deletions recommendation_engine/flask_predict.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,18 @@
import os

import flask
from flask import Flask, request
from flask import Flask
'''from flask import Flask, request
from recommendation_engine.predictor.online_recommendation import PMFRecommendation
from rudra.data_store.aws import AmazonS3
import recommendation_engine.config.cloud_constants as cloud_constants
from recommendation_engine.config.cloud_constants import USE_CLOUD_SERVICES
from recommendation_engine.config.params_scoring import ScoringParams
from recommendation_engine.config.params_scoring import ScoringParams'''
from raven.contrib.flask import Sentry
import logging

app = Flask(__name__)

'''
if USE_CLOUD_SERVICES:
s3 = AmazonS3(bucket_name=cloud_constants.S3_BUCKET_NAME, # pragma: no cover
aws_access_key_id=cloud_constants.AWS_S3_ACCESS_KEY_ID,
Expand All @@ -46,7 +47,7 @@
recommender = PMFRecommendation(ScoringParams.recommendation_threshold,
s3,
ScoringParams.num_latent_factors)

'''
SENTRY_DSN = os.environ.get("SENTRY_DSN", "")
sentry = Sentry(app, dsn=SENTRY_DSN, logging=True, level=logging.ERROR)
app.logger.info('App initialized, ready to roll...')
Expand All @@ -68,7 +69,7 @@ def readiness():
def recommendation():
"""Endpoint to serve recommendations."""
app.logger.info("Executed companion recommendation")
global recommender
'''global recommender
response_json = []
for recommendation_request in request.json:
missing, recommendations, ip_package_to_topic_dict = recommender.predict(
Expand All @@ -79,7 +80,8 @@ def recommendation():
"companion_packages": recommendations,
"ecosystem": os.environ.get("CHESTER_SCORING_REGION"),
"package_to_topic_dict": ip_package_to_topic_dict
})
})'''
response_json = []
return flask.jsonify(response_json), 200


Expand Down
18 changes: 11 additions & 7 deletions requirements.in
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
boto3==1.6.7
daiquiri==1.3.0
flask==1.0.2
gevent==1.2.2
numpy==1.14.2
scipy==1.0.0
gunicorn==19.7.1
boto3
daiquiri
flask
gevent
numpy
scipy
gunicorn
raven[flask]
setuptools==41.0.0
tensorflow==2.0.0b1
pandas
h5py
20 changes: 16 additions & 4 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,37 @@
#
# pip-compile --output-file requirements.txt requirements.in
#
setuptools==41.0.0
gevent==1.5.0
boto3==1.6.7
botocore==1.9.23 # via boto3, s3transfer
click==6.7 # via flask
daiquiri==1.3.0
docutils==0.14 # via botocore
flask==1.0.2
gevent==1.2.2
#gevent==1.2.2
greenlet==0.4.14 # via gevent
gunicorn==19.7.1
gunicorn==20.0.4
itsdangerous==0.24 # via flask
jinja2==2.10.1 # via flask
jmespath==0.9.3 # via boto3, botocore
markupsafe==1.0 # via jinja2
numpy==1.14.2
#numpy==1.14.2
python-dateutil==2.6.1 # via botocore
s3transfer==0.1.13 # via boto3
scipy==1.0.0
scipy
#scipy==1.0.0
six==1.11.0 # via python-dateutil
werkzeug==0.15.3 # via flask
raven[flask]==6.10.0
contextlib2==0.5.5 # via raven
blinker==1.4 # via raven

numpy==1.16.5
tensorflow==2.0.0
pandas
#boto3
#scipy
#daiquiri
#flask
h5py
55 changes: 55 additions & 0 deletions requirements_new.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
tensorflow==1.7.0
tensorflow-estimator==1.15.0
tensorboard==1.7.0
tensorboard-plugin-wit==1.7.0
absl-py==0.11.0
astunparse==1.6.3
blinker==1.4
boto3==1.6.7
botocore==1.9.23
cachetools==4.2.0
certifi==2020.12.5
chardet==4.0.0
click==7.1.2
daiquiri==1.3.0
docutils==0.16
Flask==1.0.2
flatbuffers==1.12
gast==0.3.3
gevent==1.2.2
google-auth==1.24.0
google-auth-oauthlib==0.4.2
google-pasta==0.2.0
greenlet==0.4.17
grpcio==1.32.0
gunicorn==19.7.1
h5py==2.10.0
idna==2.10
importlib-metadata==3.3.0
itsdangerous==1.1.0
Jinja2==2.11.2
jmespath==0.10.0
Keras-Preprocessing==1.1.2
Markdown==3.3.3
MarkupSafe==1.1.1
numpy==1.19.4
oauthlib==3.1.0
opt-einsum==3.3.0
pip-tools==5.5.0
protobuf==3.14.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
python-dateutil==2.6.1
raven==6.10.0
requests==2.25.1
requests-oauthlib==1.3.0
rsa==4.6
s3transfer==0.1.13
scipy==1.0.0
six==1.15.0
termcolor==1.1.0
typing-extensions==3.7.4.3
urllib3==1.26.2
Werkzeug==1.0.1
wrapt==1.12.1
zipp==3.4.0
24 changes: 24 additions & 0 deletions rudra/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
"""Initialize the ml_utils package."""

import datetime
import os
import logging
import daiquiri

DEBUG = os.getenv('DEBUG', False) == 'true'

formatter = daiquiri.formatter.ColorExtrasFormatter(
fmt=(daiquiri.formatter.DEFAULT_EXTRAS_FORMAT +
" [%(filename)s:%(lineno)s F:%(funcName)s()]"))

daiquiri.setup(
level=logging.DEBUG if DEBUG else logging.ERROR,
outputs=(
daiquiri.output.TimedRotatingFile('/tmp/rudra.errors.log',
level=logging.WARNING,
interval=datetime.timedelta(hours=48)),
daiquiri.output.Stream(formatter=formatter)
)
)

logger = daiquiri.getLogger(__name__)
1 change: 1 addition & 0 deletions rudra/data_store/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Data Store and Retrieval from various Storage."""
45 changes: 45 additions & 0 deletions rudra/data_store/abstract_data_store.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""Abstract class for data store interactions."""

import abc


class AbstractDataStore(metaclass=abc.ABCMeta):
"""Abstract class to dictate the behaviour of a data store."""

@abc.abstractmethod
def get_name(self):
"""Get name of bucket or root fs directory."""
pass

@abc.abstractmethod
def read_json_file(self):
"""Read JSON file from the data source."""
pass

@abc.abstractmethod
def read_generic_file(self):
"""Read a file and return its contents."""
pass

@abc.abstractmethod
def read_pickle_file(self, _filename):
"""Read Pickle file from data store."""
pass

@abc.abstractmethod
def read_yaml_file(self, _filename):
"""Read Pickle file from data store."""
pass

@abc.abstractmethod
def upload_file(self, _src, _target):
"""Upload file into data store."""
pass

@abc.abstractmethod
def write_json_file(self, _filename, _contents):
"""Write JSON file into data store."""
pass
Loading