Celery Configuration
Tutorial to follow along
Configuring celery workers and celery beat for async task execution and scheduling alerts, reports, etc.
- Python and Python-venv (Version 3.10)
- Postgres (Metadata db)
- Redis
Everything I am installing on single vm as creating multiple will cost a lot but you should be able to understand.
Process is going to be same as normal installation we saw with few changes. Let's start.
sudo apt update -y & sudo apt upgrade -y
Installing Redis
sudo apt install redis-server
Installing Apache Superset
-
Install dependencies
sudo apt-get install build-essential libssl-dev libffi-dev python3.10-dev python3.10-pip libsasl2-dev libldap2-dev default-libmysqlclient-dev python3.10-venv libpq-devIf 3.10 gives error then you can add repo using
sudo add-apt-repository ppa:savoury1/pythonsudo apt update -
Create app directory for superset and dependencies
sudo mkdir /appsudo chown user /appcd /app -
Create python environment
mkdir supersetcd supersetpython3 -m venv superset_env. superset_env/bin/activatepip install --upgrade setuptools pip -
Install python dependencies
pip install pillowpip install apache-supersetpip install psycopg2pip install gunicornpip install celerypip install gevent -
Create superset config file and set environment variable
touch superset_config.pyexport SUPERSET_CONFIG_PATH=/app/superset/superset_config.py -
Edit
superset_config.pyusingnano superset_config.pyand put following code in it# Superset specific configROW_LIMIT = 5000# Flask App Builder configuration# Your App secret key will be used for securely signing the session cookie# and encrypting sensitive information on the database# Make sure you are changing this key for your deployment with a strong key.# Alternatively you can set it with `SUPERSET_SECRET_KEY` environment variable.# You MUST set this for production environments or the server will not refuse# to start and you will see an error in the logs accordingly.SECRET_KEY = 'YOUR_OWN_RANDOM_GENERATED_SECRET_KEY'# The SQLAlchemy connection string to your database backend# This connection defines the path to the database that stores your# superset metadata (slices, connections, tables, dashboards, ...).# Note that the connection information to connect to the datasources# you want to explore are managed directly in the web UI# The check_same_thread=false property ensures the sqlite client does not attempt# to enforce single-threaded access, which may be problematic in some edge casesSQLALCHEMY_DATABASE_URI = 'sqlite:////app/superset/superset.db?check_same_thread=false'TALISMAN_ENABLED = FalseWTF_CSRF_ENABLED = False# Set this API key to enable Mapbox visualizationsMAPBOX_API_KEY = ''# Celery Redis configuration for async query executionclass CeleryConfig(object):broker_url = "redis://localhost:6379/0"imports = ("superset.sql_lab","superset.tasks.scheduler",)result_backend = "redis://localhost:6379/0"worker_prefetch_multiplier = 10task_acks_late = Truetask_annotations = {"sql_lab.get_sql_results": {"rate_limit": "100/s",},}CELERY_CONFIG = CeleryConfig# On Redisfrom flask_caching.backends.rediscache import RedisCacheRESULTS_BACKEND = RedisCache(host='localhost', port=6379, key_prefix='superset_results') -
Please replace YOUR_OWN_RANDOM_GENERATED_SECRET_KEY in above file with the code returned by following command
openssl rand -base64 42 -
Once Done let us inititlize database with following commands
# Create an admin user in your metadata database (use `admin` as username to be able to load the examples)export FLASK_APP=supersetsuperset db upgradesuperset fab create-admin# As this is going to be production I have commented load example part but if you need you can run this# superset load_examples# Create default roles and permissionssuperset init -
Now Our environment is ready lets try running it.. To run superset I have created a sh script that you can run in order to run the server. To create create script using following command.
nano run_superset.shand paste following code in it.
#!/bin/bashexport SUPERSET_CONFIG_PATH=/app/superset/superset_config.py. /app/superset/superset_env/bin/activategunicorn \-w 10 \-k gevent \--timeout 120 \-b 0.0.0.0:8088 \--limit-request-line 0 \--limit-request-field_size 0 \--statsd-host localhost:8125 \"superset.app:create_app()" -
In order to run it we need to grant it run permission. To do that lets run following command.
chmod +x run_superset.sh -
Lets run and test if it works?
sh run_superset.sh -
check if you are able to login using admin creds on server-ip-address:8088. If everything is working fine then we can go ahead and create service that will start automatically as soon as server starts or in case it reboots.
[!NOTE]
Now if you try to connect with database and async query is enabled it will just run indefinitely as celery worker is not running. We will create celery once superset service is created
Lets create service called superset using following command
sudo nano /etc/systemd/system/superset.service
paste following code in it
[Unit] Description = Apache Superset Webserver Daemon After = network.target
[Service] PIDFile = /app/superset/superset-webserver.PIDFile Environment=SUPERSET_HOME=/app/superset Environment=PYTHONPATH=/app/superset WorkingDirectory = /app/superset limit-re> ExecStart = /app/superset/run_superset.sh ExecStop = /bin/kill -s TERM $MAINPID
[Install] WantedBy=multi-user.target
once copied run following command to enable and start service
systemctl daemon-reload sudo systemctl enable superset.service sudo systemctl start superset.service
Run and Test Celery
-
To run celery I have created a sh script that you can run in order to run the server. To create create script using following command.
nano run_celery.shand paste following code in it.
#!/bin/bash export SUPERSET_CONFIG_PATH=/app/superset/superset_config.py . /app/superset/superset_env/bin/activate
celery --app=superset.tasks.celery_app:app worker --pool=prefork -O fair -c 4 & celery --app=superset.tasks.celery_app:app beat
In above script
-c 4represents how many worker processes should run in a worker. -
In order to run it we need to grant it run permission. To do that lets run following command.
chmod +x run_celery.sh -
Lets run and test if it works?
sh run_celery.sh -
Create Celery service edit/create file
sudo nano /etc/systemd/system/celery.serviceand paste following code in it[Unit]Description = Apache Celery worker DaemonAfter = network.target[Service]PIDFile = /app/superset/celery.PIDFileEnvironment=SUPERSET_HOME=/app/supersetEnvironment=PYTHONPATH=/app/supersetWorkingDirectory = /app/supersetExecStart = /app/superset/run_celery.shExecStop = /bin/kill -s TERM $MAINPID[Install]WantedBy=multi-user.targetonce copied run following command to enable and start service
systemctl daemon-reloadsudo systemctl enable celery.servicesudo systemctl start celery.service
YEY! Your Enterprise Server is Up and running you can test it by restarting the server...
If you have any issues you can contact me on [email protected] .
