Contains the sample projects using dbt.
Using pipenv
install dbt-core
. Alongwith dbt-core
you might need to install the adapter for database or the data warehouse you want to play with. We'll use docker to get the postgres image and use it as a database.
sudo apt-get update
sudo apt-get install pwgen
export PGPASSWORD=$(pwgen -1)
echo 'export PGPASSWORD=${PGPASSWORD}' >> ~/.bashrc
echo 'export PGPASSWORD=${PGPASSWORD}' >> ~/.profile
pipenv install dbt-core
pipenv install dbt-postgres
docker pull postgres
docker run -it -d -e POSTGRES_PASSWORD=${PGPASSWORD} -p 5432:5432 --name messy_range postgres:latest
NOTE: In case you get an error that
The container name is already in use by container "<Id>"
, then you might have the remove the existing container using the commanddocker rm messy_range
or choose a different container name when executing the docker run command. If the container is stopped you can simply executedocker restart messy_range
.
NOTE: You are not required to keep the same container that is
messy_range
for your container. You can choose another container name of your choice.
We will now create a new project using the dbt init
command. Execute the same on the command line and provide the project name. The first project which we are going to create is dbt_profiler_example
. Once the project is created you should be able to see a directory created with the project name.
cd dbt_profiler
code packages.yml
Follow the instructions to install the dependency for installing the dbt_profiler
.
Once you execute the dbt deps
command, you should be able to see the directory for dbt_packages
created in the project folder.
We can setup the connection details to our database in the profile.yml file which we can create in our projects directory or we can use one in the ~/.dbt
directory.
In our case as we are putting everything in this repo and the data is not very sensitive, when you have cloned this repo, you can notice the .dbt
directory. There is a profile file which is already created n the .dbt
directory.
For more information on the profile file, please visit -
https://docs.getdbt.com/docs/core/connect-data-platform/connection-profiles
The parent directory for profiles.yml
is determined using the following precedence:
1. `--profiles-dir` option
2. `DBT_PROFILES_DIR` environment variable
3. current working directory
4. `~/.dbt/` directory
export DBT_PROFILES_DIR="$(pwd)/.dbt"
code ./.dbt/profiles.yml
Note: You need to make sure that port
5432
is exposed to the server port5432
for the next set of steps when you start the postgres docker container. Second, you must also make sure that you initialize the your pipenv environment using thepipenv shell
command or you might have to run the next set of steps using thepipenv run <command>
. Alternatively you can also run theactivate.sh
file upon cloning the repo or everytime you restart your machine and cd to this folder.
Execute the following command on the terminal. If you see the output as All checks passed!
, that means the settings in the connection profile in the home directory or your project directory worked.
dbt debug
Note: If you want to make use of Snowflake or anyother database, you are welcome to make use of instead of the dockerized postgres. Just make sure to update your profile file accordingly.
For the next set of steps visit the individual project folders.
- First identify what's the release version of your Ubuntu through the command
lsb_release -a
- Execute the following command to install all the dependencies if the version is
20.04
sudo apt-get install build-essential libssl-dev libffi-dev python3-dev python3-pip libsasl2-dev libldap2-dev default-libmysqlclient-dev
- Install
apache-superset
using the commandpipenv install apache-superset
- Followed by the installation, activate the virtual environment and run the following commands
pipenv shell
# Create an admin user in your metadata database (use `admin` as username to be able to load the examples)
export FLASK_APP=superset
superset fab create-admin
# Load some data to play with
superset load_examples
# Create default roles and permissions
superset init
# To start a development web server on port 8088, use -p to bind to another port
superset run -p 8088 --with-threads --reload --debugger