Setting up your Dev Environment
Prerequisites#
In order to contribute to Great Expectations, you will need the following:
- A GitHub account—this is sufficient if you only want to contribute to the documentation. 
- If you want to contribute code, you will also need a working version of Git on your computer. Please refer to the Git setup instructions for your environment. 
- We also recommend going through the SSH key setup process on GitHub for easier authentication. 
Fork and clone the repository#
1. Fork the Great Expectations repo#
- Go to the Great Expectations repo on GitHub. 
- Click the - Forkbutton in the top right. This will make a copy of the repo in your own GitHub account.
- GitHub will take you to your forked version of the repository. 
2. Clone your fork#
- Click the green - Clonebutton and choose the SSH or HTTPS URL depending on your setup.
- Copy the URL and run - git clone <url>in your local terminal.
- This will clone the - developbranch of the great_expectations repo. Please use- develop(not- main!) as the starting point for your work.
- Atlassian has a nice tutorial for developing on a fork. 
3. Add the upstream remote#
- On your local machine, cd into the great_expectations repo you cloned in the previous step. 
- Run: - git remote add upstream git@github.com:great-expectations/great_expectations.git
- This sets up a remote called - upstreamto track changes to the main branch.
4. Create a feature branch to start working on your changes.#
- Example: - git checkout -b feature/my-feature-name
- We do not currently follow a strict naming convention for branches. Please pick something clear and self-explanatory, so that it will be easy for others to get the gist of your work. 
Install Python dependencies#
5. Create a new virtual environment#
- Make a new virtual environment (e.g. using virtualenv or conda), name it “great_expectations_dev” or similar. 
- Ex virtualenv: - python3 -m venv <path_to_environments_folder\>/great_expectations_devand then- <source path_to_environments_folder\>/great_expectations_dev/bin/activate
- Ex conda: - conda create --name great_expectations_devand then- conda activate great_expectations_dev
This is not required, but highly recommended.
6. Install dependencies from requirements-dev.txt#
- pip install -r requirements-dev.txt -c constraints-dev.txt
- MacOS users will be able to pip / pip3 install - requirements-dev.txtusing the above command from within conda, yet Windows users utilizing a conda environment will need to individually install all files within requirements-dev.txt
- This will ensure that sure you have the right libraries installed in your Python environment. - Note that you can also substitute requirements-dev-test.txt to only install requirements required for testing all backends, and requirements-dev-spark.txt or requirements-dev-sqlalchemy.txt if you would like to add support for Spark or SQLAlchemy tests, respectively. For some database backends, such as MSSQL additional driver installation may required in your environment; see below for more information. 
 
7. Install great_expectations from your cloned repo#
- pip install -e .- * - -ewill install Great Expectations in “editable” mode. This is not required, but is often very convenient as a developer.
(Optional) Configure resources for testing and documentation#
Depending on which features of Great Expectations you want to work on, you may want to configure different backends for local testing, such as PostgreSQL and Spark. Also, there are a couple of extra steps if you want to build documentation locally.
If you want to develop against local PostgreSQL:#
- To simplify setup, the repository includes a - docker-composefile that can stand up a local PostgreSQL container. To use it, you’ll need to have Docker installed.
- Navigate to - assets/docker/postgresqlin your- great_expectationsrepo and run- docker-compose up -d
- Within the same directory, you can run - docker-compose psto verify that the container is running. You should see something like:- Name Command State Ports———————————————————————————————————————————postgresql_travis_db_1 docker-entrypoint.sh postgres Up 0.0.0.0:5432->5432/tcp
- Once you’re done testing, you can shut down your postgesql container by running - docker-compose downfrom the same directory.
- Caution: If another service is using port 5432, Docker may start the container but silently fail to set up the port. In that case, you will probably see errors like this: - psycopg2.OperationalError: could not connect to server: Connection refused Is the server running on host "localhost" (::1) and accepting TCP/IP connections on port 5432?could not connect to server: Connection refused Is the server running on host "localhost" (127.0.0.1) and accepting TCP/IP connections on port 5432?
- Or this… - sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: database "test_ci" does not exist(Background on this error at: http://sqlalche.me/e/e3q8)
If you want to develop against local mysql:#
- To simplify setup, the repository includes a - docker-composefile that can stand up a local mysqldb container. To use it, you’ll need to have Docker installed.
- Navigate to - assets/docker/mysqlin your- great_expectationsrepo and run- docker-compose up -d
- Within the same directory, you can run - docker-compose psto verify that the container is running. You should see something like:- Name Command State Ports------------------------------------------------------------------------------------------mysql_mysql_db_1 docker-entrypoint.sh mysqld Up 0.0.0.0:3306->3306/tcp, 33060/tcp
- Once you’re done testing, you can shut down your mysql container by running - docker-compose downfrom the same directory.
- Caution: If another service is using port 3306, Docker may start the container but silently fail to set up the port. 
If you want to develop against local Spark:#
- In most cases, - pip install requirements-dev.txtshould set up pyspark for you.
- If you don’t have Java installed, you will probably need to install it and set your - PATHor- JAVA_HOMEenvironment variables appropriately.
- You can find official installation instructions for Spark here. 
If you want to build documentation locally:#
- pip install -r docs/requirements.txt
- To build documentation, the command is - cd docs; make html
- Documentation will be generated in - docs/build/html/with the- index.htmlas the index page.
- Note: we use - autoapito generate API reference docs, but it’s not compatible with pandas 1.1.0. You’ll need to have pandas 1.0.5 (or a previous version) installed in order to successfully build docs.
Run tests to confirm that everything is working#
- You can run all tests by running pytestin the great_expectations directory root. Please see Testing for testing options and details.
Start coding!#
At this point, you have everything you need to start coding!