Appendix A — Setting Up Git, Python and Jupyterlab

A.1 Getting Set Up

Follow the steps, in order, to set up your computer.

A.1.1 Opening the terminal

You may, at various points in time, need to open a terminal window. The terminal is a command-line interface to your computer.

Open the Start menu and type “cmd”. Open the Command Prompt program.

Press Cmd+Space or click the Launchpad (rocket ship) icon. Type “Terminal”, and open the Terminal program.

A.2 Install Git

If you already have Git installed, you may skip this step.

Git is a program used by anyone who develops code for a living. It is used for version control. It should be installed on your computer before anything else. Select your operating system and follow the instructions below.

Git is safe and open-source, and licensed under the GNU General Public License.

Visit https://git-scm.com/install/windows and download the latest version of Git for Windows. Run the installer, and select all default options.

The easiest thing to do is install Git via homebrew. Open the Terminal app, type the following two commands, pressing Enter between them. Select all default options.

  1. Install Homebrew via instructions at https://brew.sh/
  2. Install Git via the command brew install git

Other options for installation are available at https://git-scm.com/install/mac.

A.3 Install miniforge

The best way to organize data science code in your computer is with a package manager. Because coding in Python involves installing lots of other people’s code (pandas, plotly, scikit-learn) – and different projects might want different versions of this code – you should always install Python and its libraries/packages within a package manager. The package manager of choice among many data scientists is conda-forge. It allows you to create different environments which have different packages installed depending on what you need for a particular data science task.

The miniforge installer is located at https://conda-forge.org/download/.

Click on the tab(s) below corresponding to your operating system and follow the relevant instructions.

Visit https://conda-forge.org/download/ and download the Windows installer to your computer. From now on, when you open the terminal, open the miniforge terminal.

Determine what version of macOS installer you have. Click the Apple menu in the top-left corner of the screen, and select About This Mac.

(If you need additional help, follow instructions at this link: https://support.apple.com/en-us/109033.)

Visit https://conda-forge.org/download/ and download the ARM_64 installer to your computer. Run it, selecting all default settings.

Visit https://conda-forge.org/download/ and download the x86_64 installer to your computer. Run it, selecting all default settings.

A.4 Working with conda environments

Miniforge allows you to create environments to organize versions of code packages, in case you have multiple projects that might use different versions of Python or existing packages. These environments are managed via something called conda. It is best practices to always code within a local environment, whether you use conda, Docker, or other tools. This way, you have full control over what versions of packages are used for different projects and can provide that information to others to make your code reproducible.

A.4.1 Create a Conda environment

Open your terminal. Decide on an environment name, e.g. “data”. Type the command below, replacing “envname” with a meaningful name for a project. Stick to an all-lowercase name. At the time this was written, Python 3.14 was the latest version of Python.

conda create -n envname python=3.14

This creates a fresh installation of Python 3.14 on your system, with no packages/libraries installed yet, inside the named environment “envname”.

A.4.2 Open your Conda environment

Do this every time you plan to edit code or install packages!

Type the following into your terminal to activate / open your Conda environment.

conda activate envname

You will know your conda environment is active if you see (envname) before anything in your terminal.

A.4.3 Install Packages

Make sure your conda environment is open and active (see the previous section).

Most Python packages you might want to install can be directly installed from the terminal. To install the following packages and make them available to your code, type the following commands:

conda install -c conda-forge packagename

I recommend installing the following packages. You can simply replace packagename with each of the following packages, or even type them all in one list, i.e. run a command like conda install -c conda-forge package1 package2 package3.

A.4.3.2 To install all of the above packages at once:

The following line of code will install all of the packages above at once:

conda install -c conda-forge jupyterlab jupyterlab-git numpy pandas polars matplotlib plotly seaborn scikit-learn

A.4.3.3 Installing with pip

In the future, when you find other packages with code you want, they might not be installable using conda-forge, but are installable using the command pip install packagename. I recommend always attempting to install packages using conda-forge first, and only install packages with pip if there is no other available installer.

A.4.4 Open Jupyter Lab

Open your terminal, activate your conda environment with conda activate envname. Then type jupyterlab. If everything has gone well, you will open a Jupyter Lab environment. You can then create new notebooks by clicking the “New .ipynb” button.

A.4.5 Create Working Directory

Note: You should, at this time, create a new, empty folder on your computer, ideally under Documents/Development or username/Development, under which all of your code projects will go. Every new project should be placed in a new directory/folder.

A.5 Sharing your code on GitHub

GitHub (https://github.com/) is the golden standard website where professionals and academics share their open-source code projects. Github is excellent for showing off what you have done to potential employers and/or sharing relevant code projects to allow other people to use them. However, you should only download and run code from GitHub if you have reason to trust the author and/or you know what it does.

GitHub also enables you and other code authors to collaborate on a code project in a more standard manner than CoCalc – though this will not be discussed in this section.

A.5.1 Create a GitHub account

If you already have a GitHub account, skip this step. Navigate for https://github.com/ and click “Sign up.” Create a minimal account. Select a professional username: it is not uncommon to use your first initial and last name, or other similar patterns. Most amateur, hobbyist and academic programmers stick to the Free tier account.

A.5.2 Create a new repository.

Create a repository on GitHub with a simple name like data-projects .

A.5.3 Upload code into this repository

If you want to share your code in this repository, you can upload any file or directory structure into this website. However, keep in mind:

  • Licensing. If you use someone else’s data or code, make sure you have permission to share it. You can get into legal trouble if you copy another person’s data / code without permission.

  • There are better ways to do this. If you plan to update and edit your code, or work on it with someone else, there are additional workflow steps you should follow.

The following steps are optional, for now.

A.5.4 Download GitHub Desktop

Download and install GitHub Desktop by following instructions at https://desktop.github.com/download/. Connect your GitHub Desktop account with your GitHub account by going to File>Options and attaching your account under the Accounts tab.

You can edit the files in this repository as normal. Whenever you make substantial changes, you should create a new “commit.” Write a brief summary in the description, and click “Commit to main.”

If you prefer, you can also write git commit -a -m "commit description" in the terminal from within JupyterLab.