If you are attending one of our workshops, you can use Google Colab to run all the code. If you want to setup your own computer to run the analysis demonstrated on this course, you can follow the instructions below. :::
The data used in these materials is provided as a zip file. Download and unzip the folder to your Desktop to follow along with the materials.
Alternatively you can use the link below to download the data from Google Drive:
How to use in Colab:
Open a new notebook in Google Colab
If you want to run the notebooks in Colab, you can also use the Open in Colab badge below:
Run the commands in code cells.
You can now create notebooks and run any of the scripts in Google Colab.
Repository link:
https://cambiotraining.github.io/ml-unsupervised/
data) and mount the drive in Colab. Open a new Google Colab notebook. Then create a new code cell and type the following commands (and then click the play button to run the cell):from google.colab import drive
import os
import pandas as pd
drive.mount('/content/drive')
os.chdir('/content/drive/My Drive/data')
pd.read_csv('diabetes_sample_data.csv')
drive.mount() command.
Follow the instructions in the output cell to complete the authentication process. You will need a gmail account or you can use your cam.ac.uk account. Some screenshots are shown below to guide you through the process:


If you have your files in a different directory, please change the path in the os.chdir() command above accordingly.
A template notebook is also available to get you started.
Google Colab comes with most of the required packages pre-installed.
If you need to install any additional packages, you can do so using the !pip install package-name command in a code cell.
Create a new folder named data in the My Drive folder. Download the data files and and copy the files into the data folder (if you want to access the files from Colab). Your directory structure should look like this:
My Drive/
└─ data/
(Optional) You can also follow the instructions below to setup from the browser.
Open a new Google Colab notebook.


Run this cell button.
data in the My Drive folder. Then upload the data files into the data folder.
Install Python (we recommend using Anaconda or Miniconda to manage your Python environment).
Install Visual Studio Code (see below) or Spyder.
Download the data folder from the link here or unzip the file here and save it on your computer.
data folder. Then change directory to the data folder:cd data
python3 -m venv .venv
Activate the virtual environment
.venv\Scripts\activate
source .venv/bin/activate
Install required Python packages
pip install numpy pandas scikit-learn seaborn matplotlib scanpy pca
data) and that you have installed the required packages (see above).Your directory structure should look like this:
data/
└─
import os
# where are we?
print( os.getcwd() )
# change directory to where the data is stored
os.chdir('data/')
# where are we now?
print( os.getcwd() )