In this article, I will provide a detailed guide on how to upload a Kaggle dataset directly to Google Colab. This process simplifies data access, allowing users to seamlessly integrate Kaggle datasets into their Colab notebooks for analysis, machine learning, or data engineering tasks.
Prerequisite
- Colab
- Kaggle Dataset
Steps:
1. Choose Dataset
Pick the dataset you want to import into CoLab. I will be using Reviews for Hotels Worldwide (Booking.com)
2. API Token
To download a dataset, kaggle services require authentication. You must now download an API token.
You may quickly generate this token from your Kaggle account’s profile page. Easily access your Kaggle profile by clicking here.
select account –> find API section –> create new API token
A file named as kaggle.json will be downloaded on your local machine.
3. Colab Notebook
Set up your colab notebook and upload kaggle.json which was downloaded in step 2 to it.
Now install Kaggle Library, make .kaggle directory, copy kaggle.json to it, change its permission.
! pip install kaggle
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json
4. Download data
Now you just need to download dataset. There are two types of datasets:
- Competitions
- Datasets Downloading Competitions dataset:
! kaggle competitons download «name-of-competition»
Downloading Datasets:
! kaggle datasets download «name-of-dataset»
for example:
In case dataset is downloaded as zip extension, you can simply use unzip command of linux:
! unzip «name-of-file»