Dragon Arrow written by Tatsuya Nakaji, all rights reserved animated-dragon-image-0164

Devide data into train and validation in Pytorch

updated on 2020-03-01

Devide data into train and validation in Pytorch



Folder


Data Folder is like below constructure.

This is only example of animal image classifier.

root/
 ├ train/
 │ ├ horse/
 │ │  ├ 8537.png
 │ │  └ ...
 │ ├ butterfly/
 │ │  ├ 2857.png
 │    └ ... 
 ├ test/
 │ ├ horse/
 │ │  ├ 8536.png
 │ │  └ ...
 │ ├ butterfly/
 │ │  ├ 2856.png
 │    └ ... 


How to devide data


split data into train(0.8) and validation(0.2) with stratified target


# load library
import torch
import torchvision
from torchvision import datasets, transforms

# transform
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

# ImageFolder
trainset = datasets.ImageFolder(root='./train',
                                        transform=transform)

# target array
targets = trainset.targets

# stratified split for validation
train_idx, valid_idx= train_test_split(
    np.arange(len(targets)),
    test_size=0.2,
    shuffle=True,
    stratify=targets)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, sampler=train_sampler, num_workers=2)
validloader = torch.utils.data.DataLoader(trainset, batch_size=4, sampler=valid_sampler, num_workers=2)


Now, you have train and validation by stratified split!!