Projects

Useful Information:

This page will contain details regarding the Course Projects.

Students are encouraged to select a topic and work on their own projects. However, it is highly recommended that students talk to members of various labs at IISc who are experienced in the topic they have selected.

Projects can be broadly classified of two kinds:

Application
: when Deep learning is used for image processing in other research fields. For example, medical tool segmentation in operation videos.
Improving Deep learning architectures
: when interesting new ideas are introduced to pre-existing models for improving their performance. For example, exploring benefits of higher dimensional convolution in a traditional classification setup.

As the course is on using Deep Learning for Computer Vision, the project must visual data in form of pixels etc.

Resources

Some resources for selecting projects:

Awesome Deep Vision
CVPR: IEEE Conference on Computer Vision and Pattern Recognition
ICCV: International Conference on Computer Vision
ECCV: European Conference on Computer Vision
NIPS: Neural Information Processing Systems
ICLR: International Conference on Learning Representations
Kaggle challenges: An online machine learning competition website. For example, a Yelp classification challenge.

For models, ConvNets have been successfully used in a variety of computer vision tasks. This type of projects would involve understanding the state-of-the-art vision models, and building new models or improving existing models for a vision task. The list below presents some papers on recent advances of ConvNets in the computer vision community.

Object recognition: [Krizhevsky et al.], [Russakovsky et al.], [Szegedy et al.], [Simonyan et al.], [He et al.]
Object detection: [Girshick et al.], [Sermanet et al.], [Erhan et al.]
Image segmentation: [Long et al.]
Video classification: [Karpathy et al.], [Simonyan and Zisserman]
Scene classification: [Zhou et al.]
Face recognition: [Taigman et al.]
Depth estimation: [Eigen et al.]
Image-to-sentence generation: [Karpathy and Fei-Fei], [Donahue et al.], [Vinyals et al.]
Visualization and optimization: [Szegedy et al.], [Nguyen et al.], [Zeiler and Fergus], [Goodfellow et al.], [Schaul et al.]

</p>

We also provide a list of popular computer vision datasets:

Meta Pointer: A large collection organized by CV Datasets.
Yet another Meta pointer
ImageNet: a large-scale image dataset for visual recognition organized by WordNet hierarchy
SUN Database: a benchmark for scene recognition and object detection with annotated scene categories and segmented objects
Places Database: a scene-centric database with 205 scene categories and 2.5 millions of labelled images
NYU Depth Dataset v2: a RGB-D dataset of segmented indoor scenes
Microsoft COCO: a new benchmark for image recognition, segmentation and captioning
Flickr100M: 100 million creative commons Flickr images
Labeled Faces in the Wild: a dataset of 13,000 labeled face photographs
Human Pose Dataset: a benchmark for articulated human pose estimation
YouTube Faces DB: a face video dataset for unconstrained face recognition in videos
UCF101: an action recognition data set of realistic action videos with 101 action categories
HMDB-51: a large human motion dataset of 51 action classes

Grading Policy

This information will be added soon.

Some Example Projects

Projects at VAL, by the present PhD. students will be added to this google doc. Link to Google Doc

Projects in a similar course at Stanford has been listed here. These can help guide you.

Acknowledgement

Material from CS231n has been used for preparing this page, as well as the basic course material.