This page will contain details regarding the Course Projects.
Students are encouraged to select a topic and work on their own projects. However, it is highly recommended that students talk to members of various labs at IISc who are experienced in the topic they have selected.
Projects can be broadly classified of two kinds:
- Application: when Deep learning is used for image processing in other research fields. For example, medical tool segmentation in operation videos.
- Improving Deep learning architectures: when interesting new ideas are introduced to pre-existing models for improving their performance. For example, exploring benefits of higher dimensional convolution in a traditional classification setup.
As the course is on using Deep Learning for Computer Vision, the project must visual data in form of pixels etc.
Some resources for selecting projects:
- Awesome Deep Vision
- CVPR: IEEE Conference on Computer Vision and Pattern Recognition
- ICCV: International Conference on Computer Vision
- ECCV: European Conference on Computer Vision
- NIPS: Neural Information Processing Systems
- ICLR: International Conference on Learning Representations
- Kaggle challenges: An online machine learning competition website. For example, a Yelp classification challenge.
For models, ConvNets have been successfully used in a variety of computer vision tasks. This type of projects would involve understanding the state-of-the-art vision models, and building new models or improving existing models for a vision task. The list below presents some papers on recent advances of ConvNets in the computer vision community.
- Object recognition: [Krizhevsky et al.], [Russakovsky et al.], [Szegedy et al.], [Simonyan et al.], [He et al.]
- Object detection: [Girshick et al.], [Sermanet et al.], [Erhan et al.]
- Image segmentation: [Long et al.]
- Video classification: [Karpathy et al.], [Simonyan and Zisserman]
- Scene classification: [Zhou et al.]
- Face recognition: [Taigman et al.]
- Depth estimation: [Eigen et al.]
- Image-to-sentence generation: [Karpathy and Fei-Fei], [Donahue et al.], [Vinyals et al.]
- Visualization and optimization: [Szegedy et al.], [Nguyen et al.], [Zeiler and Fergus], [Goodfellow et al.], [Schaul et al.]
We also provide a list of popular computer vision datasets:
- Meta Pointer: A large collection organized by CV Datasets.
- Yet another Meta pointer
- ImageNet: a large-scale image dataset for visual recognition organized by WordNet hierarchy
- SUN Database: a benchmark for scene recognition and object detection with annotated scene categories and segmented objects
- Places Database: a scene-centric database with 205 scene categories and 2.5 millions of labelled images
- NYU Depth Dataset v2: a RGB-D dataset of segmented indoor scenes
- Microsoft COCO: a new benchmark for image recognition, segmentation and captioning
- Flickr100M: 100 million creative commons Flickr images
- Labeled Faces in the Wild: a dataset of 13,000 labeled face photographs
- Human Pose Dataset: a benchmark for articulated human pose estimation
- YouTube Faces DB: a face video dataset for unconstrained face recognition in videos
- UCF101: an action recognition data set of realistic action videos with 101 action categories
- HMDB-51: a large human motion dataset of 51 action classes
This information will be added soon.
Some Example Projects
Projects at VAL, by the present PhD. students will be added to this google doc. Link to Google Doc
Projects in a similar course at Stanford has been listed here. These can help guide you.
Material from CS231n has been used for preparing this page, as well as the basic course material.