CSE 527: Introduction to Computer Vision
Location and time: Mon, Fri 1:00 - 2:20, Room: ENGINEERING 143
In this course students will gain knowledge of theory and practice in Computer Vision, and by the end will have demonstrable ability to implement a working solution for real-world problems in image and video analysis. Students will get hands-on experience in deriving the mathematical underpinnings as well as the programmatic implementation of classical vision problems such as image classification, object detection and tracking, segmentation, pose estimation, visual scene reasoning (e.g. captioning, questioning) and more. Students will utilize deep neural networks of many types (convolutional, residual, recurrent, variational, adversarial, etc.), learn how to use cloud1 GPU-enabled virtual machines, evaluate their implementations on standard vision datasets, and compare their results to the state-of-the-art work of computer vision laboratories worldwide. This is an intensive hands-on class with weekly theoretical and practical programming assignments, which will give students the confidence to tackle computer vision problems in the wild as professionals.
1 We will use the Google Cloud Platform, courtesy of Google Cloud Education
|Introduction to computer vision, prereqs review|
|2/1||1-2||Human vision, optics, light, color, image formation||Hello Vision World|
|Images as functions, Convolutions, Filters, Edges|
|2/8||2-2||Interest points and corners, Features|
|2/11||3-1||Shapes, Models, Image matching and alignment||Image processing|
|Object detection: Templates, Haar-like features, HoG|
|2/18||4-1||Object tracking: Mean-shift, Kalman filter||Detection & Tracking|
Setero & 3D
|MVG Intro, Epipolar geometry, Stereo|
|2/25||5-1||Camera calibration, Stereo reconstruction|
|3/1||5-2||Structure-from-Motion, Multi-View Stereo, Visual odometry||SfM & MVS|
|3/4||6-1||Machine learning||Introduction, Linear classifiers, Logistic regression, Probabilistic inference|
|3/8||6-2||SVMs, Boosting, Bag-of-Words, Decision trees, Statistical analysis of classifiers||Hello Machine Learning|
|Biological and Artificial neural networks, Convolutional neural networks, TensorFlow|
|3/15||7-2||Autoencoders, VAE, ML matters: losses, metrics, training, overfitting, regularization||Hello CNNs and AEs|
|3/18 - 3/24 Spring break|
|3/29||8-2||Deep Learning 1:
|Conv-Pool nets, Binary vs. Multiclass, SoftMax, X-Entropy, ImageNet, AlexNet|
|4/1||9-1||Region proposals, SSDs, multi-task losses, YOLO and R-CNNs||Deep object detection|
Deep Learning 2:
|Visual questions and dialogs, Captioning, RNNs & Intro to NLP, BLEU|
|4/8||10-1||(Visual) Attention, Image2Text, Text2Image, pix2code|
|4/12||10-2||Video action recognition, Sign language translation||Image/Video Captioning|
|4/15||11-1||Deep Learning 3:
|Hourglass nets, Fully-Convolutional nets (FCN), UNet, SegNet|
|4/19||11-2||Residual blocks, Skip connections, SkipUNet, FRRN|
|4/22||12-1||Atrous convolution blocks, CRF, DeepLab nets, Mask-RCNN||Semantic segmentation|
Deep Learning 4:
|Hacking & Visualizing ConvNets, Gradient ascent, DeepDream, Visual style transfer|
|4/29||13-1||Generative Adversarial Nets (GANs), DCGAN, VAEGAN, WGAN, CGAN, CycleGAN|
|5/3||13-2||Transfer learning, auxiliary training, bootstrapping, fine-tuning||Generative vision|
|5/6||14-1||Review and prep||TBD|
|5/10||14-2||Class review, final prep|
Piazza, Emailing, Appointments, TAs
We will be using Piazza for class discussion. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza so everyone may benefit from the answers.
Our class page is https://piazza.com/stonybrook/spring2019/cse527/home
Our TAs for this semester: TBD
My office hours are MF 2:30-4PM (after class) in room #145 in New CS.
Recommended Textbooks and Online Material
|Computer Vision: Models, Learning, and Inference
by Dr Simon J. D. Prince
Link: Amazon, Download: http://www.computervisionmodels.com/
My notes: A good reference, but a bit outdated. Contains a good primer on relevant maths.
by Richard Szeliski
Link: Amazon, Download: http://szeliski.org/Book/
My notes: Many algorithms, math underpinning and citations, but little intuition. Pre- deep learning.
Feature Extraction and Image Processing for Computer Vision, Third Edition
|Hands-On Machine Learning with Scikit-Learn and TensorFlow
by Aurélien Géron
My notes: Adequate coverage of scikit-learn and an old version of TensorFlow (0.9~1.0).
|Deep Learning with Keras
by Antonio Gulli, Sujit Pal
Deep Learning with Python
Machine learning, data science books: https://github.com/faizalazman/Data-Science-Books
- Solid programming experience required: In this course you will program complete computer vision systems, from reading dataset files and processing pixels through coding an optimization problem. You should be comfortable picking up a new API (with some help). We will be working strictly in Python3 (OpenCV, NumPy, SciKit, TensorFlow/Keras, Jupyter). If you are uncomfortable in Python - please contact me.
- Mathematics -- relevant knowledge required: This course relies on an established knowledge in computer science oriented mathematics. You are assumed to have taken undergraduate-level courses in the following: Calculus, Linear Algebra, Probability Theory, and Algorithms.
- Machine learning experience very useful (but not required): Coding projects will usually have a machine learning component to them. It would be beneficial if you are familiar with basic ML pipelines (feature design, training, testing) and concepts (supervision, classification, regression, clustering, confusion matrices, precision-recall, ROC, AUC, F1 score, loss/gain functions, etc.)
If you have no solid programming2 background or no solid understanding of linear algebra3 and algorithms4, you should not take this course.
If you do not have any experience in machine learning:
- Take an online course, for example:
- Pick up a book: http://amzn.to/2uqRlxy, http://amzn.to/2t3vOY2
- Good list of resources for machine learning oriented linear algebra and other maths: https://machinelearningmastery.com/resources-for-linear-algebra-in-machine-learning/
- Be prepared to do a lot of self-learning.
- You must be comfortable with learning a sizable set of new tools and processes quickly, and putting them to work right away. In other words, you are expected to be a Hacker :)
- If you have a concern regarding your prior knowledge - contact me.
2 "solid programming" means you are capable of writing, documenting, running, debugging and analyzing the complexity of a program in Python given a non-trivial task at hand.
3 "solid linear algebra" means you have full grasp of concepts such as vectors, matrices, bases, operations, normalization, dot and cross products, factorizations, etc. and are able to derive equations by hand.
4 "solid algorithms" means you have full grasp of elementary data structures (lists, sets, vectors, trees, stacks, queue, hash tables), sorting, graph theory, combinatorics, dynamic programming and complexity analysis.
In this class grading will be given according to:
- Attendance: Record of attendance will be kept in class. More than 5 absences (~20%), without just and confirmed cause5, will award a failing grade in the entire course. Grade component: 10%.
- Assignments: Students are expected to complete all assignments. Missing assignments will award a failing grade for them. More than 3 missing assignments throughout the semester will award a failing grade in the entire course. Grade component: 35%.
- Late submission: Assignments will be accepted only up to 48 hours past deadline. Each hour beyond the deadline will account for a 1.5% penalty decrease in grade.
- Mid-term: Students are expected to complete the mid-term. Missing the mid-term will not award a failing grade for the entire course. Grade component: 25%.
- Final: Students are expected to complete the final to graduate from the class with a passing grade. Failing or missing the final might still award a passing grade in the entire course, but it's extremely unlikely. Grade component: 30%.
5 Confer with the university policies regarding absences: https://www.stonybrook.edu/commcms/provost/faculty/handbook/academic_policies/minimal_instructional_and_student_responsibilities.php
The following are classes of similar nature in which you may find useful information:
- Last year's CSE 527: http://hi.cs.stonybrook.edu/teaching/cse527-fall17
- Brown: https://cs.brown.edu/courses/csci1430/
- Georgia Tech: http://www.cc.gatech.edu/~hays/compvision/
- Udacity: https://www.udacity.com/course/introduction-to-computer-vision--ud810
- UNC: http://www.cs.unc.edu/~lazebnik/spring11/
- Stanford: http://vision.stanford.edu/teaching/cs131_fall1617/
- Cornell: http://www.via.cornell.edu/ece547/
- U Rochester: https://www.cs.rochester.edu/~cxu22/t/577F16/syllabus.pdf
- U of Toronto: http://www.cs.toronto.edu/~kyros/courses/2503/syllabus.html
- Columbia: http://www.cs.columbia.edu/~areiter/CS_Webpage/COMS4731.html
- Illinois: http://slazebni.cs.illinois.edu/spring16/
- Duke: https://www.cs.duke.edu/courses/fall12/compsci527/syllabus.html
- MIT: http://6.869.csail.mit.edu/fa15/
Class Policy Statements
Disruptive Behavior Zero-Tolerance Policy:
We will have absolutely NO abusive, demeaning, derogatory, offensive, defamatory, slanderous or rude communication in or out of the classroom, or any of the online channels of communication. There will be zero tolerance towards such behavior, and if such behavior will be observed it will lead to immediate reporting.
Stony Brook University expects students to maintain standards of personal integrity that are in harmony with the educational goals of the institution; to observe national, state, and local laws and University regulations; and to respect the rights, privileges, and property of other people.
Confer with the university policy on disruptive behavior: https://www.stonybrook.edu/commcms/provost/faculty/handbook/academic_policies/responding_to_distressed_disruptive_students
Disability Support Services (DSS) Statement:
If you have a physical, psychological, medical or learning disability that may impact your course work, please contact Student Accessibility Support Center, ECC (Educational Communications Center) Building, Room 128, (631)632-6748. They will determine with you what accommodations, if any, are necessary and appropriate. All information and documentation is confidential.
Students who require assistance during emergency evacuation are encouraged to discuss their needs with their professors and Student Accessibility Support Center. For procedures and information go to the following website: http://www.stonybrook.edu/ehs/fire/disabilities.
Academic Integrity Statement:
Each student must pursue his or her academic goals honestly and be personally accountable for all submitted work. Representing another person's work as your own is always wrong. Faculty is required to report any suspected instances of academic dishonesty to the Academic Judiciary. Faculty in the Health Sciences Center (School of Health Technology & Management, Nursing, Social Welfare, Dental Medicine) and School of Medicine are required to follow their school-specific procedures. For more comprehensive information on academic integrity, including categories of academic dishonesty please refer to the academic judiciary website at http://www.stonybrook.edu/commcms/academic_integrity/index.html
Critical Incident Management Statement:
Stony Brook University expects students to respect the rights, privileges, and property of other people. Faculty are required to report to the Office of University Community Standards any disruptive behavior that interrupts their ability to teach, compromises the safety of the learning environment, or inhibits students' ability to learn. Faculty in the HSC Schools and the School of Medicine are required to follow their school-specific procedures. Further information about most academic matters can be found in the Undergraduate Bulletin, the Undergraduate Class Schedule, and the Faculty-Employee Handbook.
* Header image: https://flic.kr/p/WvXsZX