Computer Vision Applications, Fall 2022

Aug 29, 2022

Details

Course: COMP 388-002 / COMP 488-002 Computer Science Topics
Format: Seminar
Level: Undergraduate and Graduate
Instructor: Daniel Moreira (dmoreira1@luc.edu)

Lectures: MON, 4:15 to 6:45 PM, 117 Cuneo Hall
Office Hours: TUE and THR, 5:00 to 7:00 PM, by appointment
Sakai: https://sakai.luc.edu/x/4tCa9j

Grades are now available.

Overview

How might Google or TinEye reverse image search operate? How can a computer program process the pixel values of images and video frames and classify the depicted scene, or leverage the captured faces to perform person identification? What about manipulated images with tools such as Photoshop? Are there methods to help to debunk these manipulations? These are some of the questions we will be addressing in this course, focusing on state-of-the-art Computer Vision (CV) solutions to reduce the semantic gap between the pixel values and the desired outcome of complex tasks such as content-based image retrieval, content classification and recognition, biometric identification, and media forensics, always with the greater good in mind.

Requirements to attend this course are basic programming skills (especially Python) and statistics and probability.

Schedule

Date	Topic	Leaders	References	Assignment
08/29	Introduction to CV	Instructor	N.A.	N.A.
09/05	Labor Day	N.A.	N.A.	A01, due on 09/15
09/12	Letter Soup: AI, ML, NN, DL, etc.	Instructor	[1, 2, 3]	A02, due on 09/20
09/19	Image Description	Instructor	[4, 5, 6]	A03, due on 09/27
09/26	Image Retrieval	Nick and Jesus	[7, 8, 9, 10]	A04, due on 10/04
10/03	Image Classification	Nick and Kenneth	[11, 12, 13, 14, 15]	A05, due on 10/18
10/10	Fall Break	N.A.	N.A.	N.A.
10/17	Object Detection	John and Kenneth	[16, 17, 18, 19, 20]	A06, due on 10/25
10/24	Image Segmentation	Mujtaba and Matt	[21, 22, 23, 24, 25]	A07, due on 11/01
10/31	Face Detection	John and Amol	[26, 27, 28, 29]	A08, due on 11/08
11/07	Face Recognition	Mujtaba and Amol	[30, 31, 32, 33, 34]	A09, due on 11/15
11/14	Generative Adversarial Nets	Jakob and Matt	[35, 36, 37, 38, 39]	A10, due on 11/29
11/21	Attacks & Deep Fake Detection	Instructor	[40, 41, 42]	N.A.
11/28	Sensitive Video Analysis	Instructor	[43, 44]	N.A.
12/05	Provenance Analysis	Jakob and Jesus	[45, 46, 47, 48, 49]	N.A.
12/12	Final Exam	N.A.	N.A.	N.A.

Assignments

~~A01: Image Descriptors [4, 5, 6], due on 09/15 at noon.~~
~~A02: Image Retrieval, [8, 9, 10], due on 09/20 at noon.~~
~~A03: Image Classification, [11, 12, 13, 14, 15], due on 09/27 at noon.~~
~~A04: Object Detection, [16, 17, 18, 19, 20], due on 10/04 at noon.~~
~~A05: Image Segmentation, [21, 22, 23, 24, 25], due on 10/18 at noon.~~
~~A06: Face Detection, [26, 27, 28, 29], due on 10/25 at noon.~~
~~A07: Face Recognition, [30, 31, 32, 33, 34], due on 11/01 at noon.~~
~~A08: Generative Adversarial Nets, [35, 36, 37, 38, 39], due on 11/08 at noon.~~
~~A09: Attacks and Sensitive Video Analysis, [40, 41, 42, 43, 44], due on 11/16 at noon.~~
~~A10: Provenance Analysis, [46, 47, 48, 49], due on 11/29 at noon.~~

Students will have to do at most eight assignments. Each assignment will comprise a particular set of scientific articles. Students will have to choose one of the articles for each assignment and provide a summary on the due date. There is no limit of pages for the summaries. Each summary should contain:
(1) What is the problem addressed in the article?
(2) Why is it important to address this problem?
(3) How do the authors address the problem?
(4) What are the authors’ claims?
(5) What methodology did they adopt (e.g., datasets, problem metrics, experiments) to prove their claims?
(6) Do you agree with the authors’ claims?
(7) For the graduate students, how do you think you may use this work in your research?
(8) What open questions do you have about the article?

Discussion Leaders

~~Image Retrieval, Nick and Jesus, on 09/26.~~
~~Image Classification, Nick and Kenneth, on 10/03.~~
~~Object Detection, John and Kenneth, on 10/17.~~
~~Image Segmentation, Mujtaba and Matt, on 10/24.~~
~~Face Detection, John and Amol, on 10/31.~~
~~Face Recognition, Mujtaba and Amol, on 11/07.~~
~~Generative Adversarial Networks, Jakob and Matt, on 11/14.~~
~~Provenance Analysis, Jakob and Jesus, on 12/05.~~

Each student will play the role of discussion leader twice along the course. Students will lead discussion in groups, preferably in pairs of one graduate and one undergraduate student. The graduate students are expected to help their undergraduate peers.

Discussion leaders will be responsible for organizing a 1.5-hour presentation of the topic of the day, resorting to slides, videos, and demonstrations. The instructor advises the discussion leaders to share their material with him a couple of days before the presentation day.
Discussion leaders will also receive the summaries of the articles and open questions related to their topics from the other students at least 5 days before their presentation.

The discussion and assignment topics coincide; as a consequence, discussion leaders are not required to provide summaries for the topics they will present.

Final Exam

Date and Local: 12/12, 4:15 PM, 117 Cuneo Hall
Format: Oral quiz, questions here.

Grading

Concept	Point Interval	Concept	Point Interval	Concept	Point Interval	Concept	Point Interval
A	[94, 100)	B+	[88, 89]	C+	[78, 79]	D	[60, 69]
A-	[90, 93]	B	[84, 87]	C	[74, 77]	F	[0, 59]
		B-	[80, 83]	C-	[70, 73]

Distribution

Total: 100 points
Class Presence and Participation: 6 points (x13)
Assignments: 1 point (x8)
Discussion Leadership: 3 points (x2)
Final Exam: 8 points
CV-on-the-news Post: 1 point (extra)
Demonstration on Discussion Day: 5 points (extra)
Late Assignments: -0.1 point per day

Each student has two “Oopsie” cards (OC), which will allow them to either avoid losing points because of absence or extend due dates until 12/11. They may use an OC at their discretion for any task, except for their assigned days of discussion leadership and final exam. Please let the instructor know you want to use your OC.

CV On the News

Posted by the students on Sakai.

https://bit.ly/3HebdsS (Matt’s submission).
https://intel.ly/3B2o5hS (Jakob’s).
https://yhoo.it/3Fl3sjx (Nick’s).

Google Colab

Practical material used in class.

Harris’ corner detector (https://bit.ly/3gRv0E2).
SIFT detector (https://bit.ly/3gXsZWL).
SURF detector (https://bit.ly/3B7dNxg).
Face recognition (https://bit.ly/3F4D5NM).
FGSM (https://bit.ly/3Fl4Oe7).
Lippe’s FGSM (https://bit.ly/3FmvbAD).
Final Exam Raffler (https://bit.ly/3BQEBBY).

References

LeCun, Y., Bengio, Y., Hinton, G. Deep learning. Nature 521 (1), 2015.
Hearst, M., Dumais, S., Osuna, E., Platt, J., Scholkopf, B. Support vector machines. IEEE Intelligent Systems and their Applications 13 (4), 1998.
Ho, T. The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (8), 1998.
Lowe, D. Distinctive Image Features from Scale-Invariant Keypoints. Springer International Journal of Computer Vision 60 (2), 2004.
Bay, H., Tuytelaars, T., Van Gool, L. SURF: Speeded Up Robust Features. Springer European Conference on Computer Vision (ECCV), 2006
Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B. Large-Scale Image Retrieval with Attentive Deep Local Features. IEEE International Conference on Computer Vision (ICCV), 2016.
Jegou, H., Douze, M., Johnson, J. Faiss: A library for efficient similarity search. Available at https://bit.ly/3BiGYg9. Meta Platforms, Inc., 2017.
Brogan, J., Bharati, A., Moreira, D., Rocha, A., Bowyer, K., Flynn, P., Scheirer, W. Fast Local Spatial Verification for Feature-Agnostic Large-Scale Image Retrieval. IEEE Transactions on Image Processing 30 (1), 2021.
Kalantidis, Y., Avrithis, Y. Locally Optimized Product Quantization for Approximate Nearest Neighbor Search. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
Jegou, H., Douze, M., Schmid, C. Product quantization for nearest neighbor search. IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (1), 2010.
Howard, A., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. ArXiv Preprint, 2017.
Krizhevsky, A., Sutskever, I., Hinton, G. Imagenet classification with deep convolutional neural networks. ACM Communications 60 (6), 2017.
He, K., Zhang, X., Ren, S., Sun, J. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. Going deeper with convolutions. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
Simonyan, K., Zisserman, A. Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations (ICLR), 2015.
Lin, T-Y., Goyal, P., Girshick, R., He, K., Dollar, P. Focal loss for dense object detection. IEEE International Conference on Computer Vision (ICCV), 2017.
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. You only look once: Unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
Ren, S., He, K., Girshick, R., Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 28 (1), 2015.
Girshick, R. Fast R-CNN. IEEE International Conference on Computer Vision (ICCV), 2015.
Girshick, R., Donahue, J., Darrell, T., Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P., Zhang, L. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
He, K., Gkioxari, G., Dollar, P., Girshick, R. Mask R-CNN. IEEE International Conference on Computer Vision (ICCV), 2017.
Badrinarayanan, V., Kendall, A., Cipolla, R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (12), 2017.
Long, J., Shelhamer, E., Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
Ronneberger, O., Fischer, P., Brox, T. U-net: Convolutional networks for biomedical image segmentation. Springer International Conference on Medical Image Computing and Computer-assisted Intervention (MICCAI), 2015.
Viola, P., Jones, M. Robust real-time face detection. Springer International Journal of Computer Vision 57 (2), 2004.
Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G. A convolutional neural network cascade for face detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
Xu, X., Kakadiaris, I. Joint head pose estimation and face alignment framework using global and local CNN features. IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2017.
Albiero, V., Chen, X., Yin, X., Pang, G., Hassner, T. img2pose: Face alignment and detection via 6dof, face pose estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Parkhi, O., Vedaldi, A., Zisserman, A. Deep face recognition. British Machine Vision Conference (BMVC), 2015.
Schroff, F., Kalenichenko, D., Philbin, J. Facenet: A unified embedding for face recognition and clustering. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
Wen, Y., Zhang, K., Li, Z., Qiao, Y. A discriminative feature learning approach for deep face recognition. Springer European Conference on Computer Vision (ECCV), 2016.
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L. Sphereface: Deep hypersphere embedding for face recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
Deng, J., Guo, J., Xue, N., Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y. Generative adversarial nets. ArXiv preprint (https://bit.ly/3OXpCvn), 2014.
Mirza, M., Osindero, S. Conditional generative adversarial nets. ArXiv preprint (https://bit.ly/3OZwM2j), 2014.
Isola, P., Zhu, J., Zhou, T., Efros, A. Image-to-image translation with conditional adversarial networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
Zhu, J., Park, T., Isola, P., Efros, A. Unpaired image-to-image translation using cycle-consistent adversarial networks. IEEE International Conference on Computer Vision (ICCV), 2017.
Karras, T., Laine, S., Aila, T. A style-based generator architecture for generative adversarial networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
Goodfellow, I., Shlens, J., Szegedy, C. Explaining and harnessing adversarial examples. International Conference on Learning Representations (ICLR), 2015.
Bai, T., Zhao, J., Zhu, J., Han, S., Chen, J., Li, B., Kot, A. AI-gan: Attack-inspired generation of adversarial examples. IEEE International Conference on Image Processing (ICIP), 2021.
Wang, S., Wang, O., Zhang, R., Owens, A., Efros, A. CNN-generated images are surprisingly easy to spot… for now. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
Perez, M., Avila, S., Moreira, D., Moraes, D., Testoni, V., Valle, E., Goldenstein, S., Rocha, A. Video pornography detection through deep learning techniques and motion information. Elsevier Neurocomputing 230 (1), 2017.
Moreira, D., Avila, S., Perez, M., Moraes, D., Testoni, V., Valle, E., Goldenstein, S., Rocha, A. Temporal robust features for violence detection. IEEE Winter Conference on Applications of Computer Vision (WACV), 2017.
Moreira, D., Theisen, W., Scheirer, W., Bharati, A., Brogan, J., Rocha, A. Image Provenance Analysis. Springer Multimedia Forensics (Book), 2022. Available at https://bit.ly/3ESzK5q.
Pinto, A., Moreira, D., Bharati, A., Brogan, J., Bowyer, K., Flynn, P., Scheirer, W., Rocha, A. Provenance filtering for multimedia phylogeny. IEEE International Conference on Image Processing (ICIP), 2017.
Moreira, D., Bharati, A., Brogan, J., Pinto, A., Parowski, M., Bowyer, K., Flynn, P., Rocha, A., Scheirer, W. Image provenance analysis at scale. IEEE Transactions on Image Processing 27 (12), 2018.
Bharati, A., Moreira, D., Brogan, J., Hale, P., Bowyer, K., Flynn, P., Rocha, A., Scheirer, W. Beyond pixels: Image provenance analysis leveraging metadata. IEEE Winter Conference on Applications of Computer Vision (WACV), 2019.
Bharati, A., Moreira, D., Flynn, P., Rocha, A., Bowyer, K., Scheirer, W. Transformation-aware embeddings for image provenance. IEEE Transactions on Information Forensics and Security 16 (1), 2021.

Academic Integrity

Students are expected to adhere to the LUC statements on academic integrity available at https://bit.ly/3TmiQkQ. These policies fully apply to this course. The penalty for task-wise academic misconduct is zero points. Multiple events of misconduct will incur in failing the entire course (with an F grade). All cases of academic misconduct will be reported to the proper department offices.

Accommodations

Students who have disabilities and wish to request academic accommodations are advised to contact the Services for Students With Disabilities (SSWD) office at 773-508-3700 or SSWD@luc.edu as soon as possible. The SSWD office will provide accommodation letters that, once shared with the instructor, will be fully accommodated as per the terms of their content with no further questions.

Teaching Computer Vision

Daniel Moreira

Assistant Professor of Computer Science

Computer scientist with interests in (but not limited to) Computer Vision, Machine Learning, Media Forensics, and Biometrics.