

Classes completed
374

Quizzes submitted
301

Projects submitted
325
Students will apply AI-powered tools to generate captions, descriptions, and summaries from images. By expanding on a simple image caption, they will utilize a text generation model (GPT-2) and a text-to-image model (Stable Diffusion) to create a complete, interactive workflow for text-to-image conversion. They will practice generating descriptive text, summarizing the content, and exploring creative applications.
In this assignment, you will build a complete console-based voice translation application using speech recognition, Google Translate, and text-to-speech. The program takes spoken English input, converts it to text, translates it into a user-selected language, and then speaks the translated result.