IS 5320 – Hrishabh Kulkarni

Hrishabh Kulkarni – IS 5320

Category: Uncategorized

  • Summary Post – HW 8

    Summary Post 8


    Time Log Teams – time spent on other Teams’ sites (must have 3 entries or more):
    Date: Feb. 27, 2026 From: 09:05am To: 09:17am
    Date: Feb. 27, 2026 From: 06:10pm To: 06:22pm
    Date: Feb. 28, 2026 From: 10:15am To: 10:27am


    Time Log Students – time spent on other students’ sites (must have 3 entries or more):
    Date: Feb. 27, 2026 From: 10:10am To: 10:21am
    Date: Feb. 27, 2026 From: 07:30pm To: 07:41pm
    Date: Feb. 28, 2026 From: 11:05am To: 11:16am
    Date: Feb. 28, 2026 From: 08:15pm To: 08:26pm

    Essay I – Summary of Content Activities

    This week, I focused on creating two new in-depth blog posts centered on emerging AI trends that are shaping 2026. The first article explores Multimodal AI, breaking down how modern AI systems can simultaneously process text, images, audio, and video — and why this marks a fundamental shift in human-computer interaction. The second article covers AI Reasoning Models, explaining how systems like OpenAI’s o3 use chain-of-thought reasoning to “think before they respond,” and why this represents a paradigm leap beyond traditional language models. Both posts include proper image citations, are open for visitor comments, and have been categorized and tagged appropriately. I also updated the General Menu to reflect the new content under relevant categories, and added both posts to the HW8 section of the HWs Menu for grading purposes and also added Thankyou page and given it a separate parent block in the Menu. Additionally, I visited all Teams’ and students’ sites, leaving thoughtful comments on posts I found engaging, and moderated and approved incoming comments on my own site through the WordPress admin dashboard.

    New Content Published This Week:

    Essay II – Summary of “Thank You” Event Conversion

    This week, I set up a “Thank You” page conversion event in Google Analytics 4 using Google Tag Manager. First, I created a dedicated Thank You page (not a post) in WordPress, which serves as the destination users land on after submitting a contact form. In GA4, I navigated to Admin → Data Display → Conversions and created a new conversion event named thank_you. To ensure the event fires correctly, I followed Conversion II → Method 1 and created a new event tag in Google Tag Manager — configuring it to trigger when a user lands on the Thank You page URL. The GTM tag was published and verified using GTM’s Preview/Tag Assistant mode, which confirmed the event fired successfully on page load. After the standard 12–24 hour delay, the thank_you conversion event appeared in GA4 under Admin → Events and was toggled as a conversion. Screenshots of the GA4 conversion setup and GTM tag configuration are included below.

    Essay III – Summary of “Menu Click” Event Conversion

    To track user engagement with my site’s navigation, I created a “menu click” custom event in Google Tag Manager for one of the links in my main menu. I started by entering GTM Preview mode and clicking the target menu link to identify its Click Text value in the Tag Assistant window. Using this, I configured a new Click Trigger in GTM with the condition set to Click Text → contains → [menu link text], which avoids the complexity of using empty CSS Click Classes. I then created a corresponding GA4 Event Tag in GTM, named menu_click, linked to this trigger and tied to my GA4 Measurement ID. After publishing the GTM container, I verified the tag fired correctly in Preview mode. Following the 12–24 hour propagation window, the menu_click event appeared in GA4 under Reports → Engagement → Events, confirming successful tracking. I also marked it as a conversion in GA4 to monitor navigation-driven engagement going forward. Screenshots of the GTM trigger setup and the GA4 event report are included below.

  • AI Reasoning Models

    AI Reasoning Models – The Revolution of “Think Before You Speak”

    For years, AI has been praised for its speed. Ask a question, get an answer in milliseconds. But speed without accuracy is just a fast mistake.

    In 2026, a new breed of AI is changing the game, not by being faster, but by being smarter. Meet AI Reasoning Models: the systems that actually think before they respond.

    What Are Reasoning Models?

    Most AI you’ve used works by predicting the next most likely word or token based on patterns in training data. It’s incredibly fast, but it struggles with complex, multi-step problems that require logical deduction.

    Reasoning models are different. They use a technique called chain-of-thought reasoning, essentially an internal scratchpad where the model breaks a problem down step by step before giving you a final answer. The longer and harder it “thinks,” the better and more accurate its output becomes.

    Think of it like the difference between a student who blurts out the first answer that comes to mind versus one who carefully works through the problem on paper first. Same raw knowledge — completely different quality of output.

    The Numbers That Shocked the AI World

    When OpenAI released o3, the AI community took notice — and for good reason:

    • On ARC-AGI, a visual reasoning benchmark previously thought to be years away from AI capability, o3 scored 87.5% accuracy
    • On AIME 2024 (elite-level math competition problems), o3 scored 96.7% — compared to o1’s 83.3%, a massive leap in just one generation
    • These aren’t just benchmarks — they represent AI solving problems that genuinely require abstract thinking, planning, and reasoning

    This is not incremental improvement. This is a paradigm shift.

    From Autocomplete to Deep Thinking

    Here’s the evolution in simple terms:

    • GPT-3 era: Predict the next word really well
    • GPT-4 era: Understand context, write coherently at length
    • Reasoning model era: Analyze, deliberate, reason, and solve like a specialist consultant

    The Feedback Loop Nobody’s Talking About

    Here’s the part that makes reasoning models truly significant: they are now being used to train the next generation of AI models. The outputs of o3-level reasoning are becoming the training data for future systems creating an accelerating feedback loop of intelligence improvement.

    This means every new model release won’t just be “a bit better.” It will compound on the reasoning capacity of its predecessor. We are, quite literally, building AI that gets exponentially smarter at thinking.

    Why This Matters to You

    Whether you’re a researcher, developer, student, or professional reasoning models are the tools that will handle your hardest, most intellectually demanding tasks. They’re not replacing creative or emotional intelligence. They’re taking the heavy cognitive lifting off your plate.

    The shift from “fast AI” to “thinking AI” is already here. The real question is: are you using it?


    References:
    Bratincevic, N. (2025, March 27). OpenAI’s o3: Hype or a real step toward AGI? Forrester Research. https://www.forrester.com/blogs/openais-o3-hype-or-a-real-step-toward-agi/
    Microsoft Azure AI Foundry. (2025, April 21). Everything you need to know about reasoning models. Microsoft Tech Community. https://techcommunity.microsoft.com/blog/azure-ai-foundry-blog/everything-you-need-to-know-about-reasoning-models-o1-o3-o4-mini-

  • Multimodal AI

    Multimodal AI – When AI Finally Got Eyes, Ears, and a Voice

    Remember when AI was just a chatbot you typed questions into? Those days are officially over.

    We are living through one of the most exciting shifts in artificial intelligence , the rise of Multimodal AI. And if you think this is just another buzzword, think again. Multimodal AI is quietly becoming the backbone of how we interact with machines in 2026.

    So, What Exactly Is Multimodal AI?

    Traditional AI models were built around a single type of input usually text. You typed, it responded. Simple, but limited.

    Multimodal AI breaks that boundary. These models can simultaneously process and generate text, images, audio, and video, just like a human does naturally. Show it a photo, it understands it. Play it an audio clip, it transcribes and analyzes it. Give it a video, it summarizes the narrative. It’s AI that perceives the world through multiple “senses” at once.

    Think of it this way: earlier AI was like talking to someone on a phone call, text only. Multimodal AI is like sitting across from someone in a room, full sensory engagement.

    Why Is It Exploding Right Now?

    The momentum behind multimodal AI in 2026 is undeniable. Here’s what’s driving it:

    • GPT-4o, Gemini 1.5, and Claude 3 have made multimodal capability the new baseline standard not a premium feature
    • Disney invested $1 billion into OpenAI specifically to leverage multimodal tools like Sora, enabling users to generate clips featuring Marvel, Pixar, and Star Wars characters
    • ByteDance’s Seedance 2.0, released in early 2026, went viral for producing 2K AI video with native audio and lip-synced dialogue, a jaw-dropping demonstration of how far this has come
    • In healthcare, multimodal models are being used for autonomous diagnostics reading MRI scans, cross-referencing patient notes, and flagging anomalies, all at once

    Real-World Applications You’ll See Everywhere

    The impact isn’t just in labs or big tech companies. Multimodal AI is creeping into everyday use cases:

    • Content Creation: Generate a thumbnail, write the caption, and produce the voiceover all from one prompt
    • Education: Upload a handwritten equation or a chart; the AI explains it step by step
    • Customer Support: AI that reads a product photo, listens to the complaint audio, and resolves the issue — no human needed
    • Research: Feed a PDF, a dataset, and an audio interview; the model synthesizes insights across all three

    What This Means for You

    Whether you’re a creator, developer, or business owner — multimodal AI is going to fundamentally change how you build, communicate, and create. The era of single-mode AI is behind us. The next chapter is one where AI sees the world as richly and fully as we do.

    The question isn’t whether multimodal AI will impact your field. It’s whether you’ll be ready when it does.


    References:
    Webuters. (2025, November 9). The evolution of multimodal generative AI in 2026. https://www.webuters.com/evolution-of-multimodal-generative-ai
    Tran, K. (2025, December 26). Why 2026 belongs to multimodal AI. Fast Company. https://www.fastcompany.com/91466308/why-2026-belongs-to-multimodal-ai