Blog radlak.com

…what’s there in the world

AI Weekly Roundup: Impressive New Models, Video Editing Tools, and More

This week’s AI advancements feature a surprisingly powerful open-weight language model, new video editing tools that allow for targeted modifications, and an image generator that promises character consistency. The central theme is the rapid pace of innovation, with new tools offering practical and creative capabilities that were previously complex or unavailable.

GLM 4.5: A Powerful Open-Weight LLM

A new open-weight large language model, GLM 4.5, has emerged, demonstrating performance on par with proprietary models like GPT-4 and Claude 4 Opus, particularly in agentic reasoning and coding. Key highlights include:

Advanced Capabilities: The model excels at complex tasks, such as generating well-designed, multi-slide presentations with images by researching topics online—a significant improvement over text-only slides from competitors.
One-Shot Coding: It demonstrated the ability to create a functional, browser-based game (a clone of Vampire Survivors) from a single prompt, showcasing impressive coding and logical reasoning skills.
Accessibility: The model is freely available for anyone to use via the Z.ai website.

Advanced AI Video Editing

Both Runway and Luma Labs have released new features that allow users to modify existing videos using text prompts, opening up new creative possibilities.

Runway ALF: This feature is effective at changing a video’s environment (e.g., placing a jet in space) or replacing specific objects (e.g., swapping babies for a sandwich). It can also be used to generate new camera angles from an original clip.
Luma Labs’ Modify with Instructions: Similar to Runway, this tool can alter video scenes. In a direct comparison, it successfully added a chasing alien spaceship to a video, a detail Runway missed. It includes a “strength” slider to control the degree of modification.

Other Key Tool Updates

Several other platforms rolled out significant new features:

Google VEO’s Emergent Behavior: An unexpected capability was discovered where Google’s VEO model can follow text instructions written directly onto a source image to guide the video generation.
Ideogram’s Character Feature: This tool allows for easy face-swapping and character placement in any image using just a single reference photo, making it simple to create consistent characters or insert a face into existing scenes.
Meshy 5 for 3D Modeling: The new version of Meshy offers improved text-to-3D and image-to-3D generation, creating cleaner and more detailed models. It successfully generated a high-quality 3D pizza from a text prompt and a surprisingly coherent 3D scene from a complex 2D image.
Photoshop Beta: Adobe introduced a “Harmonize” feature that automatically matches the lighting and color of a pasted object to its new background, as well as a new generative upscale tool for enhancing low-resolution images.
ChatGPT Study Mode: OpenAI released a mode designed to help users with homework by guiding them through problems step-by-step rather than providing direct answers.

Rapid-Fire News and Takeaways

The week also saw Amazon invest in Fable, an “AI Netflix” aiming to generate full cartoon episodes, and Higgsfield AI making its powerful Halo video model temporarily free. In robotics, a Figure robot was shown performing laundry tasks, and the Unitree R1 humanoid robot was announced for a relatively low price point of $5,900. The overarching conclusion is that AI tools are becoming more powerful, accessible, and specialized, with open-weight models closing the gap with proprietary leaders and new features enabling unprecedented creative control and practical utility.

Mentoring question

Given the rapid advancements in AI-powered creative tools for video, images, and 3D models, which of these new capabilities could you integrate into your personal or professional projects to enhance your workflow or create something entirely new?

Source: https://youtube.com/watch?v=99fmkEYQuaM&si=wnfQoD4pMNymqYvD