Weekly AI Roundup: GLM 4.5, In-Video Editing, Consistent Characters & More

This video provides a comprehensive overview and hands-on demonstration of the latest significant releases in the AI space. The central theme is showcasing new tools and model capabilities that have emerged, highlighting practical applications, current limitations, and the rapid pace of innovation.

GLM 4.5: An Impressive Open-Weight LLM

A new, powerful open-weight language model, GLM 4.5, is introduced as a standout release. Despite the narrator’s initial skepticism towards marginal model improvements, GLM 4.5 proved genuinely impressive. It competes with top-tier models like GPT-4 and Claude Opus, even outperforming them in specific coding and reasoning benchmarks. Key demonstrated features include:

Slide Deck Creation: It can generate visually appealing and well-researched slide decks, far surpassing the quality of previous attempts with models like ChatGPT.
One-Shot Coding: The model successfully created a playable JavaScript clone of the game “Vampire Survivors” from a single prompt, showcasing strong coding abilities.
Accessibility: It is fast and freely available to use on the Z.ai website.

Advanced AI Video Editing & Generation

The video covers several new tools that allow for sophisticated video manipulation and creation using text prompts.

Runway ALF & Luma Modify: Both platforms introduced features to edit existing videos with text instructions. They excel at changing environments (e.g., placing a jet in space) and replacing objects (e.g., swapping babies for a sandwich), but struggle with more complex contextual changes. Runway also showed a novel capability to generate new camera angles from an existing shot.
Google’s VEO (Emergent Behavior): An unexpected capability was discovered where users can write text instructions directly onto an image, and the VEO model will animate it according to those instructions.
MidJourney Update: Users can now specify a start and end frame for video generation, enabling morphing effects or creating seamless loops. However, the narrator’s tests showed the feature still has significant limitations and may require more refined prompting.

AI Image & 3D Model Generation

New tools have made creating consistent characters and 3D models more accessible.

Ideogram Character: This feature allows for easy face-swapping and consistent character generation using just a single reference image, a significant simplification over older methods. The tool successfully placed the narrator’s face into various famous photos.
Meshy 5: An updated version of the text/image-to-3D tool that generates cleaner, more detailed 3D models. It performed well creating a detailed pizza and surprisingly capable models from complex 2D images like a “Rick and Morty” scene.
Hunan 3D World Model: A model from Tencent that creates explorable 3D worlds from prompts, though access to custom generation is currently on a waitlist.

Platform Updates and Rapid-Fire News

The summary concludes with several other notable updates:

Platform Enhancements: ChatGPT introduced a “Study Mode” for guided problem-solving, and Photoshop (Beta) added a “Generative Upscale” for increasing image resolution and a “Harmonize” feature to seamlessly blend objects into new scenes by matching lighting and color.
Other News: Google’s AI Search mode rolled out in the UK; Amazon invested in Fable, an “AI Netflix” for generating cartoon episodes; and Figure Robotics demonstrated a robot capable of doing laundry.

Conclusion

The week was marked by the release of highly capable and accessible AI tools that push the boundaries of creative and practical applications. While some features are still experimental, the overall trend points towards more powerful, user-friendly AI that can handle increasingly complex tasks in content creation, coding, and problem-solving.

Mentoring question

With the rapid release of new AI tools for video, image, and 3D creation, how can you strategically integrate one of these into your existing creative or professional workflow to solve a real problem, rather than just experimenting with them?

Source: https://youtube.com/watch?v=99fmkEYQuaM&si=o5bVYcw5NL4rNfwA