卷疯了卷疯了,2月16日,短短十几小时内,OpenAI和谷歌接连发布核弹级成果。
OpenAI发布首款文生视频模型——Sora。简单来说就是,AI视频要变天了!
它不仅能够根据文字指令创造出既逼真又充满想象力的场景,而且生成长达1分钟的超长视频,还是一镜到底那种。
Runway Gen 2、Pika等AI视频工具,都还在突破几秒内的连贯性,而OpenAI,已经达到了史诗级的纪录。
60秒的一镜到底,视频中的女主角、背景人物,都达到了惊人的一致性,各种镜头随意切换,人物都是保持了神一般的稳定性。
OpenAI究竟是怎么做到的?根据官网介绍,「通过一次性为模型提供多帧的预测,我们解决了一个具有挑战性的问题。」
显然,这个王炸级技术有着革命般的意义,连Sam Altman都沉迷到不能自拔!
他不仅疯狂发推安利,而且还亲自下场为网友生成视频:你们随意来prompt,我一一输出。
一位戴着尖顶帽,身披绣有白色星星的蓝色长袍的巫师正在施法,他的一只手射出闪电,另一只手中拿着一本旧书。
在一间拥有电影级灯光设置的充满托斯卡纳乡村风情的厨房里,一位擅长利用社交媒体的奶奶,正在教你制作美味的自制诺奇面。
我们将带你进行一次未来城市的街头巡览,在这里,高科技与自然和谐共处,展现出一种独特的赛博朋克风格。
这座城市洁净无瑕,到处可见的是先进的未来式有轨电车、绚丽的喷泉、巨型的全息投影以及四处巡逻的机器人。
想象一下,一个来自未来的人类导游正带领一群好奇的外星访客,向他们展示人类极致创造力的结晶——这座无与伦比、充满魅力的未来城市。
借助于对语言的深刻理解,Sora能够准确地理解用户指令中所表达的需求,把握这些元素在现实世界中的表现形式。
也因此,Sora创造出的角色,能够表达丰富的情感!
它所制作出的复杂场景,不仅可以包括多个角色,还有特定的动作类型,以及对对象和背景的精确细节描绘。
看,下图中人物的瞳孔、睫毛、皮肤纹理,都逼真到看不出一丝破绽,完全没有AI味儿。
从此,视频和现实究竟还有什么差别?!
Prompt: Extreme close up of a 24 year old woman’s eye blinking, standing in Marrakech during magic hour, cinematic film shot in 70mm, depth of field, vivid colors, cinematic
此外,Sora还能在同一视频中设计出多个镜头,同时保持角色和视觉风格的一致性。
要知道,以前的AI视频,都单镜头生成的。
而这次OpenAI能在多角度的镜头切换中,就能实现对象的一致性,这不得不说是个奇迹!
这种级别的多镜头一致性,是Gen 2和Pika都完全无法企及的……
Prompt: A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.
举个例子:「雪后的东京熙熙攘攘。镜头穿过繁忙的街道,跟随着几位享受着美丽雪景和在附近摊位购物的人们。美丽的樱花瓣伴随着雪花在风中飘舞。」
Sora根据这个提示所呈现的,便是东京在冬日里梦幻的一幕。
无人机的镜头跟随一对悠闲散步的情侣穿梭在街道上,左侧是车辆在河岸路上行驶的声音,右侧是顾客在一排小店之间穿梭的景象。
Prompt: Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.
Prompt: Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. The art style is 3D and realistic, with a focus on lighting and texture. The mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. Its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. The use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.
Prompt: A gorgeously rendered papercraft world of a coral reef, rife with colorful fish and sea creatures.
否则,虚拟和现实的界限,还能区分得清吗?
但是无可否认的是,可怕的事实已经就在面前:一个已经能够理解和模拟现实世界的模型,也就意味着AGI已经不远了。
业内大佬张启煊评价道,「Sora是我目前看到唯一跳脱出空镜头生成、真正的视频生成工作。」
在他看来,目前看来Sora跟Pika、Runway是有代差的,视频生成领域终于被OpenAI支配。或许某天3D视频领域,有朝一日也能体会到这种恐惧。
技术介绍
Prompt: Reflections in the window of a train traveling through the Tokyo suburbs.
Prompt: Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.
Prompt: Drone view of waves crashing against the rugged cliffs along Big Sur’s garay point beach. The crashing blue waters create white-tipped waves, while the golden light of the setting sun illuminates the rocky shore. A small island with a lighthouse sits in the distance, and green shrubbery covers the cliff’s edge. The steep drop from the road down to the beach is a dramatic feat, with the cliff’s edges jutting out over the sea. This is a view that captures the raw beauty of the coast and the rugged landscape of the Pacific Coast Highway.
Prompt: Aerial view of Santorini during the blue hour, showcasing the stunning architecture of white Cycladic buildings with blue domes. The caldera views are breathtaking, and the lighting creates a beautiful, serene atmosphere.
Prompt: A young man at his 20s is sitting on a piece of cloud in the sky, reading a book.
Prompt: A litter of golden retriever puppies playing in the snow. Their heads pop out of the snow, covered in.
Prompt: The camera directly faces colorful buildings in burano italy. An adorable dalmation looks through a window on a building on the ground floor. Many people are walking and cycling along the canal streets in front of the buildings.
一幅充满工人、设备和重型机械的建筑工地的移轴摄影。
Prompt: Tiltshift of a construction site filled with workers, equipment, and heavy machinery.
Prompt: A petri dish with a bamboo forest growing within it that has tiny red pandas running around.
Prompt: A cartoon kangaroo disco dances.
Prompt: Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee.