OpenAI Soraを試してみてください
テキスト/画像からビデオを作成し、ループビデオを生成し、ビデオを前後に拡張します
Soraがライブになったときに最初に知る人になりましょう！

オープンエーアイソラについて

何がソラ

OpenAIのテキストからビデオへのモデル。ソラは、ビジュアル品質とユーザーのテキスト指示への遵守を維持しながら、最大1分のビデオを生成することができます。

Soraの目標

Soraは、現実世界を理解しシミュレートするモデルの基盤として機能し、現実世界との相互作用を必要とする問題を解決するのに人々を支援します。

進捗

赤チームと招待されたビジュアルアーティスト、デザイナー、映像作家にのみ利用可能です。

特徴

複数のキャラクターや特定のモーションタイプ、被写体、背景の詳細を正確にサポートします。モデルはこれらのものが物理世界に存在する方法を理解し、1つのビデオ内で複数のショットを撮影します。

制限事項

複雑な物理を正確にシミュレートする難しさ、空間の詳細の混乱、物体やキャラクターの突然の出現、正確でない物理モデリングと不自然な物体の変形。

安全

赤チームと協力して敵対的なテストを実施し、モデル内のセキュリティ問題を特定して対処するためのツールを構築し、検出分類器とC2PAメタデータを使用して誤解を招くコンテンツを検出するためのツールを構築します。

ショーケース - 毎日更新

Prompt

a brown and white border collie stands on a skateboard, wearing sunglasses

Prompt

1st person view taking the longest zip-line in the world through Dubai

Prompt

Style: Modern cinematic realism with vivid visual accents. A summer evening. A group of young friends is gathered on a rooftop, overlooking the glowing city lights. They’re laughing, chatting, and enjoying the vibe with soft music playing in the background. The camera slowly zooms in on a bottle of YOMI beer on the table. Cold condensation drips down the glass, highlighting the vibrant golden hue of the drink. The focus shifts to a hand reaching for the bottle. The camera follows the motion, capturing the crisp sound of the bottle cap popping open. A sip. A deep breath. A smile. In the background, a voice speaks: ‘YOMI — the taste of the moment. Capture your inspiration.’ Final scene: A bottle of YOMI stands against the backdrop of a setting sun, its golden light refracting through the beer. The brand logo and tagline appear on screen: ‘YOMI. The time of your story.

Prompt

The camera follows behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road surrounded by pine trees on a steep mountain slope, dust kicks up from its tires, the sunlight shines on the SUV as it speeds along the dirt road, casting a warm glow over the scene

Prompt

POV, ACTION SHOTS, JUMPCUTS, Montage,, tracking shot, from the side hyperspeed, 30x speed, cinematic atmosphere, person having a futuristic neon beachpunk in punkexosuit form around them, suiting up, glow and light, Phanto-Cinematic still, beachpunk gigadream, kodak etkar 100, hypersurrealist retrowave religiouscience fiction, Southern California, emocore, hyperfuturistic, beachpunk ISO: T2.8, compression: ARRIRAW, lighting_conditions: ultraviolet blacklight, backlit,

Prompt

Close-up shot of a freeride skier carving through deep, untouched powder snow during a vibrant sunset in the Alps. The camera starts low, tracking alongside the skier as they make a powerful turn, sending a spray of fine snow into the air. The spray catches the warm golden-pink light of the setting sun, creating a stunning glow and sparkling reflections. The camera then pans upward and slightly rotates, revealing the majestic alpine peaks bathed in the sunset’s hues. The skier continues gracefully downhill, leaving a glowing trail of light and snow in their wake as the scene fades into the serene mountain landscape.

Prompt

An elegant scene set in Egypt featuring a female anthropomorphic fox character. She has vibrant red-orange fur and vivid green eyes, posing gracefully near ancient Egyptian ruins with the iconic pyramids in the background. She is wearing a flowing, semi-transparent, culturally inspired robe with golden patterns. The setting includes sandy terrain, scattered palm trees, and hints of ancient stone structures adorned with hieroglyphics. The sky is clear, and the sun casts a warm glow over the scene, emphasizing the mystique of the Egyptian desert landscape.

Prompt

A stylish woman walks down a Seoul street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.

Prompt

Company	Generation Type	Max Length	Extend?	Camera Controls? (zoom, pan)	Motion Control? (amount)	Other Features	Format
Runway	Text-to-video, image-to-video, video-to-video	4 sec	Yes	Yes	Yes	Motion brush, upscale	Website
Pika	Text-to-video, image-to-video	3 sec	Yes	Yes	Yes	Modify region, expand canvas, upscale	Website
Genmo	Text-to-video, image-to-video	6 sec	No	Yes	Yes	FX presets	Website
Kaiber	Text-to-video, image-to-video, video-to-video	16 sec	No	No	No	Sync to music	Website
Stability	Image-to-video	4 sec	No	No	Yes		WebsiteLocal model, SDK
Zeroscope	Text-to-video	3 sec	No	No	No		Local model
ModelScope	Text-to-video	3 sec	No	No	No		Local model
Animate Diff	Text-to-video, image-to-video, video-to-video	3 sec	No	No	No		Local model
Morph	Text-to-video	3 sec	No	No	No		Discord bot
Hotshot	Text-to-video	2 sec	No	No	No		Website
Moonvalley	Text-to-video, image-to-video	3 sec	No	Yes	No		Discord bot
Deforum	Text-to-video	14 sec	No	Yes	No	FX presets	Discord bot
Leonardo	Image-to-video	4 sec	No	No	Yes		Website
Assistive	Text-to-video, Image-to-video	4 sec	No	No	Yes		Website
Neural Frames	Text-to-video, image-to-video, video-to-video	Unlimited	No	No	No	Sync to music	Website
MagicHour	Text-to-video, image-to-video, video-to-video	Unlimited	No	No	No	Face swap, sync to music	Website
Vispunk	Text-to-video	3 sec	No	Yes	No		Website
Decohere	Text-to-video, Image-to-video	4 sec	No	No	Yes		Website
Domo Al	Image-to-video, video-to-video	3 sec	No	No	Yes		Discord bot

人々はxでSoraについて話します

SoraAI by OpenAI is wild.

These are 100% generated only from text and take just 1 minute 🤯

10 wild examples ( 2nd is WOW ) pic.twitter.com/NLetbJVa2v
— Alamin (@iam_chonchol) February 18, 2024

If you think OpenAI Sora is a creative toy like DALLE, ... think again. Sora is a data-driven physics engine. It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all… pic.twitter.com/pRuiXhUqYR
— Jim Fan (@DrJimFan) February 15, 2024

"this close-up shot of a futuristic cybernetic german shepherd showcases its striking brown and black fur..."

Video generated by Sora. pic.twitter.com/Bopbl0yv0Y
— Bill Peebles (@billpeeb) February 18, 2024

Sora and Stable Video, text to video compare. pic.twitter.com/pZzSeSXPtN
— Retropunk (@RetropunkAI) February 17, 2024

OpenAI's Sora is the most advanced text-to-video tool yet. 💡

It can generate compellingly realistic characters, create multiple dynamic shots in a single video, with accurate details of both subjects and background.

Here's the 10 best generations so far
🧵👇 pic.twitter.com/FHp0cxt0Ll
— Escher (@Escher_AI) February 16, 2024

OpenAI's Sora is going to change marketing forever, enabling anyone to unleash his inner creativity.

Check this 100% AI-generated video of Mammoth generated with the new "text-to-video" OpenAI model: pic.twitter.com/DcDGPjpBXC
— William Briot (@WilliamBriot) February 15, 2024

"a photorealistic video of a butterfly that can swim navigating underwater through a beautiful coral reef"

Video generated by Sora pic.twitter.com/nebCKLa09U
— Tim Brooks (@_tim_brooks) February 17, 2024

Another Sora video, Sora can generate multiple videos side-by-side simultaneously.

This is a single video sample from Sora. It is not stitched together; Sora decided it wanted to have five different viewpoints all at once! pic.twitter.com/q2rfxh61CQ
— 🅱️WhiteAfricanSpaceJesus (@zespacejesus) February 18, 2024

Sora can also generate stories involving a sequence of events, although it's far from perfect.

For this video, I asked that a golden retriever and samoyed should walk through NYC, then a taxi should stop to let the dogs pass a crosswalk, then they should walk past a pretzel and… pic.twitter.com/OhqVFqR5vA
— Bill Peebles (@billpeeb) February 17, 2024

https://t.co/uCuhUPv51N pic.twitter.com/nej4TIwgaP
— Sam Altman (@sama) February 15, 2024

https://t.co/P26vJHlw06 pic.twitter.com/AW9TfYBu3b
— Sam Altman (@sama) February 15, 2024

https://t.co/rPqToLo6J3 pic.twitter.com/nPPH2bP6IZ
— Sam Altman (@sama) February 15, 2024

https://t.co/WJQCMEH9QG pic.twitter.com/Qa51e18Vph
— Sam Altman (@sama) February 15, 2024

a wizard wearing a pointed hat and a blue robe with white stars casting a spell that shoots lightning from his hand and holding an old tome in his other hand
— biden or buster (@willofdoug) February 15, 2024

よくある質問

Soraは、テキスト指示からリアルで想像力豊かなビデオシーンを作成できるOpenAIによって開発されたAIモデルです。物理世界の動きをシミュレートするよう設計されており、ビジュアル品質を維持し、ユーザーのプロンプトに従うことで、最大1分のビデオを生成します。
Soraは、静的ノイズに似たビデオから始まり、多くのステップでノイズを取り除くことによって徐々に変換される拡散モデルです。これは、GPTモデルに類似したトランスフォーマーアーキテクチャを使用し、ビデオや画像をパッチと呼ばれるより小さなデータユニットのコレクションとして表現します。
Soraは、複数のキャラクターが登場する複雑なシーン、特定の動き、被写体や背景の正確なディテールを含む幅広い種類のビデオを生成することができます。また、既存の静止画をアニメーション化したり、欠落しているフレームを補完して既存のビデオを延長することもできます。
Soraは、複雑なシーンの物理を正確にシミュレートしたり、特定の原因と結果のインスタンスを理解したり、時間の経過とともに空間の詳細を維持したりすることに苦労するかもしれません。時には物理的にありえない動きを作り出したり、空間の詳細を混同したりすることがあります。
OpenAIは、赤チームと協力してモデルを敵対的にテストし、誤解を招くコンテンツを検出するツールを開発しています。彼らは将来的にC2PAメタデータを含める予定であり、テキスト分類器や画像分類器など、他の製品からの既存の安全性手法を活用しています。
現在、Soraは赤チームに利用可能で、危害やリスクの重要な領域を評価するために利用され、ビジュアルアーティスト、デザイナー、映画製作者には、クリエイティブプロフェッショナル向けのモデルを進化させるためのフィードバックを提供しています。
クリエイティブなプロフェッショナルであれば、OpenAIを通じてSoraへのアクセスを申請することができます。アクセスが許可されると、テキストプロンプトに基づいてビデオを生成するモデルを使用でき、独自で想像力豊かなシーンで創造的なプロジェクトを強化することができます。
Soraは、OpenAIが人工一般知能（AGI）を達成する重要なマイルストーンと考えている、現実世界を理解しシミュレートできるモデルの基盤として機能します。
Soraは言語に深い理解を持ち、テキストのプロンプトを正確に解釈し、生き生きとした感情を表現する魅力的なキャラクターやシーンを生成することができます。1つのビデオ内で複数のショットを作成しながら、一貫したキャラクターやビジュアルスタイルを維持します。
Soraは、GPTモデルに類似したトランスフォーマーアーキテクチャを使用し、ビデオや画像をパッチと呼ばれるより小さなデータ単位のコレクションとして表現します。このデータ表現の統一により、モデルはより幅広い視覚データで訓練されることが可能になります。
モデルに一度に多くのフレームを予測させることで、Soraは、被写体が一時的に視界から外れても一貫性を保つことができます。
Soraは、DALL·E 3からのリキャプション技術を使用しており、これにより視覚トレーニングデータのために非常に記述的なキャプションを生成します。これにより、生成されたビデオでユーザーのテキスト指示により忠実に従うモデルを支援します。
OpenAIは、Soraを製品に統合する前に、敵対的なテスト、検出分類器の開発、およびDALL·E 3など他の製品からの既存の安全性手法を活用するなど、いくつかの安全対策を講じる予定です。
Soraは映画製作者、アニメーター、ゲーム開発者、その他のクリエイティブプロフェッショナルがビデオコンテンツ、ストーリーボード、さらにはアイデアのプロトタイプを迅速かつ効率的に生成するために使用できます。
OpenAIは積極的に政策立案者、教育者、アーティストと連携し、技術の懸念を理解し、その技術のポジティブな利用例を特定する取り組みを行っています。彼らは、すべての有益な利用法や悪用を予測することはできないと認めていますが、実世界での使用から学ぶことが、時間をかけてより安全なAIシステムを作り上げるために重要であると認識しています。
OpenAIには、使用ポリシーに違反するテキスト入力プロンプトをチェックして拒否するテキスト分類器があります。例えば、極端な暴力、性的コンテンツ、憎悪のイメージ、または知的財産の未承認使用を要求するものなどが該当します。
AIにおける「ワールドモデル」とは、物理世界とそのダイナミクスをシミュレートする計算モデルを指し、AIがその中で物体やエンティティがどのように相互作用するかを理解し予測することができるようになります。Soraの文脈では、このことはモデルがテキストの指示に従うだけでなく、重力、運動、物体の相互作用など、現実世界の物理法則や振る舞いにも従うようにトレーニングされていることを意味します。この能力は、テキストの記述から現実的で一貫したビデオコンテンツを作成するために重要です。

OpenAI Soraを試してみてくださいテキスト/画像からビデオを作成し、ループビデオを生成し、ビデオを前後に拡張しますSoraがライブになったときに最初に知る人になりましょう！

オープンエーアイ ソラについて

何がソラ

Soraの目標

進捗

特徴

制限事項

安全

ショーケース - 毎日更新

Other AI video products

人々はxでSoraについて話します

よくある質問

ソラとは何ですか？

ソラはどのように機能しますか？

Soraはどんな種類のビデオを生成できますか？

Soraの制限は何ですか？

OpenAIはSoraのコンテンツの安全性をどのように確保していますか？

誰がソラにアクセスできますか？

私のクリエイティブプロジェクトにSoraをどのように使用できますか？

研究における空の未来は何ですか

Soraはテキストプロンプトをどのように処理しますか？

Soraのアーキテクチャの技術的な詳細は何ですか？

Soraは生成されたビデオの主題の一貫性をどのように確保していますか？

ソラのトレーニングにおける再キャプション技術の役割は何ですか？

OpenAIはSoraを製品にどのように統合する予定ですか？

Soraのクリエイティブ産業における潜在的な応用は何ですか？

Soraを使用する際の倫理的考慮事項は何ですか？

Soraは潜在的なリスクを持つコンテンツの生成をどのように処理していますか？

AIとSoraの文脈での「ワールドモデル」とは何ですか？

OpenAI Soraを試してみてください
テキスト/画像からビデオを作成し、ループビデオを生成し、ビデオを前後に拡張します
Soraがライブになったときに最初に知る人になりましょう！

オープンエーアイソラについて