AI Image Contest Exhibit Q&A

31 min readFeb 20, 2023

On the last day of the AI Image Contest Exhibition in Yokohama, we were pleased to partake in a Q&A that gathered questions from the Japanese community.

Here is the complete rundown of all questions, both in English and Japanese!

AI画像展 x NovelAIコラボ | NovelAIについてのQ&A ミラー配信

Twitterhttps://twitter.com/AI_contest

www.youtube.com

横浜で行われたAIコンテストの最終日、日本の方々から質問を受け付ける機会を設けさせていただきました。

以下は実際の質疑応答を英語と日本語でまとめたものになります！

Q: What’s amazing about Novel AI
Novel AIは何がすごいのか

We managed to offer uninhibited freedom in terms of privacy and encryption as one of our strongest suits. We strongly believe in unlimited creativity. Aside from that, most likely the fact that we are a small team of 12 people that have made it this far! We launched with Text generations back in June 2021, we continue building out our model offering, and we’re certainly only getting stronger!

NovelAIの強みとしては、暗号化とプライバシーを保護する事で使用者の皆さんに制限されることのない自由を提供することだと思っています。私たちは、無限の創造性を強く信じています。しかし「何がすごいのか」という質問に立ち返るなら、私は12人という小さなチームでここまでやってこられたという事実が一番”すごい”と思っています。 2021年6月に文章生成サービス（Text generations）を立ち上げてからAIモデル提供サービスを続け、今日まで確実に成長してこられているのです！

Q: What was the hardest part of developing Novel AI?
Novel AIを開発するのに一番苦労したのはどこか

The biggest hurdles so far have been the attempted implementation of basic Stable Diffusion — there were some harder issues to tackle, such as the Center Crop issues, and mitigation of unwanted content, which led us to abandon the basic model implementation and provide only our own, niche trained models (Anime & Furry) instead — we were also really confident in the final product.
Oh, another one of our pain points is keeping up with promised timelines. We learned over the past year that setting roadmaps isn’t very realistic in the wild west of Artificial Intelligence.

サービスの開発に当たっての最大の難関は、Stable Diffusionを実装しようとしたことでしょうか？Center Cropの問題や、不要な要素を取り除くなど、取り組むべき難しい問題があり、けっきょくはSDモデルを使う事を断念し、代わりに私たち独自のトレーニング済みモデル（NAI Diffusion Animeと Furry）だけを提供することになりました。今では独自モデルを使ってサービスを作った事をとても誇りに思っています。それともう一つ、私たちの悩みの種は、約束した納期を守ることです。人工知能開発という先進領域を進むにあたり、正確なロードマップを設定することはあまり現実的ではないことをこの一年で学ばされました。

Q: I’m curious about what kind of work you do, the introduction of the office, and the atmosphere of development
どういう形態で仕事をしているのか, オフィスの紹介とか, 開発の雰囲気が気になる

We don’t have an office — we are located all over the world, and everyone works remotely whenever they see fit. Since there are only 12 of us on the Anlatan team, the atmosphere feels more like a group of friends — the hardships over the last year have certainly brought us together.

実は私たちにオフィスはなく、世界中に散らばった仲間たちがオンラインで仕事をしています。Anlatanのチームには１２人しかいない上に、先の一年の苦難を一緒に乗り越えた仲間たちなので、会社の同僚というよりは親しい友達の集まりのような雰囲気です。

Q: How will AI image technology evolve and be used in the future (e.g. evolution of video)
今後AI画像の技術はどのように進化してどのように使われるのか（例：動画の進化）

StabilityAI is already aiming for video generation, and there have been some Stable Diffusion-powered prototype applications floating around. I could imagine movies might be doable in the next few years. Personally, we love seeing AI used as a tool to truly expand upon your own skillset. Imagine the new types of creations that will pop up from people that previously lacked the opportunity to express themselves!

StabilityAIは既に動画の自動生成を目指していますし、StablityDiffusionを使ったアプリがたくさん出回っています。次の数年は動画分野の開発が盛んにおこなわれると考えています。私としてはＡＩが便利なツールとして使われ、人間のスキルを拡張する事を快く思っています。ＡＩの台頭によって、以前は自分を表現する技術を持たなかった方々がクリエイターとして活躍できるようになる……そんな素敵な時代が来ると信じています！

Q: When do you expect the singularity to occur? Will AGI happen?
シンギュラリティはいつ起きると予想しますか？, AGIは起きる？

Kurumuz, our lead AI developer & CEO, is convinced it will happen between 2030–2035.

２０３０年から２０３５年の間で起きるのではないか、とAI開発者でNovelAICEOのKurumuz氏は予想しています。

Q: What made you choose anime as the subject of your Image Generation AI model?
画像生成AIの中でなぜアニメ系に着目したのか

Most of the team loves the anime aesthetic, and we simply wanted this to work for our own enjoyment as well. With the aforementioned hurdles during the image generation development, we realized that while we couldn’t control the outputs of the basic Stable Diffusion model, we were able to do that with NovelAI Diffusion Anime & Furry.
After a few delays during the development, we realized just how good our work turned out on its own and made the choice to drop the basic model.
There are plenty of alternative platforms offering Stable Diffusion in its more basic forms, so why not stand out in this market with something unique? NovelAI is focused on Storytelling foremost. We think that the omission of photorealism from the models somewhat lends to the concept as well. We do hope to expand further in terms of styles, though.

開発チームの大半がアニメの画風を愛していたので、私達自身が楽しめるサービスを作りたいと思っていました。StableDiffusionを使った開発を続けるうちに、SDでは生成結果をコントロールするのが難しいと気づきました。そこで開発したNovelAI AnimeとFurryは非常に優れていたのです。それに基本のSDモデルを使ったWebプラットフォームは沢山あるので、市場で目立つためには何かユニークな事をすることが必要でした。さらに、NovelAIは”物語性”を重要にしているので、フォトリアル（写真的、現実的）な絵を除外する事はそういうコンセプトにあっています。ただ、これからはより多様なスタイルが出力できたら、とも思っています。

Q: What kind of jobs are likely to survive in the future?
今後、生き残りそうな職業は？

Even in the unlikely possibility that AI manages to harness everything in terms of skill, humans will always be needed to guide and control the AI — but we are not so pessimistic. There is a joy in creating art and expressing yourself that is not replaceable through AI.
The hope is to allow everyone to express themselves and tell the stories they want to tell — whether it is in writing or visually.

もしもAIが人間のスキルを使えるようになったとしても、私達は悲観的な未来は予想していません。人間の存在はAIをコントロールして導くために必要不可欠だと思いますし、アート作品を作る楽しさ、自分を表現する喜びはAIで決して代替できないものです。全ての人が自分を表現できるようになって、自分の物語を文章や絵を通じて語る事が出来るようになるというのが私達の希望です。

Q: Do you have any plans to add more functions, such as inpainting and outpainting, in the future?
今後、inpaintingやoutpaintingなどの機能を増やす予定はありますか？

We ourselves want those two functionalities, but we do not have any release plans or ETA at this time. It is on the to-do list but no guarantees.

私達開発チームもそういった機能が欲しいと思っていますが、今後の開発、リリース予定は立っておらず、また保証も出来ないのです。

Q: What is the final goal of the company, not just NovelAIDiffusion?
NovelAIDiffusionに限らず、企業として最終目標は何なのか

Good question! Our general goal is to marry text and image generation into the text editor so that you can create illustrated stories. We do have some additional plans for the future, but our hands have been quite full!

良い質問です！私達のゴールは文章生成と画像生成機能を統合し、イラスト付きの物語を作れるサービスを提供する事です。更なる計画もあるのですが、今はちょっと手一杯な感じです！

Q: Nice to meet you, I love Cool & Kawaii illustrations such as NovelAI’s game CG and novel illustrations! There are many drawing AIs out there today, and I believe each one has different characteristics and strengths. So, I would like to know how NovelAI plans to differentiate itself from those drawing AIs in the future!
はじめまして、自分はNovelAIの表現するゲームのCGや小説の挿絵のような、Cool&Kawaiiイラストが大好きです！現在は数多くのお絵描きAIが登場しており、それぞれに異なった特徴や強みがあると、私は思っています。そこでNovelAIは今後、それらのお絵描きAIとどのような差別化をはかっていく予定か、ぜひとも教えてほしいです！

Right now, we are very aware that the Anime model generally creates a very specific art style and a noticeable NovelAIDiffusion pattern that is quite frankly hard to unsee once you notice it! We are working on new models that hopefully won’t have this issue anymore.

現在のモデルにはNovelAIDiffusionパターンともいえる特徴的なア―トスタイルでの出力があり、一度気になり始めると無視できないという事もあると思います。現在はそういった問題が起きないモデルを開発中です。

Q; You released a so-called version that is strong for Kemono, but do you have any plans to incorporate a version that specializes in XX in the future? (If possible, I would like you to be able to generate Japanese “kotatu” lol)
いわゆる、ケモに強いバージョンを公開されましたが、今後も〇〇特化のようなバージョンを組み込む予定はありますか？（できればぜひ、日本の「kotatu」を生成できるようにしてほしいです笑）

We’re not quite sure if we need to keep models separate or if we can find a way that combines the different models into one and becomes adept at different styles.

Also I don’t think I’ve ever heard of kotatu? Please let us know what style this is over on Twitter sometime!

今後ともそういった特化モデルが分かれて存在するかは分かりません。もしそういった異なるアートスタイルを持ったモデルをミックスして、その上で個々のアートスタイルを出力できるなら特化モデルは要らなくなると思います。

質問にある”コタツ”というモノは聞いたことがないのでTwitterとかで教えてくれたら嬉しいです！

Q: It was rumored that NovelAI would be open-sourced, but how long do you think it will take? Since there is no sign of it being published, I imagine that it will be published after the model becomes minor.
NovelAIがオープンソース化されるという話ですが、それはどの程度のスパンで考えられていますか？公開される気配がないのでモデルがマイナーになってから公開されるのかなと想像してしまいます。

There have been no more developments In that direction.

現在はオープンソース化の方向で開発は行っていません。

Q: There are some areas where the AI is currently weak. Will you improve it?
現在AIは苦手な部位があったりしますが、改善したりしますか？

Hands! Everyone always points out the hands!
We definitely hope to find a way to fix those.

手です！既に多くの人がお気づきですね！私達はそういった問題を解決できるモデルを作れるようこれからも努力します。

Q: Can you be an illustrator’s sidekick?
イラストレーターの相棒になりえますか？

Yes! Aini already extensively tested using NovelAI’s img2img to assist her in lighting and generating endless versions of how to shade and put detail into specific areas of her work. It’s like a personal teacher that takes a few seconds to analyze and point out possible flaws and solution references at the click of a button. Using Generated Images in your workflow can also greatly increase the speed of your work, allowing you to do more of what you want faster!

もちろんです！コミュニティマネジャーのAiniさんは日常的にNovelAIのI2I（元画像ありでのAI画像生成）を自分の絵に陰影を付けたり、様々なパターンのディテールを加えるのに使うなど、広範囲に渡ってイラスト制作の工程で使っています。画像生成AIはボタンを押すだけで、数秒のうちに可能性のある欠陥や解決策を指摘してくれるので、絵の家庭教師のようなものだと思います。イラストレーションの工程にAIを参加させる事で、作業スピードが大幅に向上し、より速く、より多くの作業を行うことができるようになりました！

Q: Although it was a very short period from the release of stable diffusion to the start of the NovelAIDiffusion service, the generated images, UI, and access resistance have the impression of being of high quality. Were you considering such a service before SD was released?
stable diffusion公開からNovelAIDiffusionサービス開始までは非常に短期間だったにもかかわらず生成画像、UIやアクセス耐性などは高品質な印象があります。SD公開前からこういったサービスを検討していたのでしょうか？

We did play with the idea of image generation back in December 2021. We even had some very badly aged examples in a previous blog post. There wasn’t much work done on an image editor as we have now, though — more so a general design scheme for how to implement images into the text generation (which continues to be a goal of ours). Our frontend developers generally created the image editor as we developed the image models and put in anything we needed as we went along.
Feature requests would pop up during use, and TabloidA would sometimes get a chance to design them before the frontend implements it, sometimes not.

It is certainly a work in progress. If you ever have quality-of-life requests or feedback, please don’t hesitate to send them our way! We’ll make sure to find a way to translate them and see if they are something we might be able to implement,

私達が画像生成の分野で色々し始めたのは２０２１年の１２月で、実際その時に（時代遅れであまり面白いとはいえない）ブログもポストしていたんです。AIで生成した文章に画像を実装する為のちょっとしたデザインはありましたが今のような形式ではありませんでした。フロントエンドの開発者は、画像モデルを開発しながら画像エディタを作り、必要なものをどんどん組み込んでいきました。機能追加の要望があれば開発者たちが手を付ける前にTabloidA氏が設計することもありました。現在進行形で私達のサービスは進化しているので、QOL（クオリティ・オブ・ライフ）に関する要望やフィードバックがあれば、遠慮なくお寄せください（英語でなくても翻訳して実装できるか検討させていただきます！）

Q: From now on, we expect that models will be released one after another, and the number of competitors will increase. What do you think about the differentiation that is unique to NovelAI?
これからはモデルが次々と公開され、競合も増えてくると予想します。その中でNovelAIだからこその差別化についてどのように考えていますか？

We’re absolutely thrilled to watch new and unique image models arrive. It’s nice that there will be more variety to pick from and even learn from. We’re certainly excited to go on and continue making new models ourselves!

新しいユニークな画像生成モデルが登場するのには、本当にワクワクします。表現の幅が増え、さらに学ぶ事が増えるのは嬉しいことです。私たち自身も、新しいモデルを作り続けることを楽しみにしています！

Q: hello. I always enjoy creating images. As soon as possible, I would like to ask about the optimal solution for the order in which prompts are written. For example, I generally write in the order of ``background -> angle pose -> style -> body shape -> face -> hair -> skin -> clothes -> fine movements of arms and hands. Can the order be generated more beautifully or less broken?
こんにちは。いつも楽しく画像生成させて頂いています。早速ですが私はプロンプトの書く順序の最適解についてお訊きしたいです。例えば私は大体「背景→アングル・ポーズ→画風→体型→顔→髪→肌→服→腕や手の細かい所作」のような順序でなんとなく書いているのですが、内部的にはどういった順序のほうがより美麗または破綻少なめに生成出来るのでしょうか？

We generally noticed the most important aspects of an image should be in the front half but it isn’t quite so scientific, and there are many factors to tokenization that can have a placebo effect of making you “Feel” like a certain prompt is working.

しばしば画像の重要な要素はプロンプトの前半にくるようにするべきと言われますが、科学的な理由は見つかっていません。トークン化には複数の要素が絡んでいるので、実際には効果のないプロンプトが機能しているように感じてしまうプラシーボ効果が起きている場合もあります。

Q: I’m having a lot of trouble with not having enough tokens for prompts and negative prompts (especially the latter). Do you have any plans to expand the number of tokens in the future?
プロンプトとネガティブプロントのトークンが足りなくてよく困っている（特に後者）のですが、今後トークンの枠数を拡張する予定はあるのでしょうか？

There hasn’t been much focus on this internally. The token amount now is quite generous — should we find ways to lengthen it, we most likely will, though.
We actually recommend using less text since any type of symbol or token can have more of a negative than a positive effect. Looking at a lot of the prompts floating around online, we do believe there is a lot of placeboes.

この点については、社内でもあまり焦点が当たっていません。現在のトークンの量はかなり多めですが、もっと長くする方法が見つかれば、そうする可能性もあります。

プロンプトは短くすることをおススメします。シンボルやトークンは、ポジティブな効果よりもネガティブな効果をもたらす事が多いです。ネットに流れている多くのプロンプトを見ていると、プラシーボが多いように思います。

Q: Are there any big version upgrades planned in the future?
今後大きなバージョンアップは何か予定されていますか？

Hopefully! The team has only been back to unhindered research for a relatively short time, but they’re happily working away again.

きっとあると思います！現在は大仕事が終わりチームは自由な研究生活を送っていますが、大きなプロジェクトに向けて忙しくするのも良いと思っています。

Q: Do you have any plans with the release of StableDiffusion2?
StableDiffusion2が出たことによって何か考えている計画はありますか？

The team is still busy evaluating it at this time!

まだSD2が使用に耐えるか検証中です！

Q: In NovelAI, it is said that the prompt “masterpiece” is very effective for painting, but did you do anything special?
NovelAIでは、masterpieceというプロンプトが絵にすごく効きやすいとされていますが、なにか特殊な処理を行ったのですか？

As a result of adding quality tags to our training data, the AI is given a sense of aesthetics and good visual concepts. However, this can have some disadvantages, such as default prompts often generating girls or picture frames… We recommend experimenting with both the Add Quality Tags toggle on and off!

AIのトレーニングデータにクオリティタグを追加したことにより、AIは人間の美的感覚を理解しクオリティの高いビジュアルが生成できるようになりましたが、同時に、デフォルトのプロンプトでは女の子や額縁がよく生成されるなどのデメリットもでてきました。「Add Quality Tags（品質タグの追加）」のオンオフを切り換えて実験するのがおススメです！

Q: Currently, NovelAI can generate some famous anime characters’ images, and are you willing to add more data of anime characters?

現在版権キャラの立ち絵などをNovelAIで生成する事が出来ますが、今後、新しいアニメのキャラなどを生成する事ができるようになる可能性はありますか？

Most likely, you will have to make do with how recent and how much the data the AI model has been trained on. The AI creates connections through knowledge. I don’t think we will generally aim for the reproducibility of characters to happen. We don’t purposely input certain characters’ concepts to the model. In general, it will learn the visual aspects of characters just by the way of the model getting smarter and better, even though it is not exactly our goal.

一般的に、キャラクターを出力するにはそのAIモデルが何時のデータをどれだけ学習したかに依ります。現在そういった既存のキャラクターが出力されるという現象は確かにあるものの、私達が意図して実装したものではありません。ただ私達が目標とするかしないかに関係なく、これからAIモデルがより賢くなるにしたがって、副産物的にそういった特定のキャラクターを出力することが容易になっていくと考えられます。

Q：二次創作やオリジナルに限らず、同じキャラクターを連続して生成することが難しいです。なにか良い方法はあるでしょうか？また、そういった機能を追加する予定はありますか？

I am having difficulty generating a series of the same character on NovelAI. Is there a good solution? Also, do you have any plans to update the service to enhance this function?

There is not an easy solution to this, but you could get by with image-to-image generation and the editor to define some specifics. Not only are you using your prompts, but also you can give the AI some hints about what you want to see, such as a hair color.
Other than that, tagging can be thought out, but maybe mastering tagging at this time is not going to be easy to get the original character consistent. We are hoping the smarter models we get; the easier it will be to get your ideal images.

i2i(画像から画像を生成する方法)とNovelAIのエディターを使って特定のキャラクター画像を出力するのは容易ではないので、髪色などの具体的にどういった見た目のなのかといったヒントを与える事が重要です。また、詳細なプロンプトで外見を制御をする事も考えられるものの、それでも特定のキャラクターを出し続けるは難しいです。ただ、将来的にはより賢いAIによって簡単になるのではないかと思います。

同じ質問をクリスさんにもしてみました。

We also asked Chris the above question.

There is no easy solution at this time other than mastering your tagging and undesired content. A larger prompt in combination with a large Undesired Content will let you lead in the right direction. It largely depends on your tag usage and adjusting it to your desired outcome. If that is mastered, you can change the style of your image and manage to keep the same character.

現段階でプロンプトとネガティブプロンプト（望まないコンテンツをタグにしたもの）を使いこなす以外に簡単な解決法はありません。大量のタグをプロンプトとネガティブプロンプトに使えば望む方向性で出力しやすくなるので、タグの使用と調整次第で理想的なイメージを得やすくなります。もしあなたがタグの使い方に精通したなら、出力する絵のスタイルを変えたとしても、同じキャラクターを出力しやすくなるはずです！

That concludes the in-person Q&A!
Many thanks to the AI Contest team, everyone that submitted questions, and the amazing people that helped us with translation & localization, as well as the in-person attendees who asked us a few more questions!
It was an honor to be able to attend our first in-person event and meet everyone.

これでQ＆Aコーナーを終了させていただきます。AIコンテスト実行委員チームに最大の感謝を、そして質問者の方々、翻訳、ローカライズを担当した方々には大変感謝しております。

今回のコンテストは私達の初めての会場イベントであり、この場に出席し、皆さんと会えたことを本当に感謝しています。