Aprenda Most Popular Ready-Made Models

We have discovered numerous approaches for generating images using neural networks.
However, you do not need to train them on your own, as there are many ready-made solutions available. There are various models that are already trained and can be used to generate new data using web applications or APIs.

DALL-E 3 (by OpenAI)

DALL-E 3 is renowned for its capability to generate highly realistic and imaginative images from textual descriptions. It excels in capturing intricate concepts and diverse artistic styles, making it popular among artists and designers. While access is currently limited, it remains a significant advancement worth monitoring.

Midjourney

An independent AI image generator, Midjourney is celebrated for the quality of its outputs. While it may not match DALL-E 3 in photorealism, it stands out for producing unique and artistic interpretations based on user prompts. Midjourney boasts a robust community and offers a user-friendly interface for exploring creative visions.

Stable Diffusion

Stable Diffusion prioritizes customization and control. Although its initial results may not always be the most polished, it allows users to finely adjust the image generation process. This makes it an excellent choice for those interested in experimentation and gaining deeper insights into AI-driven image creation.

How the text prompt is converted into the image?

The initial step involves converting your text prompt into a format understandable by the AI model. This is achieved by encoding words into numerical representations while considering their sequence and relationships.

def text_to_numerical_representation(text):
    # Split the text into individual tokens (words)
    tokens = text.split()
    
    # Create a numerical representation (for simplicity, using ASCII values)
    numerical_representation = [ord(char) for token in tokens for char in token]
    
    return numerical_representation

This encoding serves as the condition for the model. In certain cases, this text encoding can even replace the generator's input instead of random noise.

Consequently, the network is trained to closely match the encoded text description.

Note

We used ASCII encoding as a simple example. In practice, more advanced techniques such as Word2Vec, TF-IDF, BERT and others are employed. You can find more information in Introduction to NLP course.

Tudo estava claro?

Obrigado pelo seu feedback!

Seção 3. Capítulo 5

Pergunte à IA

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Deslize para mostrar o menu