Left arrow iconReturn to All Resources

AI images: The opportunity and the bull$#!%

16 Minute Read

Matt HowlandChief Product & Engineering Officer, Cordial

This is the second post in a series that will explore different aspects of AI application—where there is opportunity today, where I believe it will emerge tomorrow, and where to look out for people slinging BS.

You will learn about the current state of AI image generation, the benefits it offers, the hurdles it faces, and realistic steps for leveraging AI without falling for the hype. Read my first post on AI copywriting here.

When your dog has more AI chops than most marketing teams (true story)

All right, friends, I’m going to make a potentially offensive statement. My dog, Franklin, might have a more impressive AI resume than some of your marketing teams.

No, I’m not kidding.

I wanted to collect all those candid shots from friends and family for my recent wedding. So, naturally, I gave Franklin a virtual phone and an AI brain. Guests texted their photos to his number (or WhatsApp; he’s international, after all), and he used AI magic to extract labels from the images (multimodal model). Then, he’d generate a whimsical, storybook-style illustration from the labels and description(using diffusion, more on that later) featuring himself right in the middle of the action and, of course, a message describing what he’s up to (multimodal model), and text that back. It was a hit, and let’s be honest, it was way cooler than just asking people to email me their pics, sometimes it’s the non-obvious things that win. 

Oh my gosh, I just had the best dream! Sian and Matt were having a fancy dinner under twinkly lights, and I was right there with them! They were eating yummy food, and I was hoping for some tasty treats to fall my way. It was so magical and cozy, just like a fairy tale! Woof! 🐾✨ – AI Franklin

The point? AI isn’t just for tech nerds anymore.

If my dog can use AI to create personalized artwork, imagine what you could do with it for your retail brand. In this blog post, we’re diving into the world of AI-generated images. We’ll explore what’s possible today, what’s on the horizon, and of course, we’ll call out the BS.

For the product nerds out there like me, I had 89% adoption of AI Franklin, with >70% using it three more times.

What AI image generation can do NOW (and it’s not just for your Instagram feed…or fake influencers)

Forget the days of endless photoshoots and scouring stock image libraries. AI is here to revolutionize the way I create and use visuals for my retail brand. While it might not be quite ready to replace my entire product photography team (yet), AI image generation is already making a serious impact. And yes, I’m side-eyeing those AI “influencers” too (although they are here to stay). Here’s what legitimate AI can do for me right now:

AI-powered lighting adjustments (studio lighting, minus the studio) 

Say goodbye to complex lighting setups and expensive reshoots. AI can now intelligently adjust lighting and shadows in your product photos, creating a professional look and feel with just a few clicks. Need to brighten a dimly lit shot or add dramatic shadows for a more artistic effect? AI has got you covered.

Image infill for seamless product placement (no more Photoshop gymnastics) 

Need to remove an unwanted object from a product photo or extend the background for a wider shot? AI-powered infilling tools can seamlessly fill in missing parts of an image, creating a natural and realistic result. This can save you hours of tedious editing and ensure your product photos are always picture-perfect.

  • Tools to Explore: Photoshop’s Firefly “Generative Fill” excels at image infilling, while other tools like DALL-E 2 and RunwayML‘s Inpainting feature offer similar capabilities.

Social media visuals that pop

Sick of scrolling through endless generic stock photos? AI can help me create eye-catching visuals that are tailored to my brand and audience. Whether I need a post for Instagram, a banner for my website, or a thumbnail for a YouTube video, AI can generate unique and engaging images that will stop scrollers in their tracks (without resorting to deceptive tactics).

  • Companies to watch: Typeface and Canva (and of course Photoshop Firefly) are incorporating AI image generation tools directly into their platforms, making it easy for marketers like me to create professional-looking visuals without any design experience.

Concept visualization

Have a brilliant product idea but struggling to communicate it to my team or investors? AI can help me visualize my concept with detailed mockups. This can accelerate the design and development process, saving me time and money in the long run.

  • Companies to watch: Again, Midjourney and DALL-E 2  are generating buzz for their ability to create stunning visuals from even the most abstract text prompts.  But I think there are some even more interesting opensource tools I’ll get to in a minute.

Nerd time. Behind the pixels: How AI conjures up images 

You might not be coding AI models yourself, but understanding the basic principles behind them will give you a significant edge. Here’s the lowdown, minus the jargon:

The neural network playground: Diffusion models and beyond

Most AI image generators, like Midjourney, Stable Diffusion and Flux, are built on diffusion models. These models learn by first corrupting training data (images) with noise and then figuring out how to reverse that process. It’s akin to watching a masterpiece disintegrate and then meticulously reconstructing it; the model learns to reverse the disintegration.

The bridge between words and images: CLIP

CLIP (Contrastive Language-Image Pre-training) is a groundbreaking model that plays a crucial role in many AI image generators. Think of it as the translator that helps the AI understand your text prompts and convert them into visual representations.

CLIP is trained on a massive dataset of images and their corresponding text descriptions, learning to associate words and phrases with visual concepts. When you provide a text prompt, CLIP helps the image generator understand what you’re asking for and guides the generation process to produce images that align with your description.

Why CLIP is a big deal

  • Zero-shot learning: CLIP’s ability to connect language and images enables “zero-shot” image generation, meaning the AI can generate images from text descriptions it has never seen before. This opens up a world of possibilities for creative expression and customization.

Improved prompt understanding: CLIP helps the AI interpret complex and nuanced prompts, leading to more accurate and relevant image generation.

 LoRA (Low-Rank Adaptation): Fine-tuning made easy

LoRA is a powerful technique that allows you to fine-tune pre-trained AI image generation models like Stable Diffusion on a smaller, specific dataset. It works by injecting a low-rank update matrix into the model’s weights, enabling it to learn new concepts or styles from your data without requiring extensive retraining of the entire model.

Benefits of LoRA:

  • Efficiency: LoRA is significantly faster and less computationally expensive than full model fine-tuning.
  • Flexibility: You can train multiple LoRAs for different styles, subjects, or even individual products, then easily switch between them during image generation.
  • Customization: LoRA allows you to tailor the AI’s output to your specific needs, creating images that align with your brand’s aesthetic or showcase your products in a unique way.

ControlNet: Guiding the AI’s brushstrokes

ControlNet is a revolutionary technique that gives you unprecedented control over the structure and composition of AI-generated images. It works by providing additional input to the model in the form of control images or conditions, such as:

  • Edge Maps: Define the outlines and contours of objects in the image.
  • Pose Estimation: Control the pose and movement of human figures or other objects.
  • Depth Maps: Guide the AI’s understanding of perspective and spatial relationships within the image.
  • Semantic Segmentation: Specify the types of objects or regions present in the image.

By leveraging these control inputs, you can guide the AI’s creative process and ensure that the generated images align with your specific vision.

ComfyUI: Orchestrating the AI symphony

ComfyUI is an open-source node-based workflow tool designed specifically for AI image generation. It allows you to chain together multiple AI models, LoRAs, ControlNet extensions, and other image processing nodes to create complex and customized workflows.

Benefits of ComfyUI:

  • Flexibility: Design and experiment with intricate workflows that combine various AI models and techniques.
  • Control: Fine-tune every step of the image generation process, from initial concept to final output.
  • Automation: Automate repetitive tasks and streamline your image creation pipeline.

Where to find help and resources:

  • Online Communities: OpenArt, Hugging Face, Civitai, and Reddit’s r/StableDiffusion & r/Flux  are great places to connect with other AI enthusiasts, share tips, and find tutorials on LoRA training, ControlNet, and ComfyUI.
  • Open-Source Tools: Stable Diffusion, ControlNet, and ComfyUI are all open-source projects, giving you the freedom to explore, experiment, and even contribute to their development.

What’s on the near-term horizon (prepare to be amazed…and maybe a little freaked out)

Hold onto your hats, because AI image generation is about to blow your minds. We’re not talking about sentient robots painting masterpieces (yet), but the advancements on the horizon are truly game-changing for retail marketers. Get ready to have your expectations exceeded and maybe even question your reality a little:

Hyper-realistic imagery: Where reality and imagination blur

Forget the uncanny valley – AI is closing the gap between generated images and photographs at an astonishing pace. Advancements in diffusion models and neural rendering techniques are leading to images so realistic, it’s getting hard to tell them apart from the real deal. Imagine product photos that look like they were shot in a professional studio, complete with perfect lighting and textures, but without the hassle and expense of a photoshoot.

  • Why this matters: Hyper-realistic imagery can elevate your brand’s visual storytelling, create more immersive shopping experiences, and ultimately drive conversions.
  • Companies to watch: Flux.1[Pro] is a prime example, generating highly realistic images; and this is the worst it will ever be. 

Interactive and 3D models: Your products, in a whole new dimension

Tired of static product photos that only show one angle? Get ready for interactive 3D models that customers can spin, zoom, and explore from every angle, all from the comfort of their own screens. These models can be seamlessly integrated into your website or app, giving shoppers a more immersive and engaging experience, almost like they’re holding the product in their hands. 

  • Why this matters: Interactive 3D models can significantly reduce return rates, increase customer engagement, and boost sales. By allowing customers to visualize products in a more realistic way, you’re empowering them to make more informed purchase decisions. 
  • Companies to watch: VNTANA is a platform that helps retailers create and deploy interactive 3D models at scale, while Threekit offers a suite of tools for product customization and visualization.

AI-powered fashion design (where algorithms meet aesthetics)

Move over, human designers! AI is starting to flex its creative muscles in the fashion world. Imagine AI algorithms generating unique clothing designs, patterns, and color palettes based on trend analysis, customer preferences, and even your brand’s DNA. This could lead to hyper-personalized fashion recommendations and even on-demand clothing creation.

  • Why this matters: AI-powered fashion design has the potential to disrupt the industry, offering faster design cycles, more personalized offerings, and potentially even sustainable production practices.
  • Companies to watch: Stitch Fix’s Freestyle uses AI to recommend clothing based on personal style, while companies like Vue.ai and Heuritech are leveraging AI for trend forecasting and product design.

AI-generated virtual photoshoots

Say goodbye to expensive photoshoots and logistical nightmares. AI is making it possible to create virtual photoshoots with stunningly realistic models showcasing your products. These AI models can be customized to represent diverse body types, ethnicities, and styles, allowing you to cater to a wider audience and showcase your products in a more inclusive way.

  • Why this matters: Virtual photoshoots offer unparalleled flexibility and cost-effectiveness. You can change backgrounds, outfits, and even model poses with just a few clicks, all without the need for physical models, photographers, or studio space.
  • Companies to watch: Lalaland.ai and Rosebud AI are leading the charge in creating realistic AI-generated models for fashion and beauty brands.

AI talent agencies (the rise of the machines…in your marketing department)

Think AI image generation is impressive? Wait till you hear about AI talent agencies. These platforms are leveraging AI models to create everything from logos and marketing copy to entire advertising campaigns. They’re not just tools for creating visuals – they’re full-fledged creative partners. 

  • Why this matters (and why you might be freaking out a little): AI talent agencies are disrupting the traditional creative agency model. They can deliver high-quality creative assets at a fraction of the cost and time, potentially democratizing access to top-notch design and marketing talent. But it also raises questions about the role of human creativity and the future of work in the creative industries.
  • Companies to watch: Pencil and Flair AI, are just a few of the companies exploring the potential of AI-powered creative services. They’re offering everything from AI-generated video ads to social media campaigns that adapt in real time based on performance data.

What’s overhyped 

Let’s pump the brakes on the AI hype train for a minute. While AI image generation is undeniably cool and powerful, it’s not a magical solution for all your visual content needs. Here are a few things to keep in mind before you go all-in on AI-generated images:

AI replacing human designers and photographers (not so fast) 

Look, AI is a tool, not a replacement for human creativity and expertise. While it can generate impressive images, it still needs human guidance and direction. AI can’t understand the nuances of your brand’s aesthetic, your target audience’s preferences, or the subtle emotional cues that make an image truly impactful. Think of AI as a talented assistant, not the creative director.

  • Why this is misleading: Believing this myth could lead you to devalue the skills of your human creatives. It’s essential to recognize the unique value that human designers and photographers bring to the table – their ability to interpret briefs, develop creative concepts, and make subjective decisions based on experience and intuition. AI can enhance their work, but it can’t replace it entirely.

Perfect images every time 

Let’s be real – AI image generation is still a work in progress. While it can produce stunning results, it’s not always perfect. It can struggle with complex prompts, generate images that are blurry or distorted, or produce results that are just plain weird or off-putting. And let’s not forget the potential for bias, where AI models may inadvertently perpetuate harmful stereotypes or exclude certain groups. 

  • Why this is misleading: Expecting perfection from AI will only lead to disappointment. It’s important to set realistic expectations and understand that AI is still a tool in development. While it can save you time and resources, it’s not a substitute for human quality control and artistic judgment.

“Effortless” image creation

Generating high-quality AI images still requires skill and effort. Crafting effective prompts, fine-tuning models, and curating datasets takes time and expertise. It’s not as simple as typing a few words and getting a masterpiece. 

  • Why this is misleading: The illusion of effortless image creation can lead to unrealistic expectations and wasted resources. Be prepared to invest time and effort into learning the tools and techniques, and don’t expect instant perfection.

The BS meter 

Time to put on your BS detectors. The AI image generation space is rife with overblown claims, wild exaggerations, and just plain nonsense. Let’s take a closer look at some of the biggest offenders:

  • AI understands aesthetic preferences like a human : AI can analyze data and identify patterns, but it doesn’t “understand” beauty or aesthetics in the way humans do. Taste is subjective, and what one person finds beautiful, another might find boring or even ugly. AI can learn to mimic certain styles and preferences, but it can’t replicate the nuanced and often emotional responses that humans have to visual art.
  • AI eliminates the need for human creatives: Sure, AI can generate visuals faster and cheaper than humans. But that doesn’t mean it’s going to replace us anytime soon. Human creatives bring essential skills to the table, like critical thinking, problem-solving, and the ability to understand and connect with audiences on a deeper level. AI can be a valuable tool in the creative process, but it’s not about to make artists, designers, and photographers obsolete.
  • AI generates art that’s just as good as the real thing: Look, we appreciate a good AI-generated landscape or portrait as much as the next person. But let’s not kid ourselves – it’s not the same as a piece of art created by a human hand and imbued with human emotion and experience. AI can mimic styles and generate visually pleasing images, but it can’t replicate the soul and depth of true art.

Bonus round: Other BS claims 

  • AI can read your mind and create images you’ll love: While AI can learn your preferences based on your behavior, it can’t read your mind. True creativity often involves pushing boundaries and exploring new ideas, something AI is not yet capable of.
  • AI-generated images are always ethical and unbiased: AI models are trained on data created by humans, which means they can inherit our biases and prejudices. It’s crucial to be mindful of this and ensure that AI-generated images are used responsibly and ethically.

The AI image revolution is here (but let’s not get carried away)

So, there you have it – a glimpse into the thrilling (and sometimes bewildering) world of AI image generation. We’ve explored what’s possible today, what’s just around the corner, and what’s just plain BS.

The bottom line? AI image generation is a game-changer for retail marketers. It can help you:

  • Slash production costs and timelines: Say goodbye to expensive photoshoots and endless revisions.
  • Unleash your creativity: Generate a virtually limitless supply of visuals tailored to your brand and audience.
  • Boost engagement and conversions: Create more immersive and emotionally resonant experiences for your customers.
  • Iterate faster. More swings at bat typically means more hits, AI in the proper workflow allows you to iterate and improve incredibly quickly.

Next up: AI and the sound of the future – audio and video

And since you made it this far, my favorite photo Franklin created, when my friends sent one doing duck fact (look out Instagram influencers)…welp, Franklin decided he wanted in on the action. Sometimes AI Hallucinations work out in fantastic ways!