How to Write Grok Image Prompts That Actually Work

Grok’s image generator is one of the more capable text-to-image tools available right now, and it is accessible to anyone with an X Premium subscription or via grok.com. The model behind it, Aurora, is an autoregressive mixture-of-experts network trained on billions of examples of interleaved text and image data. It follows prompts closely, handles flat vector illustration and photorealistic styles well, and generates usable results quickly.

The quality of the output depends almost entirely on the quality of the prompt. Here is what works, based on prompts used in production for fastai.news today.

How to Access It

Grok image generation is available in three ways. On X, open Grok in the sidebar and ensure you have an X Premium subscription – free accounts have heavily limited or no access. At grok.com, the same capability is available in the browser without the X app. Grok Imagine, launched in February 2026, is a standalone image-focused tool from xAI that offers batch generation, multiple aspect ratios, and resolutions up to 2K via the Pro tier.

To generate an image, simply describe what you want in plain language. Grok handles the translation. The skill is in how precisely you describe it.

Prompt Structure That Works

Effective Grok prompts follow a consistent structure across different use cases. Breaking it down:

Style first. Open with the visual style before anything else. “Flat vector illustration” produces clean, graphic results. “Photorealistic” pushes toward lifelike rendering. “Minimal editorial” gives clean, publication-ready output. Aurora reads the style descriptor as a frame for everything that follows.

Background and palette second. State the background colour and primary accent colours explicitly. Aurora will pick its own palette if you don’t, and it may not match your brand or intended context. “Dark navy background, teal and white accents” is precise. “Dark background” is not.

Subject and composition third. Describe the primary subject, its position, and what is happening in the image. Be specific about what the subject is doing, what surrounds it, and how the composition is arranged.

Mood and tone fourth. “Clean, modern, editorial tech aesthetic” tells Aurora the overall feel. This affects lighting, spacing, and how elements relate to each other.

Negative instructions last. Tell Aurora what to exclude. “No text” is the most commonly needed instruction – Aurora will add labels and captions unless told not to. Other useful exclusions: “no people,” “no gradients,” “no drop shadows.”

Real Examples from Today

These are the exact prompts used to generate images for fastai.news articles published on Tuesday, April 7, 2026.

Article hero – WordPress automation:

Flat vector illustration, dark navy background, teal and white accents. A minimalist robot figure sits at a desk with a glowing laptop screen showing the WordPress logo and lines of clean code. Data flows as thin teal lines from the laptop into floating icons representing a document, a search result snippet, and a schema diagram. Clean, modern, editorial tech aesthetic. No text.

Result: a robot at a desk with the WordPress logo visible on screen, teal data streams flowing outward to floating UI elements. Clean, on-brand, immediately readable at article width. Generated in one attempt without revision.

Article hero – Algrow YouTube MCP:

Flat vector illustration, dark navy background, teal and white accents. A minimalist AI figure sits at a screen showing a YouTube play button surrounded by flowing data streams – subscriber counts, view graphs, channel thumbnails. The data flows in teal lines from the YouTube logo into a Claude-style chat interface. Clean, modern, editorial tech aesthetic. No text.

Result: an AI figure at a screen with YouTube data flowing into a chat interface. The play button reads clearly and the data stream elements give the composition structure without clutter.

X profile header banner:

Wide banner image, 1500×500 pixels, dark navy background. The fastai.news logo with lightning bolt appears left of centre in teal and white. Thin teal horizontal data stream lines flow across the banner from left to right, fading at the edges. Minimal, modern, editorial tech aesthetic. No text other than the logo.

Result: a clean banner with the logo positioned correctly, teal circuit lines across the dark background, suitable for direct upload to X without any editing. The logo was recognised and rendered accurately from the brand name alone.

What Aurora Does Well

Flat and minimal illustration styles are a strength. Aurora produces clean vector-style output reliably when prompted for it, which makes it well suited for editorial and tech publication use where over-rendered photorealistic images would be out of place.

Prompt adherence is high. Aurora follows specific instructions about composition, colour, and subject position closely. If you say “left of centre” it will be left of centre. If you specify a palette it will use it.

Text rendering has improved. The January 2026 update reportedly improved typography handling. In practice, Aurora can render legible text in images when prompted, though “no text” remains the safer instruction for illustrations where you want a clean result.

What to Watch For

Abstract concepts need concrete anchors. “Artificial intelligence” as a subject produces unpredictable results. “A robot figure at a desk” or “a network diagram” gives Aurora something specific to render. Ground abstract ideas in physical objects and recognisable interfaces.

Anatomy in complex scenes can be inconsistent. For editorial illustration work where human figures are incidental rather than central, this is rarely a problem. For portrait work it is worth generating several variations and selecting the best.

Free access is limited. X Premium is required for regular use on the X platform. The grok.com interface offers some free generations but throttles quickly. For production use, a subscription is the practical requirement.

The Workflow

For publication use, the process that worked reliably today was: write the prompt in Claude using the structure above, pass it to Grok, review the result, upload directly to the WordPress media library. No post-processing was needed on any of the three images generated. Total time from prompt to uploaded image: under two minutes per image.

The prompt quality is the variable that matters. A well-structured prompt produces a usable image in one attempt. A vague prompt produces results that require multiple iterations or editing. The structure outlined above eliminates most of that friction.

John Moore

John Moore is the editor of fastai.news, an independent publication covering developments in artificial intelligence.

He founded fastai.news in April 2026 to apply the same rigorous, neutral reporting standards he established at Powerboat News – his international publication – to the fast-moving world of AI.

With no filler and no opinion, fastai.news reports what is happening in AI models, research, business and tools, and leaves readers to draw their own conclusions.

John is based in Buckinghamshire, England.