Canonical generative AI pattern for assistants or companions that conduct multi-turn conversation across text plus image, voice, audio, video, or other modalities, often with memory, personalization, moderation, policy controls, or grounded tool use. Map only when conversational interaction is the primary AI product and multimodality is explicit; do not map one-way content generation, non-conversational multimodal classification, or simple chatbot wrappers with no multimodal capability.