AgentCon Vienna 2026 Recap: From Chat to Voice Agents

AgentCon Vienna 2026 was a fantastic day of hands‑on sessions and lively hallway conversations around practical AI, autonomous agents, and real-world enterprise adoption. Thanks to everyone who joined and contributed.

If you’re looking for the agenda and event details, see the official page: https://globalai.community/chapters/vienna/events/agentcon-vienna/

General Recap

The spotlight was on autonomous agents moving beyond pure chat into action.
Safety, observability, and control were recurring themes across sessions.
Enterprise integration (data, tools, identity) dominated demos and discussions.
Real-time interaction modalities (voice, streaming, parallel tool-calls) gained momentum.
Tooling maturity improved: clearer patterns for planning, tool orchestration, and guardrails.

My Session Voice Agents: The Next Step After Chat

Chat-based AI is everywhere — but it’s not the end of the journey. The next logical step is Voice Agents that listen, speak, and perform actions autonomously.

In my session, I demonstrated a fictional enterprise use case to show how a voice agent works in practice:

Responds to voice commands.
Performs tool-calls to query data and execute concrete actions.
Streams results and confirmations back to the user.
Operates with enterprise guardrails (auth, logging, rate limits) for safety.

The demo is built on Azure AI Foundry, illustrating how modern enterprise voice agents can be implemented, controlled, and safely operated.

A Memorable Opening

I opened with a playful dream: all attendees were riding in a big harvester while a rooster was crowing — the perfect cue to introduce voice. From that scene, we transitioned into the core idea: speaking is the most natural interface, and agents should meet us there.

From Chat to Voice: The Evolution

From chat to dictation: capturing intent faster than typing.
Adding STT and TTS: benefits and challenges (latency, punctuation, disfluencies, accents).
Real-time models: streaming speech and tool-calls for true conversational workflows.
Enterprise needs: reliability, auditability, handoffs, and guardrails baked into the flow.

Live Demo Highlights (Azure AI Foundry)

Voice command routing: recognize intents and select the right tools.
Tool orchestration: query data sources and perform actions with confirmations.
Streaming UX: low-latency feedback with partial results and turn-level summaries.
Safety controls: observability, limits, and fallbacks to keep operations in bounds.

Samples and Key Facts

The session samples repo is here:

The samples illustrate the end-to-end flow for voice agents in an enterprise context: intent capture, tool-calling, action execution, streaming responses, and safety/observability patterns.

Resources

AgentCon Vienna: https://globalai.community/chapters/vienna/events/agentcon-vienna/
Session samples: https://github.com/petkir/session-samples/blob/main/agentcon_vie_2025
Sample notes: https://github.com/petkir/session-samples/blob/main/agentcon_vie_2025/idea.md

If you’d like me to share the deck or a starter scaffold for building a voice agent on Azure AI Foundry, let me know

Keynote Keynote on stage coffee area