RTSP, SIP, and H.323 each manage the state of a video or audio session, but do not send the video or audio data itself, which is normally done with RTP.
RTSP was designed for setting up one-way real-time video and audio feeds. It doesn’t typically carry audio and video data once the session has been established, and it’s not required that a persistent TCP connection be established. When a player needs to interact with the server, it can re-establish the connection, and pass a sessionID in its request.
If your use-case fits RTSP, Wowza is a good option to get up and running quickly.
SIP and H.323 are used for two- and multi-way calls.
SIP is simpler and has fewer features, while H.323 is more advanced and supports things like authentication, and adding a participant to a meeting after it begins.
The Asterisk media server has support for SIP and H.323, and is open source. The company that produces Asterisk, Digium, also sells audio hardware that can be used for audio compression, as well as voice T1 and T3 integration.