© 2000, Video Development Initiative



brought to you by the

Video Development Initiative

with contributions from
University of Alabama at Birmingham
Canarie
Georgia Institute of Technology
University of North Carolina at Chapel Hill
NYSERNet, Inc.
University of South Carolina
Southeastern University Research Association
University of Tennessee
William & Mary
Yale University

Version 2.0
April, 2000

Table of Contents

Preface

Introduction

Uses of Video Conferencing

Getting Started with Video Clients

Practical Video Conferencing Steps

Network Requirements

Selecting and Tuning your PC

Advanced Video Conferencing Functionality and Management

Related Topics

Glossary of Terms

Appendices

  1. Developing a Productive Video Conference Room
  2. H.323 Specification
  3. Client Experience Summaries
  4. Interesting Web Sites on Video Conferencing

Bibliography

Contributors

Acknowledgements


Preface for Version 2.0

The first public videoconference was held in April 1930, between AT&T headquarters and their Bell Laboratory in New York City. [Rosen] Microphones and loudspeakers transmitted the audio while, under a blue light, their images were captured and transmitted as they looked into photoelectric cells. An article in the April 10 edition of the New York Daily Mirror described the audio as clear and the image as inoffensive (a term commonly used for driver's license photos but not often heard today for videoconferencing!) It was at this time that the value of face-to-face conversation at a distance was expressed.

Shortly thereafter, in 1933, the FCC was formed when much radio and television traffic began to collide. In 1934, the standards wars between companies began with the FCC intervening to establish hearings and approve standards. In 1941 the first analog standard for television, with 4.2 MHz of bandwidth (525 scan lines and 30 frames per second), was adopted. By the 1950's we had 83 channels covering the frequencies 54 to 890 MHz.

But it was thirty years after that first AT&T videoconference, before the first videoconferencing product was introduced on the market. In 1964, AT&T introduced its Picturephone at the New York City World's Fair. This system, marketed as an exclusive executive tool, required 1 MHz processing power (considered daunting at the time) and provided the first data sharing feature. In 1971, the first transatlantic videoconference occurred between two Ericsson systems (a product named LME.) And some twenty years later, desktop videoconferencing clients became available. Intel, PictureTel, and VTEL, names with which we are still familiar today, were some of the early desktop players.

Welcome to Version 2.0 of the cookbook. We're delighted that you decided to stop by. If you are new to video conferencing and the cookbook, we hope you find it easy to use and that the cookbook helps ease your entry into one of the newest and most promising uses of the Internet. If you've already read Version 1.0 or are a veteran of video conferencing, we hope that you find the new content here interesting and useful.

Changed material is noted as follows:

Introduction
The content description has been revised to include new sections and major revisions. The standards summary has been updated to current status. The ingredients have been changed to more up-to-date terms.

Applications
This section has undergone major review. Some new material has been included.

Getting Started
The brief networking and PC selection discussions have been eliminated since these are now full sections of the cookbook. The component section has been revised slightly.

Practical Steps
The VCON example has been updated for release 4.01. Application Sharing scenarios have been added, one for point to point and one for multipoint.

Network Requirements
This is a new section. It describes the typical network connection needed for good video conferencing. It includes a list of typical problems seen when the network connection is under configured or experiencing problems. Tips on who to talk to at your site and tools to monitor your connection are suggested.

Selecting and Tuning Your PC
This is a new section. Presented in "Dear Gabby" form, it provides a sort of FAQ on things we've discovered are necessary on the PC side for good video conferencing.

Advanced Functionality and Management
This section has undergone major review and rewrite.

Related Topics
This section is new though some of the topics in it have been moved here from other sections. One new topic is "What About Multicast", an overview of the differences between regular IP and Multicast over IP for video conferencing along with its strengths and the issues facing its deployment. Another new topic is "Models for Campus Deployment" which outlines the state of H.323 deployment, gives several case studies of deployment, and finally discusses the issues that have affected that deployment.

Glossary
Terms have been updated or added as needed.

Appendices
Summary sections on H.323 and T.120 have been reviewed and updated. Summary sections on SIP and MGCP/MEGACO have been added. The Experience Summary has been totally revised. Interesting VC Sites has been updated with new links to other interesting work as well as with presentations by ViDe members on video conferencing.

Contributors
This section has been updated to include authors from the Phase II ViDe group.

Acknowledgements
This section has been updated to include new sponsoring schools and organizations.

Introduction


How to Use this Cookbook
What is Video Conferencing?
Who are the Intended Readers?
Why are Standards, Openness, and Interoperability Important?
What are the Basic Ingredients?


How to Use this Cookbook

The Interface

Since this cookbook is a web document, we should give you a few guidelines on how best to view it. First, this is a framed document. The left-hand frame contains a general table of contents. Clicking on any item in this list will cause the content in that section to come up into the main frame on the right-hand side (where you are probably reading this now.)

Secondly, this cookbook has been developed to work with Netscape Navigator 4.0+ and Internet Explorer 4.0+. Some aspects of the interface may not work with earlier versions of these browsers. It has been designed to work best with a 17 inch monitor. If you are using a smaller monitor, scroll bars will appear in several of the frames. Your window should be resized as large as possible before loading this document. Sizing afterwards can cause formatting problems.

You will notice that the bottom frame contains a number of buttons.
  This button will load a printable copy of the cookbook in a separate browser window.
  These buttons will allow you to move back and forth in the cookbook sections. This is not the same back and forward as the window history back and forward. The window history can be accessed, as usual, with the browser commands within the main content window.
  This button will load a more detailed table of contents in the main window.
  This button will load the glossary of terms into a separate browser window. Using a separate window allows you to look up terms as you are reading cookbook content.
  We intend to expand and improve this document in future versions. We would like to hear your comments and suggestions. This button will load a feedback form in a separate browser window.

Lastly, the bottom frame contains copyright and contact information. You can easily reach the cookbook editor by clicking on the link.

The Content

The early or novice user will find the first six sections (Introduction, Applications, Getting Started, Practical Steps, Network Requirements, and Selecting and Tuning your PC) of this cookbook useful. They will introduce you to ways to use video conferencing and tell you how to get started.

More advanced users of video conferencing will probably find the next two sections (Advanced Functionality and Management and Related Topics) more interesting. They will tell you how to better evaluate components and how to move on to larger conference groups.

Early integrators may want to review all sections of this cookbook to prepare yourselves for likely needs within your community as well as to introduce yourselves to the types of equipment you will be asked to support.

We hope everyone will benefit from the information contained in the glossary, appendices, and bibliography.

Version 2.0

This cookbook has undergone one revision. The Version 1.0 content was updated as needed. Several new sections were added including Best Practices, Video Conferencing Etiquette, Network Requirements, Selecting and Tuning Your PC, And What About Multicast, and Models for Campus Deployment. Advanced Features and Going Further were reorganized into Advanced Functionality and Management. The Experience Summaries have been reformatted. 

Major revisions will appear in a shaded box, as is this text.

What is Video Conferencing?

Video conferencing in its most basic form is the transmission of image (video) and speech (audio) back and forth between two or more physically separate locations. This is accomplished through the use of cameras (to capture and send video from your local endpoint), video displays (to display video received from remote endpoints), microphones (to capture and send audio from your local endpoint), and speakers (to play audio received from remote endpoints). Although there are many factors that serve to modify or increase the complexity of this basic definition (several of which are discussed in this cookbook), it is useful to keep the concept simple in the beginning when deciding why or how you may be able to use video conferencing for yourself or your organization.

In understanding the role that video conferencing could play, consider two general situations: a) those where you are already able to communicate with someone who is not physically nearby, but you wish that communication could be richer, and b) those where you wish to access or communicate to a location that may or may not be nearby but is limited by situational or physical constraints. Distance education often comes to mind first when considering the former situation, but several other existing types of communications can also be enhanced or extended. These include organizational and cross-organizational meetings, counseling, foreign language and cultural exchanges, and telecommuting. Communication is already occurring in each of these applications, but they could be made more compelling, more effective, or less expensive via video conferencing. (Imagine a telephone call where you can see the speaker, or a television through which you can talk.) For the latter situation, the introduction of video conferencing has enabled communication to restricted areas such as clean rooms, nuclear facilities, operating rooms, and the space shuttle. It has been used to observe wildlife in their natural habitat, to establish interactive surveillance and security, and, combined with micro-instrumentation, to observe inside the human body. This side of video conferencing may not come to mind as readily as the enhancement of simple communication but it can be quite powerful. Simply imagine situations where you might like to be a "fly on the wall", with the ability to interact if desired. To imagine even further, consider that video conferencing can be point-to-point (between two endpoints), or multi-point (combining two or more endpoints into the same "conversation"). When you begin to combine diverse endpoints into one setting where audio and video from each can be shared in real-time, whole new levels of interaction are enabled and entirely new ideas for communication can result.

Once you determine that video conferencing is for you, you need to be aware that it is not currently a "plug-and-play" technology. Video conferencing actually began over a decade ago with the introduction of expensive group conferencing systems designed to send and receive compressed audio and video over network connections that could guarantee a dedicated rate of transmission and predictable service (i.e., point-to-point T1 or fractional T1 communication links, or switched connections using ISDN). Standards surrounding how the audio and video would be compressed, how the endpoints would communicate with each other (i.e., initiating/terminating calls, negotiating audio/video compatibility, indicating error conditions during a call), and how the video streams would travel over the network eventually evolved, but systems were not fully interoperable at the start. Still, evolution persisted and useful video conferencing using what is today called the H.320 standard did finally emerge. However, this also meant that video conferencing was restricted to a) those who could afford the technology and network connections to establish meeting rooms, and b) those who were able to travel to a video-conference enabled meeting location.
 
 
As time has gone on, the above restrictions have changed. Technology for conducting video conferencing has become less expensive, more flexible, and now includes options for desktop video conferencing as well as group video conferencing. More ubiquitous network types, particularly TCP/IP as used on the Internet, are being called upon to provide less expensive and more flexible connections. In conjunction with this, a new ITU (International Telecommunications Union) standard has emerged for supporting audio/video conferencing over IP. This new standard is called H.323 was first approved by the ITU in 1996. Since then, the standard has evolved through additional versions and also been implemented in multiple vendors' products. Those products will be the focus of this cookbook, which will also touch on many other factors required for a thorough understanding of video conferencing. These include the importance of standards, video conferencing needs assessment, application possibilities, basic equipment selection and use, and advanced components and services. It is both hoped and anticipated that this cookbook will help you to move from imagining what you might do with video conferencing to a successful and effective video conferencing deployment. 

Who are the Intended Readers?

This video conferencing cookbook has been prepared for academic and research users on advanced IP networks around the world. We feel that the span of that topography and experience will make the cookbook valuable to any academic institution desiring to implement video conferencing for local, state, regional, national, and international communications.

The application examples here are targeted towards the academic and research community in particular and include meetings (one-on-one to many-on-many), classes, and collaboration. The audience levels will range from the beginning user of video conferencing to the intermediate/advanced user of video conferencing to the new organizational integrator. It is expected that the beginning user has operational skills on a Windows type of workstation, including general software installation skills. It is expected that the new integrator has knowledge and skills relating to their local and extended networks as well as general server-asset support.

In attempting to analyze the audience attitudes toward video conferencing, it is acknowledged that attitudes will range from excitement and abandon to caution and order to even skepticism and apprehension. The expectations of video conferencing are likely to range from top end audio and video ("Why, it's like you're right across the table!") down to good audio with passable video ("Is that a new hair-do or is your camera malfunctioning?")

For the beginning user the objective is to familiarize you with the concept and uses of video conferencing. To that end we will lay out a potential strategy for selecting and purchasing a video conferencing product, suggest steps to follow in learning the use of the product, and share ideas about how to introduce the product into your professional life.

For the intermediate user the objective is to bring out new ideas for your use of video conferencing and to familiarize you with some advanced features and enhanced components for video conferencing.

For the new integrator the objective is to familiarize you with potential uses of video conferencing at your site, to introduce the different components and services which will be required to support those uses, and to share experiences and shortcuts for such support.

Why are Standards, Openness, and Interoperability Important?

H.323 is an International Telecommunications Union (ITU) standard for video conferencing over IP. It is an umbrella standard that specifies mandatory and optional requirements in several areas to enable a complete "call" or communication sequence. The standard also defines four major components that may be part of the call - terminals, gateways, gatekeepers, and multi-point control units. The reason for the standard is to enable interoperability between different vendors' implementations of these components. As is the case with all standards, there is a danger of either over-specification or under-specification. If the standard is over-specified, it may become difficult to implement in the form of a cost-effective product. If the standard is under-specified, there may be room for different interpretations that lead to equally compliant yet non-interoperable implementations. Version 1.0 of the H.323 specification left significant latitude for vendor interpretation. This latitude enabled wide differentiation in the marketplace but led to poor interoperability among early products. Subsequent versions of the standard are addressing this issue by becoming more specific in key areas but interoperability between vendor implementations remains an issue as does interaction of the various H.323 components across the Internet vs. an intranet.

Fortunately, market forces have resulted in several strategic partnerships among video conferencing vendors which will tend to increase interoperability in this arena. In some cases vendors have sought to acquire complementary products in order to offer complete "turnkey" solutions. In others, joint ventures have been formed to assure interoperability within a broader product line.

We are still at the early stages of the H.323 lifecycle. While the specifications paint a picture of seamless conferencing over the Internet, today's reality is uneven interoperability among H.323 products most reliably suited for intranet deployment. As the standard evolves through future versions (version 2.0 was finalized December 1998; version 3.0 is in process) and as product cycles have time to reflect that evolution, this situation should improve. 

What are the Basic Ingredients?

H.323 Video Conference Recipe
(serves 2 but may be increased with a proportionate increase in ingredients)
2 Video conferencing pioneers (choose only hefty ones with plenty of positive attitude and patience.)
2 Video conferencing terminal end stations (often called clients; can be of same or different vendors as long as H.323 compliance is verified.)
2 Peripheral sets, including camera, microphone, and speakers.
2 Workstations (probably of the Windows variety though a few Mac and Unix terminal end stations may become available in specialty shops.)
1 High speed network connection (at least 128Kbps.)
2 IP addresses for workstations.
1 Multipoint Conferencing Unit (MCU) - optional.

Attach IP address to associated workstation. Combine one pioneer, one workstation, one terminal end station, and one peripheral set at each end of network. Open user interface window on each terminal end station. Adjust camera on each end. Arrange microphone and speakers for best sound and least feedback. Have first pioneer find the dialup dialog window of their terminal, type in IP address of other pioneer's workstation. Hit the associated "dial" button. Say hello and enjoy!!

Uses of Video Conferencing


Determining Your Needs
General Uses
     Meetings
     Classroom
     Collaboration
Special Applications
     Telemedicine
     Telecommuting
     Judicial Applications
     Remote Laboratories
     Campus Surveillance & Security


Determining Your Needs

The H.323 standard supports applications ranging from a single person-to-person voice-only call to a multi-party interactive conference employing voice, video, and data. Knowing as much as possible about your current and future needs will help you to determine how much money to spend on equipment and to select equipment will help carry you into the near future. It is important to consider equipment that will upgrade and/or scale to the near future without the requirement of replacing it at each step. The following will help you in assessing your conferencing requirements:
Number of conference participants

If there will be more than two participants in a video call then there are two choices for handling the interaction: using an MCU or using multicast. In the case of the MCU, choices must be made about budget and network. MCUs can range from moderately priced software-based MCUs (several thousand dollars to twenty thousand dollars) and hardware-based MCUs (from fifteen thousand dollars to over two hundred thousand dollars.) It is important to remember that software MCUs rely on host computers that must be fast enough to keep up with all of the video streams in the conference and that the load on the computer increases with the number of people in a conference. Stability and support of the host system should also be taken into account.

Networking infrastructure is a major concern when it comes to hosting an MCU. For every person participating in a conference on an MCU, bandwidth is taken out of the total network where the MCU is hosted. For instance, if ten people were conferencing at one time at 384Kbps, it would require a total of 7.680Mbps. This would load down a typical 10Mbps LAN and further consideration would have to be given to any participants that are conferencing from the Internet or WAN links. For a site with T-1 connectivity, there would only be room for two (at most four) Internet clients to connect.

Multicasting is an appealing option from a price perspective and a networking point of view. Multicasting a multipoint conference eliminates the need for an MCU and frees the network of a concentrated point of multiple single connections. There is only one network connection between any two routers in a multicast multipoint conference. But this assumes that a multicast conference can be hosted in the first place. Multicasting to the desktop still seems to be facing a number of issues which are slowing down deployment. In order to be able to have a multipoint conference, make sure ahead of time that each participant will have multicasting capabilities. If one of the end points does not have that capability then other means of communicating will have to be made (such as tunneling a unicast connection to the multicast network.) See What about Multicast for more detailed information on how multicast works.

Environment of the participants

For one to three individuals conferencing from a PC, a desktop unit with an inexpensive camera will suffice. Close proximity to the camera is the key here. If there is a need to show other objects in the room or to control the camera from a distance, then a pan/tilt/zoom (PTZ) camera can make a big difference. A camera that can autofocus and can be controlled remotely will enable details of objects to be increased as needed. For conference rooms and auditoriums, a PTZ camera is a must. A second, and equally important, consideration for the environment is the set up of microphones and speakers. For one individual at a PC, a headset for use as a microphone and speakers may be ideal as this eliminates any echo commonly found in desktop systems. For systems with little echo (i.e. one that has built-in echo cancellation) using the supplied microphone and speakers will be sufficient. For systems that have poor or no echo cancellation, a noise-canceling speakerphone can usually be purchased as a separate item (and is well worth the money for hands-free operation and if more than one person will be in the conference). For conference rooms, multiple remote microphones may greatly enhance the quality of the conference. Having several desktop microphones for a conference table of several people will prevent the need for shouting during a conference. These types of microphones may or may not require that the user activate the microphone to speak. See Best Practices for the Audio and Video Environment for more information.

Gateways

Gateways are a necessity when a conference is to be held between two or more clients using different protocols. For example, if one client uses only H.323 protocol and the other client uses H.320 protocol, then a gateway is needed at one end to handle the conversion. The location of the gateway depends on several factors. The client using H.320 protocol may opt to have the gateway at their end since this would eliminate long distance charges on their ISDN lines (which would quickly add up for each pair of phone lines.) On the other hand, the H.323 site may have a frequent need to communicate with H.320 sites or there may be times when the H.320 sites don't have the services of a gateway available to them. While the H.323 site may incur long distance charges, it may be unavoidable and worth the money. Gateways are also a practical way of bringing in calls from regular telephone systems (POTS). For example, if someone is on the road on their wireless phone, they can still participate in a conference via a gateway which bridges them into the conference. See Advanced Functionality and Management -- Gateways for more information.

Role of voice, video, and data

Voice quality should, and usually does, take precedence over video during a conference. A conference is unsuccessful if one of the voices is not clearly audible. Usually 16Kbps to 64Kbps is used in the audio portion of a videoconference, depending on the audio codec used. 64Kbps transmission gives a higher quality of voice since the level of compression is not as significant as in the 16Kbps transmission case and will have a wider dynamic range (though this is not always the case as improved 16Kbps algorithms are now in production.) As mentioned before, when video detail is required, then pan/tilt/zoom cameras are a must for a quality conference. If the video is used only to maintain presence, then a more inexpensive camera should suffice for "talking heads" type conferences. 

Text and graphical material exchange requires the use of application sharing features (often termed "T.120 support" after the standard itself.) There is usually a fixed bandwidth for this, just like the audio and video, thus the transmission times for sending and receiving data should be predictable. Generally, application sharing is one-way or simply a "view" of the application being run on one workstation in the conference. If participants need to be able to "collaborate" (that is, execute the application together), some attention may need to be given to the enabling of collaboration on the T.120 implementation. Generally, application sharing is enabled separately from data collaboration to reduce the chances of security breeches or accidental destructive behavior. It is important to remember that when an application is being collaborated, all people in the conference have full control of that application, just as if they were in front of that computer; anything the local user can do with the application so can the remote participant. See Best Practices for the Audio and Video Environment for more information.

General Uses



In each of the above cases, the quality of the audio and video are critical to the success of the remote participation. Both will effect whether or not the remote participant(s) feel like they are truly part of a meeting (not just an observer) and also whether or not they are treated as part of the meeting by the other participants.

In the specific case of a multi-point meeting -- where more than one location is participating remotely, several factors affect the success of the remote participation. These include the view participants have of each other, how well participants can hear each other and be heard by each other, and how participants determine who is leading the meeting or "has the floor" at any given time. Features for controlling these factors are discussed in greater detail in the Advanced Features: MCU (Multi-point Conferencing Unit) section of this cookbook, but are previewed below:
 
 

What participants see may be:

What participants hear may be:

How meeting control is achieved may be:

As with any new technology, successful integration of video conferencing into existing activities requires attention to the needs of the people who will be using it. The determination of what is acceptable and useful must be based on the reaction and comfort level of the end users. In the case of simple point-to-point meetings, there is not a lot of new learning required for participants to successfully interact with each other as long as the video and audio quality do not interfere. Care should be taken to ensure that participants feel they can see and hear each other clearly. More information is available in later sections (see Practical Video Conferencing Steps and also Appendix 1: Preparing Your Room for Video Conferencing), but typical "rules of thumb" include:
  • Microphones should be of sufficient quality to pick up the speaker's voice naturally (in terms of volume and physical position) and without excessive background noise.
  • Microphones and speakers should be positioned so that they do not cause feedback and interference with each other, such as when the microphone picks up the sound from the speakers. Using directional microphones will also help limit the interference.

Attention paid to the total "look and feel" of the meeting scenario prior to conferencing helps to ensure that the technology will enhance rather than detract from the success of the meeting.

  • Classroom

  • A particularly exciting type of "meeting" that may be enhanced and expanded through the use of video conferencing is classroom instruction. Certainly all of the factors listed above for furthering the success of video conferencing within general meetings affect the classroom as well. In addition, the introduction of video conferencing into the classroom means that at least some things about the nature of the instruction necessarily have to change. In one case, remote participants may be additional students that the instructor must now accommodate in terms of instruction and try to integrate with any physically-present participants into one virtual student group. Remote participants should not feel that they are getting less out of the class than their physically-present counterparts and physically-present students should not feel that the presence of remote students is detracting from their instruction. In another case, remote participant(s) may be additions to the instruction itself, such as expert speakers, or co-instructors. As with any team-teaching, a cooperative balance of instructional duties is required but this can be made more complicated if video presence cannot compete with physical presence. For instance, instructor accessibility in the physical classroom can easily overtake the presence and command of the remote instructor, encouraging side conversations and inattention to remote instruction.

    Yet another aspect of video conferencing in the classroom is that the "participants" being shared via the video conference connection might not always be human. An instructor may want to incorporate an alternative video source (e.g., a document camera, a VCR) for sending to remote locations, or may want to receive video from an alternative video source at the remote site. The potential for combining video inputs and outputs can seem endless and readers are encouraged to fully explore these options when evaluating video conferencing equipment for use within a classroom. Two of the most typical classroom scenarios are illustrated below:
     
     


    Most importantly, use of video conferencing in the classroom requires special attention to the comfort level, teaching style, and instructional techniques of the instructor. In the ideal world, preparation for the use of video conferencing in the classroom would be minimal. However, today's reality dictates that there will have to be some adapting and learning on the part of instructors to use video conferencing successfully for instruction. Practice time outside of actual class time must be available and utilized to effectively integrate the technology with their own instructional style and methods, thereby ensuring a natural flow of classroom activities by the time the technology is experienced by the students.

  • Collaboration

  • As the previous sections describe, video conferencing can be used very effectively for meetings and classes. Travel costs and stress can be reduced while personal interaction can remain high. More people can be reached with knowledge and information when video conferencing is used in the classroom. This section will describe going one step further and actually collaborating within documents and applications that are being shared over the network.

    A video conferencing terminal will generally come with a number of software tools including electronic whiteboards, ftp, and chats. The whiteboard can be useful for dynamic lectures, collaborative diagramming, brainstorming, and sharing notes. Ftp can be used to transfer files quickly without the need for a separate operating system window. Chat can be useful when audio quality is poor or unavailable for some participants or when a subset of participants needs to communicate privately. An interface is also often provided to enable sharing of third party applications that may be installed on participating workstations. This is particularly useful when group work is supported by project-specific software applications. Communications between the terminal end stations -- while they are sharing these tools and applications -- must be standardized to ensure the highest level of access and accuracy. This communication is supported by the ITU standard, T.120. As stated in the DataBeam Tutorial on the T.120 Series Standard,

    "Established by the International Telecommunications Union (ITU), T.120 is a family of open standards that was defined by leading data communication practitioners in the industry. Over 100 key international vendors, including Apple, AT&T, British Telecom, Cisco Systems, Intel, MCI, Microsoft, and PictureTel, have committed to implementing T.120-based products and services."
    Two terms often heard in discussions of T.120 are application sharing and data collaboration. The distinction here primarily revolves around who has control of material. In application sharing, the owner of the material is allowing the other participants to view it only. In data collaboration, the owner of the material is sharing both the view and the ability to modify the material. We will illustrate the use of these through several examples.

    Video conferencing terminals that support application sharing and data collaboration do so through buttons or pull down menus. In most cases, a button will be clicked or menu item selected while the relevant application window is active. The process is very simple. A mouse click will be assumed in these examples.

    Lecture - Large Class - You are an instructor who has the need to present material from a presentation, web page, or other application that you use to deliver course material. In this case (say it is a large group coming from several distributed sites), you simply want to present the material in one direction. So after activating the window, you simply click on the application sharing button. The material immediately shows up on the screens throughout the conference. As you navigate through the lecture, each screen changes to follow. (Note: it is not necessary for the application to be resident on the receiving machines.)

    Lecture - Small Class - This case is similar to the one above except that you are working with a much smaller class. In this case, you might want to have more than just video and audio dialog between yourself and the students (and student to student.) Perhaps you'd like to include some problem solving aspect to the class. You might bring up an electronic whiteboard or other application and start up data collaboration so that each student might present their ideas on a topic or solutions to particular problems.

    Presentation Planning - You are an educator, scientist, engineer, technologist. You have been working on a project with others in your field who are separated by quite some distance. Several of you are doing a team presentation so you would like to prepare your slides together. After activating the call between the presenters, one of you will bring up the presentation software and click on the button for application sharing (if only one person will be typing) or data collaboration (if all of you will be entering material.) You are able to discuss the material, analyze the potential audience, schedule each section in your face-to-face dialog. As you agree on layout and topics, you can enter them directly into the presentation.

    Proposal Preparation - You are an information technology director who is working with another information technology director at a different school. The two of you are proposing a joint project in educational technologies over advanced networks. You are preparing your material in your favorite publication software. After activation of the call, one of you will bring up the document and click on the data collaboration button. The document will appear on the other director's screen. Each of you can now type into the document. Control is transferred back and forth simply via mouse clicks. Changes will appear on each screen.

    Student Projects - It is very common to assign group projects, particularly in higher level classes and as term projects. This is a good team building strategy which allows the students to tackle larger problems and learn from each other. As long as the students have been located at the same campus, or reasonably close by, this works well. While application sharing and data collaboration could still be used locally (say for those night owls who don't want to drive late at night), a great deal of diversity can be added to the project if the students are in separate locations. Students in environmental studies might be teamed together from diverse locations such as a coastal environment, a mountain environment, a desert environment, etc. The students can use data collaboration to prepare their final reports, run data analysis for all to see, etc.

    Scientific Research - You are an engineer and you are studying aircraft wing design with several colleagues who are distributed around the country. You have implemented a large scale application on a parallel computing system at one of your sites (actually, it could be anywhere on the network!) The person at that site can begin the application and click on data collaboration so that each of you can interact with the model as it runs and see the results as they happen. You are also using CAD software (which runs in an X-Windowed environment) to analyze the output further. One of you will start up the CAD software and click on application sharing. All of you can then view the structures and discuss what happened, what to try next, etc.

    These are but a few examples of the diverse uses of video conferencing for collaboration. In thinking of your own scenarios, consider aspects of your project work or instructive activities where data is being passed back and forth in the form of file or document transfer but is currently being acted on or viewed individually. If manipulation of this data is really intended to support the development of a common product or understanding, these are aspects of your collaborative work that may be enhanced through application and/or data sharing.

    Special Applications

    Getting Started with Video Clients



      Selecting a vendor
         Contractual Issues 
         Vendor Services & Support 
         Ongoing Maintenance and Upgrades/Enhancements 
      Basic Components
      Add-on Components - Enhancement Software and Other Peripherals
      Beyond the Standard 
      State of Video Clients 
    Best Practices for the Audio and Video Environment
    Video Conferencing Etiquette


    Many questions may now be going through your mind about video conferencing. Perhaps earlier examples in this cookbook have given you some ideas for video conferencing that you would like to pursue. You may also be wondering how to go about getting the hardware and software you will need to pursue these things. The first part of this section will tell you how to go about selecting a video conferencing system and a system vendor. It will also give you recommendations for how to go through the purchasing process, and tell you what sort of support and maintenance services you should look for or expect.

    As you begin specifying video conferencing equipment to meet your individual or organizational needs, it will be important to understand the basic components that would be part of any video conferencing solution and the possible variations that may be employed to "customize" solutions for specific uses. In subsequent sections, the basic components are discussed, followed by a list of components that may be used to replace or supplement the basics. In keeping with the intent of this cookbook, the discussion focuses on H.323 standards-based video conferencing. However, much of the information would also apply to other standards-based video conferencing (i.e., H.320), the least-transferable details being those related to network connections and data collaboration. 

    Selecting a vendor

    The first step in vendor selection is a survey of the market, looking at existing technology, new standards and emerging technologies, customer deployment of current technology, customer satisfaction surveys, the experiences of your colleagues, and of course the web pages of all the videoconferencing vendors identified in the market survey. Even in cases where a vendor has already been pre-selected, such as in a statewide videoconferencing initiative, knowing the available technology and offerings of competing vendors is invaluable for working with the pre-selected vendor. In your market survey, concentrate on articles and web sites that survey the market, evaluate existing vendors and technologies and that predict future enhancements for videoconferencing technology. Sign up for electronic discussion lists that include users of videoconferencing technology. Ask questions about different vendors and their offerings. Sources for information include:
    • Manufacturers

    • Contact references provided by the manufacturers of video conferencing products of interest to you. Ask for customers with a large number and variety of conferencing needs. Manufacturers will, of course, give names of customers that are happy with their products, but this will also let you know why they are happy with that particular manufacturer.

    After a market survey to familiarize yourself with the state-of-the-art for videoconferencing, it is necessary to select the functionalities that are both critical and desirable for your project and codify those functionalities into a purchasing document, whether an RFI, RFP or a purchase order. If at all possible, design an instrument that can be sent to a large number of vendors.

    Be sure that the bid section will result in competitive pricing that can be compared uniformly across vendors. A good practice is to provide a bid sheet with individual line items for each meaningful system component. Meaningful system components vary by project and are best determined by the individual institution, after an extensive market survey. These line items can include the entire system (hardware & software); individual line items for component pieces (i.e., MCU, terminal software, etc.); and line items for services, such as installation, training, and ongoing maintenance. Be sure the bid component includes price breaks for item multiples, such as terminal software, cameras, etc. It is critical to request information about warranty and maintenance costs. One often-overlooked pricing differential is warranty period, with some vendors offering three months and others a year or longer. It is common to require multi-year bids on ongoing maintenance costs for large-scale purchases, to insure that an organization is able to financially maintain a selected system over time and to insure that vendors do not offset low purchase costs with high maintenance pricing.

    Although a purchase is implied, be sure to include language that states that the organization you represent is not required to issue a purchase order in response to bids received.

    Distribute your purchasing instrument to the widest possible vendor pool. You will probably work closely with your institution's purchasing department, but do not rely solely on their list of identified vendors. Supplement that list with the vendors you discovered in your market survey.

    Your purchasing instrument should require the names and contact information of all customers similar in size and mission to your institution. Do not ask for selected customers, but the complete list of customers meeting your description. A critical component of the selection process is the checking of references. Be sure to ask standard questions of each reference, for comparison purposes, as well as open-ended questions about their experiences. Many vendors may not provide a complete list, even though it is requested. If necessary, ask the contacts provided what institutions or companies they contacted for references, and expand your reference pool in this manner.

    Contractual Issues

    Depending on your institution and the nature of your videoconferencing project, no contract may be required, or there may be a purchase contract and a maintenance contract. For expensive projects, where expense includes not just the purchase itself but the staffing and training required for deployment, a purchase contract is a good idea. A purchase contract can provide the financial and risk protection. If the contract includes innovations not yet available, the purchase contract can outline staggered payments for scheduled deliverables. The contract can also define financial performance incentives for functionalities that are very new or that do not perform as specified, particularly if you select a vendor for very good reasons in spite of concerns expressed in reference checks about the performance of certain functionalities or problems with ongoing troubleshooting and support. Most vendors have an honest desire to serve customers well, but they are frequently understaffed and focused more on generating new business than on support for existing customers. Financial incentives (also known as financial penalties, when the vendor steps out of the room!) are an effective way to insure service and minimize risk, particularly for very new technologies. Vendors are more likely to agree to financial incentives for performance for large, expensive projects and for projects that will be heavily promoted by the purchasing institution.

    For government entities, which of course includes state universities, financial penalties can be tricky, but not impossible. Steep reductions in ongoing maintenance costs, free extension of the warranty period, payment in free enhancements, free additional terminal software; etc. can usually be worked out with your contracts department as well as with the vendor. The goal is to avoid enriching the coffers of your institution's "general fund" (which might go toward the purchase of uniforms for the football team), and instead to impose performance penalties that directly compensate your videoconferencing project.

    Financial protection can and should include price caps for ongoing maintenance and should, if at all possible, lock in prices for enhancements that are part of the purchasing instrument response (and thus the contract), but not yet available for purchase. In terms of innovations, a contract is a good place to negotiate for functionalities requested in the purchasing instrument which the vendor is willing to develop but unable to currently supply.

    Vendor Services & Support

    Vendor services include installation support, technical documentation, ongoing troubleshooting and maintenance, and upgrades and enhancements to current service.

    Installation and troubleshooting support can include any or all of the following:

    The best installation and troubleshooting support includes preventive support. At a minimum, vendors should test all non-bundled and third-party hardware and software and provide a list of compatible products. Using tested, compatible products is the surest way to avoid installation and ongoing performance problems.
    This is a particular concern where PC's are used to host video conferencing products. Be sure that the vendor points out information on compatible equipment configurations, including:
    • number of processors allowed
    • operating systems that are compatible
    • video cards that are compatible
    • speed of processor that is required
    • additional equipment requirements (cables for PTZ cameras, NT-1 for ISDN lines,
    • IMUX's for multiple ISDN lines,...)

    Technical documentation can include:

    Manufacturers usually leave support up to the vendor (seller) and use the vendors as front lines of support on the first tier of troubleshooting. Ask the vendor or manufacturer for the support number at the manufacturer so that higher level support calls can be initiated by you, the customer. The manufacturer's support line will almost always have better information on their products than the vendor's will. Maintenance services generally include several levels of service. Be sure the vendor specifies what levels of service or what enhancements can be provided. These can include:
     
     
    • telephone support

    • Be sure to ask if they have a toll-free number. They may not point this out up front.
    • web-based support

    • Is online chatting provided for support?
    • on-site service

    • How many on-site service calls do you get?
    • free upgrades and enhancements

    • Are the upgrades and enhancements software only, or is hardware included?
    • software upgrades and patches.

    Training can include:

    Many vendors are not prepared to provide extensive training, but for large-scale projects where, for example, you will install videoconferencing services for 50 faculty members, training will be a critical component for the success of your project. If you are a project manager or an engineer, but not a practiced trainer, consider contracting training to a computer training firm specializing in technology transfer to non-computer professionals. The videoconferencing vendor and the computer training firm can be contractually required to work together to develop a training package. Be sure to specify who owns the training materials developed, which includes training manuals, course outlines and lesson plans. Be sure that either your institution owns the training materials or that the training firm and/or the videoconferencing vendor recompense your institution for the training materials developed at your expense if they wish to reuse them. We strongly recommend that large projects, particularly involving large numbers of end users, should include a significant training component.

    Ongoing Maintenance and Upgrades/Enhancements

    Before purchasing a videoconferencing system, be sure to identify, possibly through a non-disclosure agreement, any anticipated enhancements scheduled for release in the next six to fourteen months. In particular, pay attention to operating system changes in the market. Is the operating system you are using now changing to a new version that is not compatible with the current or future upgrade of the video conferencing product? Will a hardware upgrade be required to move to a newer version of an operating system or a new operating system altogether? Will you be required to pay to replace your hardware if this is the case? More than likely, in such instances, you will have to bear this cost. If you include any planned enhancements in the purchase contract, be sure to minimize the risk to you contractually.

    Be careful in a contract to negotiate only for enhancements to current functionalities and not replacement functionality that would result in the purchase of a non-standard current product. You do not want to risk ongoing problems with new releases and upgrades that will not interoperate with your nonstandard product. If current functionality requires re-working to customize service for your institution, you are probably buying the wrong product.

    If you identify a significant enhancement to service that you contract with the vendor to develop, be sure to use the purchase contract or another contract instrument to spell out the specifications and the financial incentives for completion. If your institution's involvement in designing and testing the enhancement will be significant, consider a joint marketing venture, or at least a substantial innovator's discount for the purchase and ongoing maintenance of the enhancement. Make sure all joint venture or pricing arrangements are clearly established in the contract.

    Basic Components

    As discussed in our introduction What is Video Conferencing, any video conferencing terminal must have a few basic components to "get the job done": a camera (to capture local video), a video display (to display remote video), a microphone (to capture local audio), and speakers (to play remote audio). In addition to these more obvious components, a video conferencing terminal also includes a codec ("COmpressor/DECompressor"), a user interface, a computer system to run on, and a network connection. Each of these components plays a key role in determining the quality, reliability, and user-friendliness of the video conferencing experience as well as any given video conferencing terminal's suitability to particular purposes. A basic understanding of each of these component's roles will help you map video conferencing technology capabilities to your specific application needs.

    Add-on Components - Enhancement Software and Other Peripherals

    Understanding the basic components of video conferencing is a necessary first step in planning your use of the technology. Understanding how these basics may be supplemented or enhanced is a critical next step to ensure a successful application match. The following "add-ons" are typical of the changes that can be made to a basic configuration.

    Beyond the Standard

    Version 1.0 of the H.323 standard was finalized in 1996. Since then, version 2.0 debuted (December 1998) and version 3.0 is "in the works". This development in the standard is an effort to keep pace with changes indicated through actual product field experience and new ideas about how the technology might be deployed (i.e., voice-only calls over IP, inter-networked environments). In addition, the H.323 standard also specifies requirements for some aspects of video conferencing while leaving others designated as "optional", or it does not specify how required functionality should be implemented. Given this, it may be difficult to say exactly what features an "H.323-compliant" product will or won't include. In any given product evaluation, it is important to understand the level of H.323 compliance (H.323 version, optional features) present.

    For example, audio is already fairly well covered in the H.323 standard. A variety of audio codecs are specified as optional and a common codec, G.711, is also required. Available codecs cover a variety of service qualities at a variety of bandwidths. An H.323 product may support the required codec only, or the required codec plus any given set or subset of the others. Additionally, product developers may choose to support a proprietary audio codec to achieve improved functionality or quality. If this is the case, the proprietary performance will only be possible between terminals in the same product family, under the same developer's control. What you may gain in functionality with non-standard implementations, you lose in terms of interoperability and flexibility for "mixing and matching" terminals.

    The presence of video, or a video codec, is not actually even required in the H.323 standard. An audio-only product can meet the H.323 standard as long as it complies with audio codec specifications and other specifications related to device communication and control. However, if video is included, the terminal must support the common codec, H.261. The video is then further defined by the size (in pixels) of the video window, most often QCIF (176x44 pixels) vs. CIF (352x288 pixels). H.261 itself requires QCIF support, but CIF is optional. Both the codec and the picture size will effect the video quality. Finally, as with audio codecs, there are other optional video codecs (e.g., H.263) or, again, a developer/vendor could substitute a proprietary codec to gain some performance or feature beyond what the standard provides. In evaluating the video capability of an H.323 terminal, you need to determine whether or not a video codec is available, whether or not standard video codecs are supported, and what picture size/formats are supported.

    Less obvious than audio or video, but certainly as important, are the communication and control features that enable H.323 terminals to talk to one another and allow network administrators to administer and control other H.323 network resources. When conferencing under H.323, each terminal often registers on the network with a "gatekeeper" application. The gatekeeper confirms what the terminal can do, and assists in call setup and take-down. Because of this involvement, the gatekeeper can also control how much of the total network bandwidth is allocated for video conferencing (i.e., "Sorry, you cannot place your call now; that call would put us over the limit on video conferencing bandwidth" similar to a busy signal on the telephone). As handy as the gatekeeper may seem, the H.323 standard does not require that one be present for H.323 terminals to conference with one another.
    There is currently no protocol or mechanism to force client registration with a gatekeeper. If a gatekeeper is present, you must rely on business or campus computing policies to enforce client registration. However, clients not registered will not be allowed access to advanced conferencing services such as MCUs or H.320 gateways. Firewall implementations on campus networks may also block traffic generated by unregistered clients as well.
    Though gatekeeper registration in such circumstances may be required, the standard does not explicitly say how registration must be done. It can be done statically (manually entered into the terminals configuration) or discovered dynamically (the terminal comes up on the network and requests to know who its gatekeeper is). In the latter case, any gatekeeper can answer, and the first gatekeeper that does is the one the terminal registers with. Further refinement of this procedure within the standard is ongoing and current implementations are, for the most part, left to the discretion of developers.

    These are just a few examples to show that H.323 standards development may not currently address all of the functionality required for successful and controlled H.323 conferencing, especially in an inter-networked environment. We can expect changes in many areas of the standard over time, but perhaps they will not be fast enough for market demands. As new features wait to be included in the official standard, some of them may be implemented early by particular vendors, necessitating a retrofit later if the standard ends up implementing the feature in a different way. H.323 developers will be attempting to balance standards compliance against the implementation of currently non-standard features and enhancements that will allow them to differentiate their products. In evaluating the usability and interoperability of today's H.323 products, we must ask: Which H.323 standard is a particular product compliant with? What features, if any, deviate from or extend beyond the standard? What level of interoperability does the product have with other H.323-compliant devices? 

    State of Video Clients

    Here we summarize announced vendor implementations of H.323 Client Terminals (this may not be an exhaustive list).

    H.323 terminals are available in three general categories:

    Interoperability is still a question. While the desktop systems are behaving more cooperatively for point-to-point conversations, it is not uncommon for participants to find themselves required to settle on one particular T.120 implementation (be it NetMeeting or client homogeneity) in order to ensure proper sharing. Entering into multipoint conversations brings in the added complexity of interoperation with an MCU.

    Best Practices for the Video and Audio Environment

    In this section we attempt to give a very brief (possibly oversimplified) look at how audio and video are captured and transmitted in a videoconference, what problems you might see during the videoconference, and how you can address those problems. Network issues can also affect the videoconference, but that discussion and problem treatment are addressed under Network Requirements in this cookbook.

    The Audio Environment

    Audio is the most important part of a conversation. The audio system for video conferencing consists of some combination of headset, handset, microphone, speakers, and digitizing device (hardware and software.) An ideal audio system is one that offers the widest frequency response (widest range of sounds or pitch) while using only a small amount of bandwidth and incurring minimal delay. For those who are interested, humans hearing is between 20HZ and 20KHZ, with intelligible speech being around 2KHZ. And studies show that 100ms delays are detectable but tolerable, 250ms delays are annoying, and 450+ms delays are unacceptable. [Network Week]  
     
    Click here  for more detailed information on audio capture and transmission.

    Questions to ask yourself about the audio quality:

    Is the audio delivered at an appropriate volume with a minimum of background noise and hiss?

    Your input device is likely to be a handset, a headset, or a microphone. First, if others are having difficulty hearing you, check your input device. The standard handset is known to deteriorate quickly. Try replacing it with your telephone handset. If the sound of your voice improves at the other end, you have a bad handset.

    If you are using a headset, check the positioning of your microphone. Some headsets use microphone level output (meaning the sound of your voice generates the current required to carry the signal), therefore the volume will drop quickly as the distance between your headset microphone and mouth increases. For instance, you can double the output by decreasing this distance by 1/2.

    The stock microphones are typically very basic units that can damage easily. Extension cables or damaged cables can add extraneous noise and hiss. A headset is often the best solution for basic equipment and good sound. If you plan to use video conferencing often and desire semi-privacy during your conversations, invest in a good headset. Before doing so, verify that your video conferencing client has microphone input or that you have access to a line-microphone input adapter.

    Speakers and microphones play an important part. Does the system handle echo cancellation?

    If you prefer to use speakers and a microphone instead of a headset or handset for your videoconferences, care must be taken in their selection. The standard speaker and microphone setups do not generally contain echo cancellation features. You can sometimes get by with the basic setup in a point-to-point call, but you will torture your colleagues in a multipoint call.

    As your colleague's voice flows out of your speakers, your basic microphone will pick it up and feed it back through to their speakers or headset. Thus they will hear their voice echo back to them a fraction of a second later. The reverse case is also possible with the echoing voice being yours back to you. This quickly becomes very distracting and annoying.

    In a multipoint call, through an MCU, the echo begins to take the form of bells (or an even worse screech), with ever increasing volume and speed. The only way to survive such a call is to ask those on standard speaker and microphone setups to constantly mute their audio output when they are not speaking.

    Companies like PolyCom make echo cancellation speaker/microphone combinations, called speakerphones, that work well in a variety of settings.

    Most of this discussion applies to 1-3 people positioned at a desktop system. What if this is a large room?

    Good room audio solutions are sometimes expensive solutions. Clients with standard echo cancellation features, used with basic speaker and microphone systems, work adequately but a "fish bowl" effect is sometimes noticeable. Frequency response and switching response become more important. A desktop caliber microphone may make the camera or MCU switch inappropriately as someone near the microphone shuffles paper or coughs whereas someone further from the microphone needs to shout in order to accomplish the same switching. Professional audio services should probably be consulted if the highest quality audio is expected for video conferencing in a large room.

    Does the video conferencing client have automatic gain control to optimize volume on inputs and reduce background noise?

    Most desktop video conferencing clients require the end user to manually set the volume on the incoming call. In a point-to-point call, this isn't usually too cumbersome since you are dealing with one person at one volume level. In a multipoint video conference, it would be desirable for the MCU to do automatic gain control or volume leveling across the callers. Such features do not exist in current MCUs and therefore each end user must potentially adjust their incoming volume according to multiple input (voice levels, equipment mix, etc.)

    The Video Environment

    Reading facial expressions and body language are the next most important parts of a conversation. As stated by Trowt-Bayard in "Video Conferencing, the Whole Picture", most of us are children of the television. We were born around or after the time that TV was "invented". Being such, our expectations on video quality are very high.

    For those who remember, early television required much adjustment or fiddling with vertical and horizontal holds, adjusting the rabbit ears for better reception and sound, adjusting the contrast. Thanks to things like cable TV, digital video, and much higher bandwidths, there is no need to fuss with reception in this manner.


     
    We've come so far. What problems could be left? Video techniques have been designed to accommodate those things to which our eyes are sensitive (like foreground and focus) and to devote less time and bandwidth to those things which our eyes might overlook (backgrounds, motion.)   
     
    Click here  for more detailed information on how video (both analog and digital) works. This may make it easier to understand how to achieve the best video quality possible.

    Questions to ask yourself about video quality:

    How is the video resolution? Do the colors flow smoothly? Is there any banding or dithering? Is there bleeding between colors? Do you see video artifacts such as blocks, splotches, and distortions? Other subjective measures include sharpness, contrast, brightness, color saturation, stability (lack of snow or shimmering.)

    First of all, test your focus. You can often test this through a local window, though sometimes a remote opinion helps. The location of focus buttons will vary so see your manufacturer's instructions for this detail.

    Video formats are also defined for a particular pixel width and height (e.g. VGA is 640 by 480 pixels.) Has the encoding provided enough resolution for your purposes? Shrinking the picture size can help. (This is called scaling.) Enlisting an encoder format with higher resolution might be necessary, though associated bandwidth requirements will increase as well. Common Intermediate Format, or CIF, is a higher resolution format. QCIF, or Quarter CIF, produces compatible video at lower resolution (and bandwidth requirements.) Both CIF and QCIF can encode at 7.5, 10, 15, and 30 fps. If your client's vendor has chosen 7.5 fps, it will not handle motion as well. Some products also offer something called 16-CIF. Check your video format setting. If bandwidth is precious during your video conference, consider dropping back to QCIF. If bandwidth is plentiful and resolution is important, try CIF (or a client that offers a CIF option.)

    Could it be the network?

    For an overall treatment of connectivity requirements, see the Network Requirements section of the cookbook. But remember, should you seem to be having an unusual amount of trouble during a videoconference (especially when previous videoconferences have gone well), check the paths to and from the other party. "Problems out on the net" do occur and you may be able to save yourself some unnecessary anguish if you postpone your meeting or accept issues over which you have no control. Tools like PingPlotter and traceroute can help you determine network outages or difficulties.

    Does the client end station provide a consistently high frame rate (15 - 30 fps) during motion without sacrificing clarity?

    With this question in mind, realize that frame rates will vary based on motion, dropped frames, network load, etc. "Jitter" and "stalled video" are symptoms of frame rate variation. Some client end stations have statistics on frame rates. If yours doesn't, check the bandwidth on your incoming and outgoing paths. They are likely to be different paths, which means that the video can look great in one direction and terrible in the other.

    What happens when you move quickly or wave your hands? Linearity is a good measure of a client end station's sensitivity to motion and how consistently it maintains the frame rate. Video frames may be dropped. If this happens in bursts, the motion will appear jerky. If it happens in a predictable or uniform way, the motion will be smoother. Dropping the frame rate setting can sometimes help smooth the motion, though this option is typically not offered on most clients. Dropping the bandwidth manually may help and, in certain instances (like DSL), will significantly improve the video component of the conference. Better results are seen when using a garden hose on the roses than are seen when using a firehose.

    Some client end stations perform video encoding on separate hardware. This encoding (typically H.261 or H.263) is a very computationally intensive process. Systems that do not provide extra hardware assist may experience some loss in clarity if encoding becomes a major burden on the workstation itself. Software codecs, especially on lower powered processors, may have more difficulty in supplying sustained and higher frame rates. Stopping all other applications may help.

    The Joint Audio/Video Environment

    Latency, as used here, is the delay between a video movement and the sound that goes with it -- the synchronization of sound with picture. As has been described above, the sound and picture in a videoconference are two distinct components that are produced simultaneously but captured and transmitted separately. It is up to the video conferencing system to split them apart at one end, send them down the line, and put them back together at the other end. The video conferencing system is also responsible for keeping them synchronized. Several things can impact this synchronization.

    Questions to ask yourself about synchronization

    Does the lip sync seem reasonable? Does a handclap synchronize?

    It should be taken as a given that frames will be dropped. The codec should assume this and see to it that frames are dropped uniformly in order to maintain a sense of smooth motion. If yours doesn't and the latency is serious, consider a different client/codec.

    Data sharing may have an affect. Data packets are given priority over video packets. If you have an active data collaboration going on, you may begin to see some latency in things like lip sync. Perhaps your participation in the application will distract you from the synchronization problems.

    If you have other applications running, beyond those being used in the video conference, they may be siphoning cycles away the codec, causing loss of frames and/or synchronization. This may be especially noticeable as you launch and stop applications. If you are experiencing serious synchronization problems, turn off other applications to see if that helps. A busy LAN and Microsoft Windows (the OS) buffering can also throw the synchronization off.

    Video Conferencing Etiquette

    Video conferencing, by its nature, is a social activity. As with any social activity, there are acceptable as well as expected behaviors that accompany it. Some of these behaviors are the result of culture or the environment whereas some may be said to reflect "common sense". Of course, there is also a range of definition as to what is "acceptable", " unacceptable", "desirable" and/or "expected" based on individual interpretation and temperament. Finally, when compared to other well-established social activities that combine people with technology (e.g., talking on the telephone, watching a movie in a public theatre, driving a car), video conferencing has not been around as long or had as much exposure. This combination of conditions results in the fact that video conferencing "etiquette" is certainly not "carved in stone". However, there are some basic behaviors that will improve your own video conferencing experience as well as that of the people you are conferencing with:

    Testing, testing, 1, 2, 3…

    Perhaps the most overlooked experience-enhancing behavior in a video conference is simply to pay some attention to how others will be seeing and hearing you. In video conferencing, much of the experience at one end is affected by conditions at the other. Most video conferencing clients include a "self view" window. This lets you see how you appear to the remote end — whether or not you are completely viewable on camera, if there are distractions in the background, whether you are looking straight forward at the remote caller and not "gazing down from above" or "peering up from below". Even if the self-view window is not going to be kept up during the call, it’s a good idea to preview your image in the window and adjust accordingly prior to the call. Unfortunately, this doesn’t work for adjusting audio since your local audio is almost always suppressed from "feeding back" to you in local mode or even most test modes. In this case, testing and adjusting with a live call before a meeting begins or taking a few minutes to test and adjust at the start of a call is strongly recommended. Once a call is in progress, many people seem to tolerate poor audio or video conditions, not wanting to interrupt the conversational flow or simply because they figure it must be something "at their end". A short audio/video "rehearsal" is well worth the time spent as it contributes to making the technology as transparent as possible and enables comfortable, effective and rich communication.

    Leaving well enough alone…

    Once adjustments have been made at each end to produce optimal call conditions, perhaps the most important advice is to converse naturally and make as few additional adjustments as possible. True, some adjustments may be necessary in response to environmental changes (lights are turned on/off, background noise increases). However, unnecessary "twiddling" of audio or video can have very distracting results. For example, leaning forward and adjusting a desktop camera at your local end will produce the dreaded "giant palm monster" effect at the remote end, who see friendly faces of acceptable proportions replaced by a far-too-intimate view of all or parts of a hand. Also, if limited range or uni-directional microphones are being used, excessive movement or position shifting at the local end will produce audio break-up, swells and fades at the receiving end.

    Are you still with me?

    Once your camera and incoming view window have been correctly positioned so that "eye contact" has been established between you and the remote site, you should remain focused in that direction. Shifts in attention such as looking out a window, looking at other applications on the computer screen, "multi-tasking" with other work in your office, have the same effect as not looking someone in the eye when talking to them in person. It’s important to realize that video conferencing is much more like an in person exchange than a telephone call — body language and facial expression count!

    Talking out of turn…

    As with any in-person meeting, stray noises and side conversations within a video conference distract from the primary conversation. This can complicate point-to-point meetings and becomes even more noticeable in multi-point meetings. It seems to be easier for participants to forget that they are truly part of a group conversation since the meeting room is virtual rather than physical. Side conversations at remote sites seem to spring up more readily than they would if everyone were in the same actual room. The microphones and speakers necessary for sending/receiving audio complicate matters further in that they do not differentiate between relevant and irrelevant sounds. They will readily pick up any conversation that is taking place near them and send it along. They will also just as happily pick up and transmit a sneeze with as much sound quality as a well-intentioned remark. Given these "imperfections" with technology (and with people!), it is good practice to mute your own audio when you are not speaking. In a point-to-point conversation, this isn’t as necessary and may actually result in unnatural pauses in the conversation as muting at either end is turned on or off. However, it is useful in situations where audio may be poor at either end and can be used to minimize the effects of the audio problem on the overall conversation. In a multi-point conference, muting your local audio by default and unmuting only when you want to speak is almost always a good idea. This is especially relevant in the case of a voice-activated MCU since capturing the conference audio will also result in capturing the conference video. Any "side action" at your site will then be displayed along with any "side noise". Think about it - you may not want everyone seeing your facial expression as you dissolve in a fit of coughing, or watching you tumble oh-so-gracefully over the chair that you just knocked down!

    Wow! Where’d you get that shirt?

    Once you minimize audio distractions, it’s time to think about minimizing video distractions. How and how far to go about doing this is a topic of some debate. "Traditional" video conferencing has paid significant attention to proper lighting, room aesthetics, and attire, particularly in "board room" or group settings. This is a sensible approach to a technology that relies on cameras and monitors to create the conversational environment. Such "production" aspects are similar to those that are considered when producing high quality television and video presentations. These are especially applicable in preparing a conference room or a classroom for group use (see Appendix XX: "Preparing Your Room for Video Conferencing"). However, if conferencing is going to take place on a regular (maybe daily) and unplanned basis from desktops located in individuals’ offices or homes, the acceptable degree of "sensible preparation" becomes less clear. If communication via video conferencing becomes as commonplace as using the telephone, what will our norms for video etiquette be? Will we have to stop wearing favorite clothes if they have complicated patterns or loud colors, in case we get a call that day? Will we have to re-engineer lighting in our homes and offices, or setup "video friendly" areas to take all of our calls? What if we’re mobile — get out of the sun? Step into a "video phone booth"? The answers to these questions are likely to change as the "human protocol" for video conferencing evolves and as the technology becomes more capable of simulating "reality". During this evolution, it’s important to consider what does and doesn’t work well at any given time and in any given situation to ensure that you are making informed choices.

    We’re all in this together!

    A final subtle but very important point of video etiquette is that, when you are in a video conference enabled meeting, though participants are located in physically different places, it is truly a "real" meeting! At first pass, this means things like "you should be on time", "you should pay attention", "you should make sure everyone has the same information going into the meeting", "you should bring enough materials for everyone". In the case of a multipoint meeting, these considerations are more complicated in delivery but compounded in importance. For example, if hard copy materials will be used in the meeting, they should be sent to all locations ahead of time (not unlike preparing for a teleconference). If printouts will be made from electronic material presented during the meeting, you should be sure that all sites have the capability to print the materials. If particular local objects or room locations will be shown during a meeting, care should be taken ahead of time to ensure that camera views of these are available for remote participants.

    Can I have some of that too?

    A couple of other considerations are a little less obvious but really do make a difference, particularly in multi-point meetings when groups of people have been brought together at each of the participating sites. The first is that information which is specific to each local site (e.g., where the restrooms are, where to find a phone) may need to be distributed to those that are at the local site but isn’t relevant to remote sites. Distribution of this information should be handled locally via pre-meeting communication, local handouts, or prior to the start of the meeting with local audio muted. In addition to this, if amenities differ from site to site, care should be taken to minimize group exposure to the differences in amenities. (In other words, if bagels and coffee are available at one site but not at another, it would be most polite to eat off camera!) Better yet, care should be taken to ensure that amenities are equal. Remember, it really is one meeting!

    Practical Video Conferencing Steps


    This section covers three products which have been used by ViDe members. The purpose of this section is to give you some idea of how the terminal end stations work, what makes them similar, what sets them apart. In general they are very easy to use, and this information should set your mind at ease about bringing such a system into your environment.

    The slides for the Video Conferencing session at the SURA Video Workshop on March 3-4, 1999 also provide good information on practical steps.
    We have also prepared several pages on application sharing and data collaboration. Instructions on Application Sharing and Data Collaboration describe several methods for setting up "App Sharing" under the VCON client. It addresses point-to-point calls as well as multi-point calls, though the latter must be used in conjunction with something like Instructions on Multipoint Application Sharing and Data Collaboration, which describes an additional step which instructs the MCU to orchestrate the "App Sharing".

    And we want to point out the Buyer's Guides & Tests that are provided at the Network World Fusion site. The reviewers at Network World give information on many terminal end stations including platform, cost, features and functionality such as:

    Network Requirements

    The Network Connection: Make no assumptions

    Video conferencing was originally deployed over networks that could provide some guarantees about the level of service that would be delivered to the application. The ISDN and/or dedicated T1 circuits of the H.320 standards-based world provided predictable delays over dedicated paths. This allowed video conferencing vendors to create products to work within these parameters. Dedicated circuits are also expensive circuits. H.323 standards-based video conferencing was engineered for video conferences that take place on a data network, such as the Internet. Such networks were not originally intended for delivery of sensitive near real-time applications, and do not provide any QoS (Quality of Service). The data network is used for multiple purposes: e-mail, web browsing, and other activities take place inter-mixed with H.323 video conferencing. The audio/video information within a video conference is put into a series of data packets. These data packets are likely to arrive at their destination at varying times, and out of order. To keep the "real time" impression of an interactive video conference, the packets must arrive on time and in time to be re-ordered for delivery through the video conferencing terminal. There is currently no method for giving one type of application priority over another; all packets look the same at the lowest levels of the network.

    It is best to make no assumptions at all about your network's readiness for videoconferencing. You are advised to inform your network staff of your plans early on in your decision process, and ask them if your network will support videoconferencing in the locations you have in mind, including off-campus locations.

    The network components to be considered for videoconferencing include the wiring, terminating jacks, and network electronics. Low-end videoconferencing systems (64-128kbs) may work well enough on most networks, but higher performance systems (> 700 Mb/s) are more demanding.

    Category-5 (or better) horizontal network wiring combined with fiber optic vertical wiring is recommended. Many university campuses have older wiring installed in some locations; if your wiring does not meet these specifications you should upgrade it.

    The most common connection from a campus desktop to the network is through a device called a hub, which provides a shared Ethernet connection. Unfortunately, shared Ethernet hubs are not recommended for use in videoconferencing. Shared Ethernet is a "party line" communications system: every packet sent to or received from any computer plugged in to the hub is echoed to every connected device. When one computer is sending or receiving data, it is given sole access to the network and the other devices are blocked temporarily. This system works well enough if there are only a small number of devices sharing the hub, and if the data being transferred varies in size and is not time sensitive. Since video conferencing involves a continuing, bi-directional stream of traffic that is time-sensitive, use of a hub tends to cause degraded performance for all the computers involved. The recommended connection is a switched Ethernet connection. Switched Ethernet is a "private line" that keeps your traffic from interfering with other computers, or their traffic from interfering with yours.

    The building and campus backbones must provide enough bandwidth to support the use of switched connections. Because there is no way to prioritize Ethernet traffic, network architects design networks with an excess of bandwidth in the hope that traffic uses only a small portion of available bandwidth and will therefore flow freely without congestion. Implementing QoS for Ethernet networks is currently a research topic. The Internet2 QoS pages are a good source of information about these developments. Your network staff is familiar with your campus network architecture and capacity; you should discuss your requirements with them.

    The path along the network between video terminals, or from terminals to the MCU, will also affect the performance of your conference. Network packets do not necessarily take the shortest path from one location to another; routers determine which path is taken. A router must examine the destination address of the packet and then calculate where to send it. Every pass through a router is called a "hop". Because a calculation is involved, even though it occurs at very high speed, every "hop" adds a bit of delay to the total time required to transit the entire path. Excessive network "hops" can cause problems such as:

    To learn the network path involved you can use a tool called "traceroute". (There are many freeware and shareware packages available for all platforms that provide this utility). Traceroute will reveal all the hops involved, and also provides information about the amount of delay, in milliseconds, at each hop. The traceroute utility checks the path FROM your computer TO the MCU (or other destination you specify). Traceroute does NOT check the path in the reverse direction. Routing is not symmetrical: the path from A to B will not necessarily be identical to the path from B to A. Therefore, it is important to initiate a traceroute from each location. Routing is not symmetrical: the path from A to B will not necessarily be identical to the path from B to A. As a result, a videoconference may run wonderfully in one direction and poorly in the opposite direction (in fact, those symptoms indicate asymmetric routing). A traceroute initiated from your end will show half the information you need; your videoconferencing partner should initiate a traceroute at their end to learn the rest.

    Routing changes can be handled by the central network staff, only. They can either make the change themselves, or have procedures in place to contact someone who can.

    And finally, many universities have installed firewalls to protect their campus networks from hacker attacks. H.323 is complex, uses dynamic ports, and includes multiple UDP streams. Therefore, it turns out to be a non-trivial task to configure the firewall so that H.323 traffic can pass through. See the Intel paper, The Problems and Pitfalls of Getting H.323 Safely Through Firewalls for more information.

    Selecting and Tuning your PC


    This year we are welcoming the newest addition to the cookbook team, the columnist Gabby. Gabby is here to give you advice on PC selection and tuning for best performance with Video Conference. Gabby will also attempt to reply to odd (or even normal) behaviors you might see with your PC while video conferencing. You can send your own questions and/or answers to Gabby at Gabby's Mailbox. We'll be happy to post them here.

    Take it away, Gabby.


    Dear Gabby,
    Does the system configuration for my PC really matter?
    Signed,
    Sizing It Up

    Dear Shopper,
    The video conferencing components are housed inside a desktop computer or workstation; if the computer system is not powerful enough to support the videoconferencing hardware and software the video conferencing terminal will provide poor performance. Once you have selected a particular videoconferencing product, be sure to review the vendor's PC specifications. Remember that the vendor will specify the MINIMUM requirements (usually operating system, processor speed, amount of random access memory, video display board, and video memory). In short: Overbuild
    Signed,
    Gabby


    Dear Gabby,
    Can't I just get by with the minimum system configuration?
    Signed,
    Nursing My Pentium I

    Dear Florence,
    MINIMUM requirements should be interpreted to mean that there will be no other programs active on the PC while the videoconferencing software is running. If you anticipate that the videoconferencing PC will have an e-mail program running, a web browser open, and be playing music from the CD player while you are videoconferencing, you should select a PC that exceeds the minimum requirements. Random access memory and choice of video board/video memory have the greatest impact on videoconferencing performance. The video board is important because the videoconferencing hardware displays the camera image on the computer screen by using the computer's video board. If the codec is very fast but the computer video board is slow it will degrade overall performance.
    Signed,
    Gabby


    Dear Gabby,
    A bunch of folks at my organization frequently need to be in the same video conference at the same time. I suggested we all "squeeze in" around my desktop but everyone else just groaned. Don't you think that would work? Who's right -- them or me?
    Signed,
    The Queen of Makin' Do

    Dear Queen,
    Well, it's certainly possible to squeeze together in front of the desktop so it's hard to say who's right. I guess it depends on your definition of "would work"! Unless your desktop camera has a wide angle lens, you won't be able to fit everyone in front of it and the remote site will not be able to see all of you. Also, depending on the size of your PC monitor, people at your end are likely to have to strain to share the view. Sound might work out O.K. for a small group. Microphones and speakers often "stretch" better than the video peripherals do. Still, it's probably better to admit defeat on this one. Once you have more than a couple of people who want to conference together, you either need to bring everyone into the conference from their own desktops using an MCU or set up a group conferencing area where everyone can meet around a group sized system. Remember, makin' do isn't always makin' sense ;-)!
    Signed,
    Gabby


    Dear Gabby,
    I heard that I could spend just a little more money on the audio/video components for my PC and turn it into a group conferencing system. If that's true, why do group conferencing systems cost so much more?
    Signed,
    Careful Shopper

    Dear Careful,
    Imagine you have a VW beetle but you'd like it to carry several passengers and their week's worth of camping gear and drive over rough terrain, like a Ford Explorer. You have a chance of making it work if you a) understand the design limitations of the Beetle, b) understand the design goals of the van, c) have at least some money and time to spend, and d) are handy with tools and improvisation. Same idea. If you can build it yourself, you might be able to save some money. However, if you're not the type to re-engineer something and then support what you have re-engineered, buy a group system "out of the box"!
    Signed,
    Gabby


    Dear Gabby,
    Sometimes my PC seems so sluggish when I'm doing anything else during a video conference. Where is processing typically done on a VC system?
    Signed,
    Waiting For My Spreadsheet To Load

    Dear Abacus,
    Where your video processing occurs depends on the client you are using. Some clients come with a special add-on board which will offload some or all of the video work. Some clients, generally the cheaper ones, will rely on your main processor to handle the video. Therefore slower PCs may see worse performance during application sharing. The morale here is, if you want a cheaper video conferencing client, install it on a faster PC.
    Signed,
    Gabby


    Dear Gabby,
    What if I don't use a PC?
    Signed,
    Discriminating

    Dear Endangered,
    You are certainly in a "no-Win" situation!

    There is a notable gap in the H.323 market in terms of both UNIX and Macintosh video terminals. This is due to the large, general consumer market for Windows systems. Multicast tools are available for Unix; H.323 products may surface soon as well.
    Signed,
    Gabby


    Dear Gabby,
    I have a really nice, fast PC. It has two processors in it as well. Why can't I use this system with a board assisted video conferencing client?
    Signed,
    Cycle Big Shot

    Dear Tycoon,
    Video conferencing assist boards must go into certain PCI slots. It appears that vendors have programmed their software to address a particular range of IRQ numbers. The second processor typically fits into a slot higher up and therefore throws off the IRQ numbers for the video conferencing board. Should you install the video conferencing assist board in a dual processor PC, mayhem ranging from refusal to operate to constant, regular video freezes results.
    Signed,
    Gabby


    Dear Gabby,
    Just when I thought I was handling the Information Age pretty well, the other day I got all flustered. It all started when my video conferencing client "rang" at the same time that my email beeped, my telephone rang, and someone stopped outside my office door. Each interface seemed to demand that it be "first". I let the phone go to voice mail, left the email for later, took the video call but asked them to "hold" while I talked the in-person person. But this sudden crisis of communication left me very confused. For the rest of the day, I kept trying to do strange things - Drag and drop a phone number from the address book on my PC to my telephone. Eat a bagel that was sitting on the conference table of a remote site I was video conferencing with. Pan my telephone handset around the room to "show" the caller my new office arrangement. I even became convinced at one point (and quite frustrated thereafter!) that I could cut and paste a good joke into a colleague's mind. Am I going crazy?
    Signed,
    Worried and Wondering

    Dear W&W,
    You're not nuts, just harassed as well as maybe a little ahead of your time (I mean, really, cutting and pasting into peoples' minds??!) Someday it will all come together. For now, go outside for a nice quiet sit. Don't take the cell phone, the pager, the PDA, the laptop, your pile of reading, your Dick Tracy watch, or your Maxwell Smart shoephone. Just stop. Look. Listen. Learn to be still.
    Signed,
    Gabby

    Advanced Video Conferencing Functionality and Management


    Gatekeepers
    Gateways
    MCUs
      State of Conferencing Services Products



     
     
    As we have discussed throughout this book, the H.323 standard defines a video conferencing terminal for making simple point-to-point video calls. The standard also defines three additional and related components that extend or improve access to video conferencing functionality. These components include gatekeepers, gateways, and multipoint conferencing units (MCUs). We take a closer look at what these components are intended to do as well as briefly discuss the current implementation state of each in the section below. 

    Gatekeepers

    An H.323 gatekeeper is assigned control of a particular set of video conferencing resources (terminals, gateways, MCUs) and functions somewhat like a video conferencing "traffic cop". In this role, the gatekeeper can provide or facilitate several services that enable H.323 conferencing to be more reliable and more secure. If a gatekeeper is present on the network, the H.323 standard requires that H.323 compliant terminals register themselves with the gatekeeper and allow the gatekeeper to identify them to others and control their activities within the zone. Also, if a gatekeeper is not present, the standard allows for the terminal to control its own calls, placing them via IP address with no gatekeeper registration or intervention is required. In practice, however, gatekeeper registration behavior is somewhat unclear (how does the terminal know for sure that a gatekeeper is present? What if there is more than one gatekeeper readily available?) and difficult to enforce (what if a terminal registers with a "rogue" gatekeeper that has been installed on the network? What resources will the rogue gatekeeper be able to provide access to?). Once a terminal is registered with a gatekeeper, the H.323 standard identifies some broadly defined key services that the gatekeeper could provide: Gatekeepers today are available as full-featured standalone software applications and also as scaled down "built-in" functionality included within H.323 terminals, gateways, and MCUs. The degree of video resource identification and control provided by current gatekeepers varies widely and interoperability between one vendor's gatekeeper and another vendor's gatekeeper-controlled resource can be very uneven. Additionally, inter-zone communication and resource sharing between gatekeepers is far less than what would be needed for seamless conferencing on a global IP network such as the Internet or Internet 2. The issues surrounding such implementation can be numerous and it is safe to say that discussions about standards development as well as implementation of H.323 gatekeepers often produce more questions than they answer. However, it is widely agreed that the gatekeeper is a key concept and component for enabling scaleable, Internet-based video conferencing. Most organizations are approaching gatekeeper deployment with the mindset that gatekeepers must be deployed, even "as is", while the developers and the community work to make them what they can and should be. 

    Gateways

    A gateway provides transcoding services such as address translation, network protocol translation and audio/video coding translation between dissimilar media. The most common current type of gateway transcodes between H.320 (ISDN) and H.323 (IP based LAN) protocols. H.320 and H.321 (ATM), as well as H.323 and H.321, gateways also exist.

    Gateways have multiple common uses, the most straightforward of which is to allow an ISDN-based system to join a video conference of LAN-based systems. This permits conference participation from areas that do not have high-performance networks available. While LAN-based video conferencing is the newer and more economic technology, ISDN-based systems are likely to be used into the foreseeable future.

    A secondary use for an H.320/H.323 gateway is to provide redundancy between LAN-based MCUs. Should a network break occur, a conference could be routed alternately from one MCU, across a local LAN, through a gateway, over the PSTN, back through a second gateway and onto the LAN local to the second MCU.

    Because gateways function between protocols, and not within a single protocol, some special configuration may be required. In particular, the RAS (registration, admission and status) section of the H.323 specification, which permits dynamic conference ID registration, has no functional equivalent in the H.320 specification. The result is that if a gatekeeper is present, the conference ID must be pre-defined for multipoint calls. Point-to-point calls not using a gatekeeper do not require special treatment.

    A second configuration issue to be careful of is that IVR (interactive voice response) systems often use the asterisk (" * ") to signal request for operator. In such an environment predefined groups intended for use with gateways shouldn't include asterisks. Unfortunately, this requirement conflicts with the trend among H.323-only users to utilize the asterisk as a delimiter.

    Some CPU intensive audio transcoding can cause significantly delayed audio, resulting in an objectionable lack of audio/video synchronization. H.323 systems use G.723 and G.711 while H.320 systems use G.728 and G.711. G.711, the protocol in common, provides toll quality audio but uses 64Kbps. Disabling transcoding minimizes audio delay due to transcoding but would leave only 64Kbps available for video in a 128Kbps single circuit ISDN call. Enabling G.728-G.711 transcoding would reduce the audio bandwidth requirement to 16Kbps and free an additional 40Kbps for video. In a 384Kbps triple circuit bonded ISDN call minimizing the audio delay might be deemed worth the minimal video degradation. Whether to permit audio transcoding needs to be decided on a call-by-call basis. 

    MCUs

    The ability for two people at separate and remote locations to shrink the impact of the geographical boundaries between them via video conferencing is certainly exciting and valuable. The concept becomes even more powerful when several locations can be brought together into the same conference, creating a "virtual meeting room" that exists for that particular time and group configuration facilitated by the network. Such "meeting rooms" are created through the use of a Multipoint Conferencing Unit (MCU). The purpose of an MCU is to connect three or more video conferencing systems in the same conference, managing audio and video from each participant to the others such that group communication is achieved. Data sharing is also possible between all participants in a multipoint conference though current implementations vary greatly in terms of how this is done and also how well it works.

    The H.323 standard outlines two component processes that form the basis of any multipoint interaction — the MC (multipoint controller) and the MP (multipoint processor). The MP is optional and, if present, there may also be more than one. — and two different ways to provide multipoint functionality overall — centralized versus decentralized.

    The MC provides for overall control of the conference. This involves forming connections between all endpoints, negotiating common capabilities, and communicating to the MP regarding any necessary switching of audio/video sources. The MP handles the actual processing of incoming and outgoing audio/video streams. Audio from all sites in a multipoint conference is typically mixed and delivered back to all sites in full duplex mode. Video, on the other hand, may be handled in a few different ways:

    1. Switched based on voice activation (everyone sees the current speaker)
    2. Switched via manual control ("chair control", where the designated chair decides whose video is being seen)
    3. Displayed together on a split screen display ("continuous presence", also sometimes called "Hollywood Squares")
    4. Displayed in individual video windows, one for each site that is being received.
    In a centralized MCU, the MC and MP are included in a single unit to which all endpoints connect. This forms a physical and logical star configuration with the MCU at the center. Each endpoint is, in effect, in a point-to-point call with the MCU.

    In a decentralized MCU, there is no device that can readily be pointed to as "the MCU". Instead, the component processes (MC and MP) are present to some degree in the client endpoints. The MC of one endpoint will most likely be used to control the conference while each endpoint uses its own MP to send/receive streams in accordance with its own capabilities. The video/audio/data streams from each endpoint are sent one-to-many, which requires the use of IP multicast to facilitate group identification and participation.

    Arguments for and against centralized versus decentralized multipoint conferencing are not unlike those surrounding the debate of centralized server-based computing versus peer-to-peer computing. However, with particular respect to H.323 multipoint, the centralized approach has a practical lead at this time given the current state of the H.323 standard. Centralized MCUs are more thoroughly defined and more readily understood, therefore they are more widely available in standardized product implementations. Still, a quick review of the pros and cons of each approach can be helpful.

    Centralized functionality lends itself to improved reliability, control and management. It also allows for advanced capabilities to be introduced into one entity but made available to all, thereby reducing costs at the endpoints. Of course, cost is then shifted to the central unit - in this case, the MCU. Other functionality, such as additional transcoding or network gateways, can also be fairly readily added to a centralized MCU, extending the service capabilities further than "simple" multipoint call handling. Again, this increases the cost and complexity of the MCU while decreasing cost and complexity required for client endpoints. Another consideration is that, until quite recently, most centralized MCUs forced each conference participant to the lowest common denominator for call capabilities. For instance, if one participating endpoint could only send/receive QCIF calls at 128K bandwidth, all other participants in the same conference would be forced to send/receive the same. This limitation is changing as increased transcoding capabilities are being introduced into some centralized MCUs.

    Decentralized functionality more readily supports flexibility for end-users and a more distributed load over the network. Cost can be determined and distributed based on capabilities desired for particular endpoints. Each endpoint also determines its own send/receive capabilities and does not need to adjust these based on what other participants can do. Also, in addition to providing a mechanism for group calling, support for IP multicast allows for the most efficient use of bandwidth as determined by the placement and concentration of participating endpoints within the network.

    Another consideration for the implementation of an H.323 MCU is hardware versus software-based. Again, the factors influencing the decision are not exclusive to a discussion of H.323. Hardware implementations tend to be more expensive and are likely to contain a variety of proprietary components but are likely to be faster and are also prone to be more reliable. Software implementations are more portable, more flexible, and less expensive but may suffer performance issues due to their reliance on the operating system and resources of the computer they are running on. Each type of implementation is available on the market today in a variety of forms. A careful matching of performance requirements to cost variables should be combined with a broad comparison of available products within each implementation type before a final buying decision is made.

    There are a few different hardware-based MCU configurations that are available as of this writing. One type features a modular chassis that holds one or more power supplies and a number of other interface cards. Connection "ports" are included on some of these interface cards with the number of ports available corresponding to the number of sites that can be participating in conferences at the same time. Other hardware-based MCUs are based on more stream-lined units that do not feature pluggable modules but instead are ordered with the desired number/type of ports built in. In either case, multipoint conferences involving specific numbers of endpoints (e.g., a three-point conference, a six-point conference, a 20-point conference, etc.) are "brought up" on the MCU and encumber as many actual ports as necessary for the number/type of connections and the amount of time required. Some MCUs include scheduling capabilities that allow conferences to be configured/scheduled in advance and brought up automatically. Others only allow ad hoc use of available ports on a "first come, first served" basis.

    Software MCUs operate in much the same way as hardware-based MCUs but consist only of a software package running on a powerful server/computer. Software MCU manufacturers usually limit the number of simultaneous connections by a license key which is purchased by the customer. However, there are technical limits to the number of sites that can be connected together at one time based on the processing power and speed of the server.

    Both hardware and software-based MCUs can be connected together to allow larger numbers of sites to be conferenced together simultaneously. This is termed "cascading" and is a functionality that is described in the H.323 standard. MCUs from different vendors should therefore be able to be cascaded together quite readily. In order to do this, one of the ports on each of the MCUs are used to "call into" the other.

    Audio and video mixing/switching should still operate as if there is only one MCU involved; the cascading is transparent to the participants.

    State of Conferencing Services Products

    As it is early in the H.323 lifecycle, the service marketplace is not nearly as crowded as the terminal space. There are two general categories of vendor in the service space: those that provide software MCU products, and those that provide the entire suite of H.323 services - that is, gatekeepers, gateways, and MCUs. In the first category, Whitepine, PictureTel, and DataBeam provide software MCU products. Whitepine provides an MCU solution focused on conferencing known as MeetingPoint, and an alternative offering, ClassPoint, which is optimized for a tele-classing environment. The PictureTel 330 MCU software server runs under Windows NT and can handle up to 24 simultaneous H.323 terminals.

    VideoServer, RADVision, and Accord offer distinctly different hardware-based solutions for H.323 services delivery. VideoServer, the incumbent market leader in H.320 MCU services, provides H.323 services via their Encounter family of products. The Encounter product line includes the NetGate gateway, NetServer MCU, and gatekeeper software that is available for either platform. The VideoServer hardware is a Pentium PC platform running Windows NT and a Microsoft Web Server for management. The Encounter platforms are available in a work-group and the larger enterprise chassis.

    The RADVision offering is a custom RISC-based system running on an Intel i960 platform and includes both MCS and H.320/H.323 gateway implementations. The hardware is thin, rack-mountable, and stackable. As with the VideoServer solution, the gatekeeper software runs on either the gateway or the MCS platform. RADVision's products are closely coupled with their gatekeeper, so that rather than scheduling network resources, the gatekeeper searches for available multipoint or gateway services and provides them to the user. The RADVision gatekeeper also supports call forwarding and the ability to transfer calls to a 'video operator' or receptionist if the called party is unavailable.

    Accord offers what they call a "universal transcoding gateway." Effectively, this is an all-in-one video conferencing services platform. Accord claims that their platform supports LAN-based, ISDN, and broadband conferencing services in service provider packaging. The Accord hardware is a proprietary chassis-based architecture built for scalability and reliability. Accord recently announced an agreement to implement their H.323 services using RADVision software to become available in 1999.

    As is the case with the H.323 terminals the current MCS offerings have limited interoperability. While developers are working on interoperability issues, it is critical that anyone deploying these services today pay close attention to the specific terminals, servers and version they plan to implement and test these specific pieces together as a system.

    A further concern is that there is currently a lack of qualified system integrators. While there are many integrators that can successfully demonstrate several terminals working together and will certainly sell them to you, very few have an understanding of the complexities of full-scale H.323 deployment in an environment as complex as that found on a typical college campus today, much less an Internet2 site connected directly to the vBNS or Abilene. So, the deployment of advanced H.323 services remains a truly uncharted territory, somewhat frightening, but an exciting place for the adventurous video explorer.

    Related Topics


      Storing sessions on Video Servers
      Broadcasting Conferences 
      Supporting different video encoding formats
    What About Multicast?
    Models for Campus Deployment


    Storing sessions on Video Servers

    One way to greatly improve the utility of video conferencing would be to store a video conference on a server for playback later. For example, a class video conference could be stored for later viewing by someone who missed it, or a legal meeting could be stored as part of the record of the proceeding. However, this is not very easy to do at the present time. The best way to store a meeting on a video server currently is to feed the analog video output from the system into a video server (or a VCR for a low-tech solution). This has several drawbacks: the quality of the video is reduced, data transmitted with the video conferencing session is lost, and it requires a lot of manual setup on the part of the users.
    Over a year ago, vendors told us they were working on ways to feed the digital data directly into a video server, but we have yet to see any products with this feature even enter beta testing.

    Broadcasting Conferences

    Related to storing conferences, is the notion that it would be nice to be able to broadcast a conference to many users. This scenario is often described as the 'brown bag lunch' scenario. A panel of 'experts' engages in a normal H.323 interactive videoconference. The output of this conference is broadcast to many hundreds or thousands of viewers elsewhere. The viewers are not 'in' the conference and cannot participate in the normal way, although it is possible that they could ask questions of the panel via chat or email. The scenario generally uses IP multicast as the means by which to broadcast the signal to many destinations. As with storing sessions on a video server, this functionality is currently in development and should be available in a similar time frame.
    VTEL's TurboCast product provides this functionality when using some of VTEL's videoconferencing clients. Most applications of this type are done using video broadcasting systems such as Real Networks or Microsoft Windows Media.

    Supporting different video encoding formats

    Although the H.323 video conferencing standard specifies the use of a standard video encoding format, H.261 and the optional H.263 video codec, some video conferencing software provides the flexibility to use additional audio and video encoding formats. Often, when making a call to another terminal made by the same vendor, the video conferencing software will automatically switch to a proprietary encoding format that the vendor feels provides superior quality audio or video. Also, users with access to high bandwidth networks might want to use a higher quality, higher bandwidth encoding format such as MPEG-1 or MPEG-2. Using a non-standard video format is okay, as long as the video conferencing software supports standard formats as well, so calls can be made to terminals made by another manufacturer.
    One popular system in the higher education market is Litton Network Access System's Camvision product. This system provides MPEG-2 over Ethernet or ATM links, however, it is not H.323 compliant. Four-user multipoint calls are supported with this system, without requiring an MCU. The H.321 standard for videoconferencing over ATM is also in use at some higher education sites. It too is incompatible with H.323 clients, unless some sort of bridge is used. Currently, bridges of this sort are done using the analog video ports on the devices.

    What About Multicast?

    How H.323 traffic travels the Internet

    H.323 videoconferencing sessions travel across the network on top of a network transport layer known as IP. The H.323 standard uses two types of IP transport: TCP and UDP. TCP is designed to guarantee that the data arrives in full, in its original condition. UDP is designed to get most of the data to the destination most of the time. When you purchase something online on the Internet, your are using the TCP transport. Both you and the vendor want to be sure that your order is received completely and accurately. You would not be happy if a few bits were changed in the amount charged to your credit card. In the event that some error occurs during the data transfer, the transaction can be repeated over and over again until it is done correctly. In contrast, suppose you are watching a live broadcast of a sports event. If there is a glitch and a few frames get dropped, you probably don't care. If you had the last frames sent again and again until successfully delivered, what would you do with them? See them out of order? Stop live recording and wait? This type of transport is known as UDP.

    The H.323 standard requires the use of both TCP and UDP transport. TCP is used for control and data sharing such as file transfer. You do definitely want to be sure that sessions are set up correctly; you do want to guarantee that there are no errors in the transmission of a document. UDP is used when sending video, audio, and status information. Most of the time, most of this type of data arrives correctly; when they do not, we don't care unless the percentage of missing information becomes large enough to be noticeable.

    Efficient Network Transport

    H.323 sessions are typically "unicast", meaning that one copy of the transmission is created for and addressed to each unique end-user. If there are 5 users participating in a session, 5 copies must be sent, each one addressed to a different end user. The data is deconstructed into packets, each of which carried an end-user IP number as the destination address. Since IP numbers are unique, a separate copy must be sent to each end-user. In any stream of data, including a stream of video data, the actual encoding of the data can be separated from the details of the transport mechanism used. Today, the H.323 standard calls for TCP transport of control and data; therefore, unicast is required. However, it is possible to substitute a different transport mechanism for TCP. Substituting one transport mechanism for another should have no noticeable effect (from the end-user's point of view) on the quality or the video conferencing session; what the network sees, however, are two entirely different traffic patterns.

    What is IP Multicast?

    IP multicast is a bandwidth efficient way of delivering data, in particular video and voice, to multiple recipients using a single copy for all rather than one copy each. The network can more efficiently transport the information by sending a single copy of the data through each piece of network equipment. Rather than addressing a unique IP number, multicast packets are addressed to a special set of broadcast addresses, known as Class D addresses (the block of IP numbers from 224.0.0.0 to 239.255.255.255). Since broadcasts are, by definition, addressed to "everyone", a single copy passing through a network device will be forwarded to every downstream connected device. The broadcast address serves as a virtual channel; the end-user selects the channel by selecting the broadcast address, and thus receives the data stream by request.

    You should now be asking yourself this question: suppose there are 500 programs being offered through multicast. Will my network be congested because one copy of each of these 500 programs is using bandwidth on the network, even if no one is watching any of them? To address this problem, the use of sparse mode PIM (Protocol Independent Multicast) is recommended. Sparse mode PIM makes efficient use of the network by making sure that no multicast broadcasts are sent to a router unless some end user behind the router has made a request to send or receive. Then, only the requested programs are allowed to pass through. Selecting a multicast broadcast is known as joining a multicast group. When you join a multicast group, your request is sent back through your router; the router sends a request towards the broadcast source(s). These actions build a delivery "tree" through which a single copy of the multicast is delivered. The end user will experience a noticeable pause between requesting to join a multicast group and the start of the requested stream of data due to the time it takes to build the delivery "tree".

    Why are universities, as well as other ISP's, interested in conserving bandwidth? Even if universities are able to provide "unlimited bandwidth" on campus, off-campus connections are usually arranged through some commodity internet service provider, and that connection is an expensive one. Access to the Internet at large, even to Internet2, is usually a bottleneck in the network (the point with the most limited bandwidth). Videoconferencing can be used across campus, but the more typical application is to communicate with colleagues at a distance - off-campus. Anything that can be done to conserve bandwidth at the bottleneck is going to be cost-effective.

    Multi-point Sessions without an MCU

    Earlier in this cookbook we described point-to-point and multi-point sessions. We stated that multi-point sessions require an MCU to receive and re-broadcast the session to each individual participating in the multi-point conference. However, IP multicast makes it possible to engage in multi-point conversations WITHOUT USE OF AN MCU. Since MCUs are very expensive, it is easily apparent that multicast offers a more cost-effective approach to multi-point videoconferencing by using the existing router infrastructure.

    In a multi-point multicast conference, a single Class D broadcast address is assigned for the conference in advance. When a multicast session is created, the Session Directory (SDP) Protocol assigns an unused Class D address. (Since there is no global repository of who has used what address, there are some interesting issues in defining how you know whether an address is "unused" or not). Suppose you want to create a multicast session. Everyone participating in the videoconference sends out network packets that are addressed to the same Class D address. When using an MCU, each participant's VC is transmitted to the MCU that acts as a server, re-broadcasting the data to all connected participants. In multicast, each user's data is broadcast directly from each user's VC system to all other participants, without need for a central server. As an end-user of multicast, these details are hidden from you. What you see is a user-friendly interface that presents you a list of available broadcasts, much like a TV guide. You see which sessions are currently running; you see which sessions are scheduled in the future. If you want to create or announce a new session, you click a button and fill out a few fields. To join a session, you click on one of the entries in the "TV Guide".

    Without an MCU, participants are free to add or remove themselves from conferences, without having to be pre-authorized through a gatekeeper. Should you want to create a private conference, multicast software permits you to do so (users have to know a password to sign on, similar to accessing a private web site). In either case, no central administration is required to establish the session. If a session requires security, whoever establishes the session can establish the security and send passwords to selected participants.

    What hardware, software, and network infrastructure are required to support IP multicast?

    Gee, if IP multicast saves bandwidth and eliminates central administration, why doesn't everyone just use it? One year ago, the answer was that most network backbones did not transport multicast traffic, so you couldn't get "from here to there". One year ago, the only videoconferencing software supporting multicast was the public domain set of software commonly referred to as the MBONE tools. Like much public domain software, it (vic, vat, rat, wboard, etc.) it tended to be difficult for the average end-user to install and configure. Commercial H.323 vendors did not offer multicast transport for their products.

    What a difference a year makes! Both the vBNS and Abilene networks now fully support multicast traffic on the backbone. In addition, several commercial ISP's such as Sprint (multicast@sprint.com) and UUNET (http://www.uunet.com ) are offering IP multicast services. Commercial vendors, including VCON (http://www.vcon.com ) and Lucent (http://www.lucent.com/enterprise/ipapps/conferencing/), sell H.323 systems that support IP multicast. These offerings have all the gotchas and limitations that you would expect in Version 1 of any product, but they will improve in the future. A free version of the IP/TV multicast viewer, with 1 year license, is available to Internet2 member institutions at http://netaid.uoregon.edu/. IP/TV allows you to watch and listen in on multicast conferences, but it doesn't allow you to contribute; it can operate at higher resolution than the public domain clients can. A nicely bundled set of MBONE tools for Windows (MASH) is available from UC Berkeley at http://bmrc.berkeley.edu/bibs/download/index.html. Another bundled set known as Shrimp is available from http://www.ja.net/development/video/shrimp/. The Unix/Linux community will find tools at University of Oregon Video Lab and the Internet2(tm) Networks Multicast Trial and Setting up MBone Tools for Windows95/NT, Macintosh and Unix.

    Current roadblocks to the use of IP multicast are found in campus network architectures. While the national backbones can deliver IP multicast to the campus "door", it is still difficult to deliver multicast traffic to all locations on campus. Very few campuses can deliver multicast traffic "anywhere" - the University of Oregon is a notable exception. Most campuses that do offer multicast services "everywhere" do so using proprietary network protocols.

    Just as proprietary H.323 protocols allow you to achieve sophisticated levels of H.323 videoconferencing at the cost of losing inter-operability, proprietary multicast implementations allow you to deliver multicast traffic, but at the cost of losing inter-operability. Ideally, a network implementation of multicast support would be vendor-neutral. Large campuses purchase network equipment over a period of time, often from different vendors, and thus rely on adherence to standards to assure cross-vendor compatibility at some predictable level. Even if your campus has managed to achieve single vendor, single generation equipment purchases, you will be communicating with colleagues at other institutions who may have standardized on some other vendor's equipment. Adherence to standards may limit you to lowest common denominator performance, but it's predictable and reliable.

    In order to deploy IP multicast, the following components must be considered: desktop software; desktop network interface card (NIC); campus wiring; use of network hubs or network switches; network routers (core electronics). Obviously, the software must support IP multicast. Windows95 machines require Winsock 2 drivers. ( updates available at ftp://ftp.microsoft.com/bussys/winsock/winsock2/ws295sdk.exe )

    It is very important that a good NIC is installed at the desktop. Communications will be most efficient if the NIC handles both multicast and address filtering in the NIC hardware (rather than occupying your PC's processor time to do so). Detailed information about NIC architecture and level of multicast support can be found at http://www.stl.nps.navy.mil/~mcgredo/projectNotebook/mcast/EthernetMain.html. However, even when the CPU does process multicast broadcasts, studies indicate that the burden on the CPU will not be overwhelming, even for large numbers of multicast groups (http://www.stl.nps.navy.mil/~mcgredo/projectNotebook/mcast/ethernet.html).

    Wiring infrastructure is a crucial component. Wiring that meets or exceeds Category 5 wiring standards is required for reliable transmission of videoconferencing data and especially for multicast. You will need to check with your campus network architects to determine the type of wiring installed at your locations.
     
     

    Use of switched rather than shared Ethernet connections is preferred. This refers to the network device that sits immediately behind the wall jack your PC is plugged into. In a shared connection (an Ethernet hub), all traffic that enters the hub is broadcast to EVERY connected device. If you have incoming or outgoing video traffic, it is going to be sent to every connected station, which will have to decide whether to keep or toss the transmission. If the number of devices connected to a shared hub is very small (for example, 12) AND if the video bandwidth is relatively low, it may be possible to deploy multicast over shared hubs. However, this approach is not advisable unless the campus network administrators can exert great control over the number of multicast sessions and their bandwidth. A switched network connection is better; traffic is sent only to the destination PC, not to every connected device. There is one "small problem", however; the multicast broadcast is addressed to "everyone", so multicast traffic can in effect turn your switch into a shared hub!

    To avoid this problem, the IGMP (Internet Group Multicast Protocol) protocol is used. IGMP allows an end-user's PC to request to JOIN a multicast session or to LEAVE a multicast session. If the switch supports IGMP, it will know to send multicast traffic only to ports where the end user has requested a JOIN, and the switch will ignore ports that have not joined, or that have left, a multicast session. In short, the switch must be "IGMP aware" to be truly useful.

    Switched Ethernet connections may have both end-users and shared hubs plugged into them. As stated above, transmission of multicast through a shared hub may prove to be unmanageable, so it may be desirable to allow multicast traffic to pass through a switch to individual end users, but be blocked from passing through to the hub. Ability to control multicast at the port level is a desirable switch feature.

    Recommendations from the Internet2 Multicast working group and NLANR are to configure your campus edge router to accept PIM-Sparse Mode traffic (only requested broadcasts transit your campus). Recommended campus settings include both PIM-Sparse Mode and PIM-Dense Mode. PIM-DM is suitable for campuses where end-users densely populate the network and there is plenty of bandwidth to spare. To the left, you will find a graph indicating the impact of multicast traffic on the University of Alabama at Birmingham (UAB) campus backbone under PIM Dense Mode. PIM-DM is a configuration which allows ALL available multicast traffic to flow through ("FLOOD") for several seconds, then any connections not being watched within the campus are stopped ("PRUNE") for several seconds. The net result on a bandwidth utilization graph appears to be a continuous flow of 3-10Megabits per second of traffic. PIM Sparse Mode sends multicast traffic only to end-users who have "joined" a session. PIM-SM is recommended when there are only a few multicast receivers and bandwidth is to be conserved. Very excellent information on IP multicast and associated network architecture issues can be found at the NLANR engineering site ( http://www.ncne.nlanr.net/faq/multicast.html ).

    Models for Campus Deployment

    H.323 Videoconferencing: Case Studies and Deployment Issues

    State of H.323 Deployment Today

    In the 12 months since the release of the first version of the Cookbook (April 1999), there has been a surge of interest and activity in H.323 videoconferencing in the higher education arena. Some key H.323 initiatives and events have been instrumental in both generating and supporting much of this interest and activity:

    As a result of these events and initiatives, H.323 has gained an initial foothold in many institutions, and a community of H.323 users has clearly emerged. ViDeNet is providing this community with the means of coordinating Gatekeeper zones and ensuring seamless interconnection, while participation in both the LSVNP project and the Megaconference provided the impetus for IT personnel and end-users at many institutions to purchase and deploy a H.323 client. The LSVNP project, meanwhile, is enabling end-users to explore the value and ease-of-use of H.323, and do so cost-effectively by taking advantage of multipoint and gatekeeping services and technical support offered at the four host sites.

    As well as these larger scale events and projects, many institutions are regularly conducting demos and seminars on H.323 for their faculty and staff, some of which are available on the web (put URLs for UAB and UT sessions here).

    Case Studies/ Model Applications

    The following examples of H.323 applications currently being deployed have been selected to illustrate the range and variety of emerging H.323 use, and the potential H.323 has for supporting collaboration and resource sharing on our campuses. Whereas H.323 videoconferencing is used in many of the same application domains as its precursor, H.320, (specifically, Distance Education, Telemedicine, and for communication purposes), these case studies should demonstrate that the convenience and cost-effectiveness of H.323 have been recognized by the academic community, and there is no shortage of creative application of the technology. The applications described herein are at various stages of maturity, but all are beyond the mere conceptual stage.

    Place your mouse over the image for more information on the project.
     
    University of North Carolina School of Social Work: Teaching and Training over the Internet.
    PEPNET Videoconferencing Testbed
    Evaluation of H.323 Videoconferencing for Medical School Planning on a High Performance Statewide Network.
    Virtual Rounds: Sharing of live animal clinical cases via H.323
    ViDe "Large Scale Video Network Prototype" Project

    Campus Deployment Issues

    Although H.323 use in higher education is growing, it is far from being systematically and ubiquitously deployed. H.323 enables more end-user autonomy than H.320 videoconferencing, and the technology is not prohibitively expensive, but these very attributes make the need for coordinated and judicious deployment all the more essential. Typical faculty/researcher purchase is most likely going to be limited to end-points/clients, leaving the provision of MCU/Gatekeeper services to either the central IT organization on campus, or to an external service provider. Although by no means exhaustive, the following issues and questions are some of the more critical that need to be addressed in a H.323 campus deployment plan. The significance and scope of many of these issues became apparent in the ViDe Large Scale Video Network Prototype project:

    1. Cost Model: Charge-back vs. centrally-funded service: The operational costs likely to be incurred in establishing and maintaining a H.323 network will be evident in the points to follow. Operational costs aside, the capacity to integrate the telephone network with a H.323 network will obviously have the effect of reducing the funding stream associated with traditional telephone services on our campuses. This issue, needless to say, is controversial, and demands careful consideration.
    2. Coordination/Management of Gatekeeper Zones:

    3. While a point-to-point H.323 call is relatively easy to make, there are additional elements in a H.323 communication system that must be in place for multipoint calls, for bandwidth management, and for integration with other H.32x standards - a Multipoint Control Server (MCU), gatekeepers, and gateways, respectively. The gatekeeper function is critical for call management, and, if a gatekeeper is present on a LAN, clients must be registered with the gatekeeper. Gatekeepers at different sites must then register with each other for cross-site/Gatekeeper communication. Some centralized management and establishment of policies will be necessary in order to avoid a situation of multiple and conflicting gatekeepers on a LAN, and to coordinate with Gatekeepers at other sites.
    4. Management and Coordination of Directories: Since H.323 calls are currently based on the IP address of the end-point, (although efforts are being pursued to establish a more intuitive scheme. See http://www.cavner.org/videnet/sites/naming_rfc.htm), there is obvious need to publish names with IP addresses and Gatekeeper assignment.
    5. Support for Multipoint Conferences: How much centralized support can a service provider be expected to provide to end-users? What scale of H.323 use can be supported? If the central IT organization is responsible for provision of MCU and gatekeeper services, end-users can feasibly schedule their own multipoint conferences through a web-based interface, thus at least eliminating the need for scheduling support.
    6. Security: It is critical that H.323 is carefully managed to ensure secure and judicious use. Central coordination of gatekeepers and gatekeeper zones should support this endeavor. Firewalls currently cannot pass H.323 traffic, due to the underlying dynamic port allocation scheme used in the protocol. Therefore, firewalls must be configured to pass all traffic to H.323 endpoints within their domain. This creates a security problem for firewall administrators. Next-generation firewalls that perform stateful inspection of the packet streams, and thus generate secure access on-the-fly, should resolve this issue. Additionally, the security protocols within the H.323 standard are not widely adopted, which leads to potentially unauthorized use of expensive resources, such as voice and H.320 gateways.
    7. Infrastructure: H.323 does not perform well on commodity Internet connections, is not suitable for modem connections, and will not penetrate all market areas until high speed access to the home (DSL technologies, cable modems, for example) is more ubiquitous. However, in many academic/research projects, it is reasonable to anticipate that at least one of the sites will have less than optimum networking connections, which introduces the need to consider a variety of solutions (H.323, H.320, POTS), necessitating the addition of a gateway.
    8. Provision of assistive technologies: Consideration should be given to whether voice-activated MCUs will support all use on campus, or will there be need for a MCU with continuous-presence features, - for our deaf and hearing-impaired constituents, for example? Are there other instances of special needs on campus that should be met?
    9. Technical Support: Provision must be made for the same level of technical support, troubleshooting, client recommendation, installation and deployment with as any production technology. This is probably the single greatest cost factor in the deployment of H.323 services. Support requirements are especially heavy because H.323 represent a whole new concept in computer use than most individuals and IT organizations are used to. Expect very low support staff/user ratios in the early stages of deployment.
    10. Integration with H.320 Videoconferencing: Since many campuses have already invested in and have mature H.320 videoconferencing services, and since high speed/advanced networks are not ubiquitous, it is advisable to develop a model for deployment that integrates H.320 with H.323 and that complements any existing video conferencing service.
    11. Integration with the Telephone Network: H.323 is an applicable standard not only for video conferencing over IP, but for telephone communications as well. The same infrastructure (tech support, gatekeeper administration, account management, directory services, etc.) supports both applications. In fact, under H.323, the two applications are not really distinct, there are just multiple types of end stations on the network (e.g. IP phones, desktop video, teleclassrooms) all able to intercommunicate. Furthermore, since many organizations have different cost recovery models for telephone service (i.e. charge-back) than for Internet service (i.e. centrally-funded flat rate), moving telephone services over to the IP network will have the effect of reducing the funding stream associated with telephone services.
    12. Capture, Archiving and Serving of Conferences: Provision should be made for the capture, storage and serving of conferences on-demand, necessitating choice of streaming video format, etc. This activity, however, raises issues of copyright ownership: With multiple sites participating in a conference, which site owns the copyright of the content that is generated during that conference?
    13. H.323 Room Systems: Provision should be made for optimizing room conditions with the appropriate lighting and sound systems.
    14. T.120 Application Sharing/Data Collaboration: Provision should be made for instruction and support in the use of the T.120 application sharing tools that are typically integrated with the H.323 client.

    Appendix 1. Developing a Productive Video Conference Room

    Any current conference room can be adapted for use as a videoconference room by making adjustments based on the needs of video and audio equipment to capture signals. This is less of a concern for new construction, as these details will be an integral part of the function of the room, and will be designed in by the architect. More probable will be the conversion of an in-use conference room for video conferencing. The advances in technology have made the concept of an in-house video studio an attainable communication tool. As in movie and television production where the sound stage is a critical part of the process, the conferencing room is a critical part of productive video conferencing. The walnut paneled conference room is not the most conducive atmosphere, and creates a challenge for video and audio capture. There are several adaptations that will enhance the videoconference as a useful communications tool.

    The most difficult obstacle is maintaining a balance for the camera. The background colors and lighting will affect the view as seen by the remote participants. To be able to see all participants clearly, wall or discrete floor light sources have supplemented traditional ceiling light sources. Lighting is one of the few critical factors to successful video conferencing. While there are several concepts popular with designers, one key design parameter appears throughout all the recommendations. To eliminate shadows, a combined lighting arrangement ratio of 60/40 for ceiling and wall lighting is recommended. Wall lighting should be indirect and these fixtures are readily available from a wide range of suppliers. The key in this split lighting scheme is to equalize the available light on the participants and eliminate shadows, dark backgrounds, and bright spots in the center of the conference table. Lighting consideration for the intended room will factor heavily into the choice of wall coverings and table surfaces.

    The actual colors and patterns of the participants clothing may affect video reproduction, but usually only in extreme cases of attendee dress. Specific colors are recommended for backgrounds and wall covering to enable better recognition of attendees without straining the capture capabilities of the video camera. Recommended colors are soft, textured wall coverings, but smooth painted walls will work if colors are muted earth tones and the lighting is adjusted to suit.

    Audio technology has developed to a level where only the obvious interference from air conditioners, telephones and other extraneous noise sources would factor into microphone placement. While most conferencing systems use the speakers installed in the monitors, there are separate speaker systems available to meet the needs of larger room sizes.

    The next concern is the size of the room based on the available space. The video conference is directionally oriented by the visual focus capabilities of the camera and factors in to room layouts. Allowances must be made for furniture, additional wallboards etc. The size of the attending group is not only dependent on the actual room size. A room layout will determine how many participants may attend. The actual seating arrangement is then defined to allow the participants to see and be seen through the conference. There is a minimum distance required for the camera to capture all of the attending participants and must be factored into a layout. Furniture manufacturers have developed conference tables specifically designed to allow meeting attendees to see and be seen by the video equipment. There are several sources available for specialized video equipment including custom conferencing tables and matching cabinets. The best capture angle for the video camera is a "down the table view" with the end seat closest to the camera empty. This avoids having an attendee in that seat, who can neither see the monitor nor be seen by the camera, and permits the assembled group to view the remote part of the meeting. This arrangement also creates a clear walkway into and around the table, and creates an aperture distance for the camera without unnecessary waste of available floor space.

    Video conferencing equipment does require room. There is a monitor required for receiving a conference, and where the H.323 terminal's system does not provide a screen in screen option, a second monitor is needed for the "self" image portion of the videoconference. Some conferencing equipment uses additional equipment requiring space. Most conferencing cabinets allow for the housing of this equipment in the base and placement of the monitor at an easy to be seen height on top of the cabinet. Additional cameras for enhanced teaching situations, with an additional monitor would also factor into the space considerations in planning a video conferencing room. These design factors are dependent on the requirements of the manufacturer and the available space in and near the proposed conference room. There is a sliding scale for required space. Most equipment can be located within the room and installed within a finished cabinetry. An ideal case scenario would be an adjoining "mechanical" room to house the associated equipment, leaving the monitors, camera and microphones the only physical presence in the room. With the advance of the flat screen monitors, this presence will be diminished in the future as these new monitors will require less space

    For the issue of reliability and "clean" power, most manufacturers recommend individual service circuits for the equipment. Due to the influx of sensitive high technology equipment in business, most commercial real estate space has "clean" power lines available for computer, communications and operational equipment. The actual link is made over ISDN or the computer network and these lines are currently common place in 99% of commercial real estate dedicated to business use.

    Attending to a few critical details will develop a modern videoconference room and there are designers who specialize in these concepts. These concepts permit development of a comfortable, functional videoconference room that meets the physical needs of the equipment and accommodates interior design tastes in the intended work environment.

    Sources of Information:

    Trowt-Bayard, Toby, and Jim R. Wilcox, Video Conferencing The Whole Picture, Flatiron Publishing, Inc., 2nd Edition, March 1997.

    AXIS Design Group, http://www.axisdg.com, accessed December 1998, general images and information provided under "Specialty", "Space Solutions", with specific study of: http://axisdg.com/Specialty/Space_Solutions/Experience/People/President/_Clients _Projects/ Enlarged_Briefing_1/enlarged_briefing_1.htm

    Bellwether Design, http://www.bellweather-design.com, accessed December 1998, with specific study of : http://www.bellwether-design.com/designtech.htm

    EPA Audio Visual Inc., http://www.epaaudio.com, accessed December 1998, with specific study of: http://www.epaaudio.com/design.html

    Accuwood Inc., http://www.accuwood.com, accessed December 1998.
     
     

    Appendix 2. H.323 Specification

    The information listed here was not written by ViDe. It is not contained on the ViDe Cookbook server. These materials are listed solely for the purpose of directing you to more detailed standards information should you be interested in such topics.

    Several excellent primers which describe the standards are:

    Appendix 4. Interesting Web Sites on Video Conferencing

    Products Information

    PolyCom
    Intel Business Video Conferencing
    PictureTel LiveLAN, LiveManager and LiveGateway Desktop Videoconferencing System
    VCON Telecommunications Ltd.
    VTEL Online
    RADVISION H.323 Homepage
    Accord Homepage
    VideoServer Homepage
    Global Videoconference Network

    For Further Reference

    Center for Advanced Video Network Engineering and Research
    Large Scale Video Network Prototype
    Videoconferencing Guide
    Videoconferencing Categories and Terms
    NetMeeting Overview and Download Site
    International Multimedia Teleconferencing Consortium
    International Telecommunications Union (ITU)
    The Internet Engineering Task Force (IETF)
    The IP Multicast Initiative
    Trillium H.323 Tutorial
    Trillium H.323 Tutorial Self Test
    A Primer on the H.323 Series Standard
    A Primer on the T.120 Series Standard
    PictureTel Standards Page
    Welcome to the OpenH323 Project
    Multimedia Streaming, University of Wisconsin - Madison
    TERENA DEVICE PROJECT, Desktop Video Conferencing - Current Products and their Interoperability
    W3C Synchronized Multimedia
    W3C Synchronized Multimedia Integration Language (SMIL) 1.0, Specification
     
     

    Glossary of Terms


    A - B - C - D - E - F - G - H - I - J - K - L - M - N - O - P - Q - R - S - T - U - V - W - X - Y - Z

    A

    antialiasing

    A method for smoothing the jagged edges (stairsteps) often seen in graphics or video. The method reduces the jagged edges by placing intermediate shades of color or gray around the steps.

    ASF

    Active Streaming Format. A Microsoft file format for digital video playback over the Internet, or on a standalone computer. Kind of a wrapper around any of a number of compression types, including MPEG. Part of Netshow, a proprietary streaming media solution from Microsoft. Biggest competitor is Real Networks. While this 'wrapper' support many standard formats, ASF files are themselves proprietary.
     
     

    AVI

    Audio Video Interleaved. A Microsoft format for digital audio and video playback from Windows 3.1 Somewhat cross-platform, but mostly a Windows format. Has been replaced by the ASF format, but still used by some multimedia developers.

    B

    banding

    The presence of extraneous lines.

    bandwidth

    A measure of the amount of data that can fit on a network. Measured in Hertz or bits per second. For example, a regular Ethernet line has a bandwidth of 10 Mbps (10 million bits per second)
     
     

    bit rate

    The speed of a communication channel, usually used when referring to modems. Most new modems follow the V.90 standard, which has a bit rate of 56kbps (56,000 bits per second)

    C

    CIF
    A video format that supports both NTSC and PAL signals. CIF is part of the ITU H.261 videoconferencing standard. It specifies a data rate of 30 frames per second (fps), with each frame containing 288 lines and 352 pixels per line.

    CODEC

    Stands for Coder/Decoder (a telecommunications term) or Compressor/Decompressor (a computer term). A telecom codec is the piece of hardware that connects a data line to the customer's local network. In the computer world, a codec is a piece of software that compresses and decompresses digital audio or video.

    chrominance

    color

    D

    decoder

    A piece of hardware or software that is used to convert video or audio (typically) from the digital form used in transmission or storage into a form that can be viewed.

    digital audio

    Audio that has been encoded in a digital form for processing, storage or transmission.

    dithering

    Giving the illusion of new color and shades by combining dots in various patterns. This is a common way of gaining gray scales and is commonly used in newspapers. The effects of dithering would not be optimal in the video produced during a videoconference.

    F

    full duplex

    Sending data in both directions at the same time. Usually higher quality, but requires more bandwidth. In video conferencing, full duplex will be much more natural and useable. Cheap speakerphones are half duplex, whereas more expensive ones are full duplex.

    G

    G.7xx

    A family of ITU standards for audio compression.

    gatekeeper

    In the H.323 world, the gatekeeper provides several important functions. First, it controls access to the network, allowing or denying calls and controlling the bandwidth of a call. Second, it helps with address resolution, making possible email type names for end users, and converting those into the appropriate network addresses. They also handle call tracking and billing, call signaling, and the management of gateways. They also handle call tracking and billing, call signaling, and the management of gateways.

    gateway

    Gateways provide a link between the H.323 world and other video conferencing systems. A common example would be a gateway to a H.320 (ISDN) video conferencing system.

    H

    H.261

    ITU standard for video coding for videoconferencing. H.261 is a discrete cosine transform (DCT) based algorithm for video in the 64kb/s to 2mb/s range. All H.323 compliant video conferencing system are required to support this codec.

    H.263

    ITU standard for video coding within videoconferencing. H.263 offers better compression than H.261, particularly in the low bitrate range used by modems.

    H.320

    ITU standard for videoconferencing over ISDN and fractional T1 lines.

    H.323

    ITU standard for videoconferencing over networks that do not guarantee bandwidth, such as the Internet. H.323 is the standard that this cookbook is recommending that most users in the education community should be using. For more detailed information on this and the other ITU standards see the bibliography of this document.

    H.324

    ITU standard for video conferencing over standard phone lines.

    half duplex

    A telecommunications system where data can only flow in one direction at a time. Cheaper speakerphones are a good example of this, where only one person can talk at a time.

    I

    IETF

    Internet Engineering Task Force. This is a group that develops and publishes new standards for use on the Internet.

    IGMP

    Internet Group Management Protocol. This protocol is used in multicasting.

    IP

    The Internet Protocol. IP is the basic language of the Internet. It was developed by the government for use in internetworking multiple computer networks together.

    IP Multicast

    A system for sending IP transmissions out only one time, but allowing for multiple users to receive it. This would reduce the bandwidth required for audio and video broadcasting over the Internet, but it is not widely used yet.

    J

    jitter

    A flickering on a display screen. Besides a monitor or connector malfunction, jitter can be caused by a slow refresh rate.

    K

    Kerberos

    Kerberos is a network authentication protocol developed by MIT. It is designed to provide strong authentication for client/server applications by using secret-key cryptography.

    L

    latency

    The length of time it takes a packet to move from source to destination; delay.

    lossless compression

    Refers to data compression techniques in which no data is lost. For most types of data, lossless compression techniques can reduce the space needed by only about half. Only certain types of data can tolerate lossy compression. Lossless compression technique when compressing data and programs.

    lossy compression

    Refers to data compression techniques in which some amount of data is lost. Lossy compression technologies attempt to eliminate redundant or unnecessary information. Most video compression technologies, such as MPEG, use a lossy technique.

    luminance

    brightness

    M

    MBONE

    Multicast Backbone. The MBONE is a system of transmitting audio and video over a multicast network. Mostly available at universities and government facilities, the MBONE can be thought of as a testbed for technologies that will eventually be promulgated across the larger internet. The MBONE has been replaced on the vBNS and Abilene by native multicast support.

    MIDI

    Musical Instrument Digital Interface is a standard for connecting electronic musical instruments and computers. MIDI files can be thought of as digital sheet music, where the computer acts as the musician playing back the file. MIDI files are much smaller than digital audio files, but the quality of playback will vary from computer to computer.

    MPEG

    MPEG (Moving Picture Experts Group) is a series of ISO standards for digital video and audio, designed for different uses and data rates.

    MPEG-1 - The initial MPEG standard, designed to encode full motion video so it could be played back off of a CD (150 kb/s). The bit rate of a standard MPEG1 is 1.5Mbps. MPEG-1 has a frame size of 352x240 pixels, which gives a picture quality slightly better than VHS video tape. MPEG-1 included three audio standards, most video systems use MPEG-1 layer 1 or layer 2 audio. MPEG-1 layer 3 audio (commonly known as MP3), is being used widely for audio on the Internet.

    MPEG-2 was a follow-on standard supporting higher data rates, and thus higher quality. MPEG-2 is the standard used in DVD video players, most digital satellite systems in North America, and in the new North American Digital TV system.

    MPEG-3 was abandoned as its planned functionality was included in MPEG-2.

    MPEG-4 is a draft standard that will be better suited for use on the Internet. MPEG4 delivers video at comparable quality to MPEG1 at a much lower bit rate. MPEG-4 also supports a wide variety of elements that can be transmitted separately and combined to form the video frame, such as a talking head in one stream and the background in another. That is, MPEG4 allows manipulation of objects within the video stream (addition, subtraction, object manipulation, etc.). If you don't like where a chair is in the video, you can move it (providing the chair has been coded as a moveable object, of course). Approval is expected in the first half of 1999.

    MPEG-7 is a developing standard for the description of multimedia objects. Not a video encoding format, it is a way to describe elements in a multimedia stream so that they can be accessed via database. For example, it would be useful to be able to search a multimedia database for instances of 'red wagons.'

    Multipoint Conferencing Server (MCS) (also MCU)

    A hardware or software H.323 device that allows multiple video conferencing (or audio or data) users to connect together. Without an MCS typically only point to point conferences can take place. Commonly supports voice activated switching, where whoever is talking is broadcast to all users, but new systems support "Hollywood squares", where multiple windows show each participant. ITU-T standard H.231 describes the standard way of doing this. Many current systems only support H.320 (ISDN) but many vendors are working to upgrade their products to support H.323 (LAN, Internet) as well. In the H.320 space, this functionality is referred to as a multipoint control unit (MCU). Sometimes these terms are used interchangeably, although they refer to somewhat different implementations.

    P

    packet

    A unit of information sent across a (packet-switched) network. A packet generally contains the destination address as well as the data to be sent.

    Q

    QCIF

    A standard related to CIF, QCIF (Quarter CIF), transfers one fourth the amount of data and is suitable for videoconferencing systems on slower connections or telephone lines.

    QuickTime

    A file-format and architecture developed by Apple for use with digital audio and video. Available on most computing platforms. A future version (Quicktime3) will support streaming.

    R

    RealAudio

    A proprietary system for streaming audio (and now video) over the internet. Before Real Audio, users had to download an entire audio file before they could listen to it. Also supports real-time broadcast of audio and video programs. Many radio stations now broadcast on the internet using Real Audio.
     
     

    real time

    A transmission that occurs right away, without any perceptible delay. Very important in video conferencing, as much delay will make the system very unusable.

    S

    streaming media

    Sending video or audio over a network as needed, such as Real Audio/Video or Microsoft NetShow, instead of forcing the user to download the entire file before viewing it. Typically a few seconds of data is sent ahead and buffered in case of network transmission delays. (Although some data is buffered to the hard drive, it is written to temporary storage and is gone once viewing is complete.)

    T

    T.120

    T.120 is an ITU-T standard (International Telecommunications Union) for document conferencing. Document conferencing allows two or more people to concurrently view and edit a document across a network.

    T.120 is the commonly used name to refer to a family of distinct standards. Many video conferencing companies were developing their own implementations of this until Microsoft released its free NetMeeting software. Now, many companies are using NetMeeting, while perhaps enhancing it in some way.

    Teleconferencing

    Two or more people who are geographically distant having a meeting of some sort across a telecommunications link. Includes audio conferencing, video conferencing, and or data conferencing.

    Terminal End Station

    A terminal end station is the client endpoint that provides real-time, two-way communications. This is often shortened to just terminal.

    Transcoder

    A device that does transcoding. See below.

    Transcoding

    Converting a data stream from one format to another, such as MPEG 1 to H.263, or an H.320 videoconferencing session to H.323.

    Truespeech

    Truespeech is a codec used for low bandwidth encoding of speech (not music). It was created by the DSP Group. It is available on Microsoft Windows 98 among other systems.

    U

    unicast

    Sending each user their own copy of a video (or other data) stream. As opposed to Multicast, where one copy is sent and whoever wants it listens to that copy. It is the most commonly used method for video conferencing and video on demand today. Multicast, which is much more efficient, is slowly gaining ground, but requires Internet Service Providers to support it.

    V

    ViDe

    Video Development Group. Currently consists of the Georgia Institute of Technology, North Carolina State University, the University of North Carolina, Chapel Hill, and the University of Tennessee, Knoxville, in partnership with NYSERNet (New York State Education, Research Network).

    video on demand

    Being able to view any of a number of videos when you want to. Used on the internet and at hotels, cable systems, etc.

    video server

    A computer server that has been designed to store large amounts of video and stream it to users as required. Usually a video server has large amounts of high speed disks and a large amount of network bandwidth to allow for many users to simultaneously view videos.

    voice activated switching

    Automatically switching the video feed to whomever is speaking in a multipoint video conference. Usually a function of the MCU (multipoint conferencing unit)
     
     
     
     

    Bibliography


    Measuring Video Quality in Videoconferencing Systems, Roger Finger, Intel Corporation

    International Telecommunication Union

    Recommendation H.323 (09/99) - Packet-based multimedia communications systems

    I2 Multicast working group

    IP/TV multicast client for Internet2 members

    Setting up MBONE tools tutorial

    vBNS multicast

    Abilene router configuration for multicast specs

    Connecting to a High-Performance Network's Multicast Infrastructure

    Broadband Communications;Balaji Kumar; May 1998; McGraw Hill Text; ISBN: 007038293X

    Desktop Encyclopedia of Telecommunications; Nathan J. Muller; January 1998; McGraw Hill;ISBN: 0070444579

    Digital Compression for Multimedia : Principles and Standards; Jerry D. Gibson (Editor), Toby Berger, David Lindbergh; January, 1998; Morgan Kuafman Publishers; ISBN: 1558603697

    H.323 Videoconferencing Standard; Christine Perey; 1998; Chapman & Hall; ISBN: 0412148412

    Newton's Telecom Dictionary; Harry Newton; October 1998; Miller Freeman Books; ISBN: 1578200237

    Official Microsoft NetMeeting 2.1 Book; Bob Summers, Robert Summers; 1998; Microsoft Press; ISBN: 1572318163

    Personal Videoconferencing;Evan Rosen; 1996; Manning Publications Co.; ISBN: 1-884777-28-7

    The Essential Guide to Telecommunications; Annable Z. Dodd; December 1998; Prentice Hall Trade; ISBN: 0132590115

    Voice and Data Communications Handbook : Signature Edition; Donald Gregory, J. Regis 'Bud' Bates, Regis J., Jr. Bates; January 1998; McGraw Hill Text; ISBN: 0070063966

    Video Conferencing;Toby Trowt-Bayard, Jim Wilcox; 1997; Telecom Books (Miller Freeman, Inc.); ISBN: 1-57820-010-5

    "Measuring Quality in Video Conferencing Systems", Roger Finger, Business Communication Review, June 1998.

    "Virtual Meetings with Desktop Conferencing", Amitava Dutta-Roy, IEEE Spectrum, July 1998.

    Network Week "Picture this".

    Video Cookbook Contributors

    The Video Development Initiative (ViDe)

    The goal of The Video Development Initiative (ViDe) is to promote the deployment of digital video in higher education by leveraging collective resources and expertise towards addressing challenges to deployment - poor interoperability, volatile standards and high cost. A multi-institutional effort, ViDe was founded by four educational institutions: The Georgia Institute of Technology, North Carolina State University, The University of North Carolina at Chapel Hill, and The University of Tennessee, Knoxville. NYSERNet (New York State, Educational and Research Network) became a working partner with ViDe in its efforts shortly thereafter. In May 1999, ViDe expanded its membership to include nine additional institutions: University of Alabama at Birmingham, CANARIE, George Washington University, NYSERNet (New York State, Educational, and Research Network), Ohio State University, The University of Hawaii, The University of South Carolina, Vanderbilt University, The College of William and Mary, and Yale University.

    ViDe Phase I (August 1, 1998 - April 1, 1999) was funded by the Southeastern Universities Research Association (SURA), and focused on the specification of optimum video-on-demand and videoconferencing systems, the establishment of relationships with vendors willing to refine their products to meet those specifications, and the preparation and release of recommended practices and standards for video systems to SURA member institutions. In September 1998, an RFI Concerning Video Conferencing over IP and/or Video-on-Demand Server Technologies was released to video system vendors nationally. The responses to the RFI contributed to the creation of two deliverables for the SURA and NYSERNet communities: a videoconferencing cookbook and a whitepaper on video-on-demand, Digital Video for the Next Millennium.

    ViDe Phase II foci included the SURA-funded "Large Scale Video Network Prototype," a distributed H.323 testbed to explore issues critical to the deployment of seamless networked video between institutions and regions; ViDeNet, a testbed and model network in which to develop and promote ViDe's goals for highly scalable and robust networked video technologies, and to create a seamless global environment for teleconferencing and collaboration; and the updating of the Videoconferencing Cookbook.

    In the recently initiated Phase III (June 2000), ViDe has established two new working groups - the Video Access Working Group and the MPEG4 Working Group - to focus on testing, standards development and industry partnerships in the video-on-demand arena, and to accelerate the adoption of metadata for digital video assets. Further emphasis will be placed on application sharing tests for applications relevant to research and education in science and engineering.

    The following people have contributed to the development of this cookbook. A short biographical description is included on each person for your information. In alphabetical order:

    Grace Agnew
    Georgia Institute of Technology

    Assistant Director, Systems and Technical Services, Price Gilbert Library

    Grace Agnew is a member of ViDe. She manages virtual library initiatives for the Georgia Tech Library, which includes metadatabase creation for electronic collections. She co-authored and co-administers a three-year $750,000 grant to create, with the Emory University Library, a multimedia digital virtual library, including video, audio, still image and textual material. She administers a three-year digital imaging grant to develop an imaging program for still images and three-dimensional artifacts. She is the author of numerous articles, as well as the LITA monograph, "Online System Migration Guide".

    Sean Brennan
    Georgia Institute of Technology

    Systems Support Specialist, Classroom Technology, Educational Technologies directorate of the Office of Information Technology

    Sean Brennan is responsible for the support of classroom technologies in Georgia Tech's general purpose classrooms. He has a background in Professional Sound Reinforcement, Video Projection systems, Automated control systems, multiple PC platforms, and web development. He has recently delved into the arena of Video Conferencing, both desktop and roomsize systems. He is involved in the testing and support of new and existing Video Conferencing technologies.
     
     

    Jill Gemmill
    University of Alabama at Birmingham

    Senior Network Applications Specialist, UAB Telecommunications & Network Services
    Internet2 Applications Lead

    Jill Gemmill is a member of ViDe and will become ViDe administrative chair in 2001. At UAB, Jill works on the development and deployment of advanced networking applications, including: campus deployment of IP multicast technology; streaming video-on-demand services; H.323 video conferencing; measuring campus network performance; and Quality of Service. Jill is co-Principal Investigator on NSF and NSF-EPSCOR Advanced Network Infrastructure Research grants and is a member of the team that built the Gulf Central GigaPOP in Alabama. Additional activities include: Chairperson, SURA Advanced Network Applications Workshop in September 1999 and Steering Committee Member, UCAID Health Sciences Working Group. Jill received an MS in Computer and Information Sciences in 1984 and wrote data collection/analysis software for vision and neuro scientists, including 3D reconstructions of neurons from electron, light, and confocal microscopy. She can be reached at jgemmill@uab.edu.

    Jeremy George
    Yale University

    Jeremy George directs advanced networking initiatives within ITS at Yale University. The majority of his time recently has been focused on real-time protocols, especially voice over IP.

    Chris Hodge
    University of Tennessee

    Coordinator, SunSITE@ UTK

    Chris Hodge is a member of ViDe. He is the coordinator for SunSITE@UTK, one of over 55 educationally-affiliated sites worldwide, sponsored by Sun Microsystems and dedicated to the promotion of emerging technologies and the free distribution of information.

    Mark D. Johnson
    University of South Carolina

    Information Technology Manager, Internet Video Group

    Mark Johnson is a member of ViDe and SEPSCoR. Mark has been working with H.320 and H.323 video conferencing for several years in collaboration with SEPSCoR (SouthEast Partnership to Share Computational Resources) and ViDe. Mark has been working with several other projects involving H.320 and H.323 video conferencing and streaming video including the Megaconference and Project Connect. Project Connect is a statewide effort to connect children with special needs in K-12 schools with specialists that assist them remotely. His interests lie in wireless networking and video conferencing, multicasting and data collaboration. He is part of the USC campus effort to provide wireless access to the network backbone.
     
     

    Tyler Miller Johnson
    University of North Carolina at Chapel Hill

    Director, CAVNER Center for Advanced Video Network Engineering and Research
    Telecommunications Systems Engineer, Networking and Communications Group

    Tyler Miller Johnson serves as ViDe's technical co-chair. His area of expertise is networked video systems, helping UNC-CH become a leader in that area. He is responsible for UNC's migration to digital television and HDTV, and helped make UNC the first site to employ dense wave division multiplexing for multi-gigabit uncompressed video transport. Mr. Johnson also serves on the Internet2 video steering committee and the North Carolina Networking Initiative technical committees.

    Ms. Mairead Martin
    The University of Tennessee

    Director, Advanced Internet Technologies, Office of Research and Information Technology

    Mairead Martin is a member of ViDe and heads up a recently-established unit dedicated to the development, promotion, and implementation of next-generation technologies and applications at The University of Tennessee. Ms. Martin is chair of the ViDe MPEG4 Working Group, and is active in both videoconferencing and video-on-demand activities within ViDe. She also represents UT on the Internet2 Digital Video Steering Committee.

    Ed Price
    Georgia Institute of Technology

    Research Director, Interactive Media Technology Center, Georgia Center for Advanced Telecommunications Technology

    Ed Price is the current chairman of ViDe. In his day job, Ed is the research director at the Interactive Media Technology Center at Georgia Tech. He has been involved in video conferencing research for almost 10 years, including projects sponsored by Intel, Bell South, and the US Army. Currently, most of Ed's research is involved in video indexing, educational media and telemedicine. He can be reached at ed.price@imtc.gatech.edu.

    Mary Trauner
    Georgia Institute of Technology

    Senior Research Scientist, High Performance Computing, Educational Technologies directorate of the Office of Information Technology

    Mary Trauner is a member of ViDe and editor of this cookbook. Mary has led a small team of scientists and engineers in the development of a support infrastructure for high performance computing at Georgia Tech. An atmospheric scientist herself, she has worked to port several large applications to these systems. As Georgia Tech's Internet2 Application Liaison and as the directions in large scale computing have expanded to include national labs, local campus facilities, and distributed systems, Mary has increased her activities in the application of advanced networks to the solution of large scale problems. Her interest here, with ViDe and the Internet2 DVN groups, lies primarily in the educational support role for digital video and application sharing and data collaboration for scientific and engineering applications.

    Pat Watson
    University of Tennessee

    Senior Comm and Elec Tech, Telecommunications & Network Services

    Pat Watson is a member of ViDe. Pat works with several projects involving H.320 and H.323 video conferencing, streaming video, and digital video editing and archiving on the UT Knoxville campus and with other campuses across the country.

    Mary Fran Yafchak
    SURA

    IT Program Coordinator

    Mary Fran Yafchak is a member of Vide and also the liaison between ViDe and their original sponsoring organization, SURA (Southeastern Universities Research Association). Mary Fran currently works under the direction of the SURA Director of Information Technology Initiatives to further the development as well as SURA sponsorship of collaborative information technologies within the SURA region. Prior to this, she was the Advanced Application Technology Manager for NYSERNet (New York State Research and Education Network), charged with instilling a strong application-driven approach to the network design of NYSERNet's next-generation network, NYSERNet 2000. In both current and past roles, Mary Fran has enabled and supported diverse initiatives related to the development and dissemination of advanced network technologies. Among these are the NYSERNet Video-IP Project, the NYSERNet Multipoint Conferencing Service trial, integration of Internet resources to benefit the K-12 community, and facilitation of university-based advanced application support teams. She is also presently co-chair of the Internet 2 Digital Video group.

    Graphics design assistance by Greg Noe
    University of Tennessee

     

    Acknowledgements

    We would like to thank the following for making this work possible and for providing time and creativity during the production:
    ViDe is grateful to The Southeastern Universities Research Association for its support of our efforts. SURA support has contributed to the creation of this cookbook and the creation of a "best practices" white paper on video-on-demand. With SURA funding, ViDe was able to purchase and install desktop video conferencing terminals at the original institutions as well as multipoint conferencing facilities for the Large Scale Video Networking Prototype (LSVNP), thus establishing an experienced community of H.323 users and enabling testing and evaluating of video conferencing software for the purposes of this cookbook. SURA has provided funding for organizational and planning meetings as well as travel expenses to several conferences at which ViDe was presenting on behalf of our group and region. We are grateful also for the valuable support, help and advice SURA staff and officers have contributed to ViDe activities.

    The University of Tennessee for staff time, meeting facilities, and RFI mailouts.

    NYSERNet as a predominantly self-funded research partner in this project, contributing staff time, expertise, travel expenses, and NYSERNet Video-IP internship hours.

    The Georgia Institute of Technology for staff time and meeting facilities.

    The University of North Carolina at Chapel Hill for staff time.

    The University of Alabama at Birmingham for staff time.

    Yale University for staff time.

    The University of South Carolina for staff time.

    William & Mary University for staff time.

    North Carolina State University for staff time.