ABSTRACT

Today’s video-conferencing tools support a rich range of professional and social activities, but their generic meeting environments cannot be dynamically adapted to align with distributed collaborators’ needs. To enable end-user customization, we developed BlendScape, a rendering and composition system for video-conferencing participants to tailor environments to their meeting context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users’ physical or digital backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an exploratory study with 15 end-users, we investigated whether and how they would find value in using generative AI to customize video-conferencing environments. Participants envisioned using a system like BlendScape to facilitate collaborative activities in the future, but required further controls to mitigate distracting or unrealistic visual elements. We implemented scenarios to demonstrate BlendScape’s expressiveness for supporting environment design strategies from prior work and propose composition techniques to improve the quality of environments.

Full-text PDF ↗︎
Overview of BlendScape, a rendering and composition system for end-users to customize video-conference environments by leveraging AI image generation techniques.

Overview of BlendScape, a rendering and composition system for end-users to customize video-conference environments by leveraging AI image generation techniques.

Video

5-minute narrated video overview of BlendScape, covering the motivation, pipeline, and evaluation of the system.

Interface

Overview of BlendScape interface: BlendScape offers two composition modes for creating meeting spaces: blending webcam feeds together via inpainting and transforming the image on the canvas via image-to-image. To steer the environment generation, end-users can specify text-based prompts for the Meeting Activity and Meeting Theme (c), control the strength of stylistic prompts (b), upload custom image priors (f), and modify specific regions of the scene via selection tools (e). Users can return to and iterate on previous environment designs via the history tools (g). The automatic layout techniques facilitate positioning users behind foreground objects in the scene (d). BlendScape also provides session management tools (h) and per-user controls for adjusting the proportion of their video backgrounds preserved during the environment generation and toggling between displaying live webcam feeds or static frames (i).

Overview of BlendScape interface: BlendScape offers two composition modes for creating meeting spaces: blending webcam feeds together via inpainting and transforming the image on the canvas via image-to-image. To steer the environment generation, end-users can specify text-based prompts for the Meeting Activity and Meeting Theme (c), control the strength of stylistic prompts (b), upload custom image priors (f), and modify specific regions of the scene via selection tools (e). Users can return to and iterate on previous environment designs via the history tools (g). The automatic layout techniques facilitate positioning users behind foreground objects in the scene (d). BlendScape also provides session management tools (h) and per-user controls for adjusting the proportion of their video backgrounds preserved during the environment generation and toggling between displaying live webcam feeds or static frames (i).

CITING

@inproceedings{rajaramBlendScapeEnablingEndUser2024,
  title = {{{BlendScape}}: {{Enabling End-User Customization}} of {{Video-Conferencing Environments}} through {{Generative AI}}},
  booktitle = {Proceedings of the 37th {{Annual ACM Symposium}} on {{User Interface Software}} and {{Technology}}},
  author = {Rajaram, Shwetha and Numan, Nels and Kumaravel, Balasaravanan Thoravi and Marquardt, Nicolai and Wilson, Andrew D},
  year = {2024},
  month = oct,
  series = {{{UIST}} '24},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  doi = {10.1145/3654777.3676326},
}