ABSTRACT

Today’s video-conferencing tools support a rich range of professional and social activities, but their generic, grid-based environments cannot be easily adapted to meet the varying needs of distributed collaborators. To enable end-user customization, we developed BlendScape, a system for meeting participants to compose video-conferencing environments tailored to their collaboration context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users' physical or virtual backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an evaluation with 15 end-users, we investigated their customization preferences for work and social scenarios. Participants could rapidly express their design intentions with BlendScape and envisioned using the system to structure collaboration in future meetings, but experienced challenges with preventing distracting elements. We implement scenarios to demonstrate BlendScape’s expressiveness in supporting distributed collaboration techniques from prior work and propose composition techniques to improve the quality of environments.

arXiv page ↗︎ Full-text PDF ↗︎


Overview of BlendScape, illustrating architecture creating customizable video-conference meeting experiences by leveraging AI image generation techniques.

Overview of BlendScape, illustrating architecture creating customizable video-conference meeting experiences by leveraging AI image generation techniques.


Overview of BlendScape interface: BlendScape offers two composition modes for creating meeting spaces (a): blending webcam feeds together via inpainting and transforming the image on the canvas via image-to-image. To steer the environment generation, end-users can specify text-based prompts for the Meeting Activity and Meeting Theme (c), control the prioritization of stylistic prompts over user backgrounds (b), upload custom environment images (f), and modify specific regions of the scene via selection tools (e). Users can return to and iterate on previous environment designs via the history tools (g). The automatic layout techniques facilitate positioning users behind foreground objects in the scene (d). BlendScape also provides session management tools (h) and per-user controls for adjusting the proportion of their video backgrounds preserved during the environment generation and toggling between displaying live webcam feeds or static frames (i).

Overview of BlendScape interface: BlendScape offers two composition modes for creating meeting spaces (a): blending webcam feeds together via inpainting and transforming the image on the canvas via image-to-image. To steer the environment generation, end-users can specify text-based prompts for the Meeting Activity and Meeting Theme (c), control the prioritization of stylistic prompts over user backgrounds (b), upload custom environment images (f), and modify specific regions of the scene via selection tools (e). Users can return to and iterate on previous environment designs via the history tools (g). The automatic layout techniques facilitate positioning users behind foreground objects in the scene (d). BlendScape also provides session management tools (h) and per-user controls for adjusting the proportion of their video backgrounds preserved during the environment generation and toggling between displaying live webcam feeds or static frames (i).

@misc{rajaramBlendScapeEnablingUnified2024,
  title = {{{BlendScape}}: {{Enabling Unified}} and {{Personalized Video-Conferencing Environments}} through {{Generative AI}}},
  shorttitle = {{{BlendScape}}},
  author = {Rajaram, Shwetha and Numan, Nels and Kumaravel, Balasaravanan Thoravi and Marquardt, Nicolai and Wilson, Andrew D.},
  year = {2024},
  month = mar,
  number = {arXiv:2403.13947},
  eprint = {2403.13947},
  publisher = {arXiv},
  urldate = {2024-03-22},
  abstract = {Today's video-conferencing tools support a rich range of professional and social activities, but their generic, grid-based environments cannot be easily adapted to meet the varying needs of distributed collaborators. To enable end-user customization, we developed BlendScape, a system for meeting participants to compose video-conferencing environments tailored to their collaboration context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users' physical or virtual backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an evaluation with 15 end-users, we investigated their customization preferences for work and social scenarios. Participants could rapidly express their design intentions with BlendScape and envisioned using the system to structure collaboration in future meetings, but experienced challenges with preventing distracting elements. We implement scenarios to demonstrate BlendScape's expressiveness in supporting distributed collaboration techniques from prior work and propose composition techniques to improve the quality of environments.},
  archiveprefix = {arxiv},
}