Sketching: Towards a computational model

Sketching is a powerful means of interpersonal communication. We view sketching as a combination of interactive drawing and linguistic interaction. The drawing carries the spatial aspects of what is to be communicated. The linguistic interaction provides a complementary conceptual channel that guides the interpretation of what is drawn and provides information that is not easily depicted spatially. Most people are not artists, and even artists cannot produce, in real time, drawings of complex objects and relationships that are recognizable solely visually without breaking the flow of conversation. The verbal description that occurs during drawing, punctuated by written labels, compensates for sketch1small.jpginaccuracies in drawing. Follow-up questions may be needed to disambiguate what aspects of a drawing are intended versus accidental.

Our goal is a computational model of sketching.  Research on sketching provides an arena for investigating the intersection of conceptual knowledge, visual understanding, and language, making it a valuable area for investigation in order to understand human cognition. Creating software that can interact sketch2small.jpg with human partners via sketching could have a revolutionary impact on human-computer interaction.

The framework we are developing characterizes sketching in terms of four dimensions of knowledge and skills: visual understanding, language understanding, conceptual understanding, and drawing skills. Variations along these dimensions determine how many different types of interactions something having those skills can participate in. Our work focuses heavily on improving visual understanding and conceptual understanding.

nuSketch: A Multimodal Architecture for Sketching

nuSketch is designed as a general-purpose multimodal architecture to support sketching. We are currently using it in three applications:

This research is leading us to develop domain theories that express visual symbology (i.e., the representational contents of visual symbols, genres of diagrams, and relationships between diagrams) and everyday physical semantics (i.e., the assumptions about the physical world that are typically used in interpreting diagrams). We are also using techniques from qualitative spatial reasoning to do high-level visual perception and visual analogies.

Selected Relevant Papers

Ferguson, R. W. and Forbus, K.D. (1995). Understanding illustrations of physical laws by integrating differences in visual and textual representations. AAAI Fall Symposium on Computational Models for Integrating Language and Vision

Ferguson, R. W. & Forbus, K. D. (2000). GeoRep: A flexible tool for spatial representation of line drawings. Proceedings of the 17th National Conference on Artificial Intelligence. Austin, Texas: AAAI Press. [NB: on the paper the conference is incorrectly identified as the "18th National Conference"]

Forbus, K. D., Ferguson, R. W. & Usher, J. M. (2001). Towards a computational model of sketching. Proceedings of the International Conference on Intelligent User Interfaces. Sante Fe, New Mexico.

Forbus, K. D., Ferguson, R. W. & Usher, J. M. (2000). Boundary-based multimodal input for geographic planning sketches. In P. Healy (Ed.), Proceedings of the First International Workshop on Interactive Graphical Communication. London: Queen Mary College, University of London.

Ferguson, R. W., Rasch, R. A. J., Turmel, W. & Forbus, K. D. (2000). Qualitative spatial interpretation of Course-of-Action diagrams. Proceedings of the 14th International Workshop on Qualitative Reasoning. Morelia, Mexico.

Relevant Projects

Knowledge Acquisition via Analogy, Examples, and Sketching

Technologies for Multimodal Interfaces

Understanding and Fostering Spatial Competence

Building and Using Large Common Sense Knowledge Bases

Back to Ideas page | Back to Software page | Back to QRG Home Page