This chapter investigates ways in which multiple participants coordinate actions within a temporally unfolding "turn-constructional unit (TCU)" in Japanese conversation. Utterances are shaped by ongoing processes of participation and speakers modulate the structure of their emerging TCU with recipients' dynamic stance-displays. Building on Iwasaki (2011, 2013), the paper examines ways in which speakers delay the further realization of an unfolding unit by suspending its progressivity, and incorporate recipients' actions into the design of the turn. The speaker creates interstitial spaces right after or immediately before producing a component conveying speaker's stance and affiliation, and invites recipient's actions toward the particular component before completing the turn. This chapter illustrates systematic multimodal practices within unit construction that enable emergent forms of participation.