Stable Diffusion is a bit confusing, but we already have some great resources to wrap your head around it. In terms of input, you can use a depth map from a camera with lidar (many recent phones ...
Some results have been hidden because they may be inaccessible to you