Given the digital nature of LLMs, and that they are trained on and predict text (well token) streams, I wonder if the Sapir-Whorf hypothesis applies. In the end, they only learn what humans have been able to describe with words. Which is not everything, and might lead to other interpretations. In particular about non-physical (emotional, interactional) domains.
Of course the next generation may already use audio/video input or even more sensors. This might close the gap to the human perception of the world. Although the same question as with octopuses applies: if an LLM is intelligent, but of a different intelligence than ours, are we even able to recognise each other? Each caught in their own perception bubble of reality.