The load folding optimization was very conservative by requiring the root OR instruction to have a single use. This prevented optimization when to fold loads when only the root had multiple uses. For example: %val = or i32 ... ; Assembles 4 bytes to i32 %use1 = call @foo(%val) %use2 = call @bar(%val)