This greatly reduces the number of allocators we create, while still
retaining thread safety. Reducing the number of allocators is much
better for locality and memory usage; this revision drops memory
usage for some MLIR heavy workloads (with lots of attributes/types)
by >=5%. This is due to the observation that the number of threads
is effectively always smaller than the number of parametric attributes/types.
Differential Revision: https://reviews.llvm.org/D145991