In order to reduce the overhead of synchronizing operations of shared memory multiprocessors, we proposed a mechanism, named specMEM, to execute memory accesses following a synchronizing operation speculatively before the completion of the synchronization is confirmed. A unique feature of our mechanism is that the detection of speculation failure and the restoration of computational state on the failure are implemented by a small extension of coherent cache. It is also remarkable that operations for speculation on its success and failure are performed in a constant time for each independent of the number of speculative accesses. Although we reported previously that specMEM achieves significant execution time reduction, for example 13% for LU decomposition, we also observed that it may be implemented more efficiently. This paper discusses about more efficient implementations of specMEM with an extra cache state and/or a non-speculative secondary cache.
展开▼