For the execution of an arbitrary parallel program P, consisting of a set of processes, we consider two alternative multiprocessors. The first multiprocessor has q processors and allocates parallel programs dynamically, i.e. processes may be reallocated from one processor to another. The second employs cluster allocation with k clusters and u processors in each cluster - here processes may be reallocated within a cluster only. Let T_d(P,q) and T_c (P,k,u) be execution times for the parallel program P with optimal allocations. We derive a formula for the program independent performance function G(k,u,q)=\sup_ all parallel programs P T_c(P,k,u)}{T_d(P,q)}. Hence, with optimal allocations, the execution of $P$ can never take more than a factor $G(k,u,q)$ longer time with the second multiprocessor than with the first, and there exist programs showing that the bound is sharp. The supremum is taken over all parallel programs consisting of any number of processes. Any interprocess dependency structure is allowed for the parallel programs, except deadlock. Overhead for synchronization and reallocation is neglected only. We further present optimal formulas which exploits a priori knowledge of the class of parallel programs intended for the multiprocessor, thus resulting in sharper optimal bounds. The function g(n,k,u,q) is the above maximum taken over all parallel programs consisting of n processes. The function s(n,v,k,u) is the same maximum, with q=n, taken over all parallel programs of $n$ processes which has a degree of parallelism characterized by a certain parallel profile vector v=(v_1,...,v_n). The functions can be used in various ways to obtain optimal performance bounds, aiding in multiprocessor architecture decisions. An immediate application is the evaluation of heuristic allocation algorithms. It is well known that the problems of finding the corresponding optimal allocations are NP-complete. We thus in effect present a methodology to obtain optimal control of NP-complete scheduling problems.