01994naa a2200133 a 450000100080000000500110000800800410001910000180006024500590007826000090013730000150014652014560016177302430161710028392007-08-16 1995 bl uuuu u00u1 u #d1 aMENDES, C. L. aIntegrating message-passing with vector architectures. c1995 ap.151-165. aVector architectures provide excellent computational throughput, while sucessfully tolerating memory latency by pipelining memory accesses. In this paper, we propose a generalization of vector architectures to message-passing multicomputers, which combines the efficiency of vector computation whith the scalability of distributed-memory systems. In our proposed architecture, each node is a conventional vector processsor (with chaining capability and pipelined functional units) augmented by native instructions to send and receive messages through vector registers. In this scheme, inter-node communication can be performed via vector-send/receive instructions, gaining the benefits of communication pipelining, reduced memory copies (memory-to-register-to-register instead of memory-to-memory-to-cache), and lower communication latency (due to tight processor-communication coupling). We show that this strong integration between functional and communication units can lead to substantial performance improvement over conventional message-passing multicomputers. We model pipelined computation-communication systems both analyticaly and with a detailed instruction-level simulation, and compare this simulation data with empirical results from an intel paragon. Preliminary data from a matrix multiplication example indicates our proposed vector-parallel architecture offers significant scalability benefits over existing message-passing systems. tIn: SIMPÓSIO BRASILEIRO DE ARQUITETURA DE COMPUTADORES - PROCESSAMENTO DE ALTO DESEMPENHO, 7.; CONGRESSO BRASILEIRO DA SOCIEDADE BRASILEIRA DE COMPUTAÇÃO, 15., 1995, Canela. Anais... Porto Alegre: UFRGS, Instituto de Informática,1995.