MPI4py
mpi4py实现了MPI的很多接口,并可以方便的在多进程中传递python的数据结构,编写python多进程程序。
https://mpi4py.readthedocs.io/en/stable/tu ...
Arm-Performance-Lib
Arm Performance Libraries是ARM提供的ARM架构下的性能库,提供了Fortran和c的API,子程序包括BLAS,LAPACK。
doc:https://developer. ...
PETSc
一.安装/配置PETScPETSc需要MPI和BLAS库。还有gcc等基础包。mpich可以直接apt-get安装:1sudo apt-get install mpich
如果不确定有没有安装BLAS ...
Merge-based Sparse Matrix-Vector Multiplication (SpMV) using the CSR Storage Format
Merge-based Sparse Matrix-Vector Multiplication (SpMV) using the CSR Storage Format这是一篇来自PPoPP的2016年 ...
Performance Optimization of SpMV by Considering Scheduling on CPUs
Performance optimization of SpMV using CRS format by considering OpenMP scheduling on CPUs and MIC该篇 ...
Optimizing SpMV on Emerging Many-Core Architectures
Optimizing SpMV on Emerging Many-Core Architectures本文实现了一个针对众核平台实现的自适应格式选择的模型,能够针对不同矩阵选择合适的压缩矩阵格式进行计 ...
HPC-Game-1th
主页:https://hpcgame.pku.edu.cn/
赛事组的题解:https://github.com/lcpu-club/hpcgame_1st_problems
1.流量PCAP是一种数 ...
MPI
MPI内容主要来自https://mpitutorial.com/tutorials/。由于MPI tutorial的内容比较基础,缺少了常用的非阻塞通信,并行文件读写,因此补充了《高性能计算-MP ...
SpV8_Pursuing_Optimal_Vectorization_and_Regular_Computation_Pattern_in_SpMV
SpV8_Pursuing_Optimal_Vectorization_and_Regular_Computation_Pattern_in_SpMV在之前的论文阅读中,发现间隔一段时间后,即使看笔记 ...
Algorithm and hardware co optimized solution for large SpMV problems
Algorithm and hardware co optimized solution for large SpMV problems本篇论文用了特定硬件来结合优化,所以泛用性不高。但是提醒了我按行 ...