Performance analysis and optimization for SpMV based on ARM
Performance analysis and optimization for SpMV based on aligned storage formats on an ARM processor这 ...
NEON编程
原书为Arm的NEON Programmer’s Guide和NEON Programmer Guide for Armv8-A。
根据我的实际体验,这份文档的帮助不大,只是对A64有个基本认识,实际 ...
WISE:Predicting the Performance of Sparse Matrix
WISE:Predicting the Performance of Sparse MatrixAbstract稀疏矩阵向量乘法(SpMV)是一个重要的稀疏kernel。已经开发了许多方法来加速SpM ...
Efficiently Running SpMV on Long Vector Architectures
Efficiently Running SpMV on Long Vector ArchitecturesAbstract稀疏矩阵-向量乘法(SpMV)是并行数值应用的一个重要核心。SpMV中存在稀疏 ...
CS149-Assignment-4(2023FALL)
CS149-Assignment-4(2023FALL)。
CS149:https://gfxcourses.stanford.edu/cs149/fall23/
Assignment 4: Nano ...
An OpenMP Runtime for Transparent Work Sharing across Cache-Incoherent Heterogeneous Nodes
An OpenMP Runtime for Transparent Work Sharing across Cache-Incoherent Heterogeneous NodesKey Words: ...
CMU15-418notes(19-23)
视频:https://www.bilibili.com/video/BV1Rh4y1F7aU/?spm_id_from=333.788&vd_source=463e5b3e4b18e54534 ...
CS149-Assignment-1&2
CS149-Assignment-1&2。
CS149:https://gfxcourses.stanford.edu/cs149/fall23/
Assignment 1: Performa ...
性能优化工具
关于常用的性能优化工具的一些笔记。主要内容是perf,vtune。
perf的参考文档:
https://perf.wiki.kernel.org/index.php/Tutorial
一.Lin ...
OpenMP
OpenMP笔记。由于曾经学习过,这一遍就简单过一下。
学习内容来自:
https://lemon-412.github.io/imgs/20200516OpenMP_simple_Program.p ...