MATRMUL0 - Matrix Multiplication 2K
Consider the following C++ code, it constructs n×n matrices A,B and vector V from matrix C=A×B which you need to calculate.
uint32_t n, i, j, d1, p1, r1, m1, d2, p2, m2, A[n][n], B[n][n]; uint64_t C[n][n], V[n]; //here you need to read n, p1, d1, r1, m1, p2, d2, r2, m2 from input. for (i=0; i<n; ++i) for (j=0; j<n; ++j) { d1 = d1 * p1 + r1; d2 = d2 * p2 + r2; A[i][j] = d1 >> (32 - m1); B[i][j] = d2 >> (32 - m2); } //here you need to calculate C=A*B for (i=0; i<n; ++i) { V[i] = 0; for (j=0; j<n; ++j) V[i] ^= C[i][j]; } //here you need to output V[0], V[1], ..., V[n-1], separated by spaces.
Input
You are given integers n, p1, d1, r1, m1, p2, d2, r2, m2, separated by spaces.
1≤n≤2048, 1≤m1,m2≤26, 0≤p1,d1,r1,p2,d2,r2<232.
Output
You need to output V[0], ..., V[n-1], separated by spaces.
Example
Input: 4 1664525 0 1013904223 26 1664525 1 1013904223 26 Output: 3929555766216722 770418013451752 7738542081270672 4286515685761206
P.S.
There are 4 tests with n≈2048. My best solution time — ~3.2s (~0.8s per test) which is 20 times faster than time limit.
Also try the challenge MATRMUL, to get AC there your time here should be less than 14s for cubic algorithm and 16s for Strassen's.
hide comments
Ishan:
2023-04-03 06:53:27
I have been hearing from my ECE friends that cache misses cost a lot more than ALU operations these days. This is probably the first time m trying to experience it firsthand. Good that you created a problem with one of the most important topics in CS research in modern times. |
|
Michael Kharitonov:
2017-12-31 12:18:07
tip: use Function Attributes or Function Specific Option Pragmas:
|
|
Sushant Moon:
2017-09-21 19:38:20
I need some hints for this problem. Please if anyone has any it would be a very much helpful Thank You
|
Added by: | Michael Kharitonov |
Date: | 2017-04-09 |
Time limit: | 16s |
Source limit: | 50000B |
Memory limit: | 1536MB |
Cluster: | Cube (Intel G860) |
Languages: | All |
Resource: | fastcomputing.org |