* [ptxdist] [RFC][PATCH] arm-memspeed: add patches for configurable cache sizes
@ 2014-06-19 17:24 Markus Niebel
2014-06-27 6:41 ` Markus Niebel
2014-06-27 7:49 ` Juergen Borleis
0 siblings, 2 replies; 3+ messages in thread
From: Markus Niebel @ 2014-06-19 17:24 UTC (permalink / raw)
To: ptxdist; +Cc: m.olbrich, Markus Niebel
From: Markus Niebel <Markus.Niebel@tq-group.com>
Cores like cortex a9 have L1 and L2 cache. Sizes of
caches may vary between cores and SOCs. So it is good to
have the sizes configurable and let the user decide.
While at it, remove some compiler warnings.
Signed-off-by: Markus Niebel <Markus.Niebel@tq-group.com>
---
I'm not 100% sure about the following things:
1) with HF toolchains (tested cortexa8 and v7a versions of OSELAS.Toolchain 2013.12.2)
there is a segfault when enable to run the cached tests - even with the unpatched
version. The same toolchains without HF the program runs without error.
2) The timing code looks strange. Wouldn't it be better to use clock_gettime with
CLOCK_MONOTONIC on systems with tickless kernels?
patches/arm-memspeed-1.0/0001-fix-cache-size.patch | 58 ++++++++++++++
patches/arm-memspeed-1.0/0002-use-getopt.patch | 82 ++++++++++++++++++++
.../0003-remove-compiler-warnings.patch | 14 ++++
patches/arm-memspeed-1.0/series | 4 +
4 files changed, 158 insertions(+)
create mode 100644 patches/arm-memspeed-1.0/0001-fix-cache-size.patch
create mode 100644 patches/arm-memspeed-1.0/0002-use-getopt.patch
create mode 100644 patches/arm-memspeed-1.0/0003-remove-compiler-warnings.patch
create mode 100644 patches/arm-memspeed-1.0/series
diff --git a/patches/arm-memspeed-1.0/0001-fix-cache-size.patch b/patches/arm-memspeed-1.0/0001-fix-cache-size.patch
new file mode 100644
index 0000000..4ac8a36
--- /dev/null
+++ b/patches/arm-memspeed-1.0/0001-fix-cache-size.patch
@@ -0,0 +1,58 @@
+Index: arm-memspeed-1.1/memspeed.c
+===================================================================
+--- arm-memspeed-1.1.orig/memspeed.c 2011-10-14 19:57:47.000000000 +0200
++++ arm-memspeed-1.1/memspeed.c 2014-05-12 13:51:23.000000000 +0200
+@@ -25,7 +25,12 @@
+ * than half the actual data cache size but not smaller than 4096.
+ * (4096 is the largest preload size)
+ */
+-#define CACHED_BUF_SIZE 4*1024
++#define CACHED_BUF_SIZE_MIN 4*1024
++
++static size_t large_buf_size = BUF_SIZE;
++static size_t cached_buf_size = CACHED_BUF_SIZE_MIN;
++
++
+
+ typedef void (test_fn)(void *src, void *dst, long size);
+
+@@ -101,7 +106,7 @@
+ long long t1, t2, tod_usecs, ck_usecs;
+ int i, loops, tries;
+ float x;
+- int buf_size = in_cache ? CACHED_BUF_SIZE : BUF_SIZE;
++ int buf_size = in_cache ? cached_buf_size : large_buf_size;
+ void *buf1 = test_buf;
+ void *buf2 = test_buf + buf_size;
+
+@@ -122,8 +127,8 @@
+
+ /* first a single pass to warm caches and page memory in */
+ if (in_cache) {
+- test_ldm_32_p8(buf1, NULL, CACHED_BUF_SIZE);
+- test_ldm_32_p8(buf2, NULL, CACHED_BUF_SIZE);
++ test_ldm_32_p8(buf1, NULL, cached_buf_size);
++ test_ldm_32_p8(buf2, NULL, cached_buf_size);
+ }
+ fn(buf2, buf1, buf_size);
+
+@@ -146,8 +151,8 @@
+ /* now the real test */
+ usleep(10000);
+ if (in_cache) {
+- test_ldm_32_p8(buf1, NULL, CACHED_BUF_SIZE);
+- test_ldm_32_p8(buf2, NULL, CACHED_BUF_SIZE);
++ test_ldm_32_p8(buf1, NULL, cached_buf_size);
++ test_ldm_32_p8(buf2, NULL, cached_buf_size);
+ }
+ fn(buf2, buf1, buf_size);
+ i = loops;
+@@ -229,7 +234,7 @@
+ }
+ }
+
+- test_buf_ = malloc(2*BUF_SIZE + CACHED_BUF_SIZE + 4096);
++ test_buf_ = malloc(2*large_buf_size + cached_buf_size + 4096);
+
+ /* page align */
+ test_buf = (char *)((long)(test_buf_ + 4095) & ~4095L);
diff --git a/patches/arm-memspeed-1.0/0002-use-getopt.patch b/patches/arm-memspeed-1.0/0002-use-getopt.patch
new file mode 100644
index 0000000..a587713
--- /dev/null
+++ b/patches/arm-memspeed-1.0/0002-use-getopt.patch
@@ -0,0 +1,82 @@
+Index: arm-memspeed-1.0/memspeed.c
+===================================================================
+--- arm-memspeed-1.0.orig/memspeed.c 2014-06-18 20:19:15.000000000 +0200
++++ arm-memspeed-1.0/memspeed.c 2014-06-18 20:20:03.000000000 +0200
+@@ -10,6 +10,7 @@
+ #include <unistd.h>
+ #include <string.h>
+ #include <sched.h>
++#include <errno.h>
+ #include <sys/time.h>
+ #include <sys/times.h>
+
+@@ -218,23 +219,63 @@
+ perror("Warning: unable to set scheduling priority, ");
+ }
+
++static void usage(int argc, char *argv[])
++{
++ fprintf(stderr, "Usage: %s [-c] -l <uncached size> -s <cached size>\n"
++ " -c\tinclude cached memory results\n"
++ " -l\t mem size for the uncached tests in kiBytes, must be larger than L2 cache\n"
++ " -s\t mem size for the cached tests in kiBytes\n",
++ argv[0]);
++}
++
+ int main(int argc, char *argv[])
+ {
+ void *test_buf_;
+ int i, j, include_cached = 0;
++ int c;
+
+- for (i = 1; i < argc; i++) {
+- if (strcmp(argv[i], "-c") == 0) {
++ while (-1 != (c = getopt(argc, argv, "cl:s:"))) {
++ switch(c) {
++ case 'c':
+ include_cached = 1;
+- } else {
+- fprintf(stderr, "Usage: %s [-c]\n"
+- " -c\tinclude cached memory results\n",
+- argv[0]);
++ break;
++ case 'l':
++ errno = 0;
++ large_buf_size = strtoul(optarg, 0, 10) * 1024;
++ if (errno || (0 == large_buf_size)) {
++ fprintf(stderr, "%s: -n needs positive non zero val\n", argv[0]);
++ usage(argc, argv);
++ exit(1);
++ }
++ break;
++ case 's':
++ errno = 0;
++ cached_buf_size = strtoul(optarg, 0, 10) * 1024;
++ if (errno || (0 == large_buf_size)) {
++ fprintf(stderr, "%s: -s needs positive non zero val\n", argv[0]);
++ usage(argc, argv);
++ exit(1);
++ }
++ break;
++
++ default:
++ usage(argc, argv);
+ exit(1);
+ }
+ }
+
++ if (large_buf_size <= cached_buf_size * 4) {
++ fprintf(stderr, "%s: uncached buffer needs at least quad size of chached\n", argv[0]);
++ usage(argc, argv);
++ exit(1);
++ }
++
+ test_buf_ = malloc(2*large_buf_size + cached_buf_size + 4096);
++ if (!test_buf_) {
++ fprintf(stderr, "%s: uncached buffer needs at least quad size of chached\n", argv[0]);
++ usage(argc, argv);
++ exit(1);
++ }
+
+ /* page align */
+ test_buf = (char *)((long)(test_buf_ + 4095) & ~4095L);
diff --git a/patches/arm-memspeed-1.0/0003-remove-compiler-warnings.patch b/patches/arm-memspeed-1.0/0003-remove-compiler-warnings.patch
new file mode 100644
index 0000000..d3e255c
--- /dev/null
+++ b/patches/arm-memspeed-1.0/0003-remove-compiler-warnings.patch
@@ -0,0 +1,14 @@
+Index: arm-memspeed-1.0/memspeed.c
+===================================================================
+--- arm-memspeed-1.0.orig/memspeed.c 2014-06-18 11:25:46.000000000 +0200
++++ arm-memspeed-1.0/memspeed.c 2014-06-18 11:29:22.000000000 +0200
+@@ -231,7 +231,8 @@
+ int main(int argc, char *argv[])
+ {
+ void *test_buf_;
+- int i, j, include_cached = 0;
++ size_t i, j;
++ int include_cached = 0;
+ int c;
+
+ while (-1 != (c = getopt(argc, argv, "cl:s:"))) {
diff --git a/patches/arm-memspeed-1.0/series b/patches/arm-memspeed-1.0/series
new file mode 100644
index 0000000..b9d002e
--- /dev/null
+++ b/patches/arm-memspeed-1.0/series
@@ -0,0 +1,4 @@
+0001-fix-cache-size.patch
+0002-use-getopt.patch
+0003-remove-compiler-warnings.patch
+# 0004-debug
--
1.7.9.5
--
ptxdist mailing list
ptxdist@pengutronix.de
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [ptxdist] [RFC][PATCH] arm-memspeed: add patches for configurable cache sizes
2014-06-19 17:24 [ptxdist] [RFC][PATCH] arm-memspeed: add patches for configurable cache sizes Markus Niebel
@ 2014-06-27 6:41 ` Markus Niebel
2014-06-27 7:49 ` Juergen Borleis
1 sibling, 0 replies; 3+ messages in thread
From: Markus Niebel @ 2014-06-27 6:41 UTC (permalink / raw)
To: ptxdist; +Cc: m.olbrich
Hello,
Am 19.06.2014 19:24, wrote Markus Niebel:
> From: Markus Niebel <Markus.Niebel@tq-group.com>
>
> Cores like cortex a9 have L1 and L2 cache. Sizes of
> caches may vary between cores and SOCs. So it is good to
> have the sizes configurable and let the user decide.
>
> While at it, remove some compiler warnings.
any comments or should I split the patches from the discussion?
>
> Signed-off-by: Markus Niebel <Markus.Niebel@tq-group.com>
> ---
> I'm not 100% sure about the following things:
>
> 1) with HF toolchains (tested cortexa8 and v7a versions of OSELAS.Toolchain 2013.12.2)
> there is a segfault when enable to run the cached tests - even with the unpatched
> version. The same toolchains without HF the program runs without error.
> 2) The timing code looks strange. Wouldn't it be better to use clock_gettime with
> CLOCK_MONOTONIC on systems with tickless kernels?
>
Is the some expert (toolchain and / or ARM assembler) hearing my qustion?
>
> patches/arm-memspeed-1.0/0001-fix-cache-size.patch | 58 ++++++++++++++
> patches/arm-memspeed-1.0/0002-use-getopt.patch | 82 ++++++++++++++++++++
> .../0003-remove-compiler-warnings.patch | 14 ++++
> patches/arm-memspeed-1.0/series | 4 +
> 4 files changed, 158 insertions(+)
> create mode 100644 patches/arm-memspeed-1.0/0001-fix-cache-size.patch
> create mode 100644 patches/arm-memspeed-1.0/0002-use-getopt.patch
> create mode 100644 patches/arm-memspeed-1.0/0003-remove-compiler-warnings.patch
> create mode 100644 patches/arm-memspeed-1.0/series
>
> diff --git a/patches/arm-memspeed-1.0/0001-fix-cache-size.patch b/patches/arm-memspeed-1.0/0001-fix-cache-size.patch
> new file mode 100644
> index 0000000..4ac8a36
> --- /dev/null
> +++ b/patches/arm-memspeed-1.0/0001-fix-cache-size.patch
> @@ -0,0 +1,58 @@
> +Index: arm-memspeed-1.1/memspeed.c
> +===================================================================
> +--- arm-memspeed-1.1.orig/memspeed.c 2011-10-14 19:57:47.000000000 +0200
> ++++ arm-memspeed-1.1/memspeed.c 2014-05-12 13:51:23.000000000 +0200
> +@@ -25,7 +25,12 @@
> + * than half the actual data cache size but not smaller than 4096.
> + * (4096 is the largest preload size)
> + */
> +-#define CACHED_BUF_SIZE 4*1024
> ++#define CACHED_BUF_SIZE_MIN 4*1024
> ++
> ++static size_t large_buf_size = BUF_SIZE;
> ++static size_t cached_buf_size = CACHED_BUF_SIZE_MIN;
> ++
> ++
> +
> + typedef void (test_fn)(void *src, void *dst, long size);
> +
> +@@ -101,7 +106,7 @@
> + long long t1, t2, tod_usecs, ck_usecs;
> + int i, loops, tries;
> + float x;
> +- int buf_size = in_cache ? CACHED_BUF_SIZE : BUF_SIZE;
> ++ int buf_size = in_cache ? cached_buf_size : large_buf_size;
> + void *buf1 = test_buf;
> + void *buf2 = test_buf + buf_size;
> +
> +@@ -122,8 +127,8 @@
> +
> + /* first a single pass to warm caches and page memory in */
> + if (in_cache) {
> +- test_ldm_32_p8(buf1, NULL, CACHED_BUF_SIZE);
> +- test_ldm_32_p8(buf2, NULL, CACHED_BUF_SIZE);
> ++ test_ldm_32_p8(buf1, NULL, cached_buf_size);
> ++ test_ldm_32_p8(buf2, NULL, cached_buf_size);
> + }
> + fn(buf2, buf1, buf_size);
> +
> +@@ -146,8 +151,8 @@
> + /* now the real test */
> + usleep(10000);
> + if (in_cache) {
> +- test_ldm_32_p8(buf1, NULL, CACHED_BUF_SIZE);
> +- test_ldm_32_p8(buf2, NULL, CACHED_BUF_SIZE);
> ++ test_ldm_32_p8(buf1, NULL, cached_buf_size);
> ++ test_ldm_32_p8(buf2, NULL, cached_buf_size);
> + }
> + fn(buf2, buf1, buf_size);
> + i = loops;
> +@@ -229,7 +234,7 @@
> + }
> + }
> +
> +- test_buf_ = malloc(2*BUF_SIZE + CACHED_BUF_SIZE + 4096);
> ++ test_buf_ = malloc(2*large_buf_size + cached_buf_size + 4096);
> +
> + /* page align */
> + test_buf = (char *)((long)(test_buf_ + 4095) & ~4095L);
> diff --git a/patches/arm-memspeed-1.0/0002-use-getopt.patch b/patches/arm-memspeed-1.0/0002-use-getopt.patch
> new file mode 100644
> index 0000000..a587713
> --- /dev/null
> +++ b/patches/arm-memspeed-1.0/0002-use-getopt.patch
> @@ -0,0 +1,82 @@
> +Index: arm-memspeed-1.0/memspeed.c
> +===================================================================
> +--- arm-memspeed-1.0.orig/memspeed.c 2014-06-18 20:19:15.000000000 +0200
> ++++ arm-memspeed-1.0/memspeed.c 2014-06-18 20:20:03.000000000 +0200
> +@@ -10,6 +10,7 @@
> + #include <unistd.h>
> + #include <string.h>
> + #include <sched.h>
> ++#include <errno.h>
> + #include <sys/time.h>
> + #include <sys/times.h>
> +
> +@@ -218,23 +219,63 @@
> + perror("Warning: unable to set scheduling priority, ");
> + }
> +
> ++static void usage(int argc, char *argv[])
> ++{
> ++ fprintf(stderr, "Usage: %s [-c] -l <uncached size> -s <cached size>\n"
> ++ " -c\tinclude cached memory results\n"
> ++ " -l\t mem size for the uncached tests in kiBytes, must be larger than L2 cache\n"
> ++ " -s\t mem size for the cached tests in kiBytes\n",
> ++ argv[0]);
> ++}
> ++
> + int main(int argc, char *argv[])
> + {
> + void *test_buf_;
> + int i, j, include_cached = 0;
> ++ int c;
> +
> +- for (i = 1; i < argc; i++) {
> +- if (strcmp(argv[i], "-c") == 0) {
> ++ while (-1 != (c = getopt(argc, argv, "cl:s:"))) {
> ++ switch(c) {
> ++ case 'c':
> + include_cached = 1;
> +- } else {
> +- fprintf(stderr, "Usage: %s [-c]\n"
> +- " -c\tinclude cached memory results\n",
> +- argv[0]);
> ++ break;
> ++ case 'l':
> ++ errno = 0;
> ++ large_buf_size = strtoul(optarg, 0, 10) * 1024;
> ++ if (errno || (0 == large_buf_size)) {
> ++ fprintf(stderr, "%s: -n needs positive non zero val\n", argv[0]);
> ++ usage(argc, argv);
> ++ exit(1);
> ++ }
> ++ break;
> ++ case 's':
> ++ errno = 0;
> ++ cached_buf_size = strtoul(optarg, 0, 10) * 1024;
> ++ if (errno || (0 == large_buf_size)) {
> ++ fprintf(stderr, "%s: -s needs positive non zero val\n", argv[0]);
> ++ usage(argc, argv);
> ++ exit(1);
> ++ }
> ++ break;
> ++
> ++ default:
> ++ usage(argc, argv);
> + exit(1);
> + }
> + }
> +
> ++ if (large_buf_size <= cached_buf_size * 4) {
> ++ fprintf(stderr, "%s: uncached buffer needs at least quad size of chached\n", argv[0]);
> ++ usage(argc, argv);
> ++ exit(1);
> ++ }
> ++
> + test_buf_ = malloc(2*large_buf_size + cached_buf_size + 4096);
> ++ if (!test_buf_) {
> ++ fprintf(stderr, "%s: uncached buffer needs at least quad size of chached\n", argv[0]);
> ++ usage(argc, argv);
> ++ exit(1);
> ++ }
> +
> + /* page align */
> + test_buf = (char *)((long)(test_buf_ + 4095) & ~4095L);
> diff --git a/patches/arm-memspeed-1.0/0003-remove-compiler-warnings.patch b/patches/arm-memspeed-1.0/0003-remove-compiler-warnings.patch
> new file mode 100644
> index 0000000..d3e255c
> --- /dev/null
> +++ b/patches/arm-memspeed-1.0/0003-remove-compiler-warnings.patch
> @@ -0,0 +1,14 @@
> +Index: arm-memspeed-1.0/memspeed.c
> +===================================================================
> +--- arm-memspeed-1.0.orig/memspeed.c 2014-06-18 11:25:46.000000000 +0200
> ++++ arm-memspeed-1.0/memspeed.c 2014-06-18 11:29:22.000000000 +0200
> +@@ -231,7 +231,8 @@
> + int main(int argc, char *argv[])
> + {
> + void *test_buf_;
> +- int i, j, include_cached = 0;
> ++ size_t i, j;
> ++ int include_cached = 0;
> + int c;
> +
> + while (-1 != (c = getopt(argc, argv, "cl:s:"))) {
> diff --git a/patches/arm-memspeed-1.0/series b/patches/arm-memspeed-1.0/series
> new file mode 100644
> index 0000000..b9d002e
> --- /dev/null
> +++ b/patches/arm-memspeed-1.0/series
> @@ -0,0 +1,4 @@
> +0001-fix-cache-size.patch
> +0002-use-getopt.patch
> +0003-remove-compiler-warnings.patch
> +# 0004-debug
>
Greetings
Markus
--
ptxdist mailing list
ptxdist@pengutronix.de
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [ptxdist] [RFC][PATCH] arm-memspeed: add patches for configurable cache sizes
2014-06-19 17:24 [ptxdist] [RFC][PATCH] arm-memspeed: add patches for configurable cache sizes Markus Niebel
2014-06-27 6:41 ` Markus Niebel
@ 2014-06-27 7:49 ` Juergen Borleis
1 sibling, 0 replies; 3+ messages in thread
From: Juergen Borleis @ 2014-06-27 7:49 UTC (permalink / raw)
To: ptxdist
On Thursday 19 June 2014 19:24:27 Markus Niebel wrote:
> [...]
> I'm not 100% sure about the following things:
>
> 1) with HF toolchains (tested cortexa8 and v7a versions of OSELAS.Toolchain
> 2013.12.2) there is a segfault when enable to run the cached tests - even
> with the unpatched version. The same toolchains without HF the program runs
> without error.
Same happens here. Maybe I will find the time to dig into the sources.
> 2) The timing code looks strange. Wouldn't it be better to
> use clock_gettime with CLOCK_MONOTONIC on systems with tickless kernels?
Good question. Nicolas Pitre wrote that code, I found his sources long time ago
and made a PTXdist package of it. Maybe I can change it, when I already take a
look into the sources due to the compiler issue.
Regards,
Juergen
--
Pengutronix e.K. | Juergen Borleis |
Industrial Linux Solutions | http://www.pengutronix.de/ |
--
ptxdist mailing list
ptxdist@pengutronix.de
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-06-27 7:49 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-19 17:24 [ptxdist] [RFC][PATCH] arm-memspeed: add patches for configurable cache sizes Markus Niebel
2014-06-27 6:41 ` Markus Niebel
2014-06-27 7:49 ` Juergen Borleis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox