You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nuttx.apache.org by David Sidrane <da...@nscdg.com> on 2023/06/01 15:46:37 UTC

RE: SD and eMMC performance in Nuttx

Hi Radek Pesina,

This may be way off base but, have you tried reverting
https://github.com/apache/nuttx/commit/7312a553bbc40f3771c5d53ccded89bed7391f2a

It release the CPU but traded that for potentially quantized large delays

David

-----Original Message-----
From: Radek Pesina <ra...@motec.com.au>
Sent: Wednesday, May 31, 2023 3:11 AM
To: dev@nuttx.apache.org
Subject: Re: SD and eMMC performance in Nuttx

Hi Nathan,

Thanks for the reply.  The round-robin interval was set to 200ms and
reducing it down to 10 marginally improved the transfer speed.  Our current
code base and dev board is running a slightly quicker clock than what I used
when I measured 75KiB/s, and the current setup is transferring at 100KiB/s
with a RR interval of 200 and this increases to 135KiB/s with a RR interval
of 10ms.

Yes I do have access to an oscilloscope and logic analyzer so will endeavour
to obtain some traces tomorrow to rule in/out possible unexpected delays
and/or noise.

*Kind regards,*

*Radek Pesina*

On Wed, 31 May 2023 at 11:33, Nathan Hartman <ha...@gmail.com>
wrote:

> On Tue, May 30, 2023 at 8:07 PM Radek Pesina
> <ra...@motec.com.au>
> wrote:
>
> (snip)
>
>  *Configurations Tested:*
> >
> > For eMMC, I've tried optimising the menuconfig settings to improve it,
> > including options such as below.   However, the performance remains
> > lacking:
> >
> >    - Turning on CONFIG_MEMCPY_VIK gave slight improvement
> >    - Setting USEC_PER_TICK to 1000 or below gave most improvement,
> however
> >    at the detriment of other aspects of the system. The fastest speeds
> >    observed by adjusting this was ~370KiB/s write and 700KiB/s read
> though
> >    overall this was unacceptable given the effect on the rest of the
> > system.
> >    - Adjusting LittleFS parameters (e.g.
> >    CONFIG_FS_LITTLEFS_PROGRAM_SIZE_FACTOR,
> >    CONFIG_FS_LITTLEFS_READ_SIZE_FACTOR,
> > CONFIG_FS_LITTLEFS_BLOCK_SIZE_FACTOR,
> >    CONFIG_FS_LITTLEFS_CACHE_SIZE_FACTOR,
> CONFIG_FS_LITTLEFS_LOOKAHEAD_SIZE
> >    - Ensuring SD/eMMC DMA read/writes are enabled.
> >    - Setting MMCSD_MULTIBLOCK_LIMIT to 0
> >
>
> (snip)
>
> Out of curiosity, what is the value of CONFIG_RR_INTERVAL, and, if you
> reduce it to something like 20 or 10, does that show any improvement?
>
> Do you have an oscilloscope or logic analyzer available to monitor the
> signals between the microcontroller and MMC? That might shed
> additional light on this. E.g., extremely noisy signals, intermittent
> signals, unexpectedly long delays between bursts of communication, etc.
>
> Nathan
>

Re: SD and eMMC performance in Nuttx

Posted by Radek Pesina <ra...@motec.com.au>.
Hi,

I've been busy on other tasks therefore haven't obtained a oscilloscope
trace yet, but one of my peers have performed some profiling on a different
micro and non-secure core, with the mxrt1064-evk:nsh configuration +
tickless and 100us tick rate:

nsh> dd if=/dev/zero of=/dev/null bs=4096 count=100 409600 bytes copied,
2200 usec, 181818 KB/s nsh> mkrd -m 1 -s 512 1024 nsh> mkfatfs /dev/ram1
nsh> mount -t vfat /dev/ram1 /mnt nsh> dd if=/dev/zero of=/dev/null bs=4096
count=100 409600 bytes copied, 2200 usec, 181818 KB/s nsh> dd if=/dev/zero
of=/mnt/t bs=4096 count=100 409600 bytes copied, 4700 usec, 85106 KB/s nsh>
umount /mnt nsh> mkfatfs /dev/mmcsd0 nsh> mount -t vfat /dev/mmcsd0 /mnt
nsh> dd if=/dev/zero of=/mnt/t bs=4096 count=100 409600 bytes copied,
959000 usec, 417 KB/s nsh> dd if=/dev/zero of=/dev/sd0 bs=4096 count=100
409600 bytes copied, 361400 usec, 1106 KB/s

As observed, the raw speed for the RAM based block device is 181 M/S,
through vfat on a ramdisk is 85 M/S and 0.5M/S with SD (Class 4). The raw
speed using the BCH device (sd0) is 1M/s.




On Fri, 2 Jun 2023 at 05:18, Gregory Nutt <sp...@gmail.com> wrote:

>
> > That would help when tickless mode is used. But what about tickful mode?
> I
> > guess the intent of 7312a553b was to avoid wasting processor cycles on
> busy
> > waiting, but if tickless isn't being used, perhaps busy waiting is
> > necessary here? It could choose between the two wait types at compile
> time
> > based on tickless mode.
> In "tickful" mode, you could increase the timer frequency to improve
> resolution.  At 1000 Hz things run pretty well too.  But at some point,
> however, the interrupt processing overhead becomes prohibitive.
> > Or a bigger question: if tickless mode is "better" (longer battery life,
> > fewer unnecessary interrupts, more processor cycles for real work) why
> > aren't we always using tickless mode? Are there limitations/bugs that
> make
> > it unsuitable in some situations? Not universally supported on all
> > microprocessors? Other reasons?
>
> It does rely on an available high resolution timer and, so, uses
> resources that not all MCUs may have.  It would also take a substantial
> effort to convert all existing architectures to support tickless.
>
> But the real reason is that tickless mode was experimental and not
> supported on all platforms on initial release and also not well
> trusted.  But I think it is well trusted these days and could be the
> standard.
>
> I am not opposed to the idea at all.  I do like the "tickful" mode for
> MCU bringup because it is a lot simpler.  Tickless support can be added
> later when the port is stable.
>
>
>

Re: SD and eMMC performance in Nuttx

Posted by Gregory Nutt <sp...@gmail.com>.
> That would help when tickless mode is used. But what about tickful mode? I
> guess the intent of 7312a553b was to avoid wasting processor cycles on busy
> waiting, but if tickless isn't being used, perhaps busy waiting is
> necessary here? It could choose between the two wait types at compile time
> based on tickless mode.
In "tickful" mode, you could increase the timer frequency to improve 
resolution.  At 1000 Hz things run pretty well too.  But at some point, 
however, the interrupt processing overhead becomes prohibitive.
> Or a bigger question: if tickless mode is "better" (longer battery life,
> fewer unnecessary interrupts, more processor cycles for real work) why
> aren't we always using tickless mode? Are there limitations/bugs that make
> it unsuitable in some situations? Not universally supported on all
> microprocessors? Other reasons?

It does rely on an available high resolution timer and, so, uses 
resources that not all MCUs may have.  It would also take a substantial 
effort to convert all existing architectures to support tickless.

But the real reason is that tickless mode was experimental and not 
supported on all platforms on initial release and also not well 
trusted.  But I think it is well trusted these days and could be the 
standard.

I am not opposed to the idea at all.  I do like the "tickful" mode for 
MCU bringup because it is a lot simpler.  Tickless support can be added 
later when the port is stable.



Re: SD and eMMC performance in Nuttx

Posted by Nathan Hartman <ha...@gmail.com>.
On Thu, Jun 1, 2023 at 12:35 PM Gregory Nutt <sp...@gmail.com> wrote:

>
> > This may be way off base but, have you tried reverting
> >
> https://github.com/apache/nuttx/commit/7312a553bbc40f3771c5d53ccded89bed7391f2a
> >
> > It release the CPU but traded that for potentially quantized large delays
>
> Yes, I would expect the up_udelay to be in error by about about 0.5 uS
> (provided that the delay loop is properly calibrated).  The
> nxsig_usleep() should be error by about 1.5 x system-timer-period plus
> context switching delays.
>
> With the default timer period of 10 MS, that would be an error of about
> 15 MS -- Always longer in time than requested.
>
> A fix would be to use the tickless mode with a timer period of about 1 uS.



That would help when tickless mode is used. But what about tickful mode? I
guess the intent of 7312a553b was to avoid wasting processor cycles on busy
waiting, but if tickless isn't being used, perhaps busy waiting is
necessary here? It could choose between the two wait types at compile time
based on tickless mode.

Or a bigger question: if tickless mode is "better" (longer battery life,
fewer unnecessary interrupts, more processor cycles for real work) why
aren't we always using tickless mode? Are there limitations/bugs that make
it unsuitable in some situations? Not universally supported on all
microprocessors? Other reasons?

Cheers,
Nathan

Re: SD and eMMC performance in Nuttx

Posted by Gregory Nutt <sp...@gmail.com>.
> This may be way off base but, have you tried reverting
> https://github.com/apache/nuttx/commit/7312a553bbc40f3771c5d53ccded89bed7391f2a
>
> It release the CPU but traded that for potentially quantized large delays

Yes, I would expect the up_udelay to be in error by about about 0.5 uS 
(provided that the delay loop is properly calibrated).  The 
nxsig_usleep() should be error by about 1.5 x system-timer-period plus 
context switching delays.

With the default timer period of 10 MS, that would be an error of about 
15 MS -- Always longer in time than requested.

A fix would be to use the tickless mode with a timer period of about 1 uS.