Documentation/guides/protected_build.rst - nuttx - Git at Google

 =====================
 NuttX Protected Build
 =====================

 .. warning::
     Migrated from :
     https://cwiki.apache.org/confluence/display/NUTTX/NuttX+Protected+Build

 The Traditional "Flat" Build
 ============================

 The traditional NuttX build is a "flat" build. By flat, I mean that when
 you build NuttX, you end up with a single "blob" called ``nuttx``. All of the
 components of the build reside in the same address space. All components
 of the build can access all other components of the build.

 The "Two Pass" Protected Build
 ==============================

 The NuttX protected build, on the other hand, is a "two-pass" build and
 generates two "blobs": (1) a separately compiled and linked `kernel` blob
 called, again, `nuttx` and separately compiled and linked `user` blob called
 in ``nuttx_user.elf`` (in the existing build configurations). The user blob
 is created on pass 1 and the kernel blob is created on pass2.

 These two make commands are identical:

 .. code-block:: bash

     make
     make pass1 pass2

 But the second is clearer and I prefer to use it for the protected build.
 In the second case, the user and kernel blobs are built separately; in the
 first, the kernel and user blob builds may be intermixed and somewhat
 confusing. You can also build the kernel and user blobs separately with
 one of the following commands:

 .. code-block:: bash

     make pass1
     make pass2

 At the end of the build, there will be several files in the top-level NuttX build directory. From Pass 1:

 * ``nuttx_user.elf``. The pass1 user-space ELF file
 * ``nuttx_user.hex``. The pass1 Intel HEX format file (selected in ``defconfig``)
 * ``User.map``. Symbols in the user-space ELF file

 From Pass 2:

 * ``nuttx``. The pass2 kernel-space ELF file
 * ``nuttx.hex``. The pass2 Intel HEX file (selected in ``defconfig``)
 * ``System.map``. Symbols in the kernel-space ELF file

 The Memory Protection Unit
 ==========================

 If the MCU supports a Memory Protection Unit (MPU), then the logic within
 the kernel blob all execute in kernel-mode, i.e., with all privileges.
 These privileged threads can access all memory, all CPU instructions,
 and all MCU registers. The logic executing within the user-mode blob,
 on the other hand, all execute in user-mode with certain restrictions
 as enforced by the MCU and by the MPU. The MCU may restrict access to
 certain registers and machine instructions; with the MPU, access to all
 kernel memory resources are prohibited from the user logic. This includes
 the kernel blob's FLASH, .bss/.data storage, and the kernel heap memory.

 Advantages of the Protected Build
 =================================

 The advantages of such a protected build are (1) security and (2)
 modularity. Since the kernel resources are protected, it will be much
 less likely that a misbehaving task will crash the system or that a
 wild pointer access will corrupt critical memory. This security also
 provides a safer environment in which to execute 3rd party software
 and prevents "snooping" into the kernel memory from the hosted applications.

 Modularity is assured because there is a strict control of the exposed
 kernel interfaces. In the flat build, all symbols are exposed and there
 is no enforcement of a kernel API. With the protected build, on the
 other hand, all interactions with the kernel from the user application
 logic must use `system calls` (or `syscalls`) to interface with the OS. A
 system call is necessary to transition from user-mode to kernel-mode;
 all user-space operating system interfaces are via syscall `proxies`.
 Then, while in kernel mode, the kernel system call handler will
 perform the OS service requested by the application. At the
 conclusion of system processing, user-privileges are restored
 and control is return to the user application. Since the only
 interactions with the kernel can be through support system calls,
 modularity of the OS is guaranteed.

 User-Space Proxies/Kernel-Space Stubs
 =====================================

 The same OS interfaces are exposed to the application in both the "flat"
 build and the protected build. The difference is that in the protected
 build, the user-code interfaces with a `proxy` for the OS function. For
 example, here is what a proxy for the OS ``getpid()`` interface:

 .. code-block:: c

     #include <unistd.h>
     #include <syscall.h>
     pid_t getpid(void)
     {
         return (pid_t)sys_call0(SYS_getpid);
     }

 Thus the ``getpid()`` proxy is a stand-in for the real OS ``getpid()`` interface
 that executes a system call so the kernel code can perform the real
 ``getpid()`` operation on behalf of the user application. Proxies are
 auto-generated for all exported OS interfaces using the CSV file
 ``syscall/syscall.csv`` and the program ``tools/mksyscalls``. Similarly,
 on the kernel-side, there are auto-generated `stubs` that map the
 system calls back into real OS calls. These, however, are internal
 to the OS and the implementation may be architecture-specific.
 See the ``README.txt`` files in those directories for further information.

 Combining Intel HEX Files
 =========================

 One issue that you may face is that the two pass builds creates two
 FLASH images. Some debuggers that I use will allow me to write each
 image to FLASH separately. Others will expect to have a single Intel
 HEX image. In this latter case, you may need to combine the two Intel
 HEX files into one. Here is how you can do that:

 1) The `tail` of the ``nuttx.hex`` file should look something like this
    (with my comments and spaces added):

 .. code-block:: bash

     $ tail nuttx.hex
     # 00, data records
     ...
     :10 9DC0 00 01000000000800006400020100001F0004
     :10 9DD0 00 3B005A0078009700B500D400F300110151
     :08 9DE0 00 30014E016D0100008D
     # 05, Start Linear Address Record
     :04 0000 05 0800 0419 D2
     # 01, End Of File record
     :00 0000 01 FF

 Use an editor such as vi to remove the 05 and 01 records.

 2) The `head` of the ``nuttx_user.hex`` file should look something like this
    (again with my comments and spaces added):

 .. code-block:: bash

     $ head nuttx_user.hex
     # 04, Extended Linear Address Record
     :02 0000 04 0801 F1
     # 00, data records
     :10 8000 00 BD89 01084C800108C8110208D01102087E
     :10 8010 00 0010 00201C1000201C1000203C16002026
     :10 8020 00 4D80 01085D80010869800108ED83010829
     ...

 Nothing needs to be done here. The ``nuttx_user.hex`` file should be fine.

 3) Combine the edited nuttx.hex and un-edited ``nuttx_user.hex`` file to produce
    a single combined hex file:

 .. code-block:: bash

     $ cat nuttx.hex nuttx_user.hex >combined.hex

 Then use the ``combined.hex`` file with for FLASH/JTAG tool. If you do this
 a lot, you will probably want to invest a little time to develop a tool
 to automate these steps.

 Files and Directories
 =====================

 Here is a summary of directories and files used by the STM32F4Discovery
 protected build:

 * ``boards/arm/stm32/stm32f4discovery/configs/kostest``. This is the kernel
   mode OS test configuration. The two standard configuration files
   can be found in this directory: (1) ``defconfig`` and (2) ``Make.defs``.
 * ``boards/arm/stm32/stm32f4discovery/kernel``. This is the first past
   build directory. The Makefile in this directory is invoked to
   produce the pass1 object (``nuttx_user.elf`` in this case). The
   second pass object is created by ``arch/arm/src/Makefile``. Also
   in this directory is the file ``userspace.c``. The user-mode blob
   contains a header that includes information need by the kernel
   blob in order to interface with the user-code. That header is
   defined in by this file.
 * ``boards/arm/stm32/stm32f4discovery/scripts``. Linker scripts for
   the kernel mode build are found in this directory. This includes
   (1) ``memory.ld`` which hold the common memory map, (2) ``user-space.ld``
   that is used for linking the pass1 user-mode blob, and (3)
   ``kernel-space.ld`` that is used for linking the pass1 kernel-mode blob.

 Alignment, Regions, and Subregions
 ==================================

 There are some important comments in the ``memory.ld``
 file that are worth duplicating here:

 "The STM32F407VG has 1024Kb of FLASH beginning at address
 0x0800:0000 and 192Kb of SRAM. SRAM is split up into three blocks:

 * "112KB of SRAM beginning at address 0x2000:0000
 * "16KB of SRAM beginning at address 0x2001:c000
 * "64KB of CCM SRAM beginning at address 0x1000:0000

 "When booting from FLASH, FLASH memory is aliased to address
 0x0000:0000 where the code expects to begin execution by jumping
 to the entry point in the 0x0800:0000 address range.

 "For MPU support, the kernel-mode NuttX section is assumed to
 be 128Kb of FLASH and 4Kb of SRAM. That is an excessive amount
 for the kernel which should fit into 64KB and, of course, can
 be optimized as needed... Allowing the additional memory does
 permit addition debug instrumentation to be added to the kernel
 space without overflowing the partition.

 "Alignment of the user space FLASH partition is also a critical
 factor: The user space FLASH partition will be spanned with a
 single region of size 2||n bytes. The alignment of the user-space
 region must be the same. As a consequence, as the user-space
 increases in size, the alignment requirement also increases.

 "This alignment requirement means that the largest user space
 FLASH region you can have will be 512KB at it would have to be
 positioned at 0x08800000. If you change this address, don't
 forget to change the ``CONFIG_NUTTX_USERSPACE`` configuration
 setting to match and to modify the check in ``kernel/userspace.c``.

 "For the same reasons, the maximum size of the SRAM mapping is
 limited to 4KB. Both of these alignment limitations could be
 reduced by using multiple MPU regions to map the FLASH/SDRAM
 range or perhaps with some clever use of subregions."

 Memory Management
 =================

 At present, there are two options for memory management in the
 NuttX protected build:

 Single User Heap
 ----------------

 By default, there is only a single user-space heap and heap
 allocator that is shared by both kernel- and user-modes.
 PROs: Simple and makes good use of the heap memory space,
 CONs: Awkward architecture and no security for kernel-mode
 allocations.

 Dual, Partitioned Heaps
 -----------------------

 Two configuration options can change this behavior:

 * ``CONFIG_MM_MULTIHEAP=y``. This changes internal memory manager interfaces
   so that multiple heaps can be supported.
 * ``CONFIG_MM_KERNEL_HEAP=y``. Uses the multi-heap capability to enable
   a kernel heap

 If this both options are defined defined, the two heap partitions and
 two copies of the memory allocators are built:

 One un-protected heap partition that will allocate user accessible memory
 that is shared by both the kernel- and user-space code. That allocator
 physically resides in the user address space so that it can be called
 directly by both the user- and kernel-space code. There is a header at
 the beginning of the user-space blob; the kernel-space code gets
 address of the user-space allocator from this header.

 And another protected heap partition that will allocate protected
 memory that is only accessible from the kernel code. This allocator
 is built into the kernel block. This separate protected heap is
 required if you want to support security features.

 NOTE: There are security issues with calling into the user space
 allocators in kernel mode. That is a security hole that could be
 exploit to gain control of the system! Instead, the kernel code
 should switch to user mode before entering the memory allocator
 stubs (perhaps via a trap). The memory allocator stubs should
 then trap to return to kernel mode (as does the signal handler now).

 The Traditional Approach
 ------------------------

 A more traditional approach would use something like the interface
 ``sbrk()``. The ``sbrk()`` function adds memory to the heap space
 allocation of the calling process. In this case, there would
 still be kernel- and user-mode instances of the memory allocators.
 Each would ``sbrk()`` as necessary to extend their heap; the pages
 allocated for the kernel-mode allocator would be protected but
 the pages allocated for the user-mode allocator would not.
 PROs: Meets all of the needs. CONs: Complex. Memory losses
 due to quantization.

 This approach works well with CPUs that have very capable
 Memory Management Units (MMUs) that can coalesce the
 srbk-ed chunks to a contiguous, `virtual` heap region.
 Without an MMU, the sbrk-ed memory would not be
 contiguous; this would limit the sizes of allocations
 due to the physical pages.

 Many MCUs will have Memory Protection Units (MPUs) that can
 support the security features (only). However these lower
 end MPUs may not support sufficient mapping capability to
 support this traditional approach. The ARMv7-M MPU, for
 example, only supports eight protection regions to manage
 all FLASH and SRAM and so this approach would not be
 technically feasible for th ARMv7-M family (Cortex-M3/4).

 Comparing the "Flat" Build Configuration with the Protected Build Configuration
 ===============================================================================

 Compare, for example the configuration
 ``boards/arm/stm32/stm32f4discovery/configs/ostest`` and the
 configuration ``boards/arm/stm32/stm32f4discovery/configs/kostest``.
 These two configurations are identical except that one builds a
 "flat" version of OS test and the other builds a kernel version
 of the OS test. See the file ``boards/arm/stm32/stm32f4discovery/README.txt``
 for more details about those configurations.

 The configurations can be compared using the ``cmpconfig`` tool:

 .. code-block:: bash

     cd tools
     make -f Makefile.host cmpconfig
     cd ..
     tools/cmpconfig boards/arm/stm32/stm32f4discovery/configs/ostest/defconfig boards/arm/stm32/stm32f4discovery/configs/kostest/defconfig

 Here is a summary of the meaning of all of the important differences in the
 configurations. This should be enough information for you to convert any
 configuration from a "flat" to a protected build:

 * ``CONFIG_BUILD_2PASS=y``. This enables the two pass build.
 * ``CONFIG_BUILD_PROTECTED=y``. This option enables the "two pass"
   protected build.
 * ``CONFIG_PASS1_BUILDIR="boards/arm/stm32/stm32f4discovery/kernel"``.
   This tells the build system the (relative) location of the pass1 build directory.
 * ``CONFIG_PASS1_OBJECT=""``. In some "two pass" build configurations,
   the build system need to know the name of the first pass object.
   This setting is not used for the protected build.
 * ``CONFIG_NUTTX_USERSPACE=0x08020000``. This is the expected location
   where the user-mode blob will be located. The user-mode blob
   contains a header that includes information need by the kernel
   blob in order to interface with the user-code. That header will
   be expected to reside at this location.
 * ``CONFIG_PASS1_TARGET="all"``. This is the build target to use for
   invoking the pass1 make.
 * ``CONFIG_MM_MULTIHEAP=y``. This changes internal memory manager
   interfaces so that multiple heaps can be supported.
 * ``CONFIG_MM_KERNEL_HEAP=y``. NuttX supports the option of using a
   single user-accessible heap or, if this options is defined,
   two heaps: (1) one that will allocate user accessible memory
   that is shared by both the kernel- and user-space code, and
   (2) one that will allocate protected memory that is only
   accessible from the kernel code. Separate heap memory is required
   if you want to support security features.
 * ``CONFIG_MM_KERNEL_HEAPSIZE=8192``. This determines an approximate
   size for the kernel heap. The standard heap space is partitioned
   into a kernel- and user-heap space. This size of the kernel heap
   is only approximate because the user heap is subject to stringent
   alignment requirements. Because of the alignment requirements, the
   actual size of the kernel heap could be considerable larger than this.
 * ``CONFIG_BOARD_EARLY_INITIALIZE=y``. This setting enables a special,
   `early` initialization call to initialize board-specific resources.
 * ``CONFIG_BOARD_LATE_INITIALIZE=y``. This setting enables a special
   initialization call to initialize `late` board-specific resources.
   The difference between ``CONFIG_BOARD_EARLY_INITIALIZE`` and
   ``CONFIG_BOARD_LATE_INITIALIZE`` is that the ``CONFIG_BOARD_EARLY_INITIALIZE``
   logic runs earlier in initialization before the full operating
   system is up and running. ``CONFIG_BOARD_LATE_INITIALIZE``, on the
   other hand, runs at the completion of initialization, just before
   the user applications are started. Neither ``CONFIG_BOARD_EARLY_INITIALIZE``
   nor ``CONFIG_BOARD_LATE_INITIALIZE`` are used in the OS test
   configuration but other configurations (such as NSH)
   require some application-specific initialization before
   the application can run. In the "flat" build, such initialization
   is performed as part of the application start-up sequence.
   These includes such things as initializing device drivers.
   These same initialization steps must be performed in kernel
   mode for the protected build and ``CONFIG_BOARD_LATE_INITIALIZE``.
   See ``boards/arm/stm32/stm32f4discovery/src/up_boot.c`` for an
   example of such board initialization code.
 * ``CONFIG_NSH_ARCHINITIALIZE`` is not defined. The setting
   ``CONFIG_NSH_ARCHINITIALIZE`` does not apply to the OS test
   configuration, however, this is noted here as an example
   of initialization that cannot be performed in the protected build.

 Architecture-Specific Options:

 * ``CONFIG_SYS_RESERVED=8``. The user application logic
   interfaces with the kernel blob using system calls.
   The architecture-specific logic may need to reserved a
   few system calls for its own internal use. The ARMv7-M
   architectures all require 8 reserved system calls.
 * ``CONFIG_SYS_NNEST=2``. System calls may be nested. The
   system must retain information about each nested system
   call and this setting is used to set aside resources for
   nested system calls. In the current architecture, a maximum
   nesting level of two is all that is needed.
 * ``CONFIG_ARMV7M_MPU=y``. This settings enables support for
   the ARMv7-M Memory Protection Unit (MPU). The MPU is used
   to prohibit user-mode access to kernel resources.
 * ``CONFIG_ARMV7M_MPU_NREGIONS=8``. The ARMv7-M MPU supports 8
   protection regions.

 Size Expansion
 ==============

 The protected build will, or course, result in a FLASH image that is
 larger than that of the corresponding "flat" build. How much larger?
 I don't have the numbers in hand, but you can build
 ``boards/arm/stm32/stm32f4discovery/configs/nsh`` and
 ``boards/arm/stm32/stm32f4discovery/configs/kostest`` and compare
 the resulting binaries for yourself using the ``size`` command.

 Increases in size are expected because:

 * The syscall layer is included in the protected build but not the flat
   build.
 * The kernel-size _syscal_l stubs will cause all enabled OS code to be
   drawn into the build. In the flat build, only those OS interfaces
   actually called by the application will be included in the final objects.
 * The dual memory allocators will increase size.
 * Code duplication. Some code, such as the C library, will be
   duplicated in both the kernel- and user-blobs, and
 * Alignment. The alignments required by the MPU logic will leave
   relatively large regions of FLASH (and perhaps RAM) is not usable.

 Performance Issues
 ==================

 The only performance differences using the protected build should
 result as a consequence of the `sycalls` used to interact with the
 OS vs. the direct C calls as used in the flat build. If your
 performance is highly dependent upon high rate OS calls, then
 this could be an issue for you. But, in the typical application,
 OS calls do not often figure into the critical performance paths.

 The `syscalls` are, ultimately, software interrupts. If the platform
 does not support prioritized, nested interrupts then the `syscall`
 execution could also delay other hardware interrupt processing.
 However, `sycall` processing is negligible: they really just
 configure to return to in supervisor mode and vector to the
 `syscall` stub. They should be lightning fast and, for the typical
 real-time applications, should cause no issues.
	=====================
	NuttX Protected Build
	=====================

	.. warning::
	Migrated from :
	https://cwiki.apache.org/confluence/display/NUTTX/NuttX+Protected+Build

	The Traditional "Flat" Build
	============================

	The traditional NuttX build is a "flat" build. By flat, I mean that when
	you build NuttX, you end up with a single "blob" called ``nuttx``. All of the
	components of the build reside in the same address space. All components
	of the build can access all other components of the build.

	The "Two Pass" Protected Build
	==============================

	The NuttX protected build, on the other hand, is a "two-pass" build and
	generates two "blobs": (1) a separately compiled and linked `kernel` blob
	called, again, `nuttx` and separately compiled and linked `user` blob called
	in ``nuttx_user.elf`` (in the existing build configurations). The user blob
	is created on pass 1 and the kernel blob is created on pass2.

	These two make commands are identical:

	.. code-block:: bash

	make
	make pass1 pass2

	But the second is clearer and I prefer to use it for the protected build.
	In the second case, the user and kernel blobs are built separately; in the
	first, the kernel and user blob builds may be intermixed and somewhat
	confusing. You can also build the kernel and user blobs separately with
	one of the following commands:

	.. code-block:: bash

	make pass1
	make pass2

	At the end of the build, there will be several files in the top-level NuttX build directory. From Pass 1:

	* ``nuttx_user.elf``. The pass1 user-space ELF file
	* ``nuttx_user.hex``. The pass1 Intel HEX format file (selected in ``defconfig``)
	* ``User.map``. Symbols in the user-space ELF file

	From Pass 2:

	* ``nuttx``. The pass2 kernel-space ELF file
	* ``nuttx.hex``. The pass2 Intel HEX file (selected in ``defconfig``)
	* ``System.map``. Symbols in the kernel-space ELF file

	The Memory Protection Unit
	==========================

	If the MCU supports a Memory Protection Unit (MPU), then the logic within
	the kernel blob all execute in kernel-mode, i.e., with all privileges.
	These privileged threads can access all memory, all CPU instructions,
	and all MCU registers. The logic executing within the user-mode blob,
	on the other hand, all execute in user-mode with certain restrictions
	as enforced by the MCU and by the MPU. The MCU may restrict access to
	certain registers and machine instructions; with the MPU, access to all
	kernel memory resources are prohibited from the user logic. This includes
	the kernel blob's FLASH, .bss/.data storage, and the kernel heap memory.

	Advantages of the Protected Build
	=================================

	The advantages of such a protected build are (1) security and (2)
	modularity. Since the kernel resources are protected, it will be much
	less likely that a misbehaving task will crash the system or that a
	wild pointer access will corrupt critical memory. This security also
	provides a safer environment in which to execute 3rd party software
	and prevents "snooping" into the kernel memory from the hosted applications.

	Modularity is assured because there is a strict control of the exposed
	kernel interfaces. In the flat build, all symbols are exposed and there
	is no enforcement of a kernel API. With the protected build, on the
	other hand, all interactions with the kernel from the user application
	logic must use `system calls` (or `syscalls`) to interface with the OS. A
	system call is necessary to transition from user-mode to kernel-mode;
	all user-space operating system interfaces are via syscall `proxies`.
	Then, while in kernel mode, the kernel system call handler will
	perform the OS service requested by the application. At the
	conclusion of system processing, user-privileges are restored
	and control is return to the user application. Since the only
	interactions with the kernel can be through support system calls,
	modularity of the OS is guaranteed.

	User-Space Proxies/Kernel-Space Stubs
	=====================================

	The same OS interfaces are exposed to the application in both the "flat"
	build and the protected build. The difference is that in the protected
	build, the user-code interfaces with a `proxy` for the OS function. For
	example, here is what a proxy for the OS ``getpid()`` interface:

	.. code-block:: c

	#include <unistd.h>
	#include <syscall.h>
	pid_t getpid(void)
	{
	return (pid_t)sys_call0(SYS_getpid);
	}

	Thus the ``getpid()`` proxy is a stand-in for the real OS ``getpid()`` interface
	that executes a system call so the kernel code can perform the real
	``getpid()`` operation on behalf of the user application. Proxies are
	auto-generated for all exported OS interfaces using the CSV file
	``syscall/syscall.csv`` and the program ``tools/mksyscalls``. Similarly,
	on the kernel-side, there are auto-generated `stubs` that map the
	system calls back into real OS calls. These, however, are internal
	to the OS and the implementation may be architecture-specific.
	See the ``README.txt`` files in those directories for further information.

	Combining Intel HEX Files
	=========================

	One issue that you may face is that the two pass builds creates two
	FLASH images. Some debuggers that I use will allow me to write each
	image to FLASH separately. Others will expect to have a single Intel
	HEX image. In this latter case, you may need to combine the two Intel
	HEX files into one. Here is how you can do that:

	1) The `tail` of the ``nuttx.hex`` file should look something like this
	(with my comments and spaces added):

	.. code-block:: bash

	$ tail nuttx.hex
	# 00, data records
	...
	:10 9DC0 00 01000000000800006400020100001F0004
	:10 9DD0 00 3B005A0078009700B500D400F300110151
	:08 9DE0 00 30014E016D0100008D
	# 05, Start Linear Address Record
	:04 0000 05 0800 0419 D2
	# 01, End Of File record
	:00 0000 01 FF

	Use an editor such as vi to remove the 05 and 01 records.

	2) The `head` of the ``nuttx_user.hex`` file should look something like this
	(again with my comments and spaces added):

	.. code-block:: bash

	$ head nuttx_user.hex
	# 04, Extended Linear Address Record
	:02 0000 04 0801 F1
	# 00, data records
	:10 8000 00 BD89 01084C800108C8110208D01102087E
	:10 8010 00 0010 00201C1000201C1000203C16002026
	:10 8020 00 4D80 01085D80010869800108ED83010829
	...

	Nothing needs to be done here. The ``nuttx_user.hex`` file should be fine.

	3) Combine the edited nuttx.hex and un-edited ``nuttx_user.hex`` file to produce
	a single combined hex file:

	.. code-block:: bash

	$ cat nuttx.hex nuttx_user.hex >combined.hex

	Then use the ``combined.hex`` file with for FLASH/JTAG tool. If you do this
	a lot, you will probably want to invest a little time to develop a tool
	to automate these steps.

	Files and Directories
	=====================

	Here is a summary of directories and files used by the STM32F4Discovery
	protected build:

	* ``boards/arm/stm32/stm32f4discovery/configs/kostest``. This is the kernel
	mode OS test configuration. The two standard configuration files
	can be found in this directory: (1) ``defconfig`` and (2) ``Make.defs``.
	* ``boards/arm/stm32/stm32f4discovery/kernel``. This is the first past
	build directory. The Makefile in this directory is invoked to
	produce the pass1 object (``nuttx_user.elf`` in this case). The
	second pass object is created by ``arch/arm/src/Makefile``. Also
	in this directory is the file ``userspace.c``. The user-mode blob
	contains a header that includes information need by the kernel
	blob in order to interface with the user-code. That header is
	defined in by this file.
	* ``boards/arm/stm32/stm32f4discovery/scripts``. Linker scripts for
	the kernel mode build are found in this directory. This includes
	(1) ``memory.ld`` which hold the common memory map, (2) ``user-space.ld``
	that is used for linking the pass1 user-mode blob, and (3)
	``kernel-space.ld`` that is used for linking the pass1 kernel-mode blob.

	Alignment, Regions, and Subregions
	==================================

	There are some important comments in the ``memory.ld``
	file that are worth duplicating here:

	"The STM32F407VG has 1024Kb of FLASH beginning at address
	0x0800:0000 and 192Kb of SRAM. SRAM is split up into three blocks:

	* "112KB of SRAM beginning at address 0x2000:0000
	* "16KB of SRAM beginning at address 0x2001:c000
	* "64KB of CCM SRAM beginning at address 0x1000:0000

	"When booting from FLASH, FLASH memory is aliased to address
	0x0000:0000 where the code expects to begin execution by jumping
	to the entry point in the 0x0800:0000 address range.

	"For MPU support, the kernel-mode NuttX section is assumed to
	be 128Kb of FLASH and 4Kb of SRAM. That is an excessive amount
	for the kernel which should fit into 64KB and, of course, can
	be optimized as needed... Allowing the additional memory does
	permit addition debug instrumentation to be added to the kernel
	space without overflowing the partition.

	"Alignment of the user space FLASH partition is also a critical
	factor: The user space FLASH partition will be spanned with a
	single region of size 2\|\|n bytes. The alignment of the user-space
	region must be the same. As a consequence, as the user-space
	increases in size, the alignment requirement also increases.

	"This alignment requirement means that the largest user space
	FLASH region you can have will be 512KB at it would have to be
	positioned at 0x08800000. If you change this address, don't
	forget to change the ``CONFIG_NUTTX_USERSPACE`` configuration
	setting to match and to modify the check in ``kernel/userspace.c``.

	"For the same reasons, the maximum size of the SRAM mapping is
	limited to 4KB. Both of these alignment limitations could be
	reduced by using multiple MPU regions to map the FLASH/SDRAM
	range or perhaps with some clever use of subregions."

	Memory Management
	=================

	At present, there are two options for memory management in the
	NuttX protected build:

	Single User Heap
	----------------

	By default, there is only a single user-space heap and heap
	allocator that is shared by both kernel- and user-modes.
	PROs: Simple and makes good use of the heap memory space,
	CONs: Awkward architecture and no security for kernel-mode
	allocations.

	Dual, Partitioned Heaps
	-----------------------

	Two configuration options can change this behavior:

	* ``CONFIG_MM_MULTIHEAP=y``. This changes internal memory manager interfaces
	so that multiple heaps can be supported.
	* ``CONFIG_MM_KERNEL_HEAP=y``. Uses the multi-heap capability to enable
	a kernel heap

	If this both options are defined defined, the two heap partitions and
	two copies of the memory allocators are built:

	One un-protected heap partition that will allocate user accessible memory
	that is shared by both the kernel- and user-space code. That allocator
	physically resides in the user address space so that it can be called
	directly by both the user- and kernel-space code. There is a header at
	the beginning of the user-space blob; the kernel-space code gets
	address of the user-space allocator from this header.

	And another protected heap partition that will allocate protected
	memory that is only accessible from the kernel code. This allocator
	is built into the kernel block. This separate protected heap is
	required if you want to support security features.

	NOTE: There are security issues with calling into the user space
	allocators in kernel mode. That is a security hole that could be
	exploit to gain control of the system! Instead, the kernel code
	should switch to user mode before entering the memory allocator
	stubs (perhaps via a trap). The memory allocator stubs should
	then trap to return to kernel mode (as does the signal handler now).

	The Traditional Approach
	------------------------

	A more traditional approach would use something like the interface
	``sbrk()``. The ``sbrk()`` function adds memory to the heap space
	allocation of the calling process. In this case, there would
	still be kernel- and user-mode instances of the memory allocators.
	Each would ``sbrk()`` as necessary to extend their heap; the pages
	allocated for the kernel-mode allocator would be protected but
	the pages allocated for the user-mode allocator would not.
	PROs: Meets all of the needs. CONs: Complex. Memory losses
	due to quantization.

	This approach works well with CPUs that have very capable
	Memory Management Units (MMUs) that can coalesce the
	srbk-ed chunks to a contiguous, `virtual` heap region.
	Without an MMU, the sbrk-ed memory would not be
	contiguous; this would limit the sizes of allocations
	due to the physical pages.

	Many MCUs will have Memory Protection Units (MPUs) that can
	support the security features (only). However these lower
	end MPUs may not support sufficient mapping capability to
	support this traditional approach. The ARMv7-M MPU, for
	example, only supports eight protection regions to manage
	all FLASH and SRAM and so this approach would not be
	technically feasible for th ARMv7-M family (Cortex-M3/4).

	Comparing the "Flat" Build Configuration with the Protected Build Configuration
	===============================================================================

	Compare, for example the configuration
	``boards/arm/stm32/stm32f4discovery/configs/ostest`` and the
	configuration ``boards/arm/stm32/stm32f4discovery/configs/kostest``.
	These two configurations are identical except that one builds a
	"flat" version of OS test and the other builds a kernel version
	of the OS test. See the file ``boards/arm/stm32/stm32f4discovery/README.txt``
	for more details about those configurations.

	The configurations can be compared using the ``cmpconfig`` tool:

	.. code-block:: bash

	cd tools
	make -f Makefile.host cmpconfig
	cd ..
	tools/cmpconfig boards/arm/stm32/stm32f4discovery/configs/ostest/defconfig boards/arm/stm32/stm32f4discovery/configs/kostest/defconfig

	Here is a summary of the meaning of all of the important differences in the
	configurations. This should be enough information for you to convert any
	configuration from a "flat" to a protected build:

	* ``CONFIG_BUILD_2PASS=y``. This enables the two pass build.
	* ``CONFIG_BUILD_PROTECTED=y``. This option enables the "two pass"
	protected build.
	* ``CONFIG_PASS1_BUILDIR="boards/arm/stm32/stm32f4discovery/kernel"``.
	This tells the build system the (relative) location of the pass1 build directory.
	* ``CONFIG_PASS1_OBJECT=""``. In some "two pass" build configurations,
	the build system need to know the name of the first pass object.
	This setting is not used for the protected build.
	* ``CONFIG_NUTTX_USERSPACE=0x08020000``. This is the expected location
	where the user-mode blob will be located. The user-mode blob
	contains a header that includes information need by the kernel
	blob in order to interface with the user-code. That header will
	be expected to reside at this location.
	* ``CONFIG_PASS1_TARGET="all"``. This is the build target to use for
	invoking the pass1 make.
	* ``CONFIG_MM_MULTIHEAP=y``. This changes internal memory manager
	interfaces so that multiple heaps can be supported.
	* ``CONFIG_MM_KERNEL_HEAP=y``. NuttX supports the option of using a
	single user-accessible heap or, if this options is defined,
	two heaps: (1) one that will allocate user accessible memory
	that is shared by both the kernel- and user-space code, and
	(2) one that will allocate protected memory that is only
	accessible from the kernel code. Separate heap memory is required
	if you want to support security features.
	* ``CONFIG_MM_KERNEL_HEAPSIZE=8192``. This determines an approximate
	size for the kernel heap. The standard heap space is partitioned
	into a kernel- and user-heap space. This size of the kernel heap
	is only approximate because the user heap is subject to stringent
	alignment requirements. Because of the alignment requirements, the
	actual size of the kernel heap could be considerable larger than this.
	* ``CONFIG_BOARD_EARLY_INITIALIZE=y``. This setting enables a special,
	`early` initialization call to initialize board-specific resources.
	* ``CONFIG_BOARD_LATE_INITIALIZE=y``. This setting enables a special
	initialization call to initialize `late` board-specific resources.
	The difference between ``CONFIG_BOARD_EARLY_INITIALIZE`` and
	``CONFIG_BOARD_LATE_INITIALIZE`` is that the ``CONFIG_BOARD_EARLY_INITIALIZE``
	logic runs earlier in initialization before the full operating
	system is up and running. ``CONFIG_BOARD_LATE_INITIALIZE``, on the
	other hand, runs at the completion of initialization, just before
	the user applications are started. Neither ``CONFIG_BOARD_EARLY_INITIALIZE``
	nor ``CONFIG_BOARD_LATE_INITIALIZE`` are used in the OS test
	configuration but other configurations (such as NSH)
	require some application-specific initialization before
	the application can run. In the "flat" build, such initialization
	is performed as part of the application start-up sequence.
	These includes such things as initializing device drivers.
	These same initialization steps must be performed in kernel
	mode for the protected build and ``CONFIG_BOARD_LATE_INITIALIZE``.
	See ``boards/arm/stm32/stm32f4discovery/src/up_boot.c`` for an
	example of such board initialization code.
	* ``CONFIG_NSH_ARCHINITIALIZE`` is not defined. The setting
	``CONFIG_NSH_ARCHINITIALIZE`` does not apply to the OS test
	configuration, however, this is noted here as an example
	of initialization that cannot be performed in the protected build.

	Architecture-Specific Options:

	* ``CONFIG_SYS_RESERVED=8``. The user application logic
	interfaces with the kernel blob using system calls.
	The architecture-specific logic may need to reserved a
	few system calls for its own internal use. The ARMv7-M
	architectures all require 8 reserved system calls.
	* ``CONFIG_SYS_NNEST=2``. System calls may be nested. The
	system must retain information about each nested system
	call and this setting is used to set aside resources for
	nested system calls. In the current architecture, a maximum
	nesting level of two is all that is needed.
	* ``CONFIG_ARMV7M_MPU=y``. This settings enables support for
	the ARMv7-M Memory Protection Unit (MPU). The MPU is used
	to prohibit user-mode access to kernel resources.
	* ``CONFIG_ARMV7M_MPU_NREGIONS=8``. The ARMv7-M MPU supports 8
	protection regions.

	Size Expansion
	==============

	The protected build will, or course, result in a FLASH image that is
	larger than that of the corresponding "flat" build. How much larger?
	I don't have the numbers in hand, but you can build
	``boards/arm/stm32/stm32f4discovery/configs/nsh`` and
	``boards/arm/stm32/stm32f4discovery/configs/kostest`` and compare
	the resulting binaries for yourself using the ``size`` command.

	Increases in size are expected because:

	* The syscall layer is included in the protected build but not the flat
	build.
	* The kernel-size _syscal_l stubs will cause all enabled OS code to be
	drawn into the build. In the flat build, only those OS interfaces
	actually called by the application will be included in the final objects.
	* The dual memory allocators will increase size.
	* Code duplication. Some code, such as the C library, will be
	duplicated in both the kernel- and user-blobs, and
	* Alignment. The alignments required by the MPU logic will leave
	relatively large regions of FLASH (and perhaps RAM) is not usable.

	Performance Issues
	==================

	The only performance differences using the protected build should
	result as a consequence of the `sycalls` used to interact with the
	OS vs. the direct C calls as used in the flat build. If your
	performance is highly dependent upon high rate OS calls, then
	this could be an issue for you. But, in the typical application,
	OS calls do not often figure into the critical performance paths.

	The `syscalls` are, ultimately, software interrupts. If the platform
	does not support prioritized, nested interrupts then the `syscall`
	execution could also delay other hardware interrupt processing.
	However, `sycall` processing is negligible: they really just
	configure to return to in supervisor mode and vector to the
	`syscall` stub. They should be lightning fast and, for the typical
	real-time applications, should cause no issues.