AminetAminet
Search:
84757 packages online
About
Recent
Browse
Search
Upload
Setup
Services

util/boot/TLSFMem.lha

Mirror:Random
Showing: m68k-amigaos iconppc-amigaos iconppc-morphos iconi386-aros iconi386-amithlon iconppc-warpup iconppc-powerup icongeneric icon
No screenshot available
Short:TLSFMem O(1) Memory Allocator
Author:Chris Hodges
Uploader:polluks+aminet sdf lonestar org (Stefan Haubenthal)
Type:util/boot
Version:1.6 (19-Aug-08)
Architecture:m68k-amigaos
Date:2020-05-08
Requires:68020+
Download:util/boot/TLSFMem.lha - View contents
Readme:util/boot/TLSFMem.readme
Downloads:779

Introduction
~~~~~~~~~~~~
TLSFMem is an implementation of a very new memory allocation system  called
TLSF (two level segregated fit). TLSF was described in a paper by the three
spanish researchers M. Masmano, I. Ripoll, A. Crespo.

Originally designed for Realtime Operating Systems, all allocation and free
operations  run  with  constant  time  complexity  (O(1)).  This is a major
improvement over the original AmigaOS  memory  system,  which  gets  slower
while memory gets fragmented (O(m) where m is the number of fragments).

Moreover, the old AmigaOS allocator uses a first fit strategy, which causes
the  memory  to fragment pretty quickly. TLSF is an exact fit allocator for
memory blocks smaller than 512 bytes and a good fit allocator for all other
sizes:  It  will always find a free block which is always smaller than 103%
of the requested block.

***************************************************************************
TLSFMem is blindly fast and will reduce memory fragmentation significantly!
***************************************************************************

TLSFMem was written in optimized assembly language, but  more  importantly,
it  uses  these clever constant time algorithms. (There is really no use in
programs that are written in assembly, when the algorithm sucks (e.g. naïve
pattern matching algorithm, like in SearchStar vs. sub-linear Rabin-Karp).)

This software comes as freeware, but as I spent a lot of time in developing
it,  you are welcome to donate something. Audio CDs are welcome and there's
an Amazon.de wishlist aswell. No PayPal.

FAQ
~~~
Q: Does this 103% stuff mean it wastes memory (internal fragmentation)?
A: No. It only needs four extra bytes per allocation. The only internal
   fragmentation happening is for its alignment of 16 bytes per allocation.

Q: Does it run on WarpOS / PowerUP?
A: The last version V1.3 didn't due to a bug in AllocAbs. The new one
   should run under PowerUP, because according to Laire, PowerUP calls the
   exec functions. I suspect WarpOS still not to work because it's said to
   have its own PPC native implementation of the exec memory functions,
   which will cause WarpOS to run out of memory. This is due to TLSFMem
   allocating its memory from the normal memory lists and pouring each into
   TLSF mem headers, which would appear empty to the standard exec
   routines.

Q: There was another question here before. Where did this important
   question go?
A: A fascist idiot and troll blackmailed me for mentioning his realname
   here.

Q: Will it speed up any program or just those written for it in mind?
A: It will speed up all programs using the standard exec memory functions.
   It will not speed up programs that use their own allocators (but these
   are scarce).

Q: Will TLSFMem (not TLSFMemPool) also speed up Memory Pools?
A: Yes, as memory pools also depend on AllocMem() allocations when building
   new puddles. However, memory allocations within a puddle remain
   unchanged (I think exec from 3.9 BB2 uses AVL trees which have amortized
   O(log n) complexity, which is "good enough" -- but I don't know how
   exec <V45 memory pools are organized internally).

Q: So why should I use TLSFMemPool then?
A: Dunno. Maybe for that little bit of extra speed. It has a higher memory
   overhead and increased fragmentation though (no pooling).

Q: Is it really faster? How much?
A: It is not faster in a system with only a few free memory chunks (we're
   talking here about 20 and less). But the fragmentation will increase
   while the systems is running, so without TLSFMem, the performance will
   degrade. How much? You are free to run tools like FragIt in the
   background to experience the slow-down in real-time.

Q: It doesn't work here.
A: Remove patches like MemOptimizer, AllocP, AllocP32, PoolMem.
   Don't run debugging tools like Enforcer, MuForce, MuGuardianAngel,
   Mungwall, etc.
A: I've found several docs regarding a design flaw in layers.library that
   will attempt to release partial memory blocks. I didn't encounter this
   here, so it WorksForMe(TM). But TLSFMem does not have a workaround for
   layers.library like other patches might have... If you get a GURU with
   number #715F0041, this might be the problem.
A: Send a bug report. A real one. With helpful information. Please.


Requirements
~~~~~~~~~~~~
TLSFMem will run on Kick 3.0 and higher systems. It will work nicely  along
Piru's  exec44.  It  can  be  installed  in a FlashRom such as the Romulus,
Algor, or Deneb. It also can be included in a customly build Kickstart  rom
with  the Remus/RomSplit-Tool (e.g. for BlizKick). Please only use TLSFMem,
not TLSFMemPool for these cases.

If you don't want to use these features, you  can  still  just  install  it
residently or run it from your startup-sequence (or user-startup).

However, if you decide to run it  very  early  in  the  system  (Kickstart,
FlashRom,  reset-resident),  you  will have to make sure that you patch the
scsi.device, because there is a bug in  it  that  will  kill  TLSFMem  (see
below).


Usage
~~~~~
To simply try it out, just run TLSFMem or TLSFMemPool in a shell. This will
install patches to the following exec.library functions:

- AllocMem()
- FreeMem()
- AvailMem()
- AllocAbs()
- Allocate()
- Deallocate()
- AddMemList()

TLSFMemPool installs the following additional patches:

- CreatePool()
- DeletePool()
- AllocPooled()
- FreePooled()

TLSFMem cannot be started from Workbench, don't try this by  giving  it  an
icon.  When  run  from  shell, it will not output anything. If successfully
installed, an error code of 0 (OK) will returned, 5 (WARN) when  the  patch
has  already been installed or 20 (FAIL), if something went wrong. There is
no message reported back. There is no need to run it with "Run", as it will
not block.

TLSFMemPool will also patch memory pools to use the O(1) allocation system,
but  then  memory will not be pooled together into puddles. There is little
additional benefit in using these memory pool patches, as exec.library will
still  call  TLSF  functions  to  allocate  the  puddles  itself and larger
allocations. Also, TLSF pooled allocations have an  overhead  of  8  bytes,
which  could  be a waste compared to the small allocation usually happening
with memory pools. TLSFMemPool is  not  compatible  with  the  AROS/MorphOS
extension that provides Semaphore protection.

Do not run programs such as PoolMem  or  MemOptimizer  at  the  same  time.
TLSFMem cannot be run with MungWall or MuForce.

On start, TLSFMem will run through the existing  memory  headers  and  will
convert  large  chunks  of  at least 512 KB size to new TLSF memory headers
(the threshold is used to avoid adding dozens  of  memory  headers  to  the
system).  If  the memory header was still completely unused, it will eat it
alive.

TLSFMem obviously cannot be removed without resetting the system.

TLSFDebug will report some internal stats to serial debug  (or  Sashimi  if
you  have  this running). This is a standalone program, not a debug version
of TLSFMem.


Theory of Operation
~~~~~~~~~~~~~~~~~~~
I suggest you read the paper of the spanish  researchers  mentioned  above,
which  can  be  found  easily by searching for "TLSF Allocator", if you are
interested in details on how it works. In summary TLSF works by  build  two
level  bitmaps  of  size categories for free memory chunks. The first level
bitmap indicates the existence of at least one free block between  2^n  and
2^(n+1),  whereas  the  second  level  bitmap exists for each of the coarse
categories and thus splits each into another  32  smaller  categories.  The
magic  comes  from a MC68020+ instruction (bfffo) which will find the first
set bit in these bitmaps in constant time.

The complexity of the operations can be given as following:

                  Practical case   Theoretical worst case
- AllocMem()      O(1)*            (O(h))
- FreeMem()       O(1)*            (O(h))
- AvailMem()      O(1)**           (O(h*floor(t/c)))
- AllocAbs()      O(1)***          (O(m))
- Allocate()      O(1)
- Deallocate()    O(1)
- AddMemList()    O(1)

- CreatePool()    O(1)
- DeletePool()    O(n)****
- AllocPooled()   O(1)
- FreePooled()    O(1)


The  complexities  given  in  brackets   are   theoretically   worst   case
complexities,  which  will  actually  not  occur in 99,999999999999% of the
uses. The input value "h" represents the number of memory  headers  in  the
system (very few), "t" the amount of total free memory in bytes and "c" the
size of the largest chunk  of  memory.  In  practice,  all  operations  are
executed in constant time with very few CPU instructions.

* These functions iterate over the available memory headers (usually only a
few),  but  in  99,9%  of  all  memory allocation cases, the allocation and
deallocations will be satisfied by the very first memory header, which is a
TLSF  one. The number of memory headers usually doesn't change, hence these
operations can be assumed O(1).  The  original  functions  /also/  need  to
iterate over the memory headers, so this is not some drawback introduced by
TLSFMem.

** AvailMem() with MEMF_LARGEST will find the category of a memory  header,
where  the  largest  block is stored in O(1), but it theoretically needs to
iterate of the blocks inside this  category  (there  are  up  to  768  size
categories  in total). In nearly 100% of the cases, this category will hold
exactly one memory block, namely the largest one. Theoretically, there  can
be  up  to t/c such blocks (where t is the total free memory in bytes and c
is the size of the largest free memory block. It is  clear  that  t/c  will
normally  be 1, and only if memory starts running low, it could happen that
t/c raises to 2 or more. In theory.

*** AllocAbs() needs to iterate over the free chunks to  find  the  correct
one.  Hence,  worst  case  complexity  is  O(m), m being the number of free
chunks. But as it starts with the largest chunk going to smaller  ones,  it
is  very  probable  that  it  will  find  the  chunk *very* few iterations.
AllocAbs() is a very seldom operation  anyway  and  usually  only  used  by
aligned memory operations. Might be worth noticing that TLSFMem returns the
pointer of the requested block instead of the  one  actually  reserved  (as
stated in the AutoDocs), because a lot of programs seem to crash otherwise.
For freeing memory allocated by AllocAbs(), you would need to  provide  the
pointer  of the originally requested block anyway, and NOT the one returned
by AllocAbs(). It seems that most  programs  don't  care  about  this,  but
TLSFMem does not allow memory to be freed partially.

**** When patched, DeletePool() is the only  operation,  which  potentially
might   take   longer   than   the  original  one.  Because  the  currently
implementation is very  simple,  it  will  use  linear  time  to  free  the
remaining  allocated  chunks.  But  I  guess most programs will empty their
pools manually anyway, instead of calling DeletePool() on  a  fully  loaded
pool.


Benchmarks
~~~~~~~~~~
I've included the program AllocatorBenchmark for your own experiments.  The
milage  will greatly vary on the fragmentation level of your system without
TLSF. I've measured factors of 30 and more.

For example:

AllocatorBenchmark RANDOMLY
TLSFMem
AllocatorBenchmark RANDOMLY

AllocatorBenchmark will currently not measure Memory Pool performance.


scsi.device Bug
~~~~~~~~~~~~~~~
There is a bug in all known version of the scsi.device (both  IDE  and  NCR
SCSI).  The devices will allocate an IORequest structure (32 bytes big) and
then use it as IOStdReq structure (which  is  48  bytes  big),  overwriting
innocent  memory past the 32 allocated bytes. By chance, this seems to have
no noticable effect on the  standard  AmigaOS  allocator,  but  immediately
kills   TLSFMem,  as  a  vital  pointer  in  its  internal  structures  are
overwritten. Don't get me wrong: This is a bug in the scsi.device and  that
TLSFMem triggers it makes it no bug of TLSFMem.

I've included spatch diffs for the  AmigaOS  Rom  Update  files  for  three
different  versions and the corresponding IDE and SCSI devices. In case you
are unable to patch these files  (I  don't  know  if  mine  were  from  the
original boing bags, because there are IDE speedup patches and I might have
already applied them to the files), you can still manually fix the devices,
by  searching  for  the  hex string "422D xxxx 7020 6100" and replace it by
"422D xxxx 7030 6100" (there are multiple occurrences in  the  AmigaOS  Rom
Update files).

You  won't  need  to  fix  these  bugs  if  you  only  start   TLSFMem   in
startup-sequence  or user-startup. If you want to benefit from its features
very early during booting,  running  it  residently  or  from  flashrom  is
recommended.


Pros and Cons
~~~~~~~~~~~~~
+ Memory allocations are very fast O(1) and remain fast during the whole
  runtime of the system.
+ Minimal fragmentation.
+ Compatible with all system conformly written programs.
+ Option to patch the memory pool system, too.

- Currently doesn't work with WarpOS (should do with PowerUP).
- Uses four additional bytes per allocation and has a minimum chunk size of
  16 instead of 8.
- Is not compatible with various development tools such as MungWall,
  MuForce, etc.
- Is not compatible with broken programs that try to free partial memory
  blocks. This never was legal and never will be.
- Is very sensitive to programs overwriting innocent memory past allocated
  memory or using memory freed before.
- Currently does not support MEMF_REVERSE (could be added, but there is not
  much sense in this).


History
~~~~~~~
V1.4 (22-Oct-07):
  - Fixed enforcer hit (memory write to 8) reported by Bernd Rösch.
  - Fixed TLSF memory being counted multiple times, if the original mem
    header was present for AvailMem(MEMF_TOTAL).
  - Fixed a slight inaccuracy for AvailMem(MEMF_LARGEST) if there were
    multiple blocks of the same category or multiple TLSF memory headers.
  - When eating a memory header, random MEMF flags were used. Fixed.
  - When converting memory headers and the TLSF Mem header is at the end
    of the original memory header, it will shrink the original one to
    remove the overlap.
  - AllocatorBenchmark was broken and would always use MEMF_CLEAR.
  - SaferPatches, PatchControl and (probably) SetMan would crash TLSFMem
    if started before, because they would try to allocate memory while
    TLSFMem calls SetFunction to modify the allocation routine, even before
    it could store the old function pointers itself. Hence, SaferPatches
    would call the patched TLSF pointer for memory, and TLSF would call
    a NULL pointer for legacy memory routines.
    To fix this catch22, TLSFMem now also implements the very last legacy
    routines Allocate() and Deallocate() itself.
  - TLSFMem would crash when run in ROM space, because it tried to save the
    old SetFunction() pointers in its code. Now that it doesn't call legacy
    routines anymore, this is fixed. However, TLSFMemPool still needs to
    store the old function pointers -- When run as resident TLSFMemPool
    will not patch memory pool functions.
  - Fixed some glitches in CreatePool() when running out of memory.
  - MEMF_CLEAR memory clearing now happens outside of Forbid state.
  - Fixed some evil glitches in AllocAbs() which could cause aligned memory
    allocations operations to fail when memory got fragmented. This could
    also explain why PowerUP programs refused to run.
  - Enabled minimum sanity check on FreeMem() for correct size parameter.
    Will recoverably GURU with $715F0041 when mismatch is detected.

V1.3 (14-Oct-07):
  - First public release.


Contact
~~~~~~~
Any mail, comments or donations welcome:

Email: chrisly(!)platon42.de
WWW: http://www.platon42.de/
Wishlist: http://www.amazon.de/gp/registry/wishlist/1123KBN787GJQ


Contents of util/boot/TLSFMem.lha
PERMISSION  UID  GID    PACKED    SIZE  RATIO METHOD CRC     STAMP     NAME
---------- ----------- ------- ------- ------ ---------- ------------ ----------
[generic]                 2153    3220  66.9% -lh5- 75d3 Oct 28  2007 TLSFMem/AllocatorBenchmark
[generic]                  125     136  91.9% -lh5- db65 Oct 14  2007 TLSFMem/ScsiBugfix/44_26/a1000.ld.strip.patch
[generic]                  123     132  93.2% -lh5- ca77 Oct 14  2007 TLSFMem/ScsiBugfix/44_26/a300.ld.strip.patch
[generic]                  121     136  89.0% -lh5- 8aa0 Oct 14  2007 TLSFMem/ScsiBugfix/44_26/a4000.ld.strip.patch
[generic]                  120     132  90.9% -lh5- 934d Oct 14  2007 TLSFMem/ScsiBugfix/44_26/a600.ld.strip.patch
[generic]                  125     140  89.3% -lh5- 54e7 Oct 14  2007 TLSFMem/ScsiBugfix/44_26/scsidisk.ld.strip.patch
[generic]                  126     136  92.6% -lh5- 4087 Oct 14  2007 TLSFMem/ScsiBugfix/44_57/a1000.ld.strip.patch
[generic]                  123     132  93.2% -lh5- 9fc4 Oct 14  2007 TLSFMem/ScsiBugfix/44_57/a300.ld.strip.patch
[generic]                  120     136  88.2% -lh5- d4eb Oct 14  2007 TLSFMem/ScsiBugfix/44_57/a4000.ld.strip.patch
[generic]                  120     132  90.9% -lh5- 30c7 Oct 14  2007 TLSFMem/ScsiBugfix/44_57/a600.ld.strip.patch
[generic]                  125     140  89.3% -lh5- 4cf5 Oct 14  2007 TLSFMem/ScsiBugfix/44_57/scsidisk.ld.strip.patch
[generic]                  127     136  93.4% -lh5- 9488 Oct 14  2007 TLSFMem/ScsiBugfix/44_6/a1000.ld.strip.patch
[generic]                  126     132  95.5% -lh5- ac98 Oct 14  2007 TLSFMem/ScsiBugfix/44_6/a300.ld.strip.patch
[generic]                  121     136  89.0% -lh5- c100 Oct 14  2007 TLSFMem/ScsiBugfix/44_6/a4000.ld.strip.patch
[generic]                  123     132  93.2% -lh5- 4d8e Oct 14  2007 TLSFMem/ScsiBugfix/44_6/a600.ld.strip.patch
[generic]                  126     140  90.0% -lh5- e86b Oct 14  2007 TLSFMem/ScsiBugfix/44_6/scsidisk.ld.strip.patch
[generic]                  151     176  85.8% -lh5- 2b2b Oct 14  2007 TLSFMem/ScsiBugfix/AmigaOS_ROM_Update.44_26.patch
[generic]                  149     180  82.8% -lh5- 1cc7 Oct 14  2007 TLSFMem/ScsiBugfix/AmigaOS_ROM_Update.44_57.patch
[generic]                  144     172  83.7% -lh5- 94a7 Oct 14  2007 TLSFMem/ScsiBugfix/AmigaOS_ROM_Update.44_6.patch
[generic]                 1200    2376  50.5% -lh5- a7cb Oct 28  2007 TLSFMem/TLSFDebug
[generic]                 2264    3104  72.9% -lh5- 9122 Oct 28  2007 TLSFMem/TLSFMemPool
[generic]                 2011    2748  73.2% -lh5- 0f29 Aug 20  2008 TLSFMem/TLSFMem
[generic]                 6796   16150  42.1% -lh5- 3c27 Aug 30  2008 TLSFMem/TLSFMem.readme
---------- ----------- ------- ------- ------ ---------- ------------ ----------
 Total        23 files   16719   30154  55.4%            May  8 03:13
Page generated in 0.02 seconds
Aminet © 1992-2024 Urban Müller and the Aminet team. Aminet contact address: <aminetaminet net>