AminetAminet
Search:
84759 packages online
About
Recent
Browse
Search
Upload
Setup
Services

util/boot/NewCMQ060.lha

Mirror:Random
Showing: m68k-amigaos iconppc-amigaos iconppc-morphos iconi386-aros iconi386-amithlon iconppc-warpup iconppc-powerup icongeneric icon
No screenshot available
Short:Patch CopyMem/Quick for 68060(040) v1.5d
Author: sintonen at iki.fi (Harry "Piru" Sintonen)
Uploader:sintonen iki fi (Harry "Piru" Sintonen)
Type:util/boot
Version:1.5d
Architecture:m68k-amigaos
Date:2000-11-08
Requires:68060 or 68040, Kickstart 2.04
Download:util/boot/NewCMQ060.lha - View contents
Readme:util/boot/NewCMQ060.readme
Downloads:1740

Description:
   This is a small patch which replace the CopyMem and CopyMemQuick
   functions of exec.library.

   These functions are optimized for the 68060 processor. They should
   also work with the 68040 processor, howevery they might not be the
   fastest possible for 68040.

   The patch tests for a 68040 or 060 processor. If it can't find one,
   it doesn't install the patch and exit with a return code of 20 (=fail).
   It also fails, if it can't allocate the necessary memory. If MorphOS
   PPC kernel is running it won't install the patch and will exit with a
   return code of 5 (=warn).

   If the CPU is a 68040 CMQ060 will install a slightly improved version
   of v1.4 routines. If CPU is a 68060 routines with new movem-loop are
   picked instead. Note that due these movem-copyloops v1.5 is slightly
   slower in chipmem copies than v1.4. However fast->fast copies are sped
   up, so I don't consider this a problem, esp. since most copies are
   fast->fast.

   In average (measured with "TestIt" from CopyMemQuicker V2.8) these
   routines are 29.4% faster than Kickstart 3.1 ones. CMQ060 v1.5 is
   in average 2.5% faster than CMQ060 v1.4.

   The full source code is included. The source code was compiled with
   GenAm 3.14, it also compiles with PhxAss.

Installation:
   Just copy CMQ060 into c:
   And insert CMQ060 in your s:startup-sequence

Some notes about Move16:
   Move16 is a new assembler command of the 68040 and 060 processors. It
   moves 16 bytes at once and it uses burst accesses. Andreas Kleinert and
   Thomas Richter said there could be problems with the Move16 command on
   the Amiga, especially in the chipram, caused by the DMA of the custom
   chips.

   So v1.5 of CMQ060 doesn't use Move16 from or into memory below
   $01000000 (Chipram, ZorroII-Fastram, I/O-Space, Kickstart,...). Move16
   is only used, when the source and destination addresses are both higher
   than $00ffffff (32-bit fastram).

   (If you didn't get any errors with V1.3 and want to get the most speed
    improvement, you could use CMQ060_Move16. This version use Move16 also
    below $01000000, but you might get problems.

    If you want to avoid all problems which Move16 could cause [the 68040
    has some Move16 bugs], you should use Aminet:util/boot/CMQ030. This
    one never uses Move16 and is still faster than the other available
    patches.)

Some notes about the movem bug:
   Some CPU Cards have a bug in the bus controller and these cards fail to
   perform movem properly with odd addresses. CMQ060 v1.5 autodetect such
   cards and will use move-loop instead of movem-loop with them. If move-
   loop is picked the performance will drop slightly compared to movem-
   loop. Fortunately such defect cards are rare. Special thanks to Harald
   Frank who patiently explained the bug to me, and gave me idea how to
   autodetect it.

Version 1.5 author:
   Harry "Piru" Sintonen
   <sintonen@iki.fi>

Original CMQ060 author:
   Dirk Busse
   Kropsburgstraße 8
   D-67141 Neuhofen
   Germany
   <dbusse@primus-online.de>
   <100.141999@germanynet.de>

Speed comparision:
   There are some similar patches available on the Aminet:
      CopyMemQuicker V2.8 from 1994 -> Aminet:util/boot/COPMQR28.lha
      PCM V1.0            from 1996 -> Aminet:util/boot/PCM_1.0.lha
      Also MCP patches these functions.

Here are some test results. All results were measured on the same AMIGA
1200 with a phase5 Blizzard PPC with 060 @ 50MHz. Blizzard PPC memory
speed setting for M68K was set to fastest possible.

The most surprising result is that PCM V1.0 is in average *slower* than
original Kickstart 3.1 routines!

"TestIt" from
CopyMemQuicker V2.8
                   orig    COPMQR   MCP   PCM  CMQ030 CMQ060 CMQ060 CMQ060
                   KS 3.1   V2.8  V1.33b1 V1.0  V1.1   V1.4   V1.5  Move16
CopyMem           routines                                            V1.5
565×64kB L->L       2.04    2.08   1.92   1.56   1.91   1.52   1.51   1.51
147×64kB L->L+1     0.94    0.68   0.57   0.68   0.56   0.57   0.56   0.56
413×64kB L->E       1.66    1.70   1.61   1.91   1.57   1.61   1.59   1.59
147×64kB L->E+1     0.94    0.68   0.57   0.68   0.56   0.57   0.56   0.56
147×64kB L+1->L     0.94    0.67   0.57   0.60   0.56   0.57   0.55   0.56
382×64kB L+1->L+1   1.62    1.39   1.29   1.05   1.30   1.03   1.02   1.02
147×64kB L+1->E     0.94    0.68   0.57   0.69   0.57   0.57   0.56   0.56
501×64kB L+1->E+1   1.91    1.89   1.95   2.34   1.96   1.96   1.93   1.93
501×64kB E->L       1.92    1.92   1.94   2.06   1.92   1.95   1.90   1.90
147×64kB E->L+1     0.94    0.67   0.57   0.68   0.57   0.57   0.55   0.55
382×64kB E->E       1.62    1.39   1.29   1.06   1.30   1.03   1.02   1.02
147×64kB E->E+1     0.94    0.68   0.57   0.68   0.57   0.57   0.56   0.56
147×64kB E+1->L     0.94    0.67   0.57   0.60   0.56   0.57   0.55   0.56
413×64kB E+1->L+1   1.71    1.70   1.60   1.93   1.61   1.60   1.56   1.56
147×64kB E+1->E     0.94    0.67   0.57   0.69   0.57   0.57   0.55   0.55
564×64kB E+1->E+1   2.10    2.06   1.91   1.56   1.92   1.52   1.50   1.50
33900×1kB L->L      0.43    0.42   0.37   1.49   0.36   0.36   0.36   0.36
9400×1kB L->L+1     0.58    0.33   0.20   0.24   0.20   0.19   0.19   0.19
24000×1kB E->E      0.68    0.30   0.26   1.01   0.27   0.26   0.26   0.26
196000×128B L->L    0.55    0.45   0.41   1.12   0.32   0.35   0.33   0.33
155000×128B E->E    0.75    0.40   0.34   1.10   0.34   0.30   0.30   0.31
588000×19B L->L     0.85    0.61   0.72   0.93   0.53   0.53   0.53   0.53
622000×18B L->L     0.86    0.51   0.71   0.89   0.51   0.50   0.50   0.51
663000×17B L->L     0.75    0.68   0.76   0.80   0.51   0.53   0.53   0.55
956000×16B L->L     0.82    0.71   1.04   1.05   0.59   0.72   0.55   0.55
1060000×8B L->L     0.85    0.72   0.89   1.03   0.62   0.53   0.55   0.55
1430000×4B L->L     0.80    0.63   0.94   1.12   0.71   0.45   0.45   0.48
2190000×1B L->L     0.74    0.61   1.40   0.88   0.44   0.66   0.66   0.70
CopyMemQuick
565×64kB L->L       2.04    2.06   1.91   1.56   1.91   1.52   1.51   1.51
33900×1kB L->L      0.43    0.43   0.37   1.27   0.36   0.36   0.35   0.35
196000×128B L->L    0.53    0.43   0.38   1.09   0.31   0.32   0.30   0.30
956000×16B L->L     0.73    0.63   0.94   1.06   0.42   0.58   0.42   0.42
1060000×8B L->L     0.53    0.57   0.80   0.63   0.44   0.42   0.42   0.42
1430000×4B L->L     0.43    0.51   0.80   0.60   0.31   0.28   0.28   0.31
Total 
                   35.63   30.70  31.48  36.84  27.31  25.80  25.16  25.31

History:
   1.0 (12.Sep.1998)
       - First public version.
   1.1 (15.Sep.1998)
       - V1.0 exits with a return code of 10 (=error), if it can't find
         a 68040 or 68060 or can't get the necessary memory.
         V1.1 exits, in this cases, with a return code of 20 (=fail).
       - Fixed a mistake in the readme.
   1.1b (19.Sep.1998)
       (I didn't changed the Patch itself! It's the same as V1.1)
       - Added the Testresults of MCP V1.30 into the readme.
       - Added CMQ060beep and CMQ060beepCMQ (see above).
   1.2 (29.Nov.1998)
       - Added the Testresults of MCP V1.32b12 into the readme.
       - Changed the source code.
         There was a problem with a wrong written program which expects
         the address of the last source byte +1 in A0 and the address
         of the last destination byte +1 in A1.
         This version of CMQ060 solves problems with such badly programs.
         It's now 100 Bytes longer, but the speed is the same. Big moves
         by the CopyMem function will be one or two cycles faster, but
         you didn't recognize it.
   1.3 (5.Jan.1999)
       All changes made to this version doesn't effect the speed. They
       are only to avoid problems with future versions of AMIGA OS.
       - changed the version string to the "standard" format
       - changed BMI to BCS and BPL to BCC
         -> now CMQ030 could move blocks bigger than 2 GigaByte ;-)
   1.4 (3.Apr.1999)
       - CMQ060 now doesn't use Move16 into/from memory below $01000000
       - added CMQ060Move16 (It's the same as CMQ060 V1.3)
       - added the test results of CMQ030 (Does never use Move16)
   1.5 (5.Sep.2000)
       - Totally rewrote the source code.
       - Bugfix: Fixed major bug from the patch init: If the memory was
         allocated near 64k boundary CMQ060 trashed innocent memory and
         crashed the system completely. Odds were 1/8192 for this to
         happen.
       - Speedup: Removed two pipeline stalls from big copies.
       - Speedup: Optimized non-move16 copy loop, now it uses movem.l
         instead of move.l. Slightly slower in chipmem copies, however
         fast -> fast copies sped up.
       - Speedup: Unrolled the bigcopy-loops to do 256 bytes per
         iteration.
       - Added MorphOS check, it makes no sense to slow down MorphOS
         with m68k patches.
       - Redid all speedtests, MCP test with 1.33b1. Added V1.4 result
         for reference. Cleaned up this readme.
   1.5b (6.Sep.2000)
       - With 68040 the move-loop is faster then movem-loop. So, now
         always pick move-loop for 68040. Thanks to Chip for benchmark
         results.
       - Added autodetect for movem buscontroller bug. Now automagically
         pick between movem- and move-loop on 68060.
       - Fixed Kickstart requirement, 68040 wasn't officially supported
         before Kickstart 2.04.
   1.5c (7.Sep.2000)
       - Bugfix: movem buscontroller bug autodetect was bugged. Fixed.
       - Made the source compile with PhxAss.
   1.5d (11.Sep.2000)
       - Bugfix: movem buscontroller bug autodetect still had a potential
         problem. Fixed.


Contents of util/boot/NewCMQ060.lha
 PERMSSN    UID  GID    PACKED    SIZE  RATIO     CRC       STAMP          NAME
---------- ----------- ------- ------- ------ ---------- ------------ -------------
[generic]                 3788    9930  38.1% -lh5- c224 Sep 11  2000 NewCMQ060.readme
[generic]                  909    1772  51.3% -lh5- 9474 Sep 11  2000 NewCMQ060/CMQ060
[generic]                  904    1712  52.8% -lh5- 1d7d Sep 11  2000 NewCMQ060/CMQ060_Move16
[generic]                 5552   24272  22.9% -lh5- 43a6 Sep 11  2000 NewCMQ060/src/CMQ060.ASM
---------- ----------- ------- ------- ------ ---------- ------------ -------------
 Total         4 files   11153   37686  29.6%            Nov  8  2000
Page generated in 0.02 seconds
Aminet © 1992-2024 Urban Müller and the Aminet team. Aminet contact address: <aminetaminet net>