AminetAminet
Search:
82026 packages online
About
Recent
Browse
Search
Upload
Setup
Services

util/boot/CopyMem.lha

Mirror:Random
Showing:m68k-amigaosppc-amigaosppc-morphosi386-arosi386-amithlonppc-warpupppc-powerupgeneric
No screenshot available
Short:Patch CopyMem/Quick for 68060(040) v1.1
Author:matthey7 at gmail.com (Matt Hey)
Uploader:matthey7 gmail com (Matt Hey)
Type:util/boot
Version:1.1
Requires:68020+, Kickstart 2.04
Architecture:m68k-amigaos >= 2.04
Date:2009-08-12
Download:http://aminet.net/util/boot/CopyMem.lha - View contents
Readme:http://aminet.net/util/boot/CopyMem.readme
Downloads:1321
Description:

This is a small patch which replaces the CopyMem and CopyMemQuick functions
of exec.library. These functions, especially CopyMem, are used a lot. I
logged the use of this function in ram and ran out of memory in about a
minute with over 100MB of memory.

CopyMem060 is optimized for the 68060 processor. CopyMem040 is optimized for
the 68040 processor. CopyMem0x0 should run well on a 68040 or 68060 if the
move16 instruction is buggy or caching problems exist. It should run on a
68020 or 68030 processor as well but is not optimized for them.

CopyMem060 and CopyMem040 test for at least a 68040 processor. If it can't
find one, it doesn't install the patch and exits with a return code of 20
(=fail).
   
The code is based in part on Harry "Piru" Sintonen's CMQ060 which is very
good. Thanks Piru. However, I thought it tried to do too much and could be
faster with small copies. I also wanted to learn how to optimize for the
68060 and see what's possible.


Features:

Much faster than Piru's CMQ060 at small memory copies which are common.

Doesn't use movem as it's the same speed as move on the 68060 and slower on
the 68040.

Doesn't touch the MMU (directly) which is proper when using Thomas
Richter's mmu.library and 68060.library and also unnecessary.

Installation is fast, doesn't fragment memory, and doesn't use much memory.

The full source code is included. Assembled with PhxAss.

Free.


Installation:

Copy your flavor of CopyMem wherever you like. It runs from Workbench but I
recommend starting from cli in the S:startup-sequence after the setpatch
command. It does not detach from the shell so the command needed is...

Run >NIL: CopyMem060

As little as 512 bytes of stack should be fine. A control C will break
CopyMem causing it to uninstall which is dangerous because of how the exec
setpatch function works.


Notes about Move16:

move16 is an instruction of the 68040 and 68060 processors. It moves 16 bytes
at once and uses burst accesses if possible. Andreas Kleinert and Thomas
Richter said there could be problems with the move16 instruction on the
Amiga, especially in the chipram, caused by the DMA of the custom chips.
Personally, I have tested move16 with chip ram and it doesn't seem to be a
problem here. SysSpeed by Torsten Bach uses the move16 instruction in chip
ram and it's stable across a wide range of Amigas. The 68040 and 68060 user 
manuals are not clear on the subject. My opinion is that if the memory areas
are mapped correctly by the MMU so that burst will not take place where it
shouldn't, that the move16 instruction will not use burst in those areas and
will be safe to use. I have included CopyMem040.safe and CopyMem060.safe that
use move16 in fast ram only. They are slightly slower and larger.


Debugging:

I have included a Snoopy 2.0 (Aminet:util/moni/snoopy20.lha) script that
shows the value of A0 and A1 on return of CopyMem and CopyMemQuick. The
source + size should equal A0 on return and destination + size should equal
A1 on return with CopyMem060 only. Please report these values if they are
not. Also, be careful about logging all calls (default) as memory and buffers
will fill up *very* fast. It's best to use a filter so just the programs
wanted are logged. The Snoopy output location is specified in the script's
icon (tooltypes). Please report all bugs to the e-mail address at the top.


Speed comparison:

CopyMemQuicker V2.8 on Aminet:util/boot/COPMQR28.lha has a "TestIt"
program that is good for speed comparisons. Here is how you test CopyMem060
from a cli...

CopyMem060 >NIL:
CopyMemQuicker
TestIt

and CMQ060...

CMQ060
CopyMemQuicker
TestIt

and AmigaOS 3.9 default...

CopyMemQuicker
TestIt

Some test results...

Using "TestIt" from CopyMemQuicker V2.8

orig=original AmigaOS 3.9 routines
CMQ28=CopyMemQuicker v2.8
MCP=MCP v1.46
CMQ060=CopyMemQuick v1.5
CM060=CopyMem060 v1.1

orig CMQ28 MCP CMQ060 CM060

1.30 1.28 1.18 0.91 0.91 CM565×64kB L->L
0.64 0.43 0.35 0.35 0.35 CM147×64kB L->L+1
1.06 1.06 0.96 0.99 0.96 CM413×64kB L->E
0.61 0.43 0.35 0.36 0.34 CM147×64kB L->E+1
0.60 0.41 0.35 0.35 0.33 CM147×64kB L+1->L
1.01 0.86 0.79 0.61 0.61 CM382×64kB L+1->L+1
0.66 0.43 0.33 0.35 0.34 CM147×64kB L+1->E
1.18 1.18 1.19 1.21 1.19 CM501×64kB L+1->E+1
1.20 1.18 1.16 1.18 1.16 CM501×64kB E->L
0.59 0.41 0.35 0.35 0.35 CM147×64kB E->L+1
1.01 0.86 0.79 1.61 0.61 CM382×64kB E->E
0.64 0.43 0.35 1.35 0.34 CM147×64kB E->E+1
0.59 0.41 0.35 0.35 0.35 CM147×64kB E+1->L
1.06 1.06 0.98 0.96 0.98 CM413×64kB E+1->L+1
0.60 0.41 0.35 0.34 0.35 CM147×64kB E+1->E
1.30 1.26 1.16 0.91 0.90 CM564×64kB E+1->E+1
0.30 0.29 0.26 0.26 0.23 CM33900×1kB L->L
0.39 0.21 0.13 0.13 0.13 CM9400×1kB L->L+1
0.45 0.18 0.16 0.18 0.18 CM24000×1kB E->E
0.36 0.30 0.26 0.24 0.21 CM196000×128B L->L
0.48 0.26 0.21 0.20 0.16 CM155000×128B E->E
0.54 0.41 0.45 0.34 0.30 CM588000×19B L->L
0.55 0.33 0.46 0.31 0.30 CM622000×18B L->L
0.48 0.45 0.44 0.33 0.31 CM663000×17B L->L
0.53 0.48 0.53 0.46 0.39 CM956000×16B L->L
0.56 0.48 0.54 0.35 0.16 CM1060000×8B L->L
0.51 0.43 0.53 0.26 0.06 CM1430000×4B L->L
0.45 0.41 0.83 0.38 0.18 CM2190000×1B L->L

1.30 1.28 1.18 0.91 0.91 CMQ565×64kB L->L
0.30 0.30 0.24 0.25 0.23 CMQ33900×1kB L->L
0.34 0.28 0.23 0.21 0.20 CMQ196000×128B L->L
0.46 0.43 0.46 0.36 0.28 CMQ956000×16B L->L
0.34 0.40 0.46 0.25 0.15 CMQ1060000×8B L->L
0.24 0.36 0.48 0.15 0.09 CMQ1430000×4B L->L
---- ---- ---- ---- ----
22.79 19.53 18.99 15.89 14.69

14.30% speedup for CMQ v2.8
16.67% speedup for MCP v1.46
30.28% speedup for CMQ060 v1.5
35.54% speedup for CopyMem060 v1.1

Actual results will vary but these are typical when some of the data to be
copied is in the data cache. Turning off the data cache and testing will show
results if none of the copy data is in the data cache. The move16 instruction
has an even larger speed advantage when the data to be copied is not in the
data cache.

Results sent to me from Amigian show a larger speedup with CopyMem040 on the
68040. 


Future:

I may make a Blizkick/Remus patch for the AmigaOS 3.9 BB2 exec.library. I may
make a CPUpatch that detects the proper CPU and installs CPU specific
patches. I may add a mmu.library protected option to protect the patches and
possibly speed them up more. I may make an optimized CopyMem020 and/or
CopyMem030 for the 68020 and 68030 if I find there is enough benefit and
interest.


History:

1.0 (02.05.09)
 First release
1.1 (28.06.09)
 Fixed bugs in CopyMem0x0 and CopyMem060.safe
 A new CopyMem040 thanks to Amigian's testing
 A few optimizations in CopyMem060 and much smaller now
 

Thanks:

Amigian for bug and performance testing CopyMem040.
Harry "Piru" Sintonen and Dirk Busse for CMQ060.
Arthur Hagen for his TestIt program.


Contents of util/boot/CopyMem.lha
 PERMSSN    UID  GID    PACKED    SIZE  RATIO METHOD CRC     STAMP          NAME
---------- ----------- ------- ------- ------ ---------- ------------ -------------
[generic]                  193     371  52.0% -lh5- 90af Jun 28 22:59 CopyMem.script
[generic]                  956    1828  52.3% -lh5- 0c18 Aug 12 21:36 CopyMem.script.info
[generic]                  365     932  39.2% -lh5- c8aa Jun 28 23:09 CopyMem040
[generic]                 1542    8377  18.4% -lh5- cdbe Jul 25 11:48 CopyMem040.a
[generic]                  548     732  74.9% -lh5- cb9a Aug 12 21:36 CopyMem040.info
[generic]                  378     964  39.2% -lh5- 586a Jun 28 23:09 CopyMem040.safe
[generic]                  548     732  74.9% -lh5- ddab Aug 12 21:36 CopyMem040.safe.info
[generic]                  325     492  66.1% -lh5- 9fd7 Jun 28 21:12 CopyMem060
[generic]                 1445    5057  28.6% -lh5- 3dd2 Jul 25 11:47 CopyMem060.a
[generic]                  550     732  75.1% -lh5- b1f3 Aug 12 21:36 CopyMem060.info
[generic]                  346     524  66.0% -lh5- b4c2 Jun 28 21:15 CopyMem060.safe
[generic]                  548     732  74.9% -lh5- 2831 Aug 12 21:36 CopyMem060.safe.info
[generic]                  280     472  59.3% -lh5- f91a Aug 12 20:36 CopyMem0x0
[generic]                 1646    6089  27.0% -lh5- 2e80 Aug 12 20:36 CopyMem0x0.a
[generic]                  548     732  74.9% -lh5- 534c Aug 12 21:36 CopyMem0x0.info
[generic]                 2970    6965  42.6% -lh5- 79f5 Aug 12 21:30 ReadMe
[generic]                 1458    2550  57.2% -lh5- 67f3 Aug 12 21:36 ReadMe.info
---------- ----------- ------- ------- ------ ---------- ------------ -------------
 Total        17 files   14646   38281  38.3%            Aug 12 21:15

Aminet © 1992-2017 Urban Müller and the Aminet team. Aminet contact address: <aminetaminet net>