AminetAminet
Search:
83199 packages online
About
Recent
Browse
Search
Upload
Setup
Services

util/boot/CopyMemAIO.lha

Mirror:Random
Showing:m68k-amigaosppc-amigaosppc-morphosi386-arosi386-amithlonppc-warpupppc-powerupgeneric
No screenshot available
Short:CopyMem 4 all incl 68080 & NATIVE x86
Author:Holger.Hippenstiel AT gmx.de
Uploader:Holger Hippenstiel nc-online de
Type:util/boot
Version:4.0
Replaces:util/boot/CopyMemAIO.lha
Architecture:m68k-amigaos >= 3.0.0
Distribution:Aminet
Kurz:CopyMem fuer alle inkl 080 & NATIV x86
Date:2020-09-17
Download:http://aminet.net/util/boot/CopyMemAIO.lha - View contents
Readme:http://aminet.net/util/boot/CopyMemAIO.readme
Downloads:563
CopyMemAIO V4.0
===============

TL;DR Install CopyMemAIO in C:, Call in Startup-Sequence after SetPatch.
No need to run it, automaticly selects the best code for your cpu.

For native CopyMem which is 3 times quicker, copy the
winuae_dll/CopyMemAIO.alib to your WinUAE-folder/winuae_dll.
You need to enable the Option under "Miscellaneous" -> "Allow native code".
*** Native Code only works on WinUAE_x86, not on x64 !! ***
There is no need to use the 64-Bitversion anyway.

CopyMem() and CopyMemQuick() are two essential functions of exec.library
These functions are used a lot by the operating system, so all os-functions
and programs benefit from replacing this function with a quicker one.
There where a lot of replacements to speed it up:

Feb.1993 by Arthur Hagen - 68000-68020 Copies best function to ramblock,
relies on AllocMem for alignment, uses mostly movem.l (ax)+,dn-dm/an-am
http://aminet.net/package/util/misc/CopyMemQuicker

Aug.1994 by Arthur Hagen - 68000-68020 Copies best function to ramblock,
aligns the codeentries to /16 divisable Adresses, JmpTable, Multiple movem.l
http://aminet.net/package/util/boot/COPMQR28

Oct.1996 by Allenbrand Brice - written for 68040, works with 68060
No copying/detaching, has to be started with "Run >NIL: ...",
code alignment purely dependend on hunk-loading. Single move16 in a loop ..
http://aminet.net/package/util/boot/PCM_1.0

May.1999 by Dirk Busse - written for 68030, works with 68020-68060, but all
use the same function, aligns the codeentries to /16 divisable Adresses,
enrolled move.l (an)+,(am)+ loops.
http://aminet.net/package/util/boot/CMQ030

Jul.1999 by Dirk Busse - written for 68060, works with 68040, but use the
same function, aligns the codeentries to /16 divisable Adresses,
enrolled move16 & move.l (an)+,(am)+ loops.
CMQ060 permantly checks for <$1000000 Address-Range (SAFE-Mode)
CMQ060Move16 does not check for <$1000000 Address-Range, so a bit faster.
http://aminet.net/package/util/boot/CMQ060

Nov.2000 by Harry "Piru" Sintonen - written for 68060, works with 68040,
but use the same function, aligns the codeentries to /16 divisable Adresses,
enrolled move16, movem.l & move.l (an)+,(am)+ loops.
Wont install on MorphOS.
http://aminet.net/package/util/boot/NewCMQ060

Aug.2009 by Matt Hey - three versions - written for 680x0, 68040 & 68060.
No copying/detaching, has to be started with "Run >NIL: ...",
code alignment purely dependend on hunk-loading.
Big enrolled move16 & move.l (an)+,(am)+ loops.
Best use of cache-size/burstloading.
But has to be run and single files for each processor and Safe-Mode.
http://aminet.net/package/util/boot/CopyMem

Aug.2020 by Holger Hippenstiel - mainly based on CopyMem by Matt Hey
Copies best function to ramblock, aligns the codeentries to /16 Adresses.
No need to run it.

Thanks to Jan Zahurancik who tested the code on an Vampire V4 Standalone
and confirmed it's the best method for 68080.

I wrote a Benchmark which tests all Functions written so far and it can
test if the copymem-functions work correctly with different sizes.

The memory-layout is the same as CopyMemQuicker, so "TestIt" will believe
CopyMemQuicker is running.

For a fast emulation most important are the Advanced JIT Settings in WinUAE:
Cache Size: 16MB
Check FPU Support
Check Constant Jump
NoCheck Hard flush
Select Direct
Check No flags
Check Catch unexpected exceptions

If you want to try "TestIt" on WinUAE with a descent fast machine,
notice it will crash with Division by Zero, take a look at
http://aminet.net/package/util/boot/NoMoreDiv0 from me to fix that problem.

For real Amigas it can be started with Argument "S" or "SAFE",
then source and destination must be in 24bit-space for move16-operation.
The SAFE Option is only needed for controllers which can only do 24Bit-DMA,
like the A2091, but there is a driverpatch for that:
http://aminet.net/package/driver/media/vbak2091

Starting CopyMemAIO again removes the patches.

*****************************************************************************

Update V4.0:

Dedicated CopyMemQuick for 68080.

Major rework and implemented native copymem for (Win)UAE.

Testresults from BenchCM:

AMD Ryzen 5 3600X 4.4Ghz 3466Mhz Ram
--------+----------+----------+----------+---------+---------+---------+
Testsize|      64kb|       8kb|       4kb|      2kb|      1kb|512 bytes|
--------+----------+----------+----------+---------+---------+---------+
CM0x0   |12288 MB/s| 8090 MB/s| 6500 MB/s|4575 MB/s|2883 MB/s|1611 MB/s|
CM040   |11650 MB/s| 6152 MB/s| 4231 MB/s|4670 MB/s|2640 MB/s|1601 MB/s|
CM060   |10834 MB/s| 6575 MB/s| 4868 MB/s| 663 MB/s| 560 MB/s| 518 MB/s|
CMNative|45624 MB/s|18284 MB/s|10674 MB/s|5821 MB/s|2956 MB/s|1591 MB/s|

Intel Core i7-4790k 4.4Ghz 2400Mhz Ram
--------+----------+----------+----------+---------+---------+---------+
Testsize|      64kb|       8kb|       4kb|      2kb|      1kb|512 bytes|
--------+----------+----------+----------+---------+---------+---------+
CM0x0   |12862 MB/s|10615 MB/s| 8595 MB/s|6080 MB/s|3832 MB/s|2245 MB/s|
CM040   |10073 MB/s| 8527 MB/s| 5989 MB/s|5827 MB/s|3760 MB/s|2260 MB/s|
CM060   | 9249 MB/s| 9155 MB/s| 6872 MB/s| 982 MB/s| 901 MB/s| 772 MB/s|
CMNative|36806 MB/s|26455 MB/s|15666 MB/s|8474 MB/s|4561 MB/s|2313 MB/s|

Intel Core i5-2500k 4Ghz 1600Mhz Ram
--------+----------+----------+----------+---------+---------+---------+
Testsize|      64kb|       8kb|       4kb|      2kb|      1kb|512 bytes|
--------+----------+----------+----------+---------+---------+---------+
CM0x0   | 8454 MB/s| 6448 MB/s| 5369 MB/s|3833 MB/s|2407 MB/s|1365 MB/s|
CM040   | 8641 MB/s| 6186 MB/s| 4297 MB/s|3851 MB/s|2413 MB/s|1368 MB/s|
CM060   | 7790 MB/s| 6556 MB/s| 4859 MB/s| 752 MB/s| 668 MB/s| 552 MB/s|
CMNative|29090 MB/s|15409 MB/s| 8808 MB/s|4671 MB/s|2389 MB/s|1221 MB/s|

Intel Celeron J3355 2Ghz 1333Mhz Ram
--------+----------+----------+----------+---------+---------+---------+
Testsize|      64kb|       8kb|       4kb|      2kb|      1kb|512 bytes|
--------+----------+----------+----------+---------+---------+---------+
CM0x0   | 3086 MB/s| 2475 MB/s| 2154 MB/s|1595 MB/s|1023 MB/s| 584 MB/s|
CM040   | 3176 MB/s| 2482 MB/s| 1974 MB/s|1578 MB/s|1030 MB/s| 580 MB/s|
CM060   | 2970 MB/s| 2423 MB/s| 1981 MB/s| 548 MB/s| 459 MB/s| 345 MB/s|
CMNative| 9532 MB/s| 4751 MB/s| 3635 MB/s|1960 MB/s| 978 MB/s| 523 MB/s|

As you can see the native code is 3 times faster, but depending on the
processor the overhead for calling the native code is only worth above
1kb copysize, so when installing native code it will use CM0x0 below
1024 bytes.

Testresults for 64kb may be a bit to high what the ram really can do because
the large Caches on x86 come in to effect, will max out around 1MB Testsize
and drop back to around 95% speed of 64kb Testsize with 4MB Testsize.
This cache effect happens to the 680x0-code parts aswell, but cleaning the
cache all the time, or measuring for longer isnt really worth the effort.

A lot of functions will feel smoother now, Icon-Drawing/Window-Dragging
and so on, all use CopyMem().

Possible Arguments are now:
S=SAFE=SAFEMODE/S,V=VERBOSE/S,NN=NONATIVE/S:

SafeMode for Amigas with Zorro II-Controller who can only do 24bit-dma.
Verbose will output which code was installed/when it was removed.
NoNative will not use the native function, just the optimized 680x0 code.

Included MemTest from http://aminet.net/package/misc/emu/RaMithlon,
which copies different memoryblocks and checks if the code is working.

I get on a 4790k with WinUAE 4.4 68060-Emulation:
Size  |  Iter   | No CMQ | 040  | 0x0  | Native
------+---------+--------+------+------+-------
   4kb| 1000000 |   42   |  34  |  24  |   7
  16kb|  250000 |   40   |  25  |  22  |   5
  64kb|   22500 |   14   |   7  |   8  |   2
 256kb|    1125 |    4   |   2  |   2  |   0
1024kb|     350 |    4   |   2  |   2  |   0

How to install:
Install CopyMemAIO in C:, Call in Startup-Sequence after SetPatch or in User-
Startup. No need to run it, automaticly selects the best code for your cpu.
For native CopyMem which is 3 times quicker, copy the
winuae_dll/CopyMemAIO.alib to your WinUAE-folder/winuae_dll.
You need to enable the Option under "Miscellaneous" -> "Allow native code".
*** Native Code only works on WinUAE_x86, not on x64 !! ***
There is no need to use the 64-Bitversion anyway.

    DISCLAIMER

        This software is subject to the "Standard Amiga FD-Software Copyright
        Note". It is Giftware as defined in paragraph 4g. If you like it and
		use it regulary, please send me a small gift.
		For more information please read "AFD-COPYRIGHT".

        Diese Software unterliegt der "Standard Amiga FD-Software Copyright
        Note". Sie ist Giftware wie definiert in Absatz 4g. Falls du sie magst
		und regelmaessig benutzt, sende bitte ein kleines Geschenk.
		Fuer mehr Informationen lies bitte "AFD-COPYRIGHT".

        (/pub/aminet/docs/misc/AFD-FilesV-XX.lha V=Version,XX=Languages)

    AUTHOR

        Please send comments, bug-reports or small gifts like a Vampire V4
        or a now "worthless :P" NVidia RTX 2080 Ti, or Paypal me to:

        Holger.Hippenstiel AT gmx.de
        Hauptstr. 38
        71229 Leonberg
        Germany


Contents of util/boot/CopyMemAIO.lha
PERMISSION  UID  GID    PACKED    SIZE  RATIO METHOD CRC     STAMP     NAME
---------- ----------- ------- ------- ------ ---------- ------------ ----------
[unknown]                 2898    7381  39.3% -lh5- d49c Oct 27  1999 afd-copyright
[unknown]                  841    1611  52.2% -lh5- b804 Sep 10 13:50 AFD-COPYRIGHT.info
[unknown]                 1273    2136  59.6% -lh5- b0a5 Sep 17 15:13 benchmarks/BenchCM
[unknown]                 1225    2028  60.4% -lh5- b128 Jan  6  2002 benchmarks/MemTest
[unknown]                 3308    6048  54.7% -lh5- 3eb8 Sep  9 10:54 benchmarks/TestIt
[unknown]                 1625    4100  39.6% -lh5- 2033 Sep 17 15:13 CopyMemAIO
[unknown]                 3529    9202  38.4% -lh5- aab7 Sep 17 15:26 CopyMemAIO.txt
[unknown]                 1742    5367  32.5% -lh5- 8195 Sep 17 15:13 source/BenchCM.s
[unknown]                 2453    7333  33.5% -lh5- 0916 Sep 17 14:57 source/CopyMemAIO.s
[unknown]                  804    2578  31.2% -lh5- 66dc Sep  7 12:41 source/Func_CM040.s
[unknown]                  712    1974  36.1% -lh5- 8b2e Sep  7 12:44 source/Func_CM060.s
[unknown]                  292     586  49.8% -lh5- 94e5 Sep  9 08:20 source/Func_CM080.s
[unknown]                  495    1230  40.2% -lh5- c623 Aug 25 22:23 source/Func_CM0x0.s
[unknown]                  619    1817  34.1% -lh5- afac Sep  7 12:46 source/Func_CMMOVEM.s
[unknown]                 1134    3007  37.7% -lh5- 5b52 Sep 10 10:32 source/Func_CMNative.s
[unknown]                  688    1977  34.8% -lh5- b501 Sep  7 12:37 source/Func_CMQ040.s
[unknown]                  562    1419  39.6% -lh5- f3de Sep  7 12:39 source/Func_CMQ060.s
[unknown]                  187     357  52.4% -lh5- 4af8 Sep 17 14:55 source/Func_CMQ080.s
[unknown]                  394    1495  26.4% -lh5- 0163 Aug 25 22:12 source/Func_CMQ0x0.s
[unknown]                  404    1019  39.6% -lh5- ad39 Sep  7 12:46 source/Func_CMQMOVEM.s
[unknown]                 5826   11264  51.7% -lh5- 58a0 Sep 17 15:30 winuae_dll/CopyMemAIO.alib
---------- ----------- ------- ------- ------ ---------- ------------ ----------
 Total        21 files   31011   73929  41.9%            Sep 18 02:43

Aminet © 1992-2020 Urban Müller and the Aminet team. Aminet contact address: <aminetaminet net>