AminetAminet
Search:
84479 packages online
About
Recent
Browse
Search
Upload
Setup
Services

util/rexx/stms.lha

Mirror:Random
Showing:m68k-amigaosppc-amigaosppc-morphosi386-arosi386-amithlonppc-warpupppc-powerupgeneric
No screenshot available
Short:*(.dclst).xml, manipulation scripts
Author:megacz(megacz at usa.com)
Uploader:megacz(megacz usa com)
Type:util/rexx
Version:0.00
Architecture:generic
Distribution:Aminet
Date:2006-07-10
Download:http://aminet.net/util/rexx/stms.lha - View contents
Readme:http://aminet.net/util/rexx/stms.readme
Downloads:573

megacz's simple text manipulation scripts, (megacz at usa.com)

hello!

this archive contains two ARexx scripts: -extractlist- 
and -scbsc- that could be very helpful for those who
need to extract *.dclst files for some reason.

please note that these scripts will not allow u to unpack
*.dclst! in order to do so u need to download such list
using DC++ and add interesting u things to the queue[.xml]
and then copy that file and process it with my scripts.

also note that DC++ could be used with WINe under linux(x86)!


-extractlist-

"template: <in_file> <out_file> <word/#?[,T]>"

[,T] allows to process non *.xml files.

this script will extract full path from queue.xml file,
example1: 'extractlist queue.xml plain.txt #?' or
example2: 'extractlist queue.xml plain.txt .avi'
-last one will extract only paths that contain .avi

as a result u will get plain text file which is
smaller and more user friendly(easier to read).


-scbsc-

with this one, u can cut out words/paths ure not
interested in plus remove dupes and cut out N words
strings.

"
template: <in_file> <out_file> <'char'> <'last'/'first'/num[,to]>
          [dd] [cut:n[,m]]

          char could be like this '/' or like this 'X2C(2F)'
          if u want lower case use inverted commas to enclose.
          'last' and 'first' are the words!, 0 as a num will not
          touch the strings allowing cutting, if [,to] number
          is being passed num become 'from', [dd] removes dupes.
          [cut:n[,m]] cuts out string that contains n words,
          if n and m r specified it will work in range of n to m.
"

example: 'scbsc plain.txt plain.clean.txt "" last dd cut:2'
-such setup will strip last entry after the "" from each
line plus remove duplicates plus remove 2 word long strings,
note that 2 word long strings doesnt mean those separated 
by spaces, but the 'char' like: "(1st)what now?(2nd)dont know".

check out 'queue.xml.lzx' that i needed to extract,
its huge(8 mib,62268 lines) so extracting may take some time!
this file cotains filelist from some mate that
has mega big collection of rock music and lots
of stoner related stuff and i wanted to have
clean list of 'stoner' bands for further use.

play with the options on smaller files to see what this
thing is capable of.

heres how i used it for "space"(genere):

"
1. extractlist l0g:Queue.xml RAM:Queue.txt space
2. scbsc RAM:Queue.txt RAM:Queue.txt.pre "" last dd cut:1,2
3. scbsc RAM:Queue.txt.pre RAM:Queue.txt.pre_2 "" 4,30
4. extractlist RAM:Queue.txt.pre_2 RAM:Queue.txt space,t
"

explanation:

1. searching for 'space' word in the xml strings that contain download path
2. cutting out last entry after the "", usually files(*.mp3), removing
   dupes, coz each entry will be same as previous(directory), after cutting
'last'
   and snipping off entries that contain 1 to up 2 directories in path
3. cutting prefixes before the desired direcotry name in this case
   starting from 4th dir and allowing up to 30 dirs in path
4. removing extra 'space' entries in each string that occured in song
names(*.mp3)



bye!


~also note that these scripts arent tested too deep as
i do not need them anymore, but i thought that maybe
someone else could find them useful too, so releasing.


Contents of util/rexx/stms.lha
 PERMSSN    UID  GID    PACKED    SIZE  RATIO METHOD CRC     STAMP          NAME
---------- ----------- ------- ------- ------ ---------- ------------ -------------
[generic]                  734    1584  46.3% -lh5- 4f96 Jul  9 20:12 stms/extractlist
[generic]               511806  511806 100.0% -lh0- 9c7a Jul  9 16:49 stms/Queue.xml.lzx
[generic]                 1539    3244  47.4% -lh5- c36f Jul 10 13:30 stms/readme.txt
[generic]                 1832    4458  41.1% -lh5- 776e Jul  9 20:14 stms/scbsc
---------- ----------- ------- ------- ------ ---------- ------------ -------------
 Total         4 files  515911  521092  99.0%            Jul 10 15:41

Aminet © 1992-2024 Urban Müller and the Aminet team. Aminet contact address: <aminetaminet net>