[Rant] Rant: Advances in computer technology...? | ANN.lu |
Posted on 23-May-2000 14:52 GMT by Christian Kemp | 13 comments View flat View list |
A minor rant on why the advances in computer technology often do not matter much to the end user, because it is still impossible to accomplish simple but unusual tasks. Today's task: inject a 80 MB access.log into a database.
Computer technology has come a long way since I bought my first Amiga almost 10 years ago. Generation after generation, computers became faster, had more RAM and overall promised to do any work just faster.
Fast forward from 1990 to last Sunday. I have an old A3000 standing in a corner of a room, and below my desk there's a P450 running Windows and more and more infrequently, WinUAE. I had been planning to get my server logs from the last three months and inject them into a database, one month at a time.
Approximate size of my March server logs: 80 MB. Maximum RAM in my A3000 with WarpEngine: 64 MB. Maximum RAM in WinUAE emulation: Less than that. So no matter what I do, I have to use the PC, which is equipped with 128 MB.
So while the choice for the hardware has been made (and in fact, I already made that choice for my everyday-hardware platform a year ago), I still had to decide on the software.
The problem: there is no program available that will take an access.log and automatically inject it into a database, so that you can run queries on them. Access just plain rejects anything that is not in comma-separated-values format, and I don't have any other program that could deal with the amounts of data I'd need to handle. A normal spreadsheet will not accept more than about 32000 or 64000 rows, I need half a million.
So the only solution, without writing a custom program, seems to be a text editor with macro capabilities. Should be relatively trivial, shouldn't it? Wrong. 128 MB RAM is apparently not enough to hold both Windows, an 80 MB text file and still have enough room to work in. The solution: Not applying the macro to the global document, but only to a selection at a time. The problem: even running the macro on 50,000 lines at a time takes half an hour each. Saving the file takes five minutes, at least. Since the memory requirements are too high to multitask adequately, it's difficult to get this done. Either I sit in frontg of the computer even though I can't do much, or I go elsewhere and do not notice when it terminates, or runs out of memory.
Even though I had an entire day and an evening, I didn't manage to run the macro ten times without screwing up the format of my server log. Since each attempt adds 30 minutes to the duration of this conversion, undoing the last modifications means losing half an hour of work. I have yet to do any queries in Access, but preliminary test indicate that this, at least, should work.
What's the point?, I hear you scream. My point is that, even though I've now seen ten years worth of promises of better and more advanced computers, that I still can't accomplish the most simple tasks in an adequate period of time.
Computers have indeed come a long way, but it sure doesn't show. While things might be flashier and overall look nicer than 5 or 10 years ago, it is still impossible to use computers for very specific tasks without getting your hands dirty by writing own code.
I'm not sure if there will be any solution to this problem, ever. Will Amiga Inc. provide the framework for more intelligent computers? Will traditional operating systems be expanded to actually cater for specific user needs? Are scripting languages such as Perl, Rebol or VBS the solution for my kind of problem?
I'm not sure.
|
|
Rant: Advances in computer technology...? : Comment 6 of 13 | ANN.lu |
Posted by Brad Devlin on 22-May-2000 22:00 GMT | This is actually really easy to do Christian. Define your Access.log data structure as a text ODBC data source using the ODBC administrator. I would probably use VB Script and ADO to pump the data from the log file to your destination database using the add method. You could throw this script into any office document form and use a button, or script a schedule using that Windows scheduler thing (haven't used it). There are also a number of DLT (Data Load/Transform) tools available, but they would probably break the bank.
I know this is an Amiga group, but Christian is using Windows and the solution is there. |
|
Rant: Advances in computer technology...? : Comment 7 of 13 | ANN.lu |
Posted by John Block on 22-May-2000 22:00 GMT | In my view you are taking the wrong approach and need software specially designed for the job.
A free perl solution sits on your server and can be interrogated through your browser:
http://www.awsd.com/scripts/weblog/index.shtml
This extracts lots and lots of data.
If you want to spend some money, I was shown funnelweb from http://www.activeconcepts.com which will run on your PC for 219 pounds.
(Just got back from Internet World exhibition!)
John |
|
Rant: Advances in computer technology...? : Comment 1 of 13 | ANN.lu |
Posted by nOw2 on 22-May-2000 22:00 GMT | You know, SuperBase 4 on the Amiga can handle that amount of data
okay.. :)
Importing it is the tricky thing, but it shouldn't be too difficult to
configure the import filter. |
|
Rant: Advances in computer technology...? : Comment 2 of 13 | ANN.lu |
Posted by Artur Pietruk on 22-May-2000 22:00 GMT | 1. Take your A3000 and make sure there is enough room on your HD.
Install GG (GeekGadgets) with perl or something like this and write
simple script which will output file with inserts.
Or:
2. Install somewhere (A3000, P450) linux (or *bsd) with PostgreSQL.
Write a script which will fill database. |
|
Rant: Advances in computer technology...? : Comment 3 of 13 | ANN.lu |
Posted by John waller on 22-May-2000 22:00 GMT | If you do not mind writing a program, it is simple to do in Access. Access allows you to open and read a text file, line by line. So you create a table with your fields, and then parse your log one line at a time.
You can also do this with Superbase. (I have done it with both programs.)
You could also simply import into an Access table, then use a query or program to parse it into another table.
I would not advise using Superbase on the 3000, simply because of the speed. I tried processing a 25 meg database on it, and it took all night just to create an index...... The Amiga is nice, but, by today's standards it is slo-o-o-o-w :( |
|
Rant: Advances in computer technology...? : Comment 4 of 13 | ANN.lu |
Posted by jools on 22-May-2000 22:00 GMT | well, with the use of a little virtual
memory it really should be no problem. i have handled
vast amounts of data (Around 400-500 meg of gfx data)
) on my little 030 a1200 with just
8 meg fast ram with the use of virtual memory.
you could write a small program
to process the data into something
which access would like or just code
abit of perl to search the data. im sure you like
to code perl :) |
|
Rant: Advances in computer technology...? : Comment 5 of 13 | ANN.lu |
Posted by Someone on 22-May-2000 22:00 GMT | I think you've made two kinds of mistakes:
1) Overgeneralization. Just because Microsoft tools let you down, that doesn't
mean that computers can't do a good job. Try other software.
2) Except for clockspeed, your P450 running Windows IS NOT more advanced
technology than your 1990 Amiga 3000. They are both mid-late 1980s
personal computer technology. They're not quite identical (especially
in aesthetic terms), but awefully close technologically. It's not "ten
years of more advanced computers"; it's ten years of stagnation. The
reason that many of us still use Amigas is that nothing better has come
along.
BTW, I have some ideas for you:
1) MS Access is a toy. Get a real database. Even a 1984 copy of dBase III
running under MS-DOS 3.10 can trivially handle problems that are difficult
with Access. 1980s technology can handle this problem, you just have to
know how to use it.
2) If your queries are simple and you don't need to run many of them, then the
"Unix Way" of just passing the whole file as astream to a series of filters
might work pretty well.
3) If you reason for not doing it on the Amiga is lack of RAM, then...
Have you tried running VMM on your Amiga? :) |
|
Rant: Advances in computer technology...? : Comment 8 of 13 | ANN.lu |
Posted by Lee Bosch on 23-May-2000 22:00 GMT | In reply to Comment 3 (John waller): I'm wondering what kind of information in a log could be split out.
Would you want to parse the domain information?
I used a 1.5MB Amiga 1000 with a 100MB hard drive to parse the Item
Master file when I ported it to another program. With 12000 records
consisting of about 30 or so fields, I massaged the database into
something that COBOL wouldn't gag on in about two hours. The database
was indexed as it was imported on the 10 character item number.
From there I jammed it all through an awk script which which generated a
highly specialized set of input records (eight lines per item with
implied decimals) which took about an hour and a half to process. I
ended up re-running the awk script on a 386SX20 Xenix box and it took
about 40 minutes to render.
I don't think I would try to use some fancy forms designer to do this
kind of stuff as I feel I had a lot better formatting control with C
style formatting as well as being able to do some math on certain fields
and shift things around for the implied decimal points.
SBase is decidedly sluggish and configuring it is very clumsy. dBASE on
an XT is much faster and for my money, the command line stuff is just
too much easier to tweak and use. When you get the output that you
want, you pipe it to a file instead of the screen.
I've used Access to do a simple contact database and I have to admit
that I'm baffled why they still market it. It is the second worst
database program I've ever had to endure (next to Works). I still do
most of my serious database work in SBase but now I use a 4.5MB A500 so
that I can put my import files in RAM:.
Lee Bosch |
|
Rant: Advances in computer technology...? : Comment 9 of 13 | ANN.lu |
Posted by Matthew on 23-May-2000 22:00 GMT | grep, sed, tr, perl, etc are your friends. |
|
Rant: Advances in computer technology...? : Comment 10 of 13 | ANN.lu |
Posted by Mark on 23-May-2000 22:00 GMT | 5 lines of perl, tops. 3 if you are good. |
|
Rant: Advances in computer technology...? : Comment 11 of 13 | ANN.lu |
Posted by bobbie sellers on 23-May-2000 22:00 GMT | I am not a code jockey nor do I originate scripts.
But I would use a splitting utilitiy to break the access.log into
more reasonable sized chunks and then do the import process into
whatever data handling program you want to use.
I used to split up very large text files to manipulate them
in my text processor when I had an as yet unrealised memory problem
on my A2000 running at the time a mere 68000 with 8 Megabytes of
mixed but too much bad memory.
Later
bliss |
|
Rant: Advances in computer technology...? : Comment 12 of 13 | ANN.lu |
Posted by Olivier Fabre on 24-May-2000 22:00 GMT | Depending on what you want to do, an ARexx script could be written in, say, 15 minutes, read through the whole 80MB file line per line and display whatever statistics you need at the end.
A very rough estimation of the time needed to process 80MB of data with a 68060 Amiga : 80 minutes. (based on the time my AminetExtract.rexx script takes to read the Aminet INDEX, IIRC !) |
|
Rant: Advances in computer technology...? : Comment 13 of 13 | ANN.lu |
Posted by Slim on 25-May-2000 22:00 GMT | Well Christian, |
|
Anonymous, there are 13 items in your selection |
|
- User Menu
-
- About ANN archives
- The ANN archives is powered by #AmigaZeux. It was updated daily (news last: 22-Oct-2004; comments last: 18-May-2005).
ANN.lu was created, previously owned and maintained by Christian Kemp, www.ckemp.com.
- Contribute
- Not possible at this time!
- Search ANN archives
- Advanced search
- Hosting
- ANN.lu was hosted by Dreamhost. Sign up through this link, mention "ckemp" as referrer and he will get a 10% commission on any account you purchase.
Please show your appreciation for any past, present and future work on ANN.lu by making a contribution via PayPal.
|