Text area: "The fantomas spyFetcher(TM) Module: Automatic botBase Maintenance SYSTEM REQUIREMENTS INSTALLATION - UNIX UNINSTALLING THE PROGRAM WORKING WITH fantomas spyFetcher(TM) CONFIGURATION OF CRON JOBS ERROR HANDLING KNOWN ISSUES UPDATES + PROGRAM CHRONOLOGY CONTACT + SUPPORT ====================================================================== SYSTEM REQUIREMENTS ------------------- Language ------- Perl 5 Module ------ Perl module Wget More info under: < http://www.gnu.org/software/wget/wget.html > UNIX ---- The Unix system requires an installed web server. Execution of CGI scripts must be enabled. A directory for execution of CGI scripts must be existent. Usually, this will be directoy /cgi-bin/. Tested under: SuSE LINUX with Apache Red Hat Linux with Apache BSDI Unix with Apache Browser ------- Script is called and executed via web browser. You will currently achieve best results under MS Internet Explorer 5+. Netscape 4.7 may require adjustment of font size. Tested under: IE 6+, Netscape 4.7, Netscape 7+, Opera 6+ ====================================================================== [ Close window ]
INSTALLATION ------------ The following files are included: spyfetcher-e.cgi --- (program script) fantomas.gif --- (logo/graphics file) sfehelp.txt --- (documentation in TXT format) <--- THIS FILE YOU ARE READING! fa_license-e.txt --- (License Agreement And Terms Of Usage - PLEASE READ!) ----------------------------------- ADJUSTMENTS IN FILE "spyfetcher-e.cgi" (please edit in ASCII or plain text editor like Notepad etc.) ----------------------------------- UNIX ---- * Please check path to location of Perl. The default path in the script is "/usr/bin/perl". If you don't know this path, you can check it out under telnet by entering Unix command "whereis perl". You may have to adjust the first line in the script "spyfetcher- e.cgi" accordingly. * The variables in the script "spyfetcher-e.cgi": "$stats_dir", "$robot_file", "$log_file", "$wget_cmd", "$sendmail", "$from_mail", "$to_mail", "$subject", $cloak_for_google, "$user and "$pw" may optionally be adjusted to your requirements. A comprehensive description of these variables can be found below in chapter "WORKING WITH fantomas spyFetcher(TM)". * The script "spyfetcher-e.cgi", the file "fantomas.gif" and the help file "sfehelp.txt" must be copied into the Unix server's CGI directory. * The CGI directory must be endowed with the following permissions: "chmod 755" [drwxr-xr-x] * Next, create the directory defined as variable "$stats_dir" with the following permissions: "chmod 777" [drwxrwxrwx](Default name is "stats".) * When uploading via FTP, make sure to transfer ALL files in ASCII mode. EXCEPTION: the graphics file "fantomas.gif" which must be transferred in BINARY or AUTOMATIC mode. * Required file permissions: spyfetcher-e.cgi: "chmod 755" [-rwxr-xr-x] fantomas.gif: "chmod 444" [-r--r--r--] sfehelp.txt: "chmod 444" [-r--r--r--] ====================================================================== [ Close window ]
UNINSTALLING THE PROGRAM ------------------------ For complete uninstall, delete the following: spyfetcher-e.cgi fantomas.gif sfehelp.txt The directory "stats" or whatever directory you defined under "$stats_dir" including contents. ====================================================================== WORKING WITH fantomas spyFetcher(TM) ------------------------------------ Program Description ------------------- The fantomas spyFetcher(TM) is a script which allows you to get the latest fantomas spiderSpy(TM) botBase as a packed archive in .ZIP format. The botBase will be unpacked and saved on your server in the directory defined under "$stats_dir" with the file name defined under "$robot_file". [ Close window ]
--------------------------------- Customization of script variables --------------------------------- The following variables may optionally be customized in script "spyfetcher-e.cgi": * $stats_dir This variable defines the directory where the spider robots list file shall reside as absolute path in this format: Example: "/usr/www/htdocs/yourdomain/cgi-bin/stats" * $robot_file This variable defines the file name of the spider robots list file. Default file name is "spiderspy.txt". * $log_file This variable defines the file name of the transfer log file. Default file name is "transfer.log". * $wget_cmd This variable defines the command call for wget. Default configuration is "/usr/bin/wget". If you don't know this path, you can check it out under telnet by entering Unix command "whereis wget". Else, please inquire with your system administrator. Email Error Message ------------------- If the script is executed in batch mode via cron job, an email error message will be generated if the transfer of the fantomas spiderSpy(TM) botBase fails. For this email functionality you will need to specify the following variables: * $sendmail This variable defines the command call for the mail program. Default configuration is "/usr/lib/sendmail -t -n -oi". If you don't know this path, you can check it out under telnet by entering Unix command "whereis sendmail". Else, please inquire with your system administrator. * $from_mail This variable defines the email error message sender's address. * $to_mail This variable defines where you want the email error message to be sent. * $subject This variable defines the email error message's subject line. * $cloak_for_google If you want to cloak for Google, please set "$cloak_for_google = 1" and the Google spider entries in the fantomas spiderSpy(TM) botBase will be activated. User Authentication ------------------- After the sign up for the spiderSpy service, you received your user id and password for downloading the fantomas spiderSpy(TM) botBase. * $user This variable defines your user id (case sensitive). * $pw This variable defines your password (case sensitive). ******************** VERY IMPORTANT! ********************* If the variables "$user" and "$pw" are not correct, the download will fail because access is forbidden. SO PLEASE MAKE SURE TO SPECIFY YOUR ID AND PW EXACTLY AS ISSUED DURING SIGNUP! ******************** VERY IMPORTANT! ********************* [ Close window ]
ONLINE MODE
-----------
* Script is activated by entering the appropriate URL into web
browser's location/address field,
e.g. "http://www.yourdomain.com/cgi-bin/spyfetcher-e.cgi".
To start the download of the current version of the fantomas
spiderSpy(TM) botBase, click button "Submit!".
If the botBase is saved on your server, the next HTML template
will display the message:
"Transfer of fantomas spiderSpy(TM) botBase successful!"
BATCH MODE
----------
You can manage the transfers of fantomas spiderSpy(TM) botBase
automatically by defining a cron job.
======================================================================
CONFIGURATION OF CRON JOBS
--------------------------
Cron is a mechanism for planning and scheduling batch jobs.
The daemon "crond" is started automatically on system boot up.
It runs one check per minute to see if there are any jobs to
execute.
The list of jobs to execute is created by the program "crontab".
The following commands work from the assumption that you are
either logged in by Telnet or locally on your Unix system.
Entering the command "crontab -l" will display a list of current
entries. By default, only entries owned by the logged in User
will be displayed.
Existing lists can be removed/deleted with command "crontab -r".
To create a new list, it is recommended to read the entries from
a file using command "crontab filename".
The following examples will show you the format of this file.
The file itself is created with an ASCII text editor.
Example:
0 12 * * * /usr/www/htdocs/yourdomain/cgi-bin/spyfetcher-e.cgi start
This entry consists of six parameters. The first five parameters
define the time schedule, whereas the sixth parameter contains
the command for executing the job.
In our example above, this command consists of:
- the full path and file name of the script
- an argument
Parameters defining the time schedule are:
minute(0-59) hour(0-23) day of month(1-31) month(1-12) day of week(0-6) 0 = Sun
Hence, the above sample entries: 0 12 * * *
can be translated as:
If Minute = 0 and Hour = 12, the script will be executed.
Because the last three scheduling parms are defined by wildcard character
"*", the job will be executed every day.
[ Close window ]
Scheduling Week Days -------------------- If you wish to run the script on Mondays only, the following entry will do the trick: 0 12 * * 1 /usr/www/htdocs/yourdomain/cgi-bin/spyfetcher-e.cgi start Scheduling Turn of Month ------------------------ You can schedule the turn of the month in this manner: 0 0 1 * * /usr/www/htdocs/yourdomain/cgi-bin/spyfetcher-e.cgi start To Summarize ------------ Create a text file (e.g. "crontab.txt") and write the appropriate command on one single line. We recommend downloading the fantomas spiderSpy(TM) botBase once per day. The following syntax will generate (as explained above) a cron job which will run once a day: 0 12 * * * /usr/www/htdocs/yourdomain/cgi-bin/spyfetcher-e.cgi start IMPORTANT ========= Please modify the TIME OF DAY argument specified for your cron job to prevent all downloads happening at the same time - with hundreds of subscribers, this could incur a server overload on our system. Prevention of abuse: Per day, a maximum of six downloads of the botBase are permitted, beyond that the downloading IP will be blocked by our system. Enter the absolute path for the script as valid for *YOUR* specific system configuration. The argument to use is "start", as shown in our example above. Next, the command "crontab crontab.txt" will transmit this file to crontab. IMPORTANT ========= If crontab has been configured for prior jobs already, you must include them in the new file "crontab.txt" (example), as the command "crontab crontab.txt" will override all previous cron jobs owned by the specific user calling crontab! For further online explanations under Unix, you can choose one of the following commands: man crontab man 5 crontab man cron ====================================================================== [ Close window ]
ERROR HANDLING -------------- This section covers individual error messages. Stats directory --------------- "Stats directory ... does not exist!" Please create stats directory or adjust the directory name under variable "$stats_dir". Download error -------------- "Download of fantomas spiderSpy(TM) botBase failed!" Possible issues: * Call of wget is not functional. Solution: Please check your system's wget functionality. * The access data specified (user id and password) for the botBase are invalid. Solution: Please check your user ID and Password. * In directory stats (defined under $stats_dir) new files could not created. Solution: Please check permissions of directory: "chmod 777" i.e. [drwxrwxrwx] Unzip error ----------- "Unzip of fantomas spiderSpy(TM) botBase failed!" Possible issues: * Call of gunzip is not functional. Solution: Please check your system's gunzip functionality. Change mode error ----------------- "Change mode of fantomas spiderSpy(TM) botBase failed!" Possible issues: * Call of chmod is not functional. Solution: Please check your system's chmod functionality. ====================================================================== KNOWN ISSUES ------------ Graphics -------- Graphics files uploaded to the CGI directory or to a directory below same may not be displayed correctly under some web server configurations. In this case you may create a directory outside of the cgi-bin. You can then define the "$graphics_dir" variable in program file spyfetcher-e.cgi accordingly. Example: $graphics_dir = "../graphics/"; Docs (Manual/Help files) ------------------------ If the help file is not displayed correctly, we recommend uploading it to an alternate directory (outside of cgi-bin!) as well. You can then define the "$doc_dir" variable in program file spyfetcher-e.cgi accordingly. Example: $doc_dir = "../docs/"; [ Close window ]
|