ANet is a series of workhorse servers for a team of data miners, each server with 32GB of memory, four dual core AMD64 CPUs, a 1.2TB RAID5 hard disk (4x300GB RAID5 disks delivering 600GB plus a hot swap share), deployed for access over secure networks. The machine was installed with Debian GNU/Linux 4.0r0. The package vsftpd was installed as the ftp server for the host, ssh for secure file transfer, and ODBC for connection to a Teradata warehouse. XWin32 is used to access the server from the desktops of a team of data miners running a locked down MS/Windows standard operating environment workstations across the organisation (multiple locations).
A default install using the Etch (version 4.0r0) release of Debian
GNU/Linux, booting from DVD-RW, was performed (23 April 2007). The DVD
image was obtained using jigdo-file, running the command
jigdo-lite (see Section ??) and specifying
MD5 checksums confirmed the integrity of the downloaded DVD image, and
similarly for disks 2 and 3.
Working: Multiple Dual Core, CPUs, X11 on a rack mounted console, CD/DVD, Ethernet, X11 from desktop Exceed through TELNET, but not yet through ssh.
Not tested: EMC2 SAN libraries, Teradata ODBC drivers, SAS/Enterprise Miner.
No issues or problems with the install! Base installation took 30 minutes. Package installation took 1 hour. Fine tuning took another 30 minutes. Total was 2 hours.
|Machine:||Rack mounted HP ProLiant DL585 G1|
|CPU:||AMD Opteron 885 2.6GHz Dual Core x 4|
|Memory:||32GB (16x2GB DIMM DDR Synchronous 400MHz)|
|Network:||Broadcom NetXtreme BCM5704 Gigabit ()|
|Disk:||Compaq Smart Array 5i/532|
|3 x 146GB single RAID of 146GB|
A modular storage array is being used to deliver reliable storage. For each host, four 300GB drives are deployed, three for RAID5 storage, and the remaining drive as a auto hot swap. 600GB of disk will be exposed from the storage array for the system. Because the modular storage arrays are dual bus, two servers can be independently supported by each array.
The initial installation was on a test machine with very restricted network access. Purpose was to test, configure and document the installation.
Standard install (see Section ??). Boot from DVD. Choose guided full repartition of the hard disk.
Install: lang=English, location=Australia, kb=American English, network=eth0 (also available were eth1, …, eth4), hostname=anet01, partition=Guided, automatic, entire disk.
The partition automatically chosen was:\ | Spec | Details | |:–|:–| | / | 279M & sda1| | /usr | 5.0G & sda5| | /var | 3.0G & sda6| | /home | 119G & sda9| | /tmp | 403M & sda8| | swap | 18G & sda7|
Set root passwd, user account, apt install from DVD with tasksel selection of Desktop Environment, Web Server, File Server, SQL Database, and Standard System. SMB install noted that WINS settings can be obtained from DHCP, so choose that (although there was a recommendation to then install dhcp3-client for this, but this was not done).
Reboot and GNOME (GDM) started no problem.
Continue installing from DVD to install wajig, configure sudo, and all the rest!
Installed Sun’s jdk 1.6.0:
# mkdir /usr/local/sun # cd /usr/local/sun # sh /home/share/java-6u1-linux-amd64.bin Agree to the license if you do - but beware it contains limitations. # update-alternatives --install /usr/bin/javac javac\ /usr/local/sun/jdk1.6.0/bin/javac 120 # update-alternatives --install /usr/bin/java java\ /usr/local/sun/jdk1.6.0/bin/java 120
We should then really do the same for all of the other binaries in
/usr/local/sun/jdk1.6.0/bin, but a quick shortcut is to
simply put them all into
# cd /usr/local/bin # ln -s /usr/local/sun/jdk1.6.0/bin/* .
The bteq application is used to connect to a Teradata data warehouse. Its installation will confirm that the data warehouse connection can be established, and hence, SAS/ACCESS Teradata can establish a connection.
Teradata do not support Debian, but the driver works. Install the libraries provided for the i386 architecture:
rpm2cpio tdicu-01.01.02.00-1.i386.rpm . | cpio -idv rpm2cpio TeraGSS_redhatlinux-i386-06.02.00.00-1.i386.rpm . | cpio -idv rpm2cpio piom-02.04.00.00-1.i386.rpm . | cpio -idv rpm2cpio cliv2-04.08.02.00-1.i386.rpm . | cpio -idv rpm2cpio bteq-08.02.04.00-1.i386.rpm . | cpio -idv sudo cp -R opt/teradata /opt sudo install usr/bin/bteq /usr/bin/ sudo install usr/lib/* /usr/lib32 sudo ln -s /usr/lib32/errmsg.cat /usr/lib/errmsg.cat sudo ln -s /usr/lib32/clispb.dat /usr/lib/clispb.dat sudo ln -s /opt/teradata/teragss/redhatlinux-i386/06.02.00.00 \ /opt/teradata/teragss/redhatlinux-i386/client rm -rf ./opt ./usr
Then simply start bteq:
$ bteq .LOGON hostname/user
A test driver was supplied by Teradata for RedHat. The package was installed under Debian using alien (via wajig):
$ wajig rpminstall tdodbc-03.06.00.00-1.x86_64.rpm
It complains that scripts won’t be generated unless the
--scripts option is ued, but when used we get some script
errors that have not yet been explored. The library seems to be in the
right place, but haven’t tested it as yet.
Sample configuration files appear in
A Debian package can be created with:
$ alien -d --scripts tdodbc-03.06.00.00-1.x86_64.rpm
tdodbc-03.06.00.00-2-amd64.deb. An install of this
though complains about
scm:socal being an invalid user in a
chown, many times. But we can look at the scripts and see what it is
trying to do.
Testing will involve creating one’s own ~/.odbc.ini, and placing the
contents of the sample odbcinst.ini into
(is it required in that location?). But tdata.so complains that it
can’t find libodbcinst.so, which is there in
perhaps this is a problem with LD_LIBRARY_PATH things in R?
Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0