Search Results

Search found 43 results on 2 pages for 'rhel4'.

Page 2/2 | < Previous Page | 1 2 

  • What's an easy way to install a bunch of OSS apps in my home dir on my company's aging Redhat VNC servers?

    - by subopt
    My company's IT group doesn't want to install lots of common stuff on our VNC servers, which run RHEL 4. The list of stuff i want/need includes graphviz, midnight commander, various python libs, sqlite, netbeans, antlr, jdk1.6, etc. They've hobbled rpm somehow, so i can't do rpm --relocate. And i don't want the RHEL4 compatible revs of this stuff anyway. I've probably got the room to build all this stuff in my home dir, but i don't want to take forever tracking down dependencies. Looking for an easy way to get the apps/tools i need, and nothing (or very little) else. Any suggestions?

    Read the article

  • How to setup a host as a sendmail relay for particular IP subnet

    - by Abhinav
    Hi, By default, sendmail (I have version 8.13 on an RHEL4) allows only local mails. I wanted to allow mails from a particular network to be relayed via the system, so I did the following based on suggestions from various places : /etc/mail/access : Added the subnet and the domain 8.37 RELAY mydomain.com RELAY (I assume this is the originating email's domain) This alone did not work, so I added the following to sendmail.mc FEATURE(access_db)dbl Now, the problem is that it is allowing access from other domains as well. To test it out, I removed 8.37 RELAY from the access, and changed the email from field to [email protected] However, I still receive the mail. What is the correct way to configure this, so that only mails from a particular subnet are relayed ?

    Read the article

  • How to tell if PAE is hurting me?

    - by James
    I have a couple of servers with 20-30 GB RAM that are running (a variant of) RHEL4. They are currently running the SMP i386 kernel, not x64, not even the hugemem kernel. This means LowMem is confined to < 1G, and thus dentry_cache and ext3_inode_cache to 100M or so each. How can I tell if this is a problem? Here's a typical vmstat report while it's compiling some Java: $ vmstat 10 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 1 0 19493816 394740 922420 0 0 1058 2292 1491 1020 6 3 80 12 2 1 0 19519480 395244 850156 0 0 1179 1412 1329 1195 9 4 75 12 1 1 0 19557368 392616 828344 0 0 1783 1680 1498 1756 14 5 72 9 I don't like the way bi is nonzero when there is so much memory free. I imagine slabtop could point more directly to the problem but I don't really understand how to interpret its output.

    Read the article

  • connection refused with nfs server.

    - by someguy
    Hi All, I have an issue with NFS. I have two rhel4 servers. sys1 and sys2. I have a directory exported from sys1 and sys2 is the nfs client trying to access the mount. I'm able to mount, but after a certain amount of time I get "RPC: Remote system error: Connection refused." and then I restart nfs services, quotas, deamon, mountd and lockd. Then it works fine again. Any way to fix this. Thanks.

    Read the article

  • Logrotate Successful, original file goes back to original size

    - by drewrockshard
    Has anyone had any issues with logrotate before that causes a log file to get rotated and then go back to the same size it originally was? Here's my findings: Logrotate Script: /var/log/mylogfile.log { rotate 7 daily compress olddir /log_archives missingok notifempty copytruncate } Verbose Output of Logrotate: copying /var/log/mylogfile.log to /log_archives/mylogfile.log.1 truncating /var/log/mylogfile.log compressing log with: /bin/gzip removing old log /log_archives/mylogfile.log.8.gz Log file after truncate happens [root@server ~]# ls -lh /var/log/mylogfile.log -rw-rw-r-- 1 part1 part1 0 Jan 11 17:32 /var/log/mylogfile.log Literally Seconds Later: [root@server ~]# ls -lh /var/log/mylogfile.log -rw-rw-r-- 1 part1 part1 3.5G Jan 11 17:32 /var/log/mylogfile.log RHEL Version: [root@server ~]# cat /etc/redhat-release Red Hat Enterprise Linux ES release 4 (Nahant Update 4) Logrotate Version: [root@DAA21529WWW370 ~]# rpm -qa | grep logrotate logrotate-3.7.1-10.RHEL4 Few Notes: Service can't be restarted on the fly, so that's why I'm using copytruncate Logs are rotating every night, according to the olddir directory having log files in it from each night.

    Read the article

  • How to backup data from linux servers to linux server (incremental + snapshot)?

    - by wag2639
    We have a handful of hosted servers running RHEL4 and RHEL5 and would like to backup some key folders (I'm thinking /var /srv and /etc) to a local server we have in house. The local server is running Ubuntu 9.10 Server edition. I'm looking for a free (preferably OSS) way to grab (or push) incremental backups to my local server and once a month or so, make a new snapshot for incremental updates in between snapshots. Also, while I'm comfortable with using a command line, others may need to use the system in the future, and I would like some kind of graphical or web interface to browse the backup repository. Suggestions?

    Read the article

  • How to backup data from linux servers to linux server (incremental + snapshot)?

    - by wag2639
    We have a handful of hosted servers running RHEL4 and RHEL5 and would like to backup some key folders (I'm thinking /var /srv and /etc) to a local server we have in house. The local server is running Ubuntu 9.10 Server edition. I'm looking for a free (preferably OSS) way to grab (or push) incremental backups to my local server and once a month or so, make a new snapshot for incremental updates in between snapshots. Also, while I'm comfortable with using a command line, others may need to use the system in the future, and I would like some kind of graphical or web interface to browse the backup repository. Suggestions?

    Read the article

  • tail -f and then exit on matching string

    - by Patrick
    I am trying to configure a startup script which will startup tomcat, monitor the catalina.out for the string "Server startup", and then run another process. I have been trying various combinations of tail -f with grep and awk, but haven't got anything working yet. The main issue I am having seems to be with forcing the tail to die after grep or awk have matched the string. I have simplified to the following test case. test.sh is listed below: #!/bin/sh rm -f child.out ./child.sh > child.out & tail -f child.out | grep -q B child.sh is listed below: #!/bin/sh echo A sleep 20 echo B echo C sleep 40 echo D The behavior I am seeing is that grep exits after 20 seconds , however the tail will take a further 40 seconds to die. I understand why this is happening - tail will only notice that the pipe is gone when it writes to it which only happens when data gets appended to the file. This is compounded by the fact that tail is to be buffering the data and outputting the B and C characters as a single write (I confirmed this by strace). I have attempted to fix that with solutions I found elsewhere, such as using unbuffer command, but that didn't help. Anybody got any ideas for how to get this working how I expect it? Or ideas for waiting for successful Tomcat start (thinking about waiting for a TCP port to know it has started, but suspect that will become more complex that what I am trying to do now). I have managed to get it working with awk doing a "killall tail" on match, but I am not happy with that solution. Note I am trying to get this to work on RHEL4.

    Read the article

  • Linux RFID reader HID Device not matching driver

    - by blietaer
    Hello, I got a RFID reader (GigaTek PCR330A-00) that is meant to be recognized under linux/windows as a (Human Interface Device) keyboard/USB. I hate to say this but it is working as a charm under Win7 but not "really" under Linux. Under Debian-like distros (x/k/Ubuntu, Debian,..), or Gentoo, or... I just can't have the device working at all: the device scan well (it has its USB 5V, so it is happy/beeping/blinking) something happened in the dmesg, but no immediate screen display of the RFID Tag code as expected (and seen under win7) Support is claiming it is ok under RHEL or SLED "enterprises" distros... and I must admit I saw it working under a RHEL4... I tried stealing the driver but did not succeed having my reader working... My question is thus double: 1./ How can I hack the kernel to add support to my device (simply register PID/VID?) ? 2./ What is different at all in a "enterprise" proprietary distro? how can I re-use it? Thank you for any hint/help. Cheers,

    Read the article

  • ssh without password does not work for some users

    - by joshxdr
    I have a new RHEL4 Linux box that I am using to copy data to old Solaris 2.6 and RHEL3 Linux boxes with scp. I have found that with the same setup, it works for some users but not for others. For user jane, this works fine: jane@host1$ ssh -v remhost debug1: Next authentication method: publickey debug1: Trying private key: /mnt/home/osborjo/.ssh/identity debug1: Offering public key: /mnt/home/osborjo/.ssh/id_rsa debug1: Server accepts key: pkalg ssh-rsa blen 277 debug1: read PEM private key done: type RSA debug1: Authentication succeeded (publickey). for user jack it does not: jack@host1 ssh -v remhost debug1: Next authentication method: publickey debug1: Trying private key: /mnt/home/oper1/.ssh/identity debug1: Offering public key: /mnt/home/oper1/.ssh/id_rsa debug1: Authentications that can continue: publickey,password,keyboard-interactive I have looked at the permissions for all the keys and files, they look the same. Since I am using home directories mounted by NFS, the keys for both the remote host and the local host are in the same directory. This is how things look for jane: jane@host1$ ls -l $HOME/.ssh -rw-rw-r-- 1 jane operator 394 Jan 27 16:28 authorized_keys -rw------- 1 jane operator 1675 Jan 27 16:27 id_rsa -rw-r--r-- 1 jane operator 394 Jan 27 16:27 id_rsa.pub -rw-rw-r-- 1 jane operator 1205 Jan 27 16:46 known_hosts For user jack: jack@host1$ ls -l $HOME/.ssh -rw-rw-r-- 1 jack engineer 394 Jan 27 16:28 authorized_keys -rw------- 1 jack engineer 1675 Jan 27 16:27 id_rsa -rw-r--r-- 1 jack engineer 394 Jan 27 16:27 id_rsa.pub -rw-rw-r-- 1 jack engineer 1205 Jan 27 16:46 known_hosts As a last ditch effort, I copied the authorized_keys, id_rsa, and id_rsa.pub from jill to jack, and changed the username in authorized_keys and id_rsa.pub with vi. It still did not work. It seems there is something different between the two users but I cannot figure out what it is.

    Read the article

  • tail -f and then exit on matching string

    - by Patrick
    I am trying to configure a startup script which will startup tomcat, monitor the catalina.out for the string "Server startup", and then run another process. I have been trying various combinations of tail -f with grep and awk, but haven't got anything working yet. The main issue I am having seems to be with forcing the tail to die after grep or awk have matched the string. I have simplified to the following test case. test.sh is listed below: #!/bin/sh rm -f child.out ./child.sh > child.out & tail -f child.out | grep -q B child.sh is listed below: #!/bin/sh echo A sleep 20 echo B echo C sleep 40 echo D The behavior I am seeing is that grep exits after 20 seconds , however the tail will take a further 40 seconds to die. I understand why this is happening - tail will only notice that the pipe is gone when it writes to it which only happens when data gets appended to the file. This is compounded by the fact that tail is to be buffering the data and outputting the B and C characters as a single write (I confirmed this by strace). I have attempted to fix that with solutions I found elsewhere, such as using unbuffer command, but that didn't help. Anybody got any ideas for how to get this working how I expect it? Or ideas for waiting for successful Tomcat start (thinking about waiting for a TCP port to know it has started, but suspect that will become more complex that what I am trying to do now). I have managed to get it working with awk doing a "killall tail" on match, but I am not happy with that solution. Note I am trying to get this to work on RHEL4.

    Read the article

  • TGT validation fails, but only for one user

    - by wzzrd
    I'm seeing the weirdest thing here. I have a couple of RHEL3, 4 and 5 machines that validate user credentials through Kerberos with an Active Directoy domain controller as their KDC. This works for all of my users, save one. There is one account that is unable to log into RHEL3 Linux machines and generates the following errors there: May 31 13:53:19 mybox sshd(pam_unix)[7186]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.0.0.1 user=user May 31 13:53:20 mybox sshd[7186]: pam_krb5: TGT verification failed for `user' May 31 13:53:20 mybox sshd[7186]: pam_krb5: authentication fails for `user' Other accounts, like my own, are fine: May 31 17:25:30 mybox sshd(pam_unix)[12913]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.0.0.1 user=myuser May 31 17:25:31 mybox sshd[12913]: pam_krb5: TGT for myuser successfully verified May 31 17:25:31 mybox sshd[12913]: pam_krb5: authentication succeeds for `myuser' May 31 17:25:31 mybox sshd(pam_unix)[12915]: session opened for user myuser by (uid=0) As you can see, TGT validation fails. This only happens for this specific account, not for any other. The failing useraccount's password has been reset, I inspected both user objects in Active Directory, but I see nothing out of the ordinary. If I have the failing useraccount log into a RHEL4 or 5 box, there is not problem, so it must be RHEL3 specific, but the fact that only one account suffers from this, alludes me. Maybe someone has seen this before?

    Read the article

  • Linux RFID reader HID Device not matching driver

    - by blietaer
    Hello, I got a RFID reader (GigaTek PCR330A-00) that is meant to be recognized under linux/windows as a (Human Interface Device) keyboard/USB. I hate to say this but it is working as a charm under Win7 but not "really" under Linux. Under Debian-like distros (x/k/Ubuntu, Debian,..), or Gentoo, or... I just can't have the device working at all: the device scan well (it has its USB 5V, so it is happy/beeping/blinking) something happened in the dmesg, but no immediate screen display of the RFID Tag code as expected (and seen under win7) Support is claiming it is ok under RHEL or SLED "enterprises" distros... and I must admit I saw it working under a RHEL4... I tried stealing the driver but did not succeed having my reader working... My question is thus double: 1./ How can I hack the kernel to add support to my device (simply register PID/VID?) ? 2./ What is different at all in a "enterprise" proprietary distro? how can I re-use it? Thank you for any hint/help. Cheers,

    Read the article

  • Which tools to use and how to find file descriptors leaking from Glassfish?

    - by cclark
    We release new code to production every week and Glassfish hasn't had any problems. This weekend we had to move racks at our hosting provider. There were not any code changes (they just powered off, moved, re-racked and powered on) but we're on a new network infrastructure and suddenly we're leaking file descriptors like a sieve. So I'm guessing there is some sort of connection attempting to be made which now fails due to a network change. I'm running Glassfish v2ur2-b04/AS9.1_02 on RHEL4 with an embedded IMQ instance. After the move I started seeing: [#|2010-04-25T05:34:02.783+0000|SEVERE|sun-appserver9.1|javax.enterprise.system.container.web|_ThreadID=33;_ThreadName=SelectorThread-?4848;_RequestID=c4de6f6d-c1d6-416d-ac6e-49750b1a36ff;|WEB0756: Caught exception during HTTP processing. java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ... [#|2010-04-25T05:34:03.327+0000|WARNING|sun-appserver9.1|javax.enterprise.system.stream.err|_ThreadID=34;_ThreadName=Timer-1;_RequestID=d27e1b94-d359-4d90-a6e3-c7ec49a0f383;|java.lang.NullPointerException at com.sun.jbi.management.system.AutoAdminTask.pollAutoDirectory(AutoAdminTask.java:1031) Using lsof I check the number of file descriptors and I see quite a few entries which look like: java 18510 root 8556u sock 0,4 1555182 can't identify protocol java 18510 root 8557u sock 0,4 1555320 can't identify protocol java 18510 root 8558u sock 0,4 1555736 can't identify protocol java 18510 root 8559u sock 0,4 1555883 can't identify protocol If I do a count of open file descriptors every minute I see it growing by 12 every minute. I have no idea what these sockets are. I've undeployed my application so there is only a plain Glassfish instance running and I still see it leaking 12 file descriptors a minute. So I think this leak is in Glassfish or potentially IMQ. What approach should I take to tracking down these sockets of unknown protocol? What tools can I use (or flags can I pass to lsof) to get more information about where to look? thanks, chuck

    Read the article

  • Passing enums to functions in C++

    - by rocknroll
    Hi all, I have a header file with all the enums listed (#ifndef #define #endif construct has been used to avoid multiple inclusion of the file) that I use in multiple cpp files in my application.One of the enums in the files is enum StatusSubsystem {ENABLED,INCORRECT_FRAME,INVALID_DATA,DISABLED}; There are functions in the application delcared as ShowStatus(const StatusSubsystem&); Earlier in the application when I made calls to the above function like ShowStatus(INCORRECT_FRAME); my application used to compile perfectly. But after some code was added The compilation halts giving the following error: File.cpp:71: error: invalid conversion from `int' to `StatusSubsystem' File.cpp:71: error: initializing argument 1 of `void Class::ShowStatus(const StatusSubsystem&) I checked the code for any conflicting enums in the new code and it looked fine. My Question is what is wrong with the function call that compiler shows as erroneous? For your reference the function definition is: void Class::ShowStatus(const StatusSubsystem& eStatus) { QPalette palette; mStatus=eStatus;//store current Communication status of system if(eStatus==DISABLED) { //select red color for label, if it is to be shown disabled palette.setColor(QPalette::Window,QColor(Qt::red)); mLabel->setText("SYSTEM"); } else if(eStatus==ENABLED) { //select green color for label,if it is to be shown enabled palette.setColor(QPalette::Window,QColor(Qt::green)); mLabel->setText("SYSTEM"); } else if(eStatus==INCORRECT_FRAME) { //select yellow color for label,to show that it is sending incorrect frames palette.setColor(QPalette::Window,QColor(Qt::yellow)); mLabel->setText("SYSTEM(I)"); } //Set the color on the Label mLabel->setPalette(palette); } A strange side effect of this situation is it compiles when I cast all the calls to ShowStatus() as ShowStatus((StatusSubsystem)INCORRECT_FRAME); Though this removes any compilation error, but a strange thing happens. Though I make call to INCORRECT_FRAME above but in function definition it matches with ENABLED. How on earth is that possible? Its like while passing INCORRECT_FRAME by reference, it magically converts to ENABLED, which should be impossible. This is driving me nuts. Can you find any flaw in what I am doing? or is it something else? The application is made using C++,Qt-4.2.1 on RHEL4. Thanks.

    Read the article

  • Weird seg fault problem

    - by bluedaemon
    Greetings, I'm having a weird seg fault problem. My application dumps a core file at runtime. After digging into it I found it died in this block: #include <lib1/c.h> ... x::c obj; obj.func1(); I defined class c in a library lib1: namespace x { struct c { c(); ~c(); void fun1(); vector<char *> _data; }; } x::c::c() { } x::c::~c() { for ( int i = 0; i < _data.size(); ++i ) delete _data[i]; } I could not figure it out for some time till I ran nm on the lib1.so file: there are more function definitions than I defined: x::c::c() x::c::c() x::c::~c() x::c::~c() x::c::func1() x::c::func2() After searching in code base I found someone else defined a class with same name in same namespace, but in another library lib2 as follows: namespace x { struct c { c(); ~c(); void func2(); vector<string> strs_; }; } x::c::c() { } x::c::~c() { } My application links to lib2, which has dependency on lib1. This interesting behavior brings several questions: Why would it even work? I would expect a "multiple definitions" error while linking against lib2 (which depends upon lib1) but never had such. The application seems to be doing what's defined in func1 except it dumps a core at runtime. After attaching debugger, I found my application calls the ctor of class c in lib2, then calls func1 (defined in lib1). When going out of scope it calls dtor of class c in lib2, where the seg fault occurs. Can anybody teach me how this could even occur? How can I prevent such problems from happening again? Is there any C++ syntax I can use? Forgot to mention I'm using g++ 4.1 on RHEL4, thank you very much!

    Read the article

  • ASMLib

    - by wcoekaer
    Oracle ASMlib on Linux has been a topic of discussion a number of times since it was released way back when in 2004. There is a lot of confusion around it and certainly a lot of misinformation out there for no good reason. Let me try to give a bit of history around Oracle ASMLib. Oracle ASMLib was introduced at the time Oracle released Oracle Database 10g R1. 10gR1 introduced a very cool important new features called Oracle ASM (Automatic Storage Management). A very simplistic description would be that this is a very sophisticated volume manager for Oracle data. Give your devices directly to the ASM instance and we manage the storage for you, clustered, highly available, redundant, performance, etc, etc... We recommend using Oracle ASM for all database deployments, single instance or clustered (RAC). The ASM instance manages the storage and every Oracle server process opens and operates on the storage devices like it would open and operate on regular datafiles or raw devices. So by default since 10gR1 up to today, we do not interact differently with ASM managed block devices than we did before with a datafile being mapped to a raw device. All of this is without ASMLib, so ignore that one for now. Standard Oracle on any platform that we support (Linux, Windows, Solaris, AIX, ...) does it the exact same way. You start an ASM instance, it handles storage management, all the database instances use and open that storage and read/write from/to it. There are no extra pieces of software needed, including on Linux. ASM is fully functional and selfcontained without any other components. In order for the admin to provide a raw device to ASM or to the database, it has to have persistent device naming. If you booted up a server where a raw disk was named /dev/sdf and you give it to ASM (or even just creating a tablespace without asm on that device with datafile '/dev/sdf') and next time you boot up and that device is now /dev/sdg, you end up with an error. Just like you can't just change datafile names, you can't change device filenames without telling the database, or ASM. persistent device naming on Linux, especially back in those days ways to say it bluntly, a nightmare. In fact there were a number of issues (dating back to 2004) : Linux async IO wasn't pretty persistent device naming including permissions (had to be owned by oracle and the dba group) was very, very difficult to manage system resource usage in terms of open file descriptors So given the above, we tried to find a way to make this easier on the admins, in many ways, similar to why we started working on OCFS a few years earlier - how can we make life easier for the admins on Linux. A feature of Oracle ASM is the ability for third parties to write an extension using what's called ASMLib. It is possible for any third party OS or storage vendor to write a library using a specific Oracle defined interface that gets used by the ASM instance and by the database instance when available. This interface offered 2 components : Define an IO interface - allow any IO to the devices to go through ASMLib Define device discovery - implement an external way of discovering, labeling devices to provide to ASM and the Oracle database instance This is similar to a library that a number of companies have implemented over many years called libODM (Oracle Disk Manager). ODM was specified many years before we introduced ASM and allowed third party vendors to implement their own IO routines so that the database would use this library if installed and make use of the library open/read/write/close,.. routines instead of the standard OS interfaces. PolyServe back in the day used this to optimize their storage solution, Veritas used (and I believe still uses) this for their filesystem. It basically allowed, in particular, filesystem vendors to write libraries that could optimize access to their storage or filesystem.. so ASMLib was not something new, it was basically based on the same model. You have libodm for just database access, you have libasm for asm/database access. Since this library interface existed, we decided to do a reference implementation on Linux. We wrote an ASMLib for Linux that could be used on any Linux platform and other vendors could see how this worked and potentially implement their own solution. As I mentioned earlier, ASMLib and ODMLib are libraries for third party extensions. ASMLib for Linux, since it was a reference implementation implemented both interfaces, the storage discovery part and the IO part. There are 2 components : Oracle ASMLib - the userspace library with config tools (a shared object and some scripts) oracleasm.ko - a kernel module that implements the asm device for /dev/oracleasm/* The userspace library is a binary-only module since it links with and contains Oracle header files but is generic, we only have one asm library for the various Linux platforms. This library is opened by Oracle ASM and by Oracle database processes and this library interacts with the OS through the asm device (/dev/asm). It can install on Oracle Linux, on SuSE SLES, on Red Hat RHEL,.. The library itself doesn't actually care much about the OS version, the kernel module and device cares. The support tools are simple scripts that allow the admin to label devices and scan for disks and devices. This way you can say create an ASM disk label foo on, currently /dev/sdf... So if /dev/sdf disappears and next time is /dev/sdg, we just scan for the label foo and we discover it as /dev/sdg and life goes on without any worry. Also, when the database needs access to the device, we don't have to worry about file permissions or anything it will be taken care of. So it's a convenience thing. The kernel module oracleasm.ko is a Linux kernel module/device driver. It implements a device /dev/oracleasm/* and any and all IO goes through ASMLib - /dev/oracleasm. This kernel module is obviously a very specific Oracle related device driver but it was released under the GPL v2 so anyone could easily build it for their Linux distribution kernels. Advantages for using ASMLib : A good async IO interface for the database, the entire IO interface is based on an optimal ASYNC model for performance A single file descriptor per Oracle process, not one per device or datafile per process reducing # of open filehandles overhead Device scanning and labeling built-in so you do not have to worry about messing with udev or devlabel, permissions or the likes which can be very complex and error prone. Just like with OCFS and OCFS2, each kernel version (major or minor) has to get a new version of the device drivers. We started out building the oracleasm kernel module rpms for many distributions, SLES (in fact in the early days still even for this thing called United Linux) and RHEL. The driver didn't make sense to get pushed into upstream Linux because it's unique and specific to the Oracle database. As it takes a huge effort in terms of build infrastructure and QA and release management to build kernel modules for every architecture, every linux distribution and every major and minor version we worked with the vendors to get them to add this tiny kernel module to their infrastructure. (60k source code file). The folks at SuSE understood this was good for them and their customers and us and added it to SLES. So every build coming from SuSE for SLES contains the oracleasm.ko module. We weren't as successful with other vendors so for quite some time we continued to build it for RHEL and of course as we introduced Oracle Linux end of 2006 also for Oracle Linux. With Oracle Linux it became easy for us because we just added the code to our build system and as we churned out Oracle Linux kernels whether it was for a public release or for customers that needed a one off fix where they also used asmlib, we didn't have to do any extra work it was just all nicely integrated. With the introduction of Oracle Linux's Unbreakable Enterprise Kernel and our interest in being able to exploit ASMLib more, we started working on a very exciting project called Data Integrity. Oracle (Martin Petersen in particular) worked for many years with the T10 standards committee and storage vendors and implemented Linux kernel support for DIF/DIX, data protection in the Linux kernel, note to those that wonder, yes it's all in mainline Linux and under the GPL. This basically gave us all the features in the Linux kernel to checksum a data block, send it to the storage adapter, which can then validate that block and checksum in firmware before it sends it over the wire to the storage array, which can then do another checksum and to the actual DISK which does a final validation before writing the block to the physical media. So what was missing was the ability for a userspace application (read: Oracle RDBMS) to write a block which then has a checksum and validation all the way down to the disk. application to disk. Because we have ASMLib we had an entry into the Linux kernel and Martin added support in ASMLib (kernel driver + userspace) for this functionality. Now, this is all based on relatively current Linux kernels, the oracleasm kernel module depends on the main kernel to have support for it so we can make use of it. Thanks to UEK and us having the ability to ship a more modern, current version of the Linux kernel we were able to introduce this feature into ASMLib for Linux from Oracle. This combined with the fact that we build the asm kernel module when we build every single UEK kernel allowed us to continue improving ASMLib and provide it to our customers. So today, we (Oracle) provide Oracle ASMLib for Oracle Linux and in particular on the Unbreakable Enterprise Kernel. We did the build/testing/delivery of ASMLib for RHEL until RHEL5 but since RHEL6 decided that it was too much effort for us to also maintain all the build and test environments for RHEL and we did not have the ability to use the latest kernel features to introduce the Data Integrity features and we didn't want to end up with multiple versions of asmlib as maintained by us. SuSE SLES still builds and comes with the oracleasm module and they do all the work and RHAT it certainly welcome to do the same. They don't have to rebuild the userspace library, it's really about the kernel module. And finally to re-iterate a few important things : Oracle ASM does not in any way require ASMLib to function completely. ASMlib is a small set of extensions, in particular to make device management easier but there are no extra features exposed through Oracle ASM with ASMLib enabled or disabled. Often customers confuse ASMLib with ASM. again, ASM exists on every Oracle supported OS and on every supported Linux OS, SLES, RHEL, OL withoutASMLib Oracle ASMLib userspace is available for OTN and the kernel module is shipped along with OL/UEK for every build and by SuSE for SLES for every of their builds ASMLib kernel module was built by us for RHEL4 and RHEL5 but we do not build it for RHEL6, nor for the OL6 RHCK kernel. Only for UEK ASMLib for Linux is/was a reference implementation for any third party vendor to be able to offer, if they want to, their own version for their own OS or storage ASMLib as provided by Oracle for Linux continues to be enhanced and evolve and for the kernel module we use UEK as the base OS kernel hope this helps.

    Read the article

  • What's up with OCFS2?

    - by wcoekaer
    On Linux there are many filesystem choices and even from Oracle we provide a number of filesystems, all with their own advantages and use cases. Customers often confuse ACFS with OCFS or OCFS2 which then causes assumptions to be made such as one replacing the other etc... I thought it would be good to write up a summary of how OCFS2 got to where it is, what we're up to still, how it is different from other options and how this really is a cool native Linux cluster filesystem that we worked on for many years and is still widely used. Work on a cluster filesystem at Oracle started many years ago, in the early 2000's when the Oracle Database Cluster development team wrote a cluster filesystem for Windows that was primarily focused on providing an alternative to raw disk devices and help customers with the deployment of Oracle Real Application Cluster (RAC). Oracle RAC is a cluster technology that lets us make a cluster of Oracle Database servers look like one big database. The RDBMS runs on many nodes and they all work on the same data. It's a Shared Disk database design. There are many advantages doing this but I will not go into detail as that is not the purpose of my write up. Suffice it to say that Oracle RAC expects all the database data to be visible in a consistent, coherent way, across all the nodes in the cluster. To do that, there were/are a few options : 1) use raw disk devices that are shared, through SCSI, FC, or iSCSI 2) use a network filesystem (NFS) 3) use a cluster filesystem(CFS) which basically gives you a filesystem that's coherent across all nodes using shared disks. It is sort of (but not quite) combining option 1 and 2 except that you don't do network access to the files, the files are effectively locally visible as if it was a local filesystem. So OCFS (Oracle Cluster FileSystem) on Windows was born. Since Linux was becoming a very important and popular platform, we decided that we would also make this available on Linux and thus the porting of OCFS/Windows started. The first version of OCFS was really primarily focused on replacing the use of Raw devices with a simple filesystem that lets you create files and provide direct IO to these files to get basically native raw disk performance. The filesystem was not designed to be fully POSIX compliant and it did not have any where near good/decent performance for regular file create/delete/access operations. Cache coherency was easy since it was basically always direct IO down to the disk device and this ensured that any time one issues a write() command it would go directly down to the disk, and not return until the write() was completed. Same for read() any sort of read from a datafile would be a read() operation that went all the way to disk and return. We did not cache any data when it came down to Oracle data files. So while OCFS worked well for that, since it did not have much of a normal filesystem feel, it was not something that could be submitted to the kernel mail list for inclusion into Linux as another native linux filesystem (setting aside the Windows porting code ...) it did its job well, it was very easy to configure, node membership was simple, locking was disk based (so very slow but it existed), you could create regular files and do regular filesystem operations to a certain extend but anything that was not database data file related was just not very useful in general. Logfiles ok, standard filesystem use, not so much. Up to this point, all the work was done, at Oracle, by Oracle developers. Once OCFS (1) was out for a while and there was a lot of use in the database RAC world, many customers wanted to do more and were asking for features that you'd expect in a normal native filesystem, a real "general purposes cluster filesystem". So the team sat down and basically started from scratch to implement what's now known as OCFS2 (Oracle Cluster FileSystem release 2). Some basic criteria were : Design it with a real Distributed Lock Manager and use the network for lock negotiation instead of the disk Make it a Linux native filesystem instead of a native shim layer and a portable core Support standard Posix compliancy and be fully cache coherent with all operations Support all the filesystem features Linux offers (ACL, extended Attributes, quotas, sparse files,...) Be modern, support large files, 32/64bit, journaling, data ordered journaling, endian neutral, we can mount on both endian /cross architecture,.. Needless to say, this was a huge development effort that took many years to complete. A few big milestones happened along the way... OCFS2 was development in the open, we did not have a private tree that we worked on without external code review from the Linux Filesystem maintainers, great folks like Christopher Hellwig reviewed the code regularly to make sure we were not doing anything out of line, we submitted the code for review on lkml a number of times to see if we were getting close for it to be included into the mainline kernel. Using this development model is standard practice for anyone that wants to write code that goes into the kernel and having any chance of doing so without a complete rewrite or.. shall I say flamefest when submitted. It saved us a tremendous amount of time by not having to re-fit code for it to be in a Linus acceptable state. Some other filesystems that were trying to get into the kernel that didn't follow an open development model had a lot harder time and a lot harsher criticism. March 2006, when Linus released 2.6.16, OCFS2 officially became part of the mainline kernel, it was accepted a little earlier in the release candidates but in 2.6.16. OCFS2 became officially part of the mainline Linux kernel tree as one of the many filesystems. It was the first cluster filesystem to make it into the kernel tree. Our hope was that it would then end up getting picked up by the distribution vendors to make it easy for everyone to have access to a CFS. Today the source code for OCFS2 is approximately 85000 lines of code. We made OCFS2 production with full support for customers that ran Oracle database on Linux, no extra or separate support contract needed. OCFS2 1.0.0 started being built for RHEL4 for x86, x86-64, ppc, s390x and ia64. For RHEL5 starting with OCFS2 1.2. SuSE was very interested in high availability and clustering and decided to build and include OCFS2 with SLES9 for their customers and was, next to Oracle, the main contributor to the filesystem for both new features and bug fixes. Source code was always available even prior to inclusion into mainline and as of 2.6.16, source code was just part of a Linux kernel download from kernel.org, which it still is, today. So the latest OCFS2 code is always the upstream mainline Linux kernel. OCFS2 is the cluster filesystem used in Oracle VM 2 and Oracle VM 3 as the virtual disk repository filesystem. Since the filesystem is in the Linux kernel it's released under the GPL v2 The release model has always been that new feature development happened in the mainline kernel and we then built consistent, well tested, snapshots that had versions, 1.2, 1.4, 1.6, 1.8. But these releases were effectively just snapshots in time that were tested for stability and release quality. OCFS2 is very easy to use, there's a simple text file that contains the node information (hostname, node number, cluster name) and a file that contains the cluster heartbeat timeouts. It is very small, and very efficient. As Sunil Mushran wrote in the manual : OCFS2 is an efficient, easily configured, quickly installed, fully integrated and compatible, feature-rich, architecture and endian neutral, cache coherent, ordered data journaling, POSIX-compliant, shared disk cluster file system. Here is a list of some of the important features that are included : Variable Block and Cluster sizes Supports block sizes ranging from 512 bytes to 4 KB and cluster sizes ranging from 4 KB to 1 MB (increments in power of 2). Extent-based Allocations Tracks the allocated space in ranges of clusters making it especially efficient for storing very large files. Optimized Allocations Supports sparse files, inline-data, unwritten extents, hole punching and allocation reservation for higher performance and efficient storage. File Cloning/snapshots REFLINK is a feature which introduces copy-on-write clones of files in a cluster coherent way. Indexed Directories Allows efficient access to millions of objects in a directory. Metadata Checksums Detects silent corruption in inodes and directories. Extended Attributes Supports attaching an unlimited number of name:value pairs to the file system objects like regular files, directories, symbolic links, etc. Advanced Security Supports POSIX ACLs and SELinux in addition to the traditional file access permission model. Quotas Supports user and group quotas. Journaling Supports both ordered and writeback data journaling modes to provide file system consistency in the event of power failure or system crash. Endian and Architecture neutral Supports a cluster of nodes with mixed architectures. Allows concurrent mounts on nodes running 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures. In-built Cluster-stack with DLM Includes an easy to configure, in-kernel cluster-stack with a distributed lock manager. Buffered, Direct, Asynchronous, Splice and Memory Mapped I/Os Supports all modes of I/Os for maximum flexibility and performance. Comprehensive Tools Support Provides a familiar EXT3-style tool-set that uses similar parameters for ease-of-use. The filesystem was distributed for Linux distributions in separate RPM form and this had to be built for every single kernel errata release or every updated kernel provided by the vendor. We provided builds from Oracle for Oracle Linux and all kernels released by Oracle and for Red Hat Enterprise Linux. SuSE provided the modules directly for every kernel they shipped. With the introduction of the Unbreakable Enterprise Kernel for Oracle Linux and our interest in reducing the overhead of building filesystem modules for every minor release, we decide to make OCFS2 available as part of UEK. There was no more need for separate kernel modules, everything was built-in and a kernel upgrade automatically updated the filesystem, as it should. UEK allowed us to not having to backport new upstream filesystem code into an older kernel version, backporting features into older versions introduces risk and requires extra testing because the code is basically partially rewritten. The UEK model works really well for continuing to provide OCFS2 without that extra overhead. Because the RHEL kernel did not contain OCFS2 as a kernel module (it is in the source tree but it is not built by the vendor in kernel module form) we stopped adding the extra packages to Oracle Linux and its RHEL compatible kernel and for RHEL. Oracle Linux customers/users obviously get OCFS2 included as part of the Unbreakable Enterprise Kernel, SuSE customers get it by SuSE distributed with SLES and Red Hat can decide to distribute OCFS2 to their customers if they chose to as it's just a matter of compiling the module and making it available. OCFS2 today, in the mainline kernel is pretty much feature complete in terms of integration with every filesystem feature Linux offers and it is still actively maintained with Joel Becker being the primary maintainer. Since we use OCFS2 as part of Oracle VM, we continue to look at interesting new functionality to add, REFLINK was a good example, and as such we continue to enhance the filesystem where it makes sense. Bugfixes and any sort of code that goes into the mainline Linux kernel that affects filesystems, automatically also modifies OCFS2 so it's in kernel, actively maintained but not a lot of new development happening at this time. We continue to fully support OCFS2 as part of Oracle Linux and the Unbreakable Enterprise Kernel and other vendors make their own decisions on support as it's really a Linux cluster filesystem now more than something that we provide to customers. It really just is part of Linux like EXT3 or BTRFS etc, the OS distribution vendors decide. Do not confuse OCFS2 with ACFS (ASM cluster Filesystem) also known as Oracle Cloud Filesystem. ACFS is a filesystem that's provided by Oracle on various OS platforms and really integrates into Oracle ASM (Automatic Storage Management). It's a very powerful Cluster Filesystem but it's not distributed as part of the Operating System, it's distributed with the Oracle Database product and installs with and lives inside Oracle ASM. ACFS obviously is fully supported on Linux (Oracle Linux, Red Hat Enterprise Linux) but OCFS2 independently as a native Linux filesystem is also, and continues to also be supported. ACFS is very much tied into the Oracle RDBMS, OCFS2 is just a standard native Linux filesystem with no ties into Oracle products. Customers running the Oracle database and ASM really should consider using ACFS as it also provides storage/clustered volume management. Customers wanting to use a simple, easy to use generic Linux cluster filesystem should consider using OCFS2. To learn more about OCFS2 in detail, you can find good documentation on http://oss.oracle.com/projects/ocfs2 in the Documentation area, or get the latest mainline kernel from http://kernel.org and read the source. One final, unrelated note - since I am not always able to publicly answer or respond to comments, I do not want to selectively publish comments from readers. Sometimes I forget to publish comments, sometime I publish them and sometimes I would publish them but if for some reason I cannot publicly comment on them, it becomes a very one-sided stream. So for now I am going to not publish comments from anyone, to be fair to all sides. You are always welcome to email me and I will do my best to respond to technical questions, questions about strategy or direction are sometimes not possible to answer for obvious reasons.

    Read the article

< Previous Page | 1 2