Search Results

Search found 92 results on 4 pages for 'infiniband'.

Page 1/4 | 1 2 3 4  | Next Page >

  • How to Avoid Your Next 12-Month Science Project

    - by constant
    While most customers immediately understand how the magic of Oracle's Hybrid Columnar Compression, intelligent storage servers and flash memory make Exadata uniquely powerful against home-grown database systems, some people think that Exalogic is nothing more than a bunch of x86 servers, a storage appliance and an InfiniBand (IB) network, built into a single rack. After all, isn't this exactly what the High Performance Computing (HPC) world has been doing for decades? On the surface, this may be true. And some people tried exactly that: They tried to put together their own version of Exalogic, but then they discover there's a lot more to building a system than buying hardware and assembling it together. IT is not Ikea. Why is that so? Could it be there's more going on behind the scenes than merely putting together a bunch of servers, a storage array and an InfiniBand network into a rack? Let's explore some of the special sauce that makes Exalogic unique and un-copyable, so you can save yourself from your next 6- to 12-month science project that distracts you from doing real work that adds value to your company. Engineering Systems is Hard Work! The backbone of Exalogic is its InfiniBand network: 4 times better bandwidth than even 10 Gigabit Ethernet, and only about a tenth of its latency. What a potential for increased scalability and throughput across the middleware and database layers! But InfiniBand is a beast that needs to be tamed: It is true that Exalogic uses a standard, open-source Open Fabrics Enterprise Distribution (OFED) InfiniBand driver stack. Unfortunately, this software has been developed by the HPC community with fastest speed in mind (which is good) but, despite the name, not many other enterprise-class requirements are included (which is less good). Here are some of the improvements that Oracle's InfiniBand development team had to add to the OFED stack to make it enterprise-ready, simply because typical HPC users didn't have the need to implement them: More than 100 bug fixes in the pieces that were not related to the Message Passing Interface Protocol (MPI), which is the protocol that HPC users use most of the time, but which is less useful in the enterprise. Performance optimizations and tuning across the whole IB stack: From Switches, Host Channel Adapters (HCAs) and drivers to low-level protocols, middleware and applications. Yes, even the standard HPC IB stack could be improved in terms of performance. Ethernet over IB (EoIB): Exalogic uses InfiniBand internally to reach high performance, but it needs to play nicely with datacenters around it. That's why Oracle added Ethernet over InfiniBand technology to it that allows for creating many virtual 10GBE adapters inside Exalogic's nodes that are aggregated and connected to Exalogic's IB gateway switches. While this is an open standard, it's up to the vendor to implement it. In this case, Oracle integrated the EoIB stack with Oracle's own IB to 10GBE gateway switches, and made it fully virtualized from the beginning. This means that Exalogic customers can completely rewire their server infrastructure inside the rack without having to physically pull or plug a single cable - a must-have for every cloud deployment. Anybody who wants to match this level of integration would need to add an InfiniBand switch development team to their project. Or just buy Oracle's gateway switches, which are conveniently shipped with a whole server infrastructure attached! IPv6 support for InfiniBand's Sockets Direct Protocol (SDP), Reliable Datagram Sockets (RDS), TCP/IP over IB (IPoIB) and EoIB protocols. Because no IPv6 = not very enterprise-class. HA capability for SDP. High Availability is not a big requirement for HPC, but for enterprise-class application servers it is. Every node in Exalogic's InfiniBand network is connected twice for redundancy. If any cable or port or HCA fails, there's always a replacement link ready to take over. This requires extra magic at the protocol level to work. So in addition to Weblogic's failover capabilities, Oracle implemented IB automatic path migration at the SDP level to avoid unnecessary failover operations at the middleware level. Security, for example spoof-protection. Another feature that is less important for traditional users of InfiniBand, but very important for enterprise customers. InfiniBand Partitioning and Quality-of-Service (QoS): One of the first questions we get from customers about Exalogic is: “How can we implement multi-tenancy?” The answer is to partition your IB network, which effectively creates many networks that work independently and that are protected at the lowest networking layer possible. In addition to that, QoS allows administrators to prioritize traffic flow in multi-tenancy environments so they can keep their service levels where it matters most. Resilient IB Fabric Management: InfiniBand is a self-managing network, so a lot of the magic lies in coming up with the right topology and in teaching the subnet manager how to properly discover and manage the network. Oracle's Infiniband switches come with pre-integrated, highly available fabric management with seamless integration into Oracle Enterprise Manager Ops Center. In short: Oracle elevated the OFED InfiniBand stack into an enterprise-class networking infrastructure. Many years and multiple teams of manpower went into the above improvements - this is something you can only get from Oracle, because no other InfiniBand vendor can give you these features across the whole stack! Exabus: Because it's not About the Size of Your Network, it's How You Use it! So let's assume that you somehow were able to get your hands on an enterprise-class IB driver stack. Or maybe you don't care and are just happy with the standard OFED one? Anyway, the next step is to actually leverage that InfiniBand performance. Here are the choices: Use traditional TCP/IP on top of the InfiniBand stack, Develop your own integration between your middleware and the lower-level (but faster) InfiniBand protocols. While more bandwidth is always a good thing, it's actually the low latency that enables superior performance for your applications when running on any networking infrastructure: The lower the latency, the faster the response travels through the network and the more transactions you can close per second. The reason why InfiniBand is such a low latency technology is that it gets rid of most if not all of your traditional networking protocol stack: Data is literally beamed from one region of RAM in one server into another region of RAM in another server with no kernel/drivers/UDP/TCP or other networking stack overhead involved! Which makes option 1 a no-go: Adding TCP/IP on top of InfiniBand is like adding training wheels to your racing bike. It may be ok in the beginning and for development, but it's not quite the performance IB was meant to deliver. Which only leaves option 2: Integrating your middleware with fast, low-level InfiniBand protocols. And this is what Exalogic's "Exabus" technology is all about. Here are a few Exabus features that help applications leverage the performance of InfiniBand in Exalogic: RDMA and SDP integration at the JDBC driver level (SDP), for Oracle Weblogic (SDP), Oracle Coherence (RDMA), Oracle Tuxedo (RDMA) and the new Oracle Traffic Director (RDMA) on Exalogic. Using these protocols, middleware can communicate a lot faster with each other and the Oracle database than by using standard networking protocols, Seamless Integration of Ethernet over InfiniBand from Exalogic's Gateway switches into the OS, Oracle Weblogic optimizations for handling massive amounts of parallel transactions. Because if you have an 8-lane Autobahn, you also need to improve your ramps so you can feed it with many cars in parallel. Integration of Weblogic with Oracle Exadata for faster performance, optimized session management and failover. As you see, “Exabus” is Oracle's word for describing all the InfiniBand enhancements Oracle put into Exalogic: OFED stack enhancements, protocols for faster IB access, and InfiniBand support and optimizations at the virtualization and middleware level. All working together to deliver the full potential of InfiniBand performance. Who else has 100% control over their middleware so they can develop their own low-level protocol integration with InfiniBand? Even if you take an open source approach, you're looking at years of development work to create, test and support a whole new networking technology in your middleware! The Extras: Less Hassle, More Productivity, Faster Time to Market And then there are the other advantages of Engineered Systems that are true for Exalogic the same as they are for every other Engineered System: One simple purchasing process: No headaches due to endless RFPs and no “Will X work with Y?” uncertainties. Everything has been engineered together: All kinds of bugs and problems have been already fixed at the design level that would have only manifested themselves after you have built the system from scratch. Everything is built, tested and integrated at the factory level . Less integration pain for you, faster time to market. Every Exalogic machine world-wide is identical to Oracle's own machines in the lab: Instant replication of any problems you may encounter, faster time to resolution. Simplified patching, management and operations. One throat to choke: Imagine finger-pointing hell for systems that have been put together using several different vendors. Oracle's Engineered Systems have a single phone number that customers can call to get their problems solved. For more business-centric values, read The Business Value of Engineered Systems. Conclusion: Buy Exalogic, or get ready for a 6-12 Month Science Project And here's the reason why it's not easy to "build your own Exalogic": There's a lot of work required to make such a system fly. In fact, anybody who is starting to "just put together a bunch of servers and an InfiniBand network" is really looking at a 6-12 month science project. And the outcome is likely to not be very enterprise-class. And it won't have Exalogic's performance either. Because building an Engineered System is literally rocket science: It takes a lot of time, effort, resources and many iterations of design/test/analyze/fix to build such a system. That's why InfiniBand has been reserved for HPC scientists for such a long time. And only Oracle can bring the power of InfiniBand in an enterprise-class, ready-to use, pre-integrated version to customers, without the develop/integrate/support pain. For more details, check the new Exalogic overview white paper which was updated only recently. P.S.: Thanks to my colleagues Ola, Paul, Don and Andy for helping me put together this article! var flattr_uid = '26528'; var flattr_tle = 'How to Avoid Your Next 12-Month Science Project'; var flattr_dsc = 'While most customers immediately understand how the magic of Oracle's Hybrid Columnar Compression, intelligent storage servers and flash memory make Exadata uniquely powerful against home-grown database systems, some people think that Exalogic is nothing more than a bunch of x86 servers, a storage appliance and an InfiniBand (IB) network, built into a single rack.After all, isn't this exactly what the High Performance Computing (HPC) world has been doing for decades?On the surface, this may be true. And some people tried exactly that: They tried to put together their own version of Exalogic, but then they discover there's a lot more to building a system than buying hardware and assembling it together. IT is not Ikea.Why is that so? Could it be there's more going on behind the scenes than merely putting together a bunch of servers, a storage array and an InfiniBand network into a rack? Let's explore some of the special sauce that makes Exalogic unique and un-copyable, so you can save yourself from your next 6- to 12-month science project that distracts you from doing real work that adds value to your company.'; var flattr_tag = 'Engineered Systems,Engineered Systems,Infiniband,Integration,latency,Oracle,performance'; var flattr_cat = 'text'; var flattr_url = 'http://constantin.glez.de/blog/2012/04/how-avoid-your-next-12-month-science-project'; var flattr_lng = 'en_GB'

    Read the article

  • allow infiniband for non root users

    - by user1219721
    I got Infiniband running on RHEL 6.3 [root@master ~]# ibv_devinfo hca_id: mthca0 transport: InfiniBand (0) fw_ver: 4.7.927 node_guid: 0017:08ff:ffd0:6f1c sys_image_guid: 0017:08ff:ffd0:6f1f vendor_id: 0x08f1 vendor_part_id: 25208 hw_ver: 0xA0 board_id: VLT0060010001 phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 2 port_lid: 3 port_lmc: 0x00 link_layer: InfiniBand port: 2 state: PORT_DOWN (1) max_mtu: 2048 (4) active_mtu: 512 (2) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: InfiniBand but it's only working as root. when trying from a non-super user, I got nothing : [nicolas@master ~]$ ibv_devices device node GUID ------ ---------------- mthca0 001708ffffd06f1c So, how to allow regular users to use infiniband ?

    Read the article

  • switchless Infiniband between two servers on RHEL 6.3

    - by exfizik
    I have 2 servers running RHEL 6.3 which have 2 port Infiniband cards >lspci | grep -i infini 07:00.0 InfiniBand: QLogic Corp. IBA7322 QDR InfiniBand HCA (rev 02) I'm interested in connecting them directly to each other bypassing an Infiniband switch (which I don't have). Quick googling showed that at least in some configurations it's possible. I installed all RedHat Infiniband packages with yum groupinstall "Infiniband Support". However, ibv_devinfo shows that both ports in each card are down, which indicates that cables are not connected. But the cable is connected, although the LEDs are off on the cards (not a good sign). Another source of confusion for me is that according to this, RedHat doesn't come with OFED packages and I'm slightly hesitant to install them from source due to the lack of RedHat support for them... So where am I going with this? The questions I have are: is it possible to have a switchless/direct Infiniband connection between two servers the way I described above? If it's possible, do I have to use the OFED packages or can I configure everything with just the packages coming with RHEL. Why are the LEDs off on my servers even though the cable is connected? Any additional input/advice/pointers would be appreciated. P.S. I followed this guide for installation instructions. The Infiniband cards are clearly recognized by my OS and the rdma service is running. Update: I have opensm installed. When I run it it says: OpenSM 3.3.13 Command Line Arguments: Log File: /var/log/opensm.log ------------------------------------------------- OpenSM 3.3.13 Entering DISCOVERING state Using default GUID 0x1175000076e4c8 SM port is down and stays at that point.

    Read the article

  • InfiniBand Enabled Diskless PXE Boot

    - by Neeraj Gupta
    When you want to bring up a compute server in your environment and need InfiniBand connectivity, usually you go through various installation steps. This could involve operating systems like Linux, followed by a compatible InfiniBand software distribution, associated dependencies and configurations. What if you just want to run some InfiniBand diagnostics or troubleshooting tools from a test machine ? What if something happened to your primary machine and while recovering in rescue mode, you also need access to your InfiniBand network ? Often times we use opensource community supported small Linux distributions but they don't come with required InfiniBand support and tools. In this weblog, I am going to provide instructions on how to add InfniBand support to a specific Linux image - Parted Magic.This is a free to use opensource Linux distro often used to recover or rescue machines. The distribution itself will not be changed at all. Yes, you heard it right ! I have built an InfiniBand Add-on package that will be passed to the default kernel and initrd to get this all working. Pr-requisites You will need to have have a PXE server ready on your ethernet based network. The compute server you are trying to PXE boot should have a compatible IB HCA and must be connected to an active IB network. Required Downloads Download the Parted Magic small distribution for PXE from Parted Magic website. Download InfiniBand PXE Add On package. Right Click and Download from here. Do not extract contents of this file. You need to use it as is. Prepare PXE Server Extract the contents of downloaded pmagic distribution into a temporary directory. Inside the directory structure, you will see pmagic directory containing two files - bzImage and initrd.img. Copy this directory in your TFTP server's root directory. This is usually /tftpboot unless you have a different setup. For Example: cp pmagic_pxe_2012_2_27_x86_64.zip /tmp cd /tmp unzip pmagic_pxe_2012_2_27_x86_64.zip cd pmagic_pxe_2012_2_27_x86_64 # ls -l total 12 drwxr-xr-x  3 root root 4096 Feb 27 15:48 boot drwxr-xr-x  2 root root 4096 Mar 17 22:19 pmagic cp -r pmagic /tftpboot As I mentioned earlier, we dont change anything to the default pmagic distro. Simply provide the add-on package via PXE append options. If you are using a menu based PXE server, then add an entry to your menu. For example /tftpboot/pxelinux.cfg/default can be appended with following section. LABEL Diskless Boot With InfiniBand Support MENU LABEL Diskless Boot With InfiniBand Support KERNEL pmagic/bzImage APPEND initrd=pmagic/initrd.img,pmagic/ib-pxe-addon.cgz edd=off load_ramdisk=1 prompt_ramdisk=0 rw vga=normal loglevel=9 max_loop=256 TEXT HELP * A Linux Image which can be used to PXE Boot w/ IB tools ENDTEXT Note: Keep the line starting with "APPEND" as a single line. If you use host specific files in pxelinux.cfg, then you can use that specific file to add the above mentioned entry. Boot Computer over PXE Now boot your desired compute machine over PXE. This does not have to be over InfiniBand. Just use your standard ethernet interface and network. If using menus, then pick the new entry that you created in previous section. Enable IPoIB After a few minutes, you will be booted into Parted Magic environment. Open a terminal session and see if InfiniBand is enabled. You can use commands like: ifconfig -a ibstat ibv_devices ibv_devinfo If you are connected to InfiniBand network with an active Subnet Manager, then your IB interfaces must have come online by now. You can proceed and assign IP address to them. This will enable you at IPoIB layer. Example InfiniBand Diagnostic Tools I have added several InfiniBand Diagnistic tools in this add-on. You can use from following list: ibstat, ibstatus, ibv_devinfo, ibv_devices perfquery, smpquery ibnetdiscover, iblinkinfo.pl ibhosts, ibswitches, ibnodes Wrap Up This concludes this weblog. Here we saw how to bring up a computer with IPoIB and InfiniBand diagnostic tools without installing anything on it. Its almost like running diskless !

    Read the article

  • how Infiniband speed is related to processor speed

    - by user223231
    I have two exactly the same servers and very curious how to make Infiniband interconnection between them? Both servers' basic specs are: CPU: 32GHz = 2x Intel Xeon X5650, 6 core, 2.66GHz and RAM: 24GB per server (edited) How determine what speed of Infiniband will be enough for perfect interconnection? SDR, DDR, QDR or FDR? My logic is 32Ghz = 32Gb/s and 40Gb one is enough, am I right or it is not that simple?

    Read the article

  • infiniband network between 3 servers

    - by grumpf
    Let's say I have 3 different servers, each one with an infiniband card. Each card has 2 different ports. (I don't know about the model yet) Is it possible to create 3 different networks and to allow the 3 servers to communicate with each other without any problems? (and any spof). I guess I just have to setup the /etc/hosts correctly. I really don't know about infiniband, so please help me :) Thanks in advance. EDIT: Point is to NOT USE a switch!

    Read the article

  • Infiniband: a highperformance network fabric - Part I

    - by Karoly Vegh
    Introduction:At the OpenWorld this year I managed to chat with interesting people again - one of them answering Infiniband deepdive questions with ease by coffee turned out to be one of Oracle's IB engineers, Ted Kim, who actually actively participates in the Infiniband Trade Association and integrates Oracle solutions with this highspeed network. This is why I love attending OOW. He granted me an hour of his time to talk about IB. This post is mostly based on that tech interview.Start of the actual post: Traditionally datatransfer between servers and storage elements happens in networks with up to 10 gigabit/seconds or in SANs with up to 8 gbps fiberchannel connections. Happens. Well, data rather trickles through.But nowadays data amounts grow well over the TeraByte order of magnitude, and multisocket/multicore/multithread Servers hunger data that these transfer technologies just can't deliver fast enough, causing all CPUs of this world do one thing at the same speed - waiting for data. And once again, I/O is the bottleneck in computing. FC and Ethernet can't keep up. We have half-TB SSDs, dozens of TB RAM to store data to be modified in, but can't transfer it. Can't backup fast enough, can't replicate fast enough, can't synchronize fast enough, can't load fast enough. The bad news is, everyone is used to this, like back in the '80s everyone was used to start compile jobs and go for a coffee. Or on vacation. The good news is, there's an alternative. Not so-called "bleeding-edge" 8gbps, but (as of now) 56. Not layers of overhead, but low latency. And it is available now. It has been for a while, actually. Welcome to the world of Infiniband. Short history:Infiniband was born as a result of joint efforts of HPAQ, IBM, Intel, Sun and Microsoft. They planned to implement a next-generation I/O fabric, in the 90s. In the 2000s Infiniband (from now on: IB) was quite popular in the high-performance computing field, powering most of the top500 supercomputers. Then in the middle of the decade, Oracle realized its potential and used it as an interconnect backbone for the first Database Machine, the first Exadata. Since then, IB has been booming, Oracle utilizes and supports it in a large set of its HW products, it is the backbone of the famous Engineered Systems: Exadata, SPARC SuperCluster, Exalogic, OVCA and even the new DB backup/recovery box. You can also use it to make servers talk highspeed IP to eachother, or to a ZFS Storage Appliance. Following Oracle's lead, even IBM has jumped the wagon, and leverages IB in its PureFlex systems, their first InfiniBand Machines.IB Structural Overview: If you want to use IB in your servers, the first thing you will need is PCI cards, in IB terms Host Channel Adapters, or HCAs. Just like NICs for Ethernet, or HBAs for FC. In these you plug an IB cable, going to an IB switch providing connection to other IB HCAs. Of course you're going to need drivers for those in your OS. Yes, these are long-available for Solaris and Linux. Now, what protocols can you talk over IB? There's a range of choices. See, IB isn't accepting package loss like Ethernet does, and hence doesn't need to rely on TCP/IP as a workaround for resends. That is, you still can run IP over IB (IPoIB), and that is used in various cases for control functionality, but the datatransfer can run over more efficient protocols - like native IB. About PCI connectivity: IB cards, as you see are fast. They bring low latency, which is just as important as their bandwidth. Current IB cards run at 56 gbit/s. That is slightly more than double of the capacity of a PCI Gen2 slot (of ~25 gbit/s). And IB cards are equipped usually with two ports - that is, altogether you'd need 112 gbit/s PCI slots, to be able to utilize FDR IB cards in an active-active fashion. PCI Gen3 slots provide you with around ~50gbps. This is why the most IB cards are configured in an active-standby way if both ports are used. Once again the PCI slot is the bottleneck. Anyway, the new Oracle servers are equipped with Gen3 PCI slots, an the new IB HCAs support those too. Oracle utilizes the QDR HCAs, running at 40gbp/s brutto, which translates to a 32gbp/s net traffic due to the 10:8 signal-to-data information ratio. Consolidation techniques: Technology never stops to evolve. Mellanox is working on the 100 gbps (EDR) version already, which will be optical, since signal technology doesn't allow EDR to be copper. Also, I hear you say "100gbps? I will never use/need that much". Are you sure? Have you considered consolidation scenarios, where (for example with Oracle Virtual Network) you could consolidate your platform to a high densitiy virtualized solution providing many virtual 10gbps interfaces through that 100gbps? Technology never stops to evolve. I still remember when a 10mbps network was impressively fast. Back in those days, 16MB of RAM was a lot. Now we usually run servers with around 100.000 times more RAM. If network infrastrucure speends could grow as fast as main memory capacities, we'd have a different landscape now :) You can utilize SRIOV as well for consolidation. That is, if you run LDoms (aka Oracle VM Server for SPARC) you do not have to add physical IB cards to all your guest LDoms, and you do not need to run VIO devices through the hypervisor either (avoiding overhead). You can enable SRIOV on those IB cards, which practically virtualizes the PCI bus, and you can dedicate Physical- and Virtual Functions of the virtualized HCAs as native, physical HW devices to your guests. See Raghuram's excellent post explaining SRIOV. SRIOV for IB is supported since LDoms 3.1.  This post is getting lengthier, so I will rename it to Part I, and continue it in a second post. 

    Read the article

  • InfiniBand Enabled Diskless PXE Boot

    - by Neeraj Gupta
    If you ever need to bring up a computer with InfiniBand networking capabilities and diagnostic tools, without even going through any installation on its hard disk, then please read on. In this article, I am going to talk about how to boot a computer over the network using PXE and have IPoIB enabled. Of course, the computer must have a compatible InfiniBand Host Channel Adapter (HCA) installed and connected to your IB network already. [ Read More ]

    Read the article

  • How can I determine which switch the Infiniband subnet manager is running on?

    - by ajdecon
    I recently inherited an Infiniband network containing multiple switches, and I know that one of these switches is running the subnet manager. The rest supposedly have that feature turned off, or were never enabled. The trouble is, I have no idea which one it is... I'd like to replace the switch subnet manager with OpenSM running on a couple of my infrastructure servers. Is there any way, short of logging into each switch individually, to determine which switch is running the SM?

    Read the article

  • Areca 1880ix RAID hangs

    - by Dave
    Areca RAID controller ARC-1880ix-12 (firmware 1.50) hangs when on high load. My setup is: Chenbro 3U chassis Intel S5500BC mainboard Xeon 5603 CPU 16GB of RAM 12 Seagate SAS drives ST32000645SS (2 of them as hot spare, 10 as RAID10) Mellanox Infiniband HBA card This server is working as external infiniband storage for Xen VMs. When load is quite big Areca's firmware hangs - it becomes unreachable even from Areca's ethernet adapter. After resetting the server power it returns to normal operation. While Areca is hanged I can confirm that it is powered (ethernet link is active) and Infiniband HBA works Ok. Thanks in advance for any idea or suggestion where the problem might be!

    Read the article

  • Is Infiniband going to get squeezed by iWARP and external QPI?

    - by andy.grover
    The Inquirer certainly thinks so.However, I'm not so sure it makes sense to compare Infiniband to an as-yet-unannounced optical external QPI. QPI is currently a processor interconnect. CPUs, RAM, and devices connected by it are conceptually part of the same machine -- they run a single OS, for example. They are both "networks" or "fabrics" but they have very different design trade-offs.Another widely-used bus in the system is closer to Infiniband than QPI -- PCI Express. Isn't it more likely that PCIe could take on IB? There are companies already who have solutions that use external PCI Express for cluster interconnect, but these have not gained significant market share. Why would QPI, a technology whose sweet spot is even further from Infiniband's than PCIe, be able to challenge Infiniband? It's hard to speculate without much information, but right now it doesn't seem likely to me.The other prediction made in the article is that Intel's 10GbE iWARP card could squeeze IB on the low end, due to its greater compatibility and lower cost.It's definitely never a good idea to bet against Ethernet when it comes to mass-market, commodity networking. Ethernet will win. 10GbE will win. But, there are now two competing ways to implement the low-latency RDMA Verbs interface on top of Ethernet. iWARP is essentially RDMA over TCP/IP over Ethernet. The new alternative is IBoE (Infiniband over Ethernet, aka RoCEE, aka "Rocky"). This encapsulates the IB packet protocol directly in the Ethernet frame. It loses the layer 3 routability of iWARP, but better maintains software compatibility with existing apps that use IB, and is simpler to implement in both software and hardware. iWARP has a substantial head start, but I believe that IBoE silicon will eventually be cheaper, and more likely to be implemented in commodity Ethernet hardware.I think IBoE is going to take low-end market share from traditional IB, but I think this is a situation IB hardware vendors have no problem accepting. Commoditized IBoE NICs invite greater use of RDMA features, and when higher performance is needed, customers can upgrade to "real" IB, maintaining IB's justification for higher prices. (IB max interconnect speeds have historically been 2-4x higher than Ethernet, and I don't see that changing.)(ObDisclosure: My current employer now sells IB hardware. I previously also worked at Intel. My opinions are my own, duh.)

    Read the article

  • Oracle Database Machine and Exadata Storage Server

    - by jean-marc.gaudron(at)oracle.com
    Master Note for Oracle Database Machine and Exadata Storage Server (Doc ID 1187674.1)This Master Note is intended to provide an index and references to the most frequently used My Oracle Support Notes with respect to Oracle Exadata and Oracle Database Machine environments. This Master Note is subdivided into categories to allow for easy access and reference to notes that are applicable to your area of interest. This includes the following categories: • Database Machine and Exadata Storage Server Concepts and Overview• Database Machine and Exadata Storage Server Configuration and Administration• Database Machine and Exadata Storage Server Troubleshooting and Debugging• Database Machine and Exadata Storage Server Best Practices• Database Machine and Exadata Storage Server Patching• Database Machine and Exadata Storage Server Documentation and References• Database Machine and Exadata Storage Server Known Problems• ASM and RAC Documentation• Using My Oracle Support Effectively

    Read the article

  • How do I make dnsmasq serve IP addresses via IPoIB?

    - by Matt
    I have a cluster farm that I'm setting up. The nodes (computers in the farm) are connected via ethernet & IP over Infiniband. I'm needing to netboot the nodes and thought dnsmasq would fit well as it provides all the features including support for DHCP over IB and it works great for our ethernet setup. However, I can't seem to get it to provide IP addresses to the infiniband adaptors on the nodes. Each node is running an Ubuntu desktop 12.04 LTS. The dnsmasq server is running on ubuntu server 12.04LTS and has the following test config: dhcp-authoritative domain-needed bogus-priv expand-hosts no-hosts domain=local dhcp-range=eth0,10.0.0.10,10.0.0.255,12h dhcp-option=eth0,3,10.0.0.1 dhcp-range=ib0,10.1.1.10,10.1.1.255,12h dhcp-option=ib0,3,10.1.1.1 log-queries log-dhcp IPoIB works between nodes when configured statically but not with dhcp. On the nodes the file /etc/network/interfaces contains auto lo iface lo inet loopback auto ib0 iface ib0 inet dhcp #iface ib0 inet static #address 10.1.1.5 #netmask 255.0.0.0 up echo connected >`find /sys -name mode | grep ib0` Is there something I need to do on the client or server end to make this work?

    Read the article

  • Oracle... and InfiniBand.

    - by jenny.gelhausen
    Beginning Sunday, 14th March 2010 the OpenFabrics Alliance has been hosting its annual conference in Sonoma, California. On Monday morning, Tim Shetler - VP of Product Management at Oracle - addressed a conference room full to the brim with the industry's InfiniBand luminaries. That same afternoon, Sumanta Chatterjee, Senior Director of Development at Oracle, was publicly lauded by moderator Bill Boas for being a long-standing, pivotal driver and crucial member of the community. A testament to InfiniBand's building momentum, it is no surprise to find it at the core of Oracle's flagship product, the Sun Oracle Database Machine. var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); try { var pageTracker = _gat._getTracker("UA-13185312-1"); pageTracker._trackPageview(); } catch(err) {}

    Read the article

  • How to correctly set up iWARP? Preferably on loopback

    - by ajdecon
    iWARP is a protocol for doing remote direct memory access (RDMA) on top of TCP/IP, so that it can work with Ethernet and other network types as opposed to Infiniband. It works with many of the standard IB interfaces - the IB verbs, for example - so it's all pretty transparent. I'm doing some IB-verbs programming (mostly for the sake of learning about how they work better), and it'd be wonderfully convenient for me if I could use iWARP to do RDMA over my loopback interface, so that I could test some of my code without getting on our IB-connected cluster. :-) But I cannot figure out how to get a "local development environment" set up: there are no tutorials I'm aware of for even setting up iWARP from scratch on a server or a network interface. Can anyone give me a tutorial or point me in the right direction? Environment is Fedora 16 running in VirtualBox.

    Read the article

  • Are there HP Infinihost III ex drivers for windows available?

    - by Matt
    I picked up some infiniband cards off ebay for development/testing purposes. I was hoping that there would be some drivers available under windows 7, but it doesn't seem to be recognised by the OFED software which would seem to contain windows drivers. They are however immediately picked up by Ubuntu and drivers are loaded. Are these cards supported under windows 7 at all? mstflint under linux reports: Image type: Failsafe FW Version: 4.8.930 I.S. Version: 1 Device ID: 25208 Chip Revision: A0 Description: Node Port1 Port2 Sys image GUIDs: 001a4bffff0c9374 001a4bffff0c9375 001a4bffff0c9376 001a4bffff0c9377 Board ID: (HP_0060000001) VSD: PSID: HP_0060000001 From what I can tell they are the following: HP IB 4X DDR PCI-e DUAL PORT HCA (HP part number 409376-B21)

    Read the article

  • What is bondib1 used for on SPARC SuperCluster with InfiniBand, Solaris 11 networking & Oracle RAC?

    - by user12620111
    A co-worker asked the following question about a SPARC SuperCluster InfiniBand network: > on the database nodes the RAC nodes communicate over the cluster_interconnect. This is the > 192.168.10.0 network on bondib0. (according to ./crs/install/crsconfig_params NETWORKS> setting) > What is bondib1 used for? Is it a HA counterpart in case bondib0 dies? This is my response: Summary: bondib1 is currently only being used for outbound cluster interconnect interconnect traffic. Details: bondib0 is the cluster_interconnect $ oifcfg getif            bondeth0  10.129.184.0  global  public bondib0  192.168.10.0  global  cluster_interconnect ipmpapp0  192.168.30.0  global  public bondib0 and bondib1 are on 192.168.10.1 and 192.168.10.2 respectively. # ipadm show-addr | grep bondi bondib0/v4static  static   ok           192.168.10.1/24 bondib1/v4static  static   ok           192.168.10.2/24 Hostnames tied to the IPs are node1-priv1 and node1-priv2  # grep 192.168.10 /etc/hosts 192.168.10.1    node1-priv1.us.oracle.com   node1-priv1 192.168.10.2    node1-priv2.us.oracle.com   node1-priv2 For the 4 node RAC interconnect: Each node has 2 private IP address on the 192.168.10.0 network. Each IP address has an active InfiniBand link and a failover InfiniBand link. Thus, the 4 node RAC interconnect is using a total of 8 IP addresses and 16 InfiniBand links. bondib1 isn't being used for the Virtual IP (VIP): $ srvctl config vip -n node1 VIP exists: /node1-ib-vip/192.168.30.25/192.168.30.0/255.255.255.0/ipmpapp0, hosting node node1 VIP exists: /node1-vip/10.55.184.15/10.55.184.0/255.255.255.0/bondeth0, hosting node node1 bondib1 is on bondib1_0 and fails over to bondib1_1: # ipmpstat -g GROUP       GROUPNAME   STATE     FDT       INTERFACES ipmpapp0    ipmpapp0    ok        --        ipmpapp_0 (ipmpapp_1) bondeth0    bondeth0    degraded  --        net2 [net5] bondib1     bondib1     ok        --        bondib1_0 (bondib1_1) bondib0     bondib0     ok        --        bondib0_0 (bondib0_1) bondib1_0 goes over net24 # dladm show-link | grep bond LINK                CLASS     MTU    STATE    OVER bondib0_0           part      65520  up       net21 bondib0_1           part      65520  up       net22 bondib1_0           part      65520  up       net24 bondib1_1           part      65520  up       net23 net24 is IB Partition FFFF # dladm show-ib LINK         HCAGUID         PORTGUID        PORT STATE  PKEYS net24        21280001A1868A  21280001A1868C  2    up     FFFF net22        21280001CEBBDE  21280001CEBBE0  2    up     FFFF,8503 net23        21280001A1868A  21280001A1868B  1    up     FFFF,8503 net21        21280001CEBBDE  21280001CEBBDF  1    up     FFFF On Express Module 9 port 2: # dladm show-phys -L LINK              DEVICE       LOC net21             ibp4         PCI-EM1/PORT1 net22             ibp5         PCI-EM1/PORT2 net23             ibp6         PCI-EM9/PORT1 net24             ibp7         PCI-EM9/PORT2 Outbound traffic on the 192.168.10.0 network will be multiplexed between bondib0 & bondib1 # netstat -rn Routing Table: IPv4   Destination           Gateway           Flags  Ref     Use     Interface -------------------- -------------------- ----- ----- ---------- --------- 192.168.10.0         192.168.10.2         U        16    6551834 bondib1   192.168.10.0         192.168.10.1         U         9    5708924 bondib0   There is a lot more traffic on bondib0 than bondib1 # /bin/time snoop -I bondib0 -c 100 > /dev/null Using device ipnet/bondib0 (promiscuous mode) 100 packets captured real        4.3 user        0.0 sys         0.0 (100 packets in 4.3 seconds = 23.3 pkts/sec) # /bin/time snoop -I bondib1 -c 100 > /dev/null Using device ipnet/bondib1 (promiscuous mode) 100 packets captured real       13.3 user        0.0 sys         0.0 (100 packets in 13.3 seconds = 7.5 pkts/sec) Half of the packets on bondib0 are outbound (from self). The remaining packet are split evenly, from the other nodes in the cluster. # snoop -I bondib0 -c 100 | awk '{print $1}' | sort | uniq -c Using device ipnet/bondib0 (promiscuous mode) 100 packets captured   49 node1-priv1.us.oracle.com   24 node2-priv1.us.oracle.com   14 node3-priv1.us.oracle.com   13 node4-priv1.us.oracle.com 100% of the packets on bondib1 are outbound (from self), but the headers in the packets indicate that they are from the IP address associated with bondib0: # snoop -I bondib1 -c 100 | awk '{print $1}' | sort | uniq -c Using device ipnet/bondib1 (promiscuous mode) 100 packets captured  100 node1-priv1.us.oracle.com The destination of the bondib1 outbound packets are split evenly, to node3 and node 4. # snoop -I bondib1 -c 100 | awk '{print $3}' | sort | uniq -c Using device ipnet/bondib1 (promiscuous mode) 100 packets captured   51 node3-priv1.us.oracle.com   49 node4-priv1.us.oracle.com Conclusion: bondib1 is currently only being used for outbound cluster interconnect interconnect traffic.

    Read the article

  • Exadata X3 Expandability Update

    - by Bandari Huang
    Exadata Database Machine X3-2 Data Sheet New 10/29/2012 Up to 18 racks can be connected without requiring additional InfiniBand switches. Exadata Database Machine X3-8 Data Sheet New 10/24/2012 Scale by connecting multiple Exadata Database Machine X3-8 racks or Exadata Storage Expansion Racks. Up to 18 racks can be connected by simply connecting via InfiniBand cables. Larger configurations can be built with additional InfiniBand switches.  

    Read the article

  • Security Access Control With Solaris Virtualization

    - by Thierry Manfe-Oracle
    Numerous Solaris customers consolidate multiple applications or servers on a single platform. The resulting configuration consists of many environments hosted on a single infrastructure and security constraints sometimes exist between these environments. Recently, a customer consolidated many virtual machines belonging to both their Intranet and Extranet on a pair of SPARC Solaris servers interconnected through Infiniband. Virtual Machines were mapped to Solaris Zones and one security constraint was to prevent SSH connections between the Intranet and the Extranet. This case study gives us the opportunity to understand how the Oracle Solaris Network Virtualization Technology —a.k.a. Project Crossbow— can be used to control outbound traffic from Solaris Zones. Solaris Zones from both the Intranet and Extranet use an Infiniband network to access a ZFS Storage Appliance that exports NFS shares. Solaris global zones on both SPARC servers mount iSCSI LU exported by the Storage Appliance.  Non-global zones are installed on these iSCSI LU. With no security hardening, if an Extranet zone gets compromised, the attacker could try to use the Storage Appliance as a gateway to the Intranet zones, or even worse, to the global zones as all the zones are reachable from this node. One solution consists in using Solaris Network Virtualization Technology to stop outbound SSH traffic from the Solaris Zones. The virtualized network stack provides per-network link flows. A flow classifies network traffic on a specific link. As an example, on the network link used by a Solaris Zone to connect to the Infiniband, a flow can be created for TCP traffic on port 22, thereby a flow for the ssh traffic. A bandwidth can be specified for that flow and, if set to zero, the traffic is blocked. Last but not least, flows are created from the global zone, which means that even with root privileges in a Solaris zone an attacker cannot disable or delete a flow. With the flow approach, the outbound traffic of a Solaris zone is controlled from outside the zone. Schema 1 describes the new network setting once the security has been put in place. Here are the instructions to create a Crossbow flow as used in Schema 1 : (GZ)# zoneadm -z zonename halt ...halts the Solaris Zone. (GZ)# flowadm add-flow -l iblink -a transport=TCP,remote_port=22 -p maxbw=0 sshFilter  ...creates a flow on the IB partition "iblink" used by the zone to connect to the Infiniband.  This IB partition can be identified by intersecting the output of the commands 'zonecfg -z zonename info net' and 'dladm show-part'.  The flow is created on port 22, for the TCP traffic with a zero maximum bandwidth.  The name given to the flow is "sshFilter". (GZ)# zoneadm -z zonename boot  ...restarts the Solaris zone now that the flow is in place.Solaris Zones and Solaris Network Virtualization enable SSH access control on Infiniband (and on Ethernet) without the extra cost of a firewall. With this approach, no change is required on the Infiniband switch. All the security enforcements are put in place at the Solaris level, minimizing the impact on the overall infrastructure. The Crossbow flows come in addition to many other security controls available with Oracle Solaris such as IPFilter and Role Based Access Control, and that can be used to tackle security challenges.

    Read the article

  • Unexpected advantage of Engineered Systems

    - by user12244672
    It's not surprising that Engineered Systems accelerate the debugging and resolution of customer issues. But what has surprised me is just how much faster issue resolution is with Engineered Systems such as SPARC SuperCluster. These are powerful, complex, systems used by customers wanting extreme database performance, app performance, and cost saving server consolidation. A SPARC SuperCluster consists or 2 or 4 powerful T4-4 compute nodes, 3 or 6 extreme performance Exadata Storage Cells, a ZFS Storage Appliance 7320 for general purpose storage, and ultra fast Infiniband switches.  Each with its own firmware. It runs Solaris 11, Solaris 10, 11gR2, LDoms virtualization, and Zones virtualization on the T4-4 compute nodes, a modified version of Solaris 11 in the ZFS Storage Appliance, a modified and highly tuned version of Oracle Linux running Exadata software on the Storage Cells, another Linux derivative in the Infiniband switches, etc. It has an Infiniband data network between the components, a 10Gb data network to the outside world, and a 1Gb management network. And customers can run whatever middleware and apps they want on it, clustered in whatever way they want. In one word, powerful.  In another, complex. The system is highly Engineered.  But it's designed to run general purpose applications. That is, the physical components, configuration, cabling, virtualization technologies, switches, firmware, Operating System versions, network protocols, tunables, etc. are all preset for optimum performance and robustness. That improves the customer experience as what the customer runs leverages our technical know-how and best practices and is what we've tested intensely within Oracle. It should also make debugging easier by fixing a large number of variables which would otherwise be in play if a customer or Systems Integrator had assembled such a complex system themselves from the constituent components.  For example, there's myriad network protocols which could be used with Infiniband.  Myriad ways the components could be interconnected, myriad tunable settings, etc. But what has really surprised me - and I've been working in this area for 15 years now - is just how much easier and faster Engineered Systems have made debugging and issue resolution. All those error opportunities for sub-optimal cabling, unusual network protocols, sub-optimal deployment of virtualization technologies, issues with 3rd party storage, issues with 3rd party multi-pathing products, etc., are simply taken out of the equation. All those error opportunities for making an issue unique to a particular set-up, the "why aren't we seeing this on any other system ?" type questions, the doubts, just go away when we or a customer discover an issue on an Engineered System. It enables a really honed response, getting to the root cause much, much faster than would otherwise be the case. Here's a couple of examples from the last month, one found in-house by my team, one found by a customer: Example 1: We found a node eviction issue running 11gR2 with Solaris 11 SRU 12 under extreme load on what we call our ExaLego test system (mimics an Exadata / SuperCluster 11gR2 Exadata Storage Cell set-up).  We quickly established that an enhancement in SRU12 enabled an 11gR2 process to query Infiniband's Subnet Manager, replacing a fallback mechanism it had used previously.  Under abnormally heavy load, the query could return results which were misinterpreted resulting in node eviction.  In several daily joint debugging sessions between the Solaris, Infiniband, and 11gR2 teams, the issue was fully root caused, evaluated, and a fix agreed upon.  That fix went back into all Solaris releases the following Monday.  From initial issue discovery to the fix being put back into all Solaris releases was just 10 days. Example 2: A customer reported sporadic performance degradation.  The reasons were unclear and the information sparse.  The SPARC SuperCluster Engineered Systems support teams which comprises both SPARC/Solaris and Database/Exadata experts worked to root cause the issue.  A number of contributing factors were discovered, including tunable parameters.  An intense collaborative investigation between the engineering teams identified the root cause to a CPU bound networking thread which was being starved of CPU cycles under extreme load.  Workarounds were identified.  Modifications have been put back into 11gR2 to alleviate the issue and a development project already underway within Solaris has been sped up to provide the final resolution on the Solaris side.  The fixed SPARC SuperCluster configuration greatly aided issue reproduction and dramatically sped up root cause analysis, allowing the correct workarounds and fixes to be identified, prioritized, and implemented.  The customer is now extremely happy with performance and robustness.  Since the configuration is common to other customers, the lessons learned are being proactively rolled out to other customers and incorporated into the installation procedures for future customers.  This effectively acts as a turbo-boost to performance and reliability for all SPARC SuperCluster customers.  If this had occurred in a "home grown" system of this complexity, I expect it would have taken at least 6 months to get to the bottom of the issue.  But because it was an Engineered System, known, understood, and qualified by both the Solaris and Database teams, we were able to collaborate closely to identify cause and effect and expedite a solution for the customer.  That is a key advantage of Engineered Systems which should not be underestimated.  Indeed, the initial issue mitigation on the Database side followed by final fix on the Solaris side, highlights the high degree of collaboration and excellent teamwork between the Oracle engineering teams.  It's a compelling advantage of the integrated Oracle Red Stack in general and Engineered Systems in particular.

    Read the article

  • New Whitepaper - Exalogic Virtualization Architecture

    - by Javier Puerta
    One of the key enhancements in the current generation of Oracle Exalogic systems—and the focus of this whitepaper—is Oracle’s incorporation of virtualized InfiniBand I/O interconnects using Single Root I/O Virtualization (SR-IOV) technology to permit the system to share the internal InfiniBand network and storage fabric between as many as 63 virtual machines per physical server node with near-native performance simultaneously allowing both high performance and high workload consolidation. Download it here: An Oracle White Paper - November 2012- Oracle Exalogic Elastic Cloud: Advanced I/O Virtualization Architecture for Consolidating High-Performance Workloads

    Read the article

  • Which is the fastest way to move 1Petabyte from one storage to a new one?

    - by marc.riera
    First of all, thanks for reading, and sorry for asking something related to my job. I understand that this is something that I should solve by myself but as you will see its something a bit difficult. A small description: Now Storage = 1PB using DDN S2A9900 storage for the OSTs, 4 OSS , 10 GigE network. (lustre 1.6) 100 compute nodes with 2x Infiniband 1 infiniband switch with 36 ports After Storage = Previous storage + another 1PB using DDN S2A 990 or LSI E5400 (still to decide) (lustre 2.0) 8 OSS , 10GigE network 100 compute nodes with 2x Infiniband Previous experience: transfered 120 TB in less than 3 days using following command: tar -C /old --record-size 2048 -b 2048 -cf - dir | tar -C /new --record-size 2048 -b 2048 -xvf - 2>&1 | tee /tmp/dir.log So , big problem here, using big mathematical equations I conclude that we are going to need 1 month to transfer the data from one side to the new one. During this time the researchers will need to step back, and I'm personally not happy with this. I'm telling you that we have infiniband connections because I think that may be there is a chance to use it to transfer the data using 18 compute nodes (18 * 2 IB = 36 ports) to transfer the data from one storage to the other. I'm trying to figure out if the IB switch will handle all the traffic but in case it just burn up will go faster than using 10GigE. Also, having lustre 1.6 and 2.0 agents on same server works quite well, with this there is no need to go by 1.8 to upgrade the metadata servers with two steps. Any ideas? Many thanks Note 1: Zoredache, we can divide it in two blocks (A)600Tb and (B)400Tb. The idea is to move (A) to new storage which is lustre2.0 formated, then format where (A) was with lustre2.0 and move (B) to this lustre2.0 block and extend with the space where (B) was. This way we will end with (A) and (B) on separate filesystems, with 1PB each.

    Read the article

  • Oracle OpenWorld 2011????WebLogic Server?????????|WebLogic Channel|??????

    - by ???02
    2011?10?2?~6??????????????????????Oracle OpenWorld 2011???????????????????????????????????????????????????WebLogic Server??????????????????????? Fusion Middleware?????? ???????? ?????????????????????????WebLogic Server??????????????????(???)????????????????WebLogic Server――Oracle OpenWorld 2011??????????·?????????????????????????????????????????WebLogic Server?????????????????????????????????WebLogic Server?????????????????????? Oracle OpenWorld 2011???????????????????????????????Oracle Fusion Financial Management????????????????Oracle Fusion Human Capital Management????????????????Oracle Fusion Supply Chain Management???????????????????Oracle Fusion Applications??????????????WebLogic Server????????????????????????????????????????? ?????????????????????·????????????????????????Oracle Fusion Middleware???????????????????????????????????????????????Development Tools???Cloud Application Foundation???Enterprise Management??3???????WebLogic Server??Oracle Coherence?Oracle Tuxedo?????Cloud Application Foundation???????????? ????????????????????????WebLogic Server?????????????????????????????????????????????????????――Oracle OpenWorld 2011??????????????????·????????Oracle Public Cloud????????????Oracle Public Cloud??????WebLogic Server????????????????? ??????????·????·???????Oracle Public Cloud???????·?????1????Java?????????????????????PaaS???WebLogic Server??????????????????????????????Java EE????????????????????????????????????????????????????????????????????????????????????????????????――WebLogic Server????????????????????????? ??????????Oracle Database??????????????WebLogic Server??Oracle Real Application Clusters(RAC)?????????????Oracle Database????????????????????Active GridLink for RAC???????????????????????????????????????WebLogic Server???????????????????????????????????????? ?????????????WebLogic Server???????????Java EE 6????????????????????????????·???????GlassFish 3???Java EE 6????????????WebLogic Server???????????????????????????????WebLogic Server?Java EE 6??????????????·??????????????Java EE 6?????????????????????????????????????? ??Java EE 6????????????????????????????????????????????Web???????????Java EE?????????Web Profile?????????????????WebLogic Server?????????????????????????????????????????????????????????????Oracle Exalogic Elastic Cloud――??????WebLogic Server??????????????·???????????Oracle Exalogic Elastic Cloud??????????????????????????????????? ???WebLogic Server???????????????????·????????????????????????????????????????????????????????????????????????????????????????????????Oracle VM 3.0 for Exalogic?????????????????????????????Exalogic????????????????????????????????????????????????????????????? ?????????????????????Exalogic Control????????????????????????????????????Oracle Enterprise Manager???????????????????????????????????? ???Java EE??????????????????????Virtual Assembly Builder??Exalogic????????????????????????Java EE???????????????????????????????1????????????Exalogic?????????????????? ?????????????????Exalogic??????????InfiniBand????????????Exabus???Oracle Coherence?Oracle Tuxedo??????????????????????Exabus????Exalogic???????????????????????????????????????――Exabus???????????????????????????????? ?????????I/O?????????????????????????·??????TCP/IP??????????????????????????????????????????????????????InfiniBand??????????????????????? ??????Exabus????????????????????????InfiniBand??????????????????????????????????????????????????????????????????????????????????4??????????(??)?6??1????????????Exabus??????API????Oracle Coherence???Exabus Java API?????Tuxedo???Exabus RDMA API????????――???????WebLogic Server?Exalogic???????Java EE?????????????????????????? WebLogic Server?????????Java EE 6??????????????????????????????????????????????????????????????????????? ???Exalogic??????????????Exabus???????????????????????????????????????????????????Java EE??????????????????????????????????Java EE???????????????????????????????????????????????

    Read the article

  • The SPARC SuperCluster

    - by Karoly Vegh
    Oracle has been providing a lead in the Engineered Systems business for quite a while now, in accordance with the motto "Hardware and Software Engineered to Work Together." Indeed it is hard to find a better definition of these systems.  Allow me to summarize the idea. It is:  Build a compute platform optimized to run your technologies Develop application aware, intelligently caching storage components Take an impressively fast network technology interconnecting it with the compute nodes Tune the application to scale with the nodes to yet unseen performance Reduce the amount of data moving via compression Provide this all in a pre-integrated single product with a single-pane management interface All these ideas have been around in IT for quite some time now. The real Oracle advantage is adding the last one to put these all together. Oracle has built quite a portfolio of Engineered Systems, to run its technologies - and run those like they never ran before. In this post I'll focus on one of them that serves as a consolidation demigod, a multi-purpose engineered system.  As you probably have guessed, I am talking about the SPARC SuperCluster. It has many great features inherited from its predecessors, and it adds several new ones. Allow me to pick out and elaborate about some of the most interesting ones from a technological point of view.  I. It is the SPARC SuperCluster T4-4. That is, as compute nodes, it includes SPARC T4-4 servers that we learned to appreciate and respect for their features: The SPARC T4 CPUs: Each CPU has 8 cores, each core runs 8 threads. The SPARC T4-4 servers have 4 sockets. That is, a single compute node can in parallel, simultaneously  execute 256 threads. Now, a full-rack SPARC SuperCluster has 4 of these servers on board. Remember the keyword demigod.  While retaining the forerunner SPARC T3's exceptional throughput, the SPARC T4 CPUs raise the bar with single performance too - a humble 5x better one than their ancestors.  actually, the SPARC T4 CPU cores run in both single-threaded and multi-threaded mode, and switch between these two on-the-fly, fulfilling not only single-threaded OR multi-threaded applications' needs, but even mixed requirements (like in database workloads!). Data security, anyone? Every SPARC T4 CPU core has a built-in encryption engine, that is, encryption algorithms cast into silicon.  A PCI controller right on the chip for customers who need I/O performance.  Built-in, no-cost Virtualization:  Oracle VM for SPARC (the former LDoms or Logical Domains) is not a server-emulation virtualization technology but rather a serverpartitioning one, the hypervisor runs in the server firmware, and all the VMs' HW resources (I/O, CPU, memory) are accessed natively, without performance overhead.  This enables customers to run a number of Solaris 10 and Solaris 11 VMs separated, independent of each other within a physical server II. For Database performance, it includes Exadata Storage Cells - one of the main reasons why the Exadata Database Machine performs at diabolic speed. What makes them important? They provide DB backend storage for your Oracle Databases to run on the SPARC SuperCluster, that is what they are built and tuned for DB performance.  These storage cells are SQL-aware.  That is, if a SPARC T4 database compute node executes a query, it doesn't simply request tons of raw datablocks from the storage, filters the received data, and throws away most of it where the statement doesn't apply, but provides the SQL query to the storage node too. The storage cell software speaks SQL, that is, it is able to prefilter and through that transfer only the relevant data. With this, the traffic between database nodes and storage cells is reduced immensely. Less I/O is a good thing - as they say, all the CPUs of the world do one thing just as fast as any other - and that is waiting for I/O.  They don't only pre-filter, but also provide data preprocessing features - e.g. if a DB-node requests an aggregate of data, they can calculate it, and handover only the results, not the whole set. Again, less data to transfer.  They support the magical HCC, (Hybrid Columnar Compression). That is, data can be stored in a precompressed form on the storage. Less data to transfer.  Of course one can't simply rely on disks for performance, there is Flash Storage included there for caching.  III. The low latency, high-speed backbone network: InfiniBand, that interconnects all the members with: Real High Speed: 40 Gbit/s. Full Duplex, of course. Oh, and a really low latency.  RDMA. Remote Direct Memory Access. This technology allows the DB nodes to do exactly that. Remotely, directly placing SQL commands into the Memory of the storage cells. Dodging all the network-stack bottlenecks, avoiding overhead, placing requests directly into the process queue.  You can also run IP over InfiniBand if you please - that's the way the compute nodes can communicate with each other.  IV. Including a general-purpose storage too: the ZFSSA, which is a unified storage, providing NAS and SAN access too, with the following features:  NFS over RDMA over InfiniBand. Nothing is faster network-filesystem-wise.  All the ZFS features onboard, hybrid storage pools, compression, deduplication, snapshot, replication, NFS and CIFS shares Storageheads in a HA-Cluster configuration providing availability of the data  DTrace Live Analytics in a web-based Administration UI Being a general purpose application data storage for your non-database applications running on the SPARC SuperCluster over whichever protocol they prefer, easily replicating, snapshotting, cloning data for them.  There's a lot of great technology included in Oracle's SPARC SuperCluster, we have talked its interior through. As for external scalability: you can start with a half- of full- rack SPARC SuperCluster, and scale out to several racks - that is, stacking not separate full-rack SPARC SuperClusters, but extending always one large instance of the size of several full-racks. Yes, over InfiniBand network. Add racks as you grow.  What technologies shall run on it? SPARC SuperCluster is a general purpose scaleout consolidation/cloud environment. You can run Oracle Databases with RAC scaling, or Oracle Weblogic (end enjoy the SPARC T4's advantages to run Java). Remember, Oracle technologies have been integrated with the Oracle Engineered Systems - this is the Oracle on Oracle advantage. But you can run other software environments such as SAP if you please too. Run any application that runs on Oracle Solaris 10 or Solaris 11. Separate them in Virtual Machines, or even Oracle Solaris Zones, monitor and manage those from a central UI. Here the key takeaways once again: The SPARC SuperCluster: Is a pre-integrated Engineered System Contains SPARC T4-4 servers with built-in virtualization, cryptography, dynamic threading Contains the Exadata storage cells that intelligently offload the burden of the DB-nodes  Contains a highly available ZFS Storage Appliance, that provides SAN/NAS storage in a unified way Combines all these elements over a high-speed, low-latency backbone network implemented with InfiniBand Can grow from a single half-rack to several full-rack size Supports the consolidation of hundreds of applications To summarize: All these technologies are great by themselves, but the real value is like in every other Oracle Engineered System: Integration. All these technologies are tuned to perform together. Together they are way more than the sum of all - and a careful and actually very time consuming integration process is necessary to orchestrate all these for performance. The SPARC SuperCluster's goal is to enable infrastructure operations and offer a pre-integrated solution that can be architected and delivered in hours instead of months of evaluations and tests. The tedious and most importantly time and resource consuming part of the work - testing and evaluating - has been done.  Now go, provide services.   -- charlie  

    Read the article

1 2 3 4  | Next Page >