Search Results

Search found 401 results on 17 pages for 'hadoop'.

Page 13/17 | < Previous Page | 9 10 11 12 13 14 15 16 17 | Next Page >

alternatives to SGE

- by bmargulies

Once upon at time there was the free SGE from Sun. tricky to install and configure, but functional and free. Now we've got: open source packages on Ubuntu that don't quite work out of the box (details on request). the actual source behind them, with a build process that depends on the c-shell and other obsolescences, available from two competing locations. a commercial packaging from Oracle a commercial package from Univa What I am really wishing for is something with the basic capabilities of this that is simple to install and maintain. Heck, I'd take a front-end to hadoop that just queues and distributes simple shell-script-defined jobs.

Read the article
Is there any descent open-source search engine solutions?

- by Nazariy

Few weeks ago my friend asked me how hard is it to launch your own search engine service with list of websites that suppose to be crawled time to time. First what come at my mind was Google Custom Search however pricing policy is quite tricky and would drain your budget if you reach 500K queries per year. Another solution I found here was SearchBlox, which can be compared to Google Mini service. It's quite good solution if you planing to cover search over small amount of websites but for larger projects it is not very handy. I also found few other search platforms like Lucene, Hadoop and Xapian which seems to be quite powerful solutions to reach Google search quality, and Nutch as a web crawler. As most of open-source projects they share same problem, luck of comprehensive guidance of usage, examples and it's expected that you are expert in this subject. I'm wondering if any of you using this solutions, which of them would you recommend, and what should I be aware of?

Read the article
Automatically generated /etc/hosts is wrong

- by Niels Basjes

I've created a kickstart script to install CentOS 5.5 (32bit) in a fully automated way. The DNS/DHCP setup correctly gives the system the right hostname in both the forward and reverse lookups. dig node4.mydomain.com. +short 10.10.10.64 dig -x 10.10.10.64 +short node4.mydomain.com. In the state the installed system is right after the installation completed is as follows: cat /etc/sysconfig/network NETWORKING=yes NETWORKING_IPV6=yes GATEWAY=10.10.10.1 HOSTNAME=node4.mydomain.com echo ${HOSTNAME} node4.mydomain.com cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 10.10.10.64 node4 My problem is that this automatically generated hosts file is slightly different from the way I want it (or better: the way Hadoop wants it). The last line should look like this: 10.10.10.64 node4.mydomain.com node4 What do I modify where to fix this? Thanks.

Read the article
Monitoring System for the cloud?

- by Maxim Veksler

I need a monitoring system, much like ganglia / nagios that is build for the cloud. I need it to support : Adding / removing nodes dynamically. (Node shuts down, dose not imply node failure...) Dynamic node based categorization, meaning node can identify them self as being part of group X (ganglia gets this almost right, but lacks the dynamic part...) Does not require multicast support (generally not allowed in cloud based setups) Plugins for recent cool stuff such as Hadoop, Cassandra, Mongo would be cool. More features include: External API, web interface and co. I've looked at Ganglia, munin and they both seem be almost there (but not exactly). I would also go for reasonably priced Software as Service solution. I'm currently doing research, so Suggestions are highly appreciated. Thank you, Maxim

Read the article
(Open Source) Cloud-Filesystem to run a Database on Top?

- by jens

Hello, what are current "technologies and implementations" to get a filesystem with unlimited capacity by using single servers with their harddisk to form a "grid/cloud filesystem"? I need to have unlimited space (by adding further servers) but it must be a filesysem that is capable of running a database on top. I know of Apache Hadoop but that seems not be be Ideal for running a DB on top of it (or am I wrong??) And iSCSI seems to be "remote/networked" but I do not know how and if this is clusterable? thank you very much!! jens

Read the article
(Open Source) Cloud-Filesystem to run a Database on Top?

- by jens

Hello, what are current "technologies and implementations" to get a filesystem with unlimited capacity by using single servers with their harddisk to form a "grid/cloud filesystem"? I need to have unlimited space (by adding further servers) but it must be a filesysem that is capable of running a database on top. I know of Apache Hadoop but that seems not be be Ideal for running a DB on top of it (or am I wrong??) And iSCSI seems to be "remote/networked" but I do not know how and if this is clusterable? thank you very much!! jens

Read the article
Can't find netbooted for Kerrighed pxe boot with Ubuntu Lucid Server

- by Pengin

I'm following installation guides for pxe booting and kerrighed. I can't find the package nfsbooted for Ubuntu 10.04. Where did it go? Context: At work I have access to 8 mini-ITX PCs and am trying to build a cluster. My plans include trying Condor, GridGain, Hadoop, and recently Kerrighed has caught my eye. (I reaslise these are all for different kinds of things, I'm just evaluating). Ideally, I'd like to have all the nodes network boot from a single server, since that seems so much easier to manage, plus I can 'borrow' additional PCs for a while without touching their HD. I've been getting on great with Ubuntu Lucid Server (10.04), trying to follow the only guides I can find to get pxe booting (and ultimately kerrighed) to work. This guide is for Ubuntu 8.04 and this one is for Debian. They both refer to a package I can't seem to find, nfsbooted. Has this package been replaced? Am I doing something daft?

Read the article
Big Data for Retail

- by David Dorf

Right up there with mobile, social, and cloud is the term "big data," which seems to be popping up lots in the press these days. Companies like Google, Yahoo, and Facebook have popularized a new class of data technologies meant to solve the problem of processing large amounts of data quickly. I first mentioned this in a posting back in March 2009. Put simply, big data implies datasets so large they can't normally be processed using a standard transactional database. The term "noSQL" is often used in this context as well. Actually, using parallel processing within the Oracle database combined with Exadata can achieve impressive results. Look for more from Oracle at OpenWorld as hinted by Jean-Pierre Dijcks. McKinsey recently released a report on big data in which retail was specifically mentioned as an industry that can benefit from the new technologies. I won't rehash that report because my friend Rama already did such a good job in his posting, Impact of "Big Data" on Retail. The presentation below does a pretty good job of framing the problem, although it doesn't really get into the available technologies (e.g. Exadata, Hadoop, Cassandra, etc.) and isn't retail specific. Determine the Right Analytic Database: A Survey of New Data Technologies So when a retailer asks me about big data, here's what I say: Big data refers to a set of technologies for processing large volumes of structured and unstructured data. Imagine collecting everything uttered by your customers on Facebook and Twitter and combining it with all the data you can find about the products you sell (e.g. reviews, images, demonstration videos), including competitive data. Assuming you could process all that data, you could then personalize offers to specific customers based on their tastes, ensure prices are competitive, and implement better local assortments. It's really not that far off.

Read the article
Oracle Big Data Learning Library - Click on LEARN BY PRODUCT to Open Page

- by chberger

Oracle Big Data Learning Library... Learn about Oracle Big Data, Data Science, Learning Analytics, Oracle NoSQL Database, and more! Oracle Big Data Essentials Attend this Oracle University Course! Using Oracle NoSQL Database Attend this Oracle University class! Oracle and Big Data on OTN See the latest resource on OTN. Search Welcome Get Started Learn by Role Learn by Product Latest Additions Additional Resources Oracle Big Data Appliance Oracle Big Data and Data Science Basics Meeting the Challenge of Big Data Oracle Big Data Tutorial Video Series Oracle MoviePlex - a Big Data End-to-End Series of Demonstrations Oracle Big Data Overview Oracle Big Data Essentials Data Mining Oracle NoSQL Database Tutorial Videos Oracle NoSQL Database Tutorial Series Oracle NoSQL Database Release 2 New Features Using Oracle NoSQL Database Exalytics Enterprise Manager 12c R3: Manage Exalytics Setting Up and Running Summary Advisor on an E s Oracle R Enterprise Oracle R Enterprise Tutorial Series Oracle Big Data Connectors Integrate All Your Data with Oracle Big Data Connectors Using Oracle Direct Connector for HDFS to Read the Data from HDSF Using Oracle R Connector for Hadoop to Analyze Data Oracle NoSQL Database Oracle NoSQL Database Tutorial Videos Oracle NoSQL Database Tutorial Series Oracle NoSQL Database Release 2 New Features Using Oracle NoSQL Database eries Oracle Business Intelligence Enterprise Edition Oracle Business Intelligence Oracle BI 11g R1: Create Analyses and Dashboards - 4 day class Oracle BI Publisher 11g R1: Fundamentals - 3 day class Oracle BI 11g R1: Build Repositories - 5 day class

Read the article
ArchBeat Link-o-Rama for 2012-06-19

- by Bob Rhubart

Discussion: Public, Private, and Hybrid Clouds A conversation about the similarities and differences between public, private, and hybrid clouds; the connection between cows, condos, and cloud computing; and what architects need to know in order to take advantage of cloud computing. (OTN ArchBeat Podcast transcript) InfoQ: Current Trends in Enterprise Mobility Interesting infographics that show current developments and major trends in enterprise mobility. Recap: EMEA User Group Leaders Meeting Latvia May 2012 Tom Scheirsen recaps the recent IOUC event in Riga. Oracle Fusion Middleware Summer Camps in Lisbon: Includes Advanced ADF Training by Oracle Product Management This is how IT people deal with the Summertime Blues. Enterprise 2.0 Conference: Building Social Business | Oracle WebCenter Blog Kellsey Ruppel shares a list of E2.0 conference sessions being presented by members of the Oracle community. Linux 6 Transparent Huge Pages and Hadoop Workloads | Structured Data Greg Rahn documents a problem. BPM Standard Edition to start your BPM project "BPM Standard Edition is an entry level BPM offering designed to help organisations implement their first few processes in order to prove the value of BPM within their own organisation." Troubleshooting ADF Security 11g Login Page Failure | Andrejus Baranovskis Oracle ACE Director Andrejus Baranovskis takes a deep dive into one of the most common ADF 11g Security issues. It's Alive! - The Oracle OpenWorld Content Catalog It's what you’ve been waiting for—the central repository for information on sessions, demos, labs, user groups, exhibitors, and more. 5 minutes or less: Indexing Attributes in OID | Andre Correa Fusion Middleware A-Team blogger Andre Correa offers help for those who encounter issues when running searches with LDAP filters against OID (Oracle Internet Directory). Condos and Clouds: Thinking about Cloud Computng by Looking at Condominiums | Pat Helland In part two of the OTN ArchBeat Podcast Public, Private, and Hybrid Clouds, Oracle Cloud chief architect Mark Nelson mentions an analogy by Pat Helland that compares condos to cloud computing. After some digging I found the October 2011 presentation in which Helland explains that analogy. Thought for the Day "I have always found that plans are useless, but planning is indispensable." — Dwight Eisenhower (October 14, 1890 – March 28, 1969) Source: Quotes for Software Engineers

Read the article
How I Work: A Cloud Developer's Workstation

- by BuckWoody

I've written here a little about how I work during the day, including things like using a stand-up desk (still doing that, by the way). Inspired by a Twitter conversation yesterday, I thought I might explain how I set up my computing environment. First, a couple of important points. I work in Cloud Computing, specifically (but not limited to) Windows Azure. Windows Azure has features to run a Virtual Machine (IaaS), run code without having to control a Virtual Machine (PaaS) and use databases, video streaming, Hadoop and more (a kind of SaaS for tech pros). As such, my designs run the gamut of on-premises, VM's in the Cloud, and software that I write for a platform. I focus on data primarily, meaning that I design a lot of systems that use an RDBMS (like SQL Server or Windows Azure Databases) or a NoSQL approach (MongoDB on Azure or large-scale Key-Value Pairs in Table storage) and even Hadoop and R, and also Cloud Numerics in F#. All that being said, those things inform my choices below. Hardware I have a Lenovo X220 tablet/laptop which I really like a great deal - it's a light, tough, extremely fast system. When I travel, that's the system I take. It has 8GB of RAM, and an SSD drive. I sometimes use that to develop or work at a client's site, on the road, or in the living room when I'm not in my home office. My main system is a GateWay DX430017 - I've maxed it out on RAM, and I have two 1TB drives in it. It's not only my workstation for work; I leave it on all the time and it streams our videos, music and books. I have about 3400 e-books, and I've just started using Calibre to stream the library. I run Windows 8 on it so I can set up Hyper-V images, since Windows Azure allows me to move regular Hyper-V disks back and forth to the Cloud. That's where all my "servers" are, when I have to use an IaaS approach. The reason I use a desktop-style system rather than a laptop only approach is that a good part of my job is setting up architectures to solve really big, complex problems. That means I have to simulate entire networks on-premises, along with the Hybrid Cloud approach I use a lot. I need a lot of disk space and memory for that, and I use two huge monitors on my stand-up desk. I could probably use 10 monitors if I had the room for them. Also, since it's our home system as well, I leave it on all the time and it doesn't travel. Software For the software for my systems, it's important to keep in mind that I not only write code, but I design databases, teach, present, and create Linux and other environments. Windows 8 - While the jury is out for me on the new interface, the context-sensitive search, integrated everything, and speed is just hands-down the right choice. I've evaluated a server OS, Linux, even an Apple, but I just am not as efficient on those as I am with Windows 8. Visual Studio Ultimate - I develop primarily in .NET (C# and F# mostly) and I use the Team Foundation Server in the cloud, and I'm asked to do everything from UI to Services, so I need everything. Windows Azure SDK, Windows Azure Training Kit - I need the first to set up my Azure PaaS coding, and the second has all the info I need for PaaS, IaaS and SaaS. This is primarily how I get paid. :) SQL Server Developer Edition - While I might install Oracle, MySQL and Postgres on my VM's, the "outside" environment is SQL Server for an RDBMS. I install the Developer Edition because it has the same features as Enterprise Edition, and comes with all the client tools and documentation. Microsoft Office - Even if I didn't work here, this is what I would use. I've just grown too accustomed to doing business this way to change, so my advice is always "use what works", and this does. The parts I use are: OneNote (and a Math Add-in) - I do almost everything - and I mean everything in OneNote. I can code, do high-end math, present, design, collaborate and more. All my notebooks are on my Skydrive. I can use them from any system, anywhere. If you take the time to learn this program, you'll be hooked. Excel with PowerPivot - Don't make that face. Excel is the world's database, and every Data Scientist I know - even the ones where I teach at the University of Washington - know it, use it, and love it. Outlook - Primary communications, CRM and contact tool. I have all of my social media hooked up to it, so when I get an e-mail from you, I see everything, see all the history we've had on e-mail, find you on a map and more. Lync - I was fine with LiveMeeting, although it has it's moments. For me, the Lync client is tres-awesome. I use this throughout my day, present on it, stay in contact with colleagues and the folks on the dev team (who wish I didn't have it) and more. PowerPoint - Once again, don't make that face. Whenever I see someone complaining about PowerPoint, I have 100% of the time found they don't know how to use it. If you suck at presenting or creating content, don't blame PowerPoint. Works great on my machine. :) Zoomit - Magnifier - On Windows 7 (and 8 as well) there's a built-in magnifier, but I install Zoomit out of habit. It enlarges the screen. If you don't use one of these tools (or their equivalent on some other OS) then you're presenting/teaching wrong, and you should stop presenting/teaching until you get them and learn how to show people what you can see on your tiny, tiny monitor. :) Cygwin - Unix for Windows. OK, that's not true, but it's mostly that. I grew up on mainframes and Unix (IBM and HP, thank you) and I can't imagine life without sed, awk, grep, vim, and bash. I also tend to take a lot of the "Science" and "Development" and "Database" packages in it as well. PuTTY - Speaking of Unix, when I need to connect to my Linux VM's in Windows Azure, I want to do it securely. This is the tool for that. Notepad++ - Somewhere between torturing myself in vim and luxuriating in OneNote is Notepad++. Everyone has a favorite text editor; this one is mine. Too many features to name, and it's free. Browsers - I install Chrome, Firefox and of course IE. I know it's in vogue to rant on IE, but I tend to think for myself a great deal, and I've had few (none) problems with it. The others I have for the haterz that make sites that won't run in IE. Visio - I've used a lot of design packages, but none have the extreme meta-data edit capabilities of Visio. I don't use this all the time - it can be rather heavy, but what it does it does really well. I also present this way when I'm not using PowerPoint. Yup, I just bring up Visio and diagram away as I'm chatting with clients. Depending on what we're covering, this can be the right tool for that. Tweetdeck - The AIR one, not that new disaster they came out with. I live on social media, since you, dear readers, are my cube-mates. When I get tired of you all, I close Tweetdeck. When I need help or someone needs help from me, or if I want to see a picture of a cat while I'm coding, I bring it up. It's up most all day and night. Windows Media Player - I listen to Trance or Classical when I code, and I find music managers overbearing and extra. I just use what comes in the box, and it works great for me. R - F# and Cloud Numerics now allows me to load in R libraries (yay!) and I use this for statistical work on big data loads. Microsoft Math - One of the most amazing, free, rich, amazing, awesome, amazing calculators out there. I get the 64-bit version for quick math conversions, plots and formula-checks. Python - I know, right? Who knew that the scientific community loved Python so much. But they do. I use 2.7; not as much runs with 3+. I also use IronPython in Visual Studio, or I edit in Notepad++ Camstudio recorder - Windows PSR - In much of my training, and all of my teaching at the UW, I need to show a process on a screen. Camstudio records screen and voice, and it's free. If I need to make static training, I use the Windows PSR tool that's built right in. It's ostensibly for problem duplication, but I use it to record for training. OK - your turn. Post a link to your blog entry below, and tell me how you set your system up.

Read the article
Discover the MySQL Connect Content Catalog!

- by Bertrand Matthelié

The MySQL Connect content catalog is now live! MySQL Connect offers you a unique opportunity to attend:Keynotes including: "The State of the Dolphin", by Oracle's Chief Corporate Architect Edward Screven and VP of MySQL Engineering Tomas Ulin. An exciting panel on "Current MySQL Usage Models and Future Developments" with Davi Arnaud from LinkedIn, Daniel Austin from PayPal, Mark Callaghan from Facebook and Calvin Sun from Twitter. Over 65 Conference sessions enabling you to hear from: Oracle MySQL engineers on MySQL 5.6, InnoDB, replication, performance tuning, security, NoSQL, MySQL Cluster, Big Data...and more. MySQL customers including the US Census Bureau, Big Fish Games, Booking.com, Ticketmaster, and Tumblr. Internationally recognized MySQL community members and partners on topics such as performance, MySQL 5.6, backup, MySQL in the Cloud, OpenStack and Hadoop. 6 Birds-of-a-feather sessions about sharding, replication, backup, and other subjects.8 Hands-On Labs designed to give you hands-on experience about MySQL replication, the MySQL Performance Schema, MySQL Cluster...and more.6 Tutorials providing you in-depth knowledge about MySQL Performance Tuning best practices, enhancing productivity with MySQL 5.6 new features or the essentials to get started with MySQL (tutorials are available as an add-on package to MySQL Connect registrants).Demo pods and exhibitors, to learn more about Partner’s and Oracle’s offerings.Receptions on both Saturday and Sunday nights, enabling you to ask all your questions to Oracle's MySQL engineers and to network with some of the world’s best MySQL professionals.Check out the MySQL Connect content catalog and find out about the amazing sessions you have the opportunity to attend.Reminder: The early bird discount is running until July 19, Register Now to save US$500! Plan to Attend Oracle OpenWorld or JavaOne? Add the MySQL Connect event to your Oracle OpenWorld or JavaOne registration for only US$100. Exhibit/Sponsorship opportunities are also available. We look forward to seeing you at MySQL Connect!

Read the article
The Windows Azure Software Development Kit (SDK) and the Windows Azure Training Kit (WATK)

- by BuckWoody

Windows Azure is a platform that allows you to write software, run software, or use software that we've already written. We provide lots of resources to help you do that - many can be found right here in this blog series. There are two primary resources you can use, and it's important to understand what they are and what they do. The Windows Azure Software Development Kit (SDK) Actually, this isn't one resource. We have SDK's for multiple development environments, such as Visual Studio and also Eclipse, along with SDK's for iOS, Android and other environments. Windows Azure is a "back end", so almost any technology or front end system can use it to solve a problem. The SDK's are primarily for development. In the case of Visual Studio, you'll get a runtime environment for Windows Azure which allows you to develop, test and even run code all locally - you do not have to be connected to Windows Azure at all, until you're ready to deploy. You'll also get a few samples and codeblocks, along with all of the libraries you need to code with Windows Azure in .NET, PHP, Ruby, Java and more. The SDK is updated frequently, so check this location to find the latest for your environment and language - just click the bar that corresponds to what you want: http://www.windowsazure.com/en-us/develop/downloads/ The Windows Azure Training Kit (WATK) Whether you're writing code, using Windows Azure Virtual Machines (VM's) or working with Hadoop, you can use the WATK to get examples, code, PowerShell scripts, PowerPoint decks, training videos and much more. This should be your second download after the SDK. This is all of the training you need to get started, and even beyond. The WATK is updated frequently - and you can find the latest one here: http://www.windowsazure.com/en-us/develop/net/other-resources/training-kit/ There are many other resources - again, check the http://windowsazure.com site, the community newsletter (which introduces the latest features), and my blog for more.

Read the article
MySQL Connect in 4 Days - Sessions From Users and Customers

- by Bertrand Matthelié

72 1024x768 Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin:0cm; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Cambria","serif";} Let’s review today the conference sessions where users and customers will describe their use of MySQL as well as best practices. Remember you can plan your schedule with Schedule Builder. Saturday, 11.30 am, Room Golden Gate 7: MySQL and Hadoop—Chris Schneider, Ning.com Saturday, 1.00 pm, Room Golden Gate 7: Thriving in a MySQL Replicated World—Ed Presz and Andrew Yee, Ticketmaster Saturday, 1.00 pm, Room Golden Gate 8: Rick’s RoTs (Rules of Thumb)—Rick James, Yahoo! Saturday, 2.30 pm, Room Golden Gate 3: Scaling Pinterest—Yashwanth Nelapati and Evrhet Milam, Pinterest Saturday, 4.00 pm, Room Golden Gate 3: MySQL Pool Scanner: An Automated Service for Host Management—Levi Junkert, Facebook Sunday, 10.15 am, Room Golden Gate 3: Big Data Is a Big Scam (Most of the Time)—Daniel Austin, PayPal Sunday, 11.45 am, Room Golden Gate 3: MySQL at Twitter: Development and Deployment—Jeremy Cole and Davi Arnaut, Twitter Sunday, 1.15 pm, Room Golden Gate 3: CERN’s MySQL-as-a-Service Deployment with Oracle VM: Empowering Users—Dawid Wojcik and Eric Grancher, DBA, CERN Sunday, 2.45 pm, Room Golden Gate 3: Database Scaling at Mozilla—Sheeri Cabral, Mozilla Sunday, 5.45 pm, Room Golden Gate 4: MySQL Cluster Carrier Grade Edition @ El Chavo, Latin America’s #1 Facebook Game—Carlos Morales, Playful Play You can check out the full program here as well as in the September edition of the MySQL newsletter. Not registered yet? You can still save US$ 300 over the on-site fee – Register Now!

Read the article
Adopting Technologies for the Sake of Technologies

- by shiju

Unlike other engineering industries, the software engineering industry is really lacking maturity. The lack of maturity can see in different aspects of entire software development life cycle. I think other engineering industries are well organised and structured with common, proven engineering practices. The software engineering industry is greatly a diverse industry with different operating systems, and variety of development platforms, programming languages, frameworks and tools. Now these days, people are going behind the hypes and intellectual thoughts without understanding their core business problems and adopting technologies and practices for the sake of technologies and practices and simply becoming a “poster child” of technologies and practices. Understanding the core business problem and providing best, solid solution with a platform neutral approach, will give you more business values and ROI, instead of blindly adopting technologies and tailor-made your applications for the sake of technologies and practices. People have been simply migrating their solutions in favour of new technologies and different versions of frameworks without any business need. The “Pepsi Challenge” in the Software Development Pepsi Challenge marketing campaign of the 1980s was a popular and very interesting marketing promotion in which people taste one cup of Pepsi and another cup with Coca Cola. In the taste test, more than 50% of people were preferred Pepsi over Coca Cola. The success story behind the Pepsi was more sweetness contains in the Pepsi cola. They have simply added more sugar and more people preferred more sweet flavour. You can’t simply identify the better one after sipping one cup of cola based on the sweetness which contains. These things have been happening in the software industry for choosing development frameworks and technologies. People have been simply choosing frameworks based on the initial sugary feeling without understanding its core strengths and weakness. The sugary framework might be more harmful when you develop real-world systems. There is not any silver bullet for solving all kind of problems and frameworks and tools do have strengths and weakness. So it would be better to understand their strength and weakness. And please keep in mind that you have to develop real apps to understand the real capabilities and weakness of a framework. Evaluating a technology based on few blog posts will harm your projects and these bloggers might be lacking real-world experience with the framework. The Problem with Align a Development Practice with Tools Recently I have observed a discussion in a group where one guy asked suggestions for practicing Continuous Delivery (CD) as part of the agile based application engineering. Then the discussion quickly went to using and choosing a Continuous Integration (CI) tool and different people suggested different Continuous Integration (CI) tools for simply practicing Continuous Delivery. If you have worked with core agile engineering practices, you could clearly know that the real essence of agile is neither choosing a tool nor choosing a process. By simply choosing CI tool from a particular vendor will not ensure that you are delivering an evolving software based on customer feedback. You have to understand the real essence of a engineering practice and choose a right tool for practicing it instead of simply focus on a particular tool for a practicing an development practice. If you want to adopt a practice, you need a solid understanding on it with its real essence where tools are just helping us for better automation. Adopting New Technologies for the Sake of Technologies The another problem is that developers have been a tendency to adopt new technologies and simply migrating their existing apps to new technologies. It is okay if your existing system is having problem with a technology stack or or maintainability challenge with existing solution, and moving to new technology for solving the current problems. We have been adopting new technologies for solving new challenges like solving the scalability challenges when the application or user bases is growing unpredictably. Please keep in mind that all new technologies will become old after working with it for few years. The below Facebook status update of Janakiraman, expresses the attitude of a typical customer. For an example, Node.js is becoming a hottest buzzword in the software industry and many developers are trying to adopt Node.js for their apps. The important thing is that Node.js is a minimalist framework that does some great things for some problems, but it’s not a silver bullet. I have been also working with Node.js which is good for some problems, but really bad for choosing it for all kind of problems. By adopting new technologies for new projects is good if we could get real business values from it because newer framework would solve some existing well known problems and provide better solutions where it can incorporate good solutions for the latest challenges . But adopting a new technology for the sake of new technology is really bad idea. Another example is JavaScript is getting lot of attention so that lot of developers are developing heavy JavaScript centric web apps. First, they will adopt a client-side JavaScript MV* framework from AngularJS, Ember, Backbone etc, and develop a Single Page App(SPA) where they are repeating the mistakes we did in the past with server-side. The mistakes we did in the server-side is transforming to client-side. The problem is that people are just adopting new technologies, but not improving their solutions. I predict that many Single Page App will suck in the future. We need a hybrid approach where we should be able to leverage both server-side and client-side for developing next-generation web apps. The another problem is that if you like a particular framework, use it for all kind of apps. In the past, I know some Silverlight passionate guys were tried to use that framework for all kind of apps including larger line of business apps. And these days developers are migrating their existing Silverlight apps in favour of HTML5 buzzword. So the real question is, what is the business values we are getting from these apps when we are developing it for the sake of a particular technology instead of business need. The another problem is that our solutions consultants are trying to provide unnecessary solutions for the sake of a particular technology or for a hype. For an example, Big Data solutions are great for solving the problem of three Vs : volume, velocity and variety. But trying to put this for every application will make problems. Let’s say, there is a small web site running with limited budget and saying that we need a recommendation engine for the web site with a Hadoop based solution with a 16 node cluster, would be really horrible. If we really need a Hadoop based solution, got for it, but trying to put this for all application would be a big disaster. It would be great if could understand the core business problems first, and later choose a right framework for providing solutions for the actual business problem, instead of trying to provide so many solutions. The Problem with Tied Up to a Platform Vendor Some organizations and teams are tied up with a particular platform vendor where they don’t want to use any product other than their preferred or existing platform vendor. They will accept any product provided by the vendor regardless of its capability. This will lets you some benefits regards with integration and collaboration of different products provided by the same vendor, but it will loose your opportunity to provide better solution for your business problems. For a real world sample scenario, lot of companies have been using SAP for their ERP solutions. When they are thinking about mobility or thinking about developing hybrid mobile apps, they can easily find out a framework from SAP. SAP provides a framework for HTML 5 based UI development named SAPUI5. If you are simply adopting that framework only based for the preference of existing platform vendor, you might be loose different opportunities for providing better solution. Initially you might enjoy the sugary feeling provided by the platform vendor, but you have to think about developing apps which should be capable for solving future challenges. I am not saying that any framework is not good and I believe that all frameworks are good over another one for solving at least one problem. My point is that we should not tied up with any specific platform vendor unless your organization is having resource availability problems. Being Polyglot for Providing Right Solutions The modern software engineering industry is greatly diverse with different tools and platforms. Lot of open source frameworks and new programming languages have been releasing to the developer community, where choosing the right platform without any biased opinion, is really a difficult task. But it would really great if we could develop an attitude with platform neutral mindset and being a polyglot developer for providing better solutions based on the actual business problems. IMHO, we should learn a new programming language and a new framework every year. This will improve the quality of our developer capabilities and also improve the quality of our primary programming language skills. Being polyglot for individual developers and organizational teams will give you greater opportunity to your developer experience and also for your applications. Organizations can analyse their business problem without tied with any technology and later they can provide solutions by choosing different platform and tools. Summary In this blog post, what I was trying to say that we should not tied up or biased with any development platform, technology, vendor or programming language and we should not adopt technologies and practices for the sake of technologies. If we are adopting a technology or a practice for the sake of it, we are simply becoming a “poster child” of the technology and practice. We should not become a poster child of other people’s intellectual thoughts and theories, instead of it we should become solutions developers and solutions consultants where we should be able to provide better solutions for the business problems. Being a polyglot developer is a good idea for improving your developer skills which lets you provide better solutions for the business problems. The most important thing is that we should become platform neutral developers where our passion should be for providing brilliant solutions. It would be great if we could provide minimalist, pragmatic business solutions. You can follow me on Twitter @shijucv

Read the article
ArchBeat Link-o-Rama for 2012-04-13

- by Bob Rhubart

TGIF! Mobile Commerce and Engagement Stats | @digbymobile www.digby.com Solution architects take note: mobile is shaping your future. OTN Architect Day - Reston, VA - May 16 www.oracle.com The live one-day event in Reston, VA brings together architects from a broad range of disciplines and domains to share insights and expertise in the use of Oracle technologies to meet the challenges today’s solution architects regularly face. Registration is free, but seating is limited. BPEL 11.1.1.6 Certified for Prebuilt E-Business Suite 12.1.3 SOA Integrations | Steven Chan blogs.oracle.com A load of links and useful information from Steven Chan. OTN: There's an App for That blogs.oracle.com Get your OTN developer community content on the go with this free app for your mobile device. Five Best Practices for Going Mobile | John Brunswick blogs.oracle.com John Brunswick offers some strategic considerations for delivering products, services, and information to mobile constituents. Why My Slime Mold is Better than Your Hadoop Cluster | Todd Hoff highscalability.com What architects can learn from naturally occurring, self-propelled goop. ADF version of "Modern" dialog windows | Martin Deh blogs.oracle.com Martin Deh describes how to use OOTB ADF components and CSS3 style elements to create iOS-style UI elements. Perfect fit: The cloud and SOA -- but don't call it that | David Linthicum www.infoworld.com "The fact of the matter," says David Linthicum, "is that the best and most effective way to move to the cloud for an enterprise whose technology platforms reflect decades of enterprise IT neglect is to use SOA as an approach and process. Just don't call it 'SOA.'" Thought for the Day "There are two major products that come out of Berkeley: LSD and UNIX. We don't believe this to be a coincidence." — Jeremy S. Anderson

Read the article
Podcast Show Notes: Redefining Information Management Architecture

- by Bob Rhubart-Oracle

Nothing in IT stands still, and this is certainly true of business intelligence and information management. Big Data has certainly had an impact, as have Hadoop and other technologies. That evolution was the catalyst for the collaborative effort behind a new Information Management Reference Architecture. The latest OTN ArchBeat series features a conversation with Andrew Bond, Stewart Bryson, and Mark Rittman, key players in that collaboration. These three gentlemen know each other quite well, which comes across in a conversation that is as lively and entertaining as it is informative. But don't take my work for it. Listen for yourself! The Panelists(Listed alphabetically) Andrew Bond, head of Enterprise Architecture at Oracle Oracle ACE Director Stewart Bryson, owner and Co-Founder of Red Pill Analytics Oracle ACE Director Mark Rittman, CIO and Co-Founder of Rittman Mead The Conversation Listen to Part 1: The panel discusses how new thinking and new technologies were the catalyst for a new approach to business intelligence projects. Listen to Part 2: Why taking an "API" approach is important in building an agile data factory. Listen to Part 3: Shadow IT, "sandboxing," and how organizational changes are driving the evolution in information management architecture. Additional Resources The Reference Architecture that is the focus of this conversation is described in detail in these blog posts by Mark Rittman: Introducing the Updated Oracle / Rittman Mead Information Management Reference Architecture Part 1: Information Architecture and the Data Factory Part 2: Delivering the Data Factory Be a Guest Producer for an ArchBeat Podcast Want to be a guest producer for an OTN ArchBeat podcast? Click here to learn how to make it happen.

Read the article
SQL Rally Pre-Con: Data Warehouse Modeling – Making the Right Choices

- by Davide Mauri

As you may have already learned from my old post or Adam’s or Kalen’s posts, there will be two SQL Rally in North Europe. In the Stockholm SQL Rally, with my friend Thomas Kejser, I’ll be delivering a pre-con on Data Warehouse Modeling: Data warehouses play a central role in any BI solution. It's the back end upon which everything in years to come will be created. For this reason, it must be rock solid and yet flexible at the same time. To develop such a data warehouse, you must have a clear idea of its architecture, a thorough understanding of the concepts of Measures and Dimensions, and a proven engineered way to build it so that quality and stability can go hand-in-hand with cost reduction and scalability. In this workshop, Thomas Kejser and Davide Mauri will share all the information they learned since they started working with data warehouses, giving you the guidance and tips you need to start your BI project in the best way possible?avoiding errors, making implementation effective and efficient, paving the way for a winning Agile approach, and helping you define how your team should work so that your BI solution will stand the test of time. You'll learn: Data warehouse architecture and justification Agile methodology Dimensional modeling, including Kimball vs. Inmon, SCD1/SCD2/SCD3, Junk and Degenerate Dimensions, and Huge Dimensions Best practices, naming conventions, and lessons learned Loading the data warehouse, including loading Dimensions, loading Facts (Full Load, Incremental Load, Partitioned Load) Data warehouses and Big Data (Hadoop) Unit testing Tracking historical changes and managing large sizes With all the Self-Service BI hype, Data Warehouse is become more and more central every day, since if everyone will be able to analyze data using self-service tools, it’s better for him/her to rely on correct, uniform and coherent data. Already 50 people registered from the workshop and seats are limited so don’t miss this unique opportunity to attend to this workshop that is really a unique combination of years and years of experience! http://www.sqlpass.org/sqlrally/2013/nordic/Agenda/PreconferenceSeminars.aspx See you there!

Read the article
Data Aggregation of CSV files java

- by royB

I have k csv files (5 csv files for example), each file has m fields which produce a key and n values. I need to produce a single csv file with aggregated data. I'm looking for the most efficient solution for this problem, speed mainly. I don't think by the way that we will have memory issues. Also I would like to know if hashing is really a good solution because we will have to use 64 bit hashing solution to reduce the chance for a collision to less than 1% (we are having around 30000000 rows per aggregation). For example file 1: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,50,60,70,80 a3,b2,c4,60,60,80,90 file 2: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,30,50,90,40 a3,b2,c4,30,70,50,90 result: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,80,110,160,120 a3,b2,c4,90,130,130,180 algorithm that we thought until now: hashing (using concurentHashTable) merge sorting the files DB: using mysql or hadoop or redis. The solution needs to be able to handle Huge amount of data (each file more than two million rows) a better example: file 1 country,city,peopleNum england,london,1000000 england,coventry,500000 file 2: country,city,peopleNum england,london,500000 england,coventry,500000 england,manchester,500000 merged file: country,city,peopleNum england,london,1500000 england,coventry,1000000 england,manchester,500000 The key is: country,city. This is just an example, my real key is of size 6 and the data columns are of size 8 - total of 14 columns. We would like that the solution will be the fastest in regard of data processing.

Read the article
Big Data Appliance X4-2 Release Announcement

- by Jean-Pierre Dijcks

Today we are announcing the release of the 3rd generation Big Data Appliance. Read the Press Release here. Software Focus The focus for this 3rd generation of Big Data Appliance is: Comprehensive and Open - Big Data Appliance now includes all Cloudera Software, including Back-up and Disaster Recovery (BDR), Search, Impala, Navigator as well as the previously included components (like CDH, HBase and Cloudera Manager) and Oracle NoSQL Database (CE or EE). Lower TCO then DIY Hadoop Systems Simplified Operations while providing an open platform for the organization Comprehensive security including the new Audit Vault and Database Firewall software, Apache Sentry and Kerberos configured out-of-the-box Hardware Update A good place to start is to quickly review the hardware differences (no price changes!). On a per node basis the following is a comparison between old and new (X3-2) hardware: Big Data Appliance X3-2 Big Data Appliance X4-2 CPU 2 x 8-Core Intel® Xeon® E5-2660 (2.2 GHz) 2 x 8-Core Intel® Xeon® E5-2650 V2 (2.6 GHz) Memory 64GB 64GB Disk 12 x 3TB High Capacity SAS 12 x 4TB High Capacity SAS InfiniBand 40Gb/sec 40Gb/sec Ethernet 10Gb/sec 10Gb/sec For all the details on the environmentals and other useful information, review the data sheet for Big Data Appliance X4-2. The larger disks give BDA X4-2 33% more capacity over the previous generation while adding faster CPUs. Memory for BDA is expandable to 512 GB per node and can be done on a per-node basis, for example for NameNodes or for HBase region servers, or for NoSQL Database nodes. Software Details More details in terms of software and the current versions (note BDA follows a three monthly update cycle for Cloudera and other software): Big Data Appliance 2.2 Software Stack Big Data Appliance 2.3 Software Stack Linux Oracle Linux 5.8 with UEK 1 Oracle Linux 6.4 with UEK 2 JDK JDK 6 JDK 7 Cloudera CDH CDH 4.3 CDH 4.4 Cloudera Manager CM 4.6 CM 4.7 And like we said at the beginning it is important to understand that all other Cloudera components are now included in the price of Oracle Big Data Appliance. They are fully supported by Oracle and available for all BDA customers. For more information: Big Data Appliance Data Sheet Big Data Connectors Data Sheet Oracle NoSQL Database Data Sheet (CE | EE) Oracle Advanced Analytics Data Sheet

Read the article
How can a large, Fortran-based number crunching codebase be modernized?

- by Dave Mateer

A friend in academia asked me for advice (I'm a C# business application developer). He has a legacy codebase which he wrote in Fortran in the medical imaging field. It does a huge amount of number crunching using vectors. He uses a cluster (30ish cores) and has now gone towards a single workstation with 500ish GPUS in it. However where to go next with the codebase so: Other people can maintain it over next 10 year cycle Get faster at tweaking the software Can run on different infrastructures without recompiles After some research from me (this is a super interesting area) some options are: Use Python and CUDA from Nvidia Rewrite in a functional language. For example, F# or Haskell Go cloud based and use something like Hadoop and Java Learn C What has been your experience with this? What should my friend be looking at to modernize his codebase? UPDATE: Thanks @Mark and everyone who has answered. The reasons my friend is asking this question is that it's a perfect time in the projects lifecycle to do a review. Bringing research assistants up to speed in Fortran takes time (I like C#, and especially the tooling and can't imagine going back to older languages!!) I liked the suggestion of keeping the pure number crunching in Fortran, but wrapping it in something newer. Perhaps Python as that seems to be getting a stronghold in academia as a general-purpose programming language that is fairly easy to pick up. See Medical Imaging and a guy who has written a Fortran wrapper for CUDA, Can I legally publish my Fortran 90 wrappers to Nvidias' CUFFT library (from the CUDA SDK)?.

Read the article
Oracle Cloud Solutions @ Cloud Expo East (June 10-12)

- by Gene Eun

Oracle is proud to be the Platinum Sponsor at next week's Cloud Expo East (June 10-12) at the Javits Center in New York City. This is the fourth consecutive year Oracle has sponsored Cloud Expo. As in years past, Oracle has a full schedule of sessions shown below. We'd love to have you be our guest at Cloud Expo East and have you attend one of our sessions and hear more about our thought leadership and leading solutions in the Cloud and Big Data. We'll also have booth #207, so please stop by and see a demo of many of our cloud offerings. Date Time Session Title Track Room Tuesday, June 10 4:40 pm - 5:15 pm Top 5 Best Practices for your Application Platform As a Service Cloud Business and the API Economy | Deploying the Cloud TBD Wednesday, June 11 9:10 am - 10:10 am Cloud Odyssey: A Hero’s Quest All Tracks (Keynote) Keynote Hall Wednesday, June 11 10:15 am - 10:45 am Big Data Management System: Smart SWL Processing Across Hadoop and Your Data Warehouse All Tracks (General Session) Keynote Hall Wednesday, June 11 2:50 pm - 3:25 pm Plug into the Cloud: Your Blueprint to Database as a Service Mobile | Hot Topics TBD Wednesday, June 11 2:50 pm - 3:25 pm From Supply-led to Demand-led: Lead Your IT to Better Serve Your Users Cloud Business and the API Economy | Deploying the Cloud TBD Thursday, June 12 2:50 pm - 3:25 pm Reduce Complexity and Accelerate Innovation with IaaS and PaaS Cloud Business and the API Economy | Deploying the Cloud TBD At Cloud Expo East, you'll get to learn about and experience the latest in Cloud and Big Data. If you don't have a pass to Cloud Expo, no problem. Oracle is giving away FREE VIP Gold Passes! We would love to have you attend Cloud Expo on us. Just go to Oracle's Cloud Expo 2014 event registration page and follow the instructions for a complimentary pass. Stay tuned to this blog and follow us on Twitter (@OracleCloudZone) during and after Cloud Expo for more insight and observations about this year's conference.

Read the article
ArchBeat Link-o-Rama for 2012-06-21

- by Bob Rhubart

Software Architects Need Not Apply | Dustin Marx "I think there is a place for software architecture," says Dustin Marx, "but a portion of our fellow software architects have harmed the reputation of the discipline." For another angle on this subject, check out Out of the Tower, Into the Trenches from the Nov/Dec edition of Oracle Magazine. Oracle Data Integrator 11g - Faster Files | David Allan David Allan illustrates "a big step for regular file processing on the way to super-charging big data files using Hadoop." 2012 Oracle Fusion Middleware Innovation Awards - Win a FREE Pass to Oracle OpenWorld 2012 in SF Share your use of Oracle Fusion Middleware solutions and how they help your organization drive business innovation. You just might win a free pass to Oracle Openworld 2012 in San Francisco. Deadline for submissions in July 17, 2012. WLST Domain creation using dry-run | Michel Schildmeijer What to do "if you want to browse through your domain to check if settings you want to apply satisfy your requirements." Cloud opens up new vistas for service orientation at Netflix | Joe McKendrick "Many see service oriented architecture as laying the groundwork for cloud. But at one well-known company, cloud has instigated the move to SOA." How to avoid the Portlet Skin mismatch | Martin Deh Detailed how-to from WebCenter A-Team blogger Martin Deh. Internationalize WebCenter Portal - Content Presenter | Stefan Krantz Stefan Krantz explains "how to get Content Presenter and its editorials to comply with the current selected locale for the WebCenter Portal session." Oracle Public Cloud Architecture | Tyler Jewell Tyler Jewell discusses the multi-tenancy model and elasticity solution implemented by Oracle Cloud in this QCon presentation. A Distributed Access Control Architecture for Cloud Computing The authors of this InfoQ article discuss a distributed architecture based on the principles from security management and software engineering. Thought for the Day "Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to to, let us concentrate rather on explaining to human beings what we want a computer to do." — Donald Knuth Source: Quotes for Software Engineers

Read the article
Oracle Endeca Information Discovery 3.1 is Now Available

- by p.anda

Oracle Endeca Information Discovery (OEID) 3.1 is a major release that incorporates significant new self-service discovery capabilities for business users. These include agile data mashup, extended support for unstructured analytics, and an even tighter integration with Oracle BI This release is available for download from: Oracle Delivery Cloud Oracle Technology Network Some of the what's new highlights ... Self-service data mashup... enables access to a wider variety of personal and trusted enterprise data sources. Blend multiple data sets in a single app. Agile discovery dashboards... allows users to easily create, configure, and securely share discovery dashboards with intelligent defaults, intuitive wizards and drag-and-drop configuration. Deeper unstructured analysis ... enables users to enrich text using term extraction and whitelist tagging while the data is live. Enhanced integration with OBI... provides easier wizards for data selection and enables OBI Server as a self-service data source. Enterprise-class data discovery... offers faster performance, a trusted data connection library, improved auditing and increased data connectivity for Hadoop, web content and Oracle Data Integrator. Find out more ... visit the OEID Overview page to download the What's New and related Data Sheet PDF documents. Have questions or want to share details for Oracle Endeca Information Discovery? The MOS Communities is a great first stop to visit and you can stop-by at MOS OEID Community.

Read the article
What makes a person contribute to opensource? [duplicate]

- by Jibin

This question already has an answer here: Why develop free, open source programs? [closed] 14 answers I know this is controversial, but.. There are many great projects like Apache Webserver or Hadoop provided by the OpenSource Community. I often feel that the people that actually benefit (financially), from these projects are developers like me, sitting in India, working for MNCs, who has never contributed anything to any opensource project so far, but earn handsomely due to my basic googling skills & the community provided documentation. Is it fair ? I mean no other industry in the world face such dilemma.Those who work get paid. I mean I almost am starting to feel guilty of taking such advantage of some thing that I contribute nothing to. I had to do projects every semester in college (we could choose projects) & I used to enjoy coding then. I want to contribute. But contributing to opensource is not a task [unlike college or office work]. And life is so busy .. Are all these opensource contributors really jobless ?[just kidding..] Could someone please share some personal experiences on how you guys started contributing or any advice on why I should contribute or what attitude in life makes you keep aside time or is it that you just crazly love writing code or is it that you just love to see your name in the contributors list or do you have a local coding group you hang out with ? Do you feel I am destined to do this ? This is my part of contributing back to the world ? Whats that basic mentality that make you guys want to contribute, while I just want to finish my work and go home. What makes you guys tick ?

Read the article

Search Results

Search found 401 results on 17 pages for 'hadoop'.

Page 13/17 | < Previous Page | 9 10 11 12 13 14 15 16 17 | Next Page >

- by bmargulies

- by Nazariy

- by Niels Basjes

- by Maxim Veksler

- by jens

- by jens

- by Pengin

- by David Dorf

- by chberger

- by Bob Rhubart

- by BuckWoody

- by Bertrand Matthelié

- by BuckWoody

- by Bertrand Matthelié

- by shiju

- by Bob Rhubart

- by Bob Rhubart-Oracle

- by Davide Mauri

- by royB

- by Jean-Pierre Dijcks

- by Dave Mateer

- by Gene Eun

- by Bob Rhubart

- by p.anda

- by Jibin

< Previous Page | 9 10 11 12 13 14 15 16 17 | Next Page >