Search Results

Search found 59118 results on 2365 pages for 'data persistence'.

Page 9/2365 | < Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16  | Next Page >

  • Big Data – Buzz Words: What is MapReduce – Day 7 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned what is Hadoop. In this article we will take a quick look at one of the four most important buzz words which goes around Big Data – MapReduce. What is MapReduce? MapReduce was designed by Google as a programming model for processing large data sets with a parallel, distributed algorithm on a cluster. Though, MapReduce was originally Google proprietary technology, it has been quite a generalized term in the recent time. MapReduce comprises a Map() and Reduce() procedures. Procedure Map() performance filtering and sorting operation on data where as procedure Reduce() performs a summary operation of the data. This model is based on modified concepts of the map and reduce functions commonly available in functional programing. The library where procedure Map() and Reduce() belongs is written in many different languages. The most popular free implementation of MapReduce is Apache Hadoop which we will explore tomorrow. Advantages of MapReduce Procedures The MapReduce Framework usually contains distributed servers and it runs various tasks in parallel to each other. There are various components which manages the communications between various nodes of the data and provides the high availability and fault tolerance. Programs written in MapReduce functional styles are automatically parallelized and executed on commodity machines. The MapReduce Framework takes care of the details of partitioning the data and executing the processes on distributed server on run time. During this process if there is any disaster the framework provides high availability and other available modes take care of the responsibility of the failed node. As you can clearly see more this entire MapReduce Frameworks provides much more than just Map() and Reduce() procedures; it provides scalability and fault tolerance as well. A typical implementation of the MapReduce Framework processes many petabytes of data and thousands of the processing machines. How do MapReduce Framework Works? A typical MapReduce Framework contains petabytes of the data and thousands of the nodes. Here is the basic explanation of the MapReduce Procedures which uses this massive commodity of the servers. Map() Procedure There is always a master node in this infrastructure which takes an input. Right after taking input master node divides it into smaller sub-inputs or sub-problems. These sub-problems are distributed to worker nodes. A worker node later processes them and does necessary analysis. Once the worker node completes the process with this sub-problem it returns it back to master node. Reduce() Procedure All the worker nodes return the answer to the sub-problem assigned to them to master node. The master node collects the answer and once again aggregate that in the form of the answer to the original big problem which was assigned master node. The MapReduce Framework does the above Map () and Reduce () procedure in the parallel and independent to each other. All the Map() procedures can run parallel to each other and once each worker node had completed their task they can send it back to master code to compile it with a single answer. This particular procedure can be very effective when it is implemented on a very large amount of data (Big Data). The MapReduce Framework has five different steps: Preparing Map() Input Executing User Provided Map() Code Shuffle Map Output to Reduce Processor Executing User Provided Reduce Code Producing the Final Output Here is the Dataflow of MapReduce Framework: Input Reader Map Function Partition Function Compare Function Reduce Function Output Writer In a future blog post of this 31 day series we will explore various components of MapReduce in Detail. MapReduce in a Single Statement MapReduce is equivalent to SELECT and GROUP BY of a relational database for a very large database. Tomorrow In tomorrow’s blog post we will discuss Buzz Word – HDFS. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • SQL Developer Data Modeler v3.3 Early Adopter: Search

    - by thatjeffsmith
    photo: Stuck in Customs via photopin cc The next version of Oracle SQL Developer Data Modeler is now available as an Early Adopter (read, beta) release. There are many new major feature enhancements to talk about, but today’s focus will be on the brand new Search mechanism. Data, data, data – SO MUCH data Google has made countless billions of dollars around a very efficient and intelligent search business. People have become accustomed to having their data accessible AND searchable. Data models can have thousands of entities or tables, each having dozens of attributes or columns. Imagine how hard it could be to find what you’re looking for here. This is the challenge we have tackled head-on in v3.3. Same location as the Search toolbar in Oracle SQL Developer (and most web browsers) Here’s how it works: Search as you type – wicked fast as the entire model is loaded into memory Supports regular expressions (regex) Results loaded to a new panel below Search across designs, models Search EVERYTHING, or filter by type Save your frequent searches Save your search results as a report Open common properties of object in search results and edit basic properties on-the-fly Want to just watch the video? We have a new Oracle Learning Library resource available now which introduces the new and improved Search mechanism in SQL Developer Data Modeler. Go watch the video and then come back. Some Screenshots This will be a pretty easy feature to pick up. Search is intuitive – we’ve already learned how to do search. Now we just have a better interface for it in SQL Developer Data Modeler. But just in case you need a couple of pointers… The SYS data dictionary in model form with Search Results If I type ‘translation’ in the search dialog, then the results will come up as hits are ‘resolved.’ By default, everything is searched, although I can filter the results after-the-fact. You can see where the search finds a match in the ‘Content’ column Save the Results as a Report If you limit the search results to a category and a model, then you can save the results as a report. All of the usual suspects You can optionally include the search string, which displays in the top of of the report as ‘PATTERN.’ You can save you common reporting setups as a template and reuse those as well. Here’s a sample HTML report: Yes, I like to search my search results report! Two More Ways to Search You can search ‘in context’ by opening the ‘Find’ dialog from an active design. You can do this using the ‘Search’ toolbar button or from a model context menu. Searching a specific model Instead of bringing up the old modal Find dialog, you now get to use the new and improved Search panel. Notice there’s no ‘Model’ drop-down to select and that the active Search form is now in the Search panel versus the search toolbar up top. What else is new in SQL Developer Data Modeler version 3.3? All kinds of goodies. You can send your model to Excel for quick edits/reviews and suck the changes back into your model, you can share objects between models, and much much more. You’ll find new videos and blog posts on the subject in the new few days and weeks. Enjoy! If you have any feedback or want to report bugs, please visit our forums.

    Read the article

  • Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the Relational Database and NoSQL database in the Big Data Story. In this article we will understand the role of Key-Value Pair Databases and Document Databases Supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (Yesterday’s post) NoSQL Databases (Yesterday’s post) Key-Value Pair Databases (This post) Document Databases (This post) Columnar Databases (Tomorrow’s post) Graph Databases (Tomorrow’s post) Spatial Databases (Tomorrow’s post) Key Value Pair Databases Key Value Pair Databases are also known as KVP databases. A key is a field name and attribute, an identifier. The content of that field is its value, the data that is being identified and stored. They have a very simple implementation of NoSQL database concepts. They do not have schema hence they are very flexible as well as scalable. The disadvantages of Key Value Pair (KVP) database are that they do not follow ACID (Atomicity, Consistency, Isolation, Durability) properties. Additionally, it will require data architects to plan for data placement, replication as well as high availability. In KVP databases the data is stored as strings. Here is a simple example of how Key Value Database will look like: Key Value Name Pinal Dave Color Blue Twitter @pinaldave Name Nupur Dave Movie The Hero As the number of users grow in Key Value Pair databases it starts getting difficult to manage the entire database. As there is no specific schema or rules associated with the database, there are chances that database grows exponentially as well. It is very crucial to select the right Key Value Pair Database which offers an additional set of tools to manage the data and provides finer control over various business aspects of the same. Riak Rick is one of the most popular Key Value Database. It is known for its scalability and performance in high volume and velocity database. Additionally, it implements a mechanism for collection key and values which further helps to build manageable system. We will further discuss Riak in future blog posts. Key Value Databases are a good choice for social media, communities, caching layers for connecting other databases. In simpler words, whenever we required flexibility of the data storage keeping scalability in mind – KVP databases are good options to consider. Document Database There are two different kinds of document databases. 1) Full document Content (web pages, word docs etc) and 2) Storing Document Components for storage. The second types of the document database we are talking about over here. They use Javascript Object Notation (JSON) and Binary JSON for the structure of the documents. JSON is very easy to understand language and it is very easy to write for applications. There are two major structures of JSON used for Document Database – 1) Name Value Pairs and 2) Ordered List. MongoDB and CouchDB are two of the most popular Open Source NonRelational Document Database. MongoDB MongoDB databases are called collections. Each collection is build of documents and each document is composed of fields. MongoDB collections can be indexed for optimal performance. MongoDB ecosystem is highly available, supports query services as well as MapReduce. It is often used in high volume content management system. CouchDB CouchDB databases are composed of documents which consists fields and attachments (known as description). It supports ACID properties. The main attraction points of CouchDB are that it will continue to operate even though network connectivity is sketchy. Due to this nature CouchDB prefers local data storage. Document Database is a good choice of the database when users have to generate dynamic reports from elements which are changing very frequently. A good example of document usages is in real time analytics in social networking or content management system. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • The data reader returned by the store data provider does not have enough columns

    - by molgan
    Hello I get the following error when I try to execute a stored procedure: "The data reader returned by the store data provider does not have enough columns" When I in the sql-manager execute it like this: DECLARE @return_value int, @EndDate datetime EXEC @return_value = [dbo].[GetSomeDate] @SomeID = 91, @EndDate = @EndDate OUTPUT SELECT @EndDate as N'@EndDate' SELECT 'Return Value' = @return_value GO It returns the value properly.... @SomeDate = '2010-03-24 09:00' And in my app I have: if (_entities.Connection.State == System.Data.ConnectionState.Closed) _entities.Connection.Open(); using (EntityCommand c = new EntityCommand("MyAppEntities.GetSomeDate", (EntityConnection)this._entities.Connection)) { c.CommandType = System.Data.CommandType.StoredProcedure; EntityParameter paramSomeID = new EntityParameter("SomeID", System.Data.DbType.Int32); paramSomeID.Direction = System.Data.ParameterDirection.Input; paramSomeID.Value = someID; c.Parameters.Add(paramSomeID); EntityParameter paramSomeDate = new EntityParameter("SomeDate", System.Data.DbType.DateTime); SomeDate.Direction = System.Data.ParameterDirection.Output; c.Parameters.Add(paramSomeDate); int retval = c.ExecuteNonQuery(); return (DateTime?)c.Parameters["SomeDate"].Value; Why does it complain about columns? I googled on error and someone said something about removing RETURN in sp, but I dont have any RETURN there. last like is like SELECT @SomeDate = D.SomeDate FROM .... /M

    Read the article

  • SQLAuthority News – Best Practices for Data Warehousing with SQL Server 2008 R2

    - by pinaldave
    An integral part of any BI system is the data warehouse—a central repository of data that is regularly refreshed from the source systems. The new data is transferred at regular intervals  by extract, transform, and load (ETL) processes. This whitepaper talks about what are best practices for Data Warehousing. This whitepaper discusses ETL, Analysis, Reporting as well relational database. The main focus of this whitepaper is on mainly ‘architecture’ and ‘performance’. Download Best Practices for Data Warehousing with SQL Server 2008 R2 Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Best Practices, Data Warehousing, PostADay, SQL, SQL Authority, SQL Documentation, SQL Download, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Nagy dobás készül az Oracle adatányászati felületen, Oracle Data Mining

    - by Fekete Zoltán
    Ahogyan már a tavaly oszi Oracle OpenWorld hírekben és eloadásokban is láthattuk a beharangozót, az Oracle nagy dobásra készül az adatbányászati fronton (Oracle Data Mining), mégpedig a remekül használható adatbányászati motor grafikus felületének a kiterjesztésével. Ha jól megfigyeljük ezt az utóbbi linket, az eddigi grafikus felület már Oracle Data Miner Classic néven fut. Hogyan is lehet használni az Oracle Data Mining-ot? - Oracle Data Miner (ingyenesen letöltheto GUI az OTN-rol) - Java-ból és PL/SQL-bol, Oracle Data Mining JDeveloper and SQL Developer Extensions - Excel felületrol, Oracle Spreadsheet Add-In for Predictive Analytics - ODM Connector for mySAP BW Oracle Data Mining technikai információ.

    Read the article

  • "Oracle Certified Expert, Java Platform, Enterprise Edition 6 Java Persistence API Developer" Preparation

    - by Matt
    I have been working with Hibernate for a fews years now, and I want to solidify and demonstrate my knowledge by taking the Oracle JPA certification, also known as: "Oracle Certified Expert, Java Platform, Enterprise Edition 6 Java Persistence API Developer (CX-310-094)" There is a training course provided by Oracle: "Building Database Driven Applications with JPA (SL-370-EE6)" But this costs $1800 and I think it would be overkill for my needs. Ideally, I would like a self study guide that will cover everything in the exam. I have looked for books and these seem like possibilities: Pro JPA 2: Mastering the Java Persistence API (Expert's Voice in Java Technology) and Beginning Java EE 6 with GlassFish 3 2nd Edition (Expert's Voice in Java Technology) But these aren't checklist type study guides as far as I am aware. I found the official SCJP study guide very useful, but I think the equivalent text for the JPA exam isn't out yet. If anyone has taken this exam, I would be grateful to hear how you prepared for it. Thanks!

    Read the article

  • "Oracle Certified Expert, Java Platform, Enterprise Edition 6 Java Persistence API Developer" Preparation

    - by Matt
    I have been working with Hibernate for a fews years now, and I want to solidify and demonstrate my knowledge by taking the Oracle JPA certification, also known as: "Oracle Certified Expert, Java Platform, Enterprise Edition 6 Java Persistence API Developer (CX-310-094)" There is a training course provided by Oracle: "Building Database Driven Applications with JPA (SL-370-EE6)" But this costs $1800 and I think it would be overkill for my needs. Ideally, I would like a self study guide that will cover everything in the exam. I have looked for books and these seem like possibilities: Pro JPA 2: Mastering the Java Persistence API (Expert's Voice in Java Technology) and Beginning Java EE 6 with GlassFish 3 2nd Edition (Expert's Voice in Java Technology) But these aren't checklist type study guides as far as I am aware. I found the official SCJP study guide very useful, but I think the equivalent text for the JPA exam isn't out yet. If anyone has taken this exam, I would be grateful to hear how you prepared for it. Thanks!

    Read the article

  • Google I/O 2012 - Big Data: Turning Your Data Problem Into a Competitive Advantage

    Google I/O 2012 - Big Data: Turning Your Data Problem Into a Competitive Advantage Ju-kay Kwek, Navneet Joneja Can businesses get practical value from web-scale data without building proprietary web-scale infrastructure? This session will explore how new Google data services can be used to solve key data storage, transformation and analysis challenges. We will look at concrete case studies demonstrating how real life businesses have successfully used these solutions to turn data into a competitive business asset. For all I/O 2012 sessions, go to developers.google.com From: GoogleDevelopers Views: 1 0 ratings Time: 52:39 More in Science & Technology

    Read the article

  • EclipseLink 2.4 Released: RESTful Persistence, Tenant Isolation, NoSQL, and JSON

    - by arungupta
    EclipseLink 2.4 is released as part of Eclipse Juno release train. In addition to providing the Reference Implementation for JPA 2.0, the key features in the release are: RESTful Persistence - Expose Java Persistence units over REST using either JSON or XML Tenant Isolation - Manage entities for multiple tenants in the same application NoSQL - NoSQL support for MongoDB and Oracle NoSQL JSON - Marshaling and unmarshaling of JSON object Here is the complete list of bugs fixed in this release. The landing page provide the complete list of documentation and examples. Read Doug Clarke's blog for a color commentary as well. This release is already integrated in the latest GlassFish 4.0 promoted build. Try the functionality and give us feedback at GlassFish Forum or EclipseLink Forum.

    Read the article

  • Data-Driven SOA with Oracle Data Integrator

    - by Irem Radzik
    v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Cambria","serif"; mso-fareast-font-family:"MS Mincho";} By Mike Eisterer, Data integration is more than simply moving data in bulk or in real-time, it is also about unifying information for improved business agility and integrating it in today’s service-oriented architectures. SOA enables organizations to easily define services which may then be discovered and leveraged by varying consumers. These consumers may be applications, customer facing portals, or complex business rules which are assembling services to automate process. Data as a foundational service provider is a key component of today’s successful SOA implementations. Oracle offers the broadest and most integrated portfolio of products to help you define, organize, orchestrate and consume data services. If you are attending Oracle OpenWorld next week, you will have ample opportunity to see the latest Oracle Data Integrator live in action and work with it yourself in two offered Hands-on Labs. Visit the hands-on lab to gain experience firsthand: Oracle Data Integrator and Oracle SOA Suite: Hands-on- Lab (HOL10480) Wed Oct 3rd 11:45AM Marriott Marquis- Salon 1/2 To learn more about Oracle Data Integrator, please visit our Introduction Hands-on LAB: Introduction to Oracle Data Integrator (HOL10481) Mon Oct 1st 3:15PM, Marriott Marquis- Salon 1/2 If you are not able to attend OpenWorld, please check out our latest resources for Data Integration.

    Read the article

  • Subsetting a data frame in a function using another data frame as parameter

    - by lecodesportif
    I would like to submit a data frame to a function and use it to subset another data frame. This is the basic data frame: foo <- data.frame(var1= c('1', '1', '1', '2', '2', '3'), var2=c('A', 'A', 'B', 'B', 'C', 'C')) I use the following function to find out the frequencies of var2 for specified values of var1. foobar <- function(x, y, z){ a <- subset(x, (x$var1 == y)) b <- subset(a, (a$var2 == z)) n=nrow(b) return(n) } Examples: foobar(foo, 1, "A") # returns 2 foobar(foo, 1, "B") # returns 1 foobar(foo, 3, "C") # returns 1 This works. But now I want to submit a data frame of values to foobar. Instead of the above examples, I would like to submit df to foobar and get the same results as above (2, 1, 1) df <- data.frame(var1=c('1','1','3'), var2=c("A", "B", "C")) When I change foobar to accept two arguments like foobar(foo, df) and use y[, c(var1)] and y[, c(var2)] instead of the two parameters x and y it still doesn't work. Which way is there to do this?

    Read the article

  • Managing Data Dependecies of Java Classes that Load Data from the Classpath at Runtime

    - by Martin Potthast
    What is the simplest way to manage dependencies of Java classes to data files present in the classpath? More specifically: How should data dependencies be annotated? Perhaps using Java annotations (e.g., @Data)? Or rather some build entries in a build script or a properties file? Is there build tool that integrates and evaluates such information (Ant, Scons, ...)? Do you have examples? Consider the following scenario: A few lines of Ant create a Jar from my sources that includes everything found on the classpath. Then jarjar is used to remove all .class files that are not necessary to execute, say, class Foo. The problem is that all the data files that class Bar depends upon are still there in the Jar. The ideal deployment script, however, would recognize that the data files on which only class Bar depends can be removed while data files on which class Foo depends must be retained. Any hints?

    Read the article

  • Best way to implement user-powered data validation

    - by vegetables
    I run a product recommendation engine and I'm hitting a few snags. I'm looking to see if anyone has any recommendations on what I should do to minimize these issues. Here's how the site works: Users come to the site and are presented with product recommendations based on some criteria. If a user knows of a product that is not in our system, they can add it by providing the product name and manufacturer. We take that information, and: Hit one API to gather all the product meta-data (and to validate the product spelling, etc). If the product is not in this first API, we do not allow it in our system. Use the information from step 1 to hit another API for pricing information (gathered from many places online). For the sake of discussion, assume that I am searching both APIs in the most efficient/successful manner possible. For the most part, this works very well. I'd say ~80% of our data is perfectly accurate, but there are a few issues: Sometimes the pricing API (Step 2) doesn't have any information for the product. The way the pricing API is built, it will always return something (theoretically, the closest possible match), and there's no guarantee that the product name is spelled exactly the same way in both APIs, so there's no automated way of knowing if it's the right product. When the pricing API finds the right product, occasionally it has outdated, or even invalid pricing data (e.g. if it screen-scraped the wrong price from a website). Since the site was fairly small at first, I was able to manually verify every product that was added to the website. However, the site has grown to the point where this is taking several hours per day, and is just not efficient use of my time. So, my question is: Aside from hiring someone (or getting an intern) to validate all the data manually, what would be the best system of letting my userbase self-manage the data. Specifically, how can I allow users to edit the data while minimizing the risk of someone ambushing my website, or accidentally setting the data incorrectly.

    Read the article

  • data handling with javascript

    - by Vincent Warmerdam
    Python has a very neat package called pandas which allows for quick data transformation; tables, aggregation, that sort of thing. A lot of these types of functionality can also be found in the python itertools module. The plyR package in R is also very similar. Usually one woud use this functionality to produce a table which is later visualized with a plot. I am personally very fond of d3, and I would like to allow the user to first indicate what type of data aggregation he wants on the dataset before it is visualized. The visualisation in question involves making a heatmap where the user gets to select the size of the bins of the heatmap beforehand (I want d3 to project this through leaflet). I want to visually select the ideal size of the bins for the heatmap. The way I work now is that I take the dataset, aggregate it with python and then manually load it in d3. This is a process that takes a lot of human effort and I was wondering if the data aggregation can be done through the javascript of the browser. I couldn't find a package for javascript specifically built for data, suggesting (to me) that this is a bad idea and that one should not use javascript for the data handling. Is there a good module/package for javascript to handle data aggregation? Is it a good/bad idea to do the data aggregation in javascript (performance wise)?

    Read the article

  • Customizing the NUnit GUI for data-driven testing

    - by rwong
    My test project consists of a set of input data files which is fed into a piece of legacy third-party software. Since the input data files for this software are difficult to construct (not something that can be done intentionally), I am not going to add new input data files. Each input data file will be subject to a set of "test functions". Some of the test functions can be invoked independently. Other test functions represent the stages of a sequential operation - if an earlier stage fails, the subsequent stages do not need to be executed. I have experimented with the NUnit parametrized test case (TestCaseAttribute and TestCaseSourceAttribute), passing in the list of data files as test cases. I am generally satisfied with the the ability to select the input data for testing. However, I would like to see if it is possible to customize its GUI's tree structure, so that the "test functions" become the children of the "input data". For example: File #1 CheckFileTypeTest GetFileTopLevelStructureTest CompleteProcessTest StageOneTest StageTwoTest StageThreeTest File #2 CheckFileTypeTest GetFileTopLevelStructureTest CompleteProcessTest StageOneTest StageTwoTest StageThreeTest This will be useful for identifying the stage that failed during the processing of a particular input file. Is there any tips and tricks that will enable the new tree layout? Do I need to customize NUnit to get this layout?

    Read the article

  • Test Data in a Distributed System

    - by Davin Tryon
    A question that has been vexing me lately has been about how to effectively test (end-to-end) features in a distributed system. Particuarly, how to effectively manage (through time) test data for feature testing. The system in question is a typical SOA setup. The composition is done in JavaScript when call to several REST APIs. Each service is built as an independent block. Each service has some kind of persistent storage (SQL Server in most cases). The main issue at the moment is how to approach test data when testing end-to-end features. Functional end-to-end testing occurs through the UI, and it is therefore necessary for test data to be set up before the test run (this could be manual or automated testing). As is typical in a distributed system, identifiers from one service are used as a link in another service. So, some level of synchronization needs to be present in the data to effectively test. What is the best way to manage and set up this data after a successful deployment to a test environment? For example, is it better to manage this test data inside each service? Or package it together with the testing suite? Does that testing suite exist as a separate project? I'm interested in design guidance about how to store and manage this test data as the application features evolve.

    Read the article

  • Data Management Business Continuity Planning

    Business Continuity Governance In order to ensure data continuity for an organization, they need to ensure they know how to handle a data or network emergency because all systems have the potential to fail. Data Continuity Checklist: Disaster Recovery Plan/Policy Backups Redundancy Trained Staff Business Continuity Policies In order to protect data in case of any emergency a company needs to put in place a Disaster recovery plan and policies that can be executed by IT staff to ensure the continuity of the existing data and/or limit the amount of data that is not contiguous.  A disaster recovery plan is a comprehensive statement of consistent actions to be taken before, during and after a disaster, according to Geoffrey H. Wold. He also states that the primary objective of disaster recovery planning is to protect the organization in the event that all or parts of its operations and/or computer services are rendered unusable. Furthermore, companies can mandate through policies that IT must maintain redundant hardware in case of any hardware failures and redundant network connectivity incase the primary internet service provider goes down.  Additionally, they can require that all staff be trained in regards to the Disaster recovery policy to ensure that all parties evolved are knowledgeable to execute the recovery plan. Business Continuity Procedures Business continuity procedure vary from organization to origination, however there are standard procedures that most originations should follow. Standard Business Continuity Procedures Backup and Test Backups to ensure that they work Hire knowledgeable and trainable staff  Offer training on new and existing systems Regularly monitor, test, maintain, and upgrade existing system hardware and applications Maintain redundancy regarding all data, and critical business functionality

    Read the article

  • How to choose how to store data?

    - by Eldros
    Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime. - Chinese Proverb I could ask what kind of data storage I should use for my actual project, but I want to learn to fish, so I don't need to ask for a fish each time I begin a new project. So, until I used two methods to store data on my non-game project: XML files, and relational databases. I know that there is also other kind of database, of the NoSQL kind. However I wouldn't know if there is more choice available to me, or how to choose in the first place, aside arbitrary picking one. So the question is the following: How should I choose the kind of data storage for a game project? And I would be interested on the following criterion when choosing: The size of the project. The platform targeted by the game. The complexity of the data structure. Added Portability of data amongst many project. Added How often should the data be accessed Added Multiple type of data for a same application Any other point you think is of interest when deciding what to use. EDIT I know about Would it be better to use XML/JSON/Text or a database to store game content?, but thought it didn't address exactly my point. Now if I am wrong, I would gladely be shown the error in my ways.

    Read the article

  • Is JPA persistence.xml classpath located?

    - by Vinnie
    Here's what I'm trying to do. I'm using JPA persistence in a web application, but I have a set of unit tests that I want to run outside of a container. I have my primary persistence.xml in the META_INF folder of my main app and it works great in the container (Glassfish). I placed a second persistence.xml in the META-INF folder of my test-classes directory. This contains a separate persistence unit that I want to use for test only. In eclipse, I placed this folder higher in the classpath than the default folder and it seems to work. Now when I run the maven build directly from the command line and it attempts to run the unit tests, the persistence.xml override is ignored. I can see the override in the META-INF folder of the maven generated test-classes directory and I expected the maven tests to use this file, but it isn't. My Spring test configuration overrides, achieved in a similar fashion are working. I'm confused at to whether the persistence.xml is located through the classpath. If it were, my override should work like the spring override since the maven surefire plugin explains "[The test class directory] will be included at the beginning the test classpath". Did I wrongly anticipate how the persistence.xml file is located? I could (and have) create a second persistence unit in the production persistence.xml file, but it feels dirty to place test configuration into this production file. Any other ideas on how to achieve my goal is welcome.

    Read the article

  • Data Masking Pack 12.1.0.3 Certified with E-Business Suite 12.1.3

    - by Elke Phelps (Oracle Development)
    I'm pleased to announce the certification of the E-Business Suite 12.1.3 Data Masking Template for the Data Masking Pack with Enterprise Manager Cloud Control 12.1.0.3. You can use the Oracle Data Masking Pack with Oracle Enterprise Manager Grid Control 12c to scramble sensitive data in cloned E-Business Suite environments.     You may scramble data in E-Business Suite cloned environments with EM12.1.0.3 using the following template: E-Business Suite 12.1.3 Data Masking Template for Data Masking Pack with EM12c (Patch 18462641) What does data masking do in E-Business Suite environments? Application data masking does the following: De-identify the data:  Scramble identifiers of individuals, also known as personally identifiable information or PII.  Examples include information such as name, account, address, location, and driver's license number. Mask sensitive data:  Mask data that, if associated with personally identifiable information (PII), would cause privacy concerns.  Examples include compensation, health and employment information.   Maintain data validity:  Provide a fully functional application.  How can EBS customers use data masking? The Oracle E-Business Suite Template for Data Masking Pack can be used in situations where confidential or regulated data needs to be shared with other non-production users who need access to some of the original data, but not necessarily every table.  Examples of non-production users include internal application developers or external business partners such as offshore testing companies, suppliers or customers.  Due to data dependencies, scrambling E-Business Suite data is not a trivial task.  The data needs to be scrubbed in such a way that allows the application to continue to function. The template works with the Oracle Data Masking Pack and Oracle Enterprise Manager to obscure sensitive E-Business Suite information that is copied from production to non-production environments.  The Oracle E-Business Suite Template for Data Masking Pack is applied to a non-production environment with the Enterprise Manager Grid Control Data Masking Pack.  When applied, the Oracle E-Business Suite Template for Data Masking Pack will create an irreversibly scrambled version of your production database for development and testing. Is there a charge for this? Yes. You must purchase licenses for the Oracle Data Masking Pack to use the Oracle E-Business Suite 12.1.3 template. The Oracle E-Business Suite 12.1.3 Template for the Data Masking Pack is included with the Oracle Data Masking Pack license.  You can contact your Oracle account manager for more details about licensing. References Additional details and requirements are provided in the following My Oracle Support Note: Using Oracle E-Business Suite Release 12.1.3 Template for the Data Masking Pack with Oracle Enterprise Manager 12.1 Data Masking Tool (Note 1481916.1) Masking Sensitive Data in the Oracle Database Real Application Testing User's Guide 11g Release 2 (11.2) Related Articles Scrambling Sensitive Data in E-Business Suite E-Business Suite 12.1.3 Data Masking Certified with Enterprise Manager 12c

    Read the article

  • Java EE 6 and NoSQL/MongoDB on GlassFish using JPA and EclipseLink 2.4 (TOTD #175)

    - by arungupta
    TOTD #166 explained how to use MongoDB in your Java EE 6 applications. The code in that tip used the APIs exposed by the MongoDB Java driver and so requires you to learn a new API. However if you are building Java EE 6 applications then you are already familiar with Java Persistence API (JPA). Eclipse Link 2.4, scheduled to release as part of Eclipse Juno, provides support for NoSQL databases by mapping a JPA entity to a document. Their wiki provides complete explanation of how the mapping is done. This Tip Of The Day (TOTD) will show how you can leverage that support in your Java EE 6 applications deployed on GlassFish 3.1.2. Before we dig into the code, here are the key concepts ... A POJO is mapped to a NoSQL data source using @NoSQL or <no-sql> element in "persistence.xml". A subset of JPQL and Criteria query are supported, based upon the underlying data store Connection properties are defined in "persistence.xml" Now, lets lets take a look at the code ... Download the latest EclipseLink 2.4 Nightly Bundle. There is a Installer, Source, and Bundle - make sure to download the Bundle link (20120410) and unzip. Download GlassFish 3.1.2 zip and unzip. Install the Eclipse Link 2.4 JARs in GlassFish Remove the following JARs from "glassfish/modules": org.eclipse.persistence.antlr.jar org.eclipse.persistence.asm.jar org.eclipse.persistence.core.jar org.eclipse.persistence.jpa.jar org.eclipse.persistence.jpa.modelgen.jar org.eclipse.persistence.moxy.jar org.eclipse.persistence.oracle.jar Add the following JARs from Eclipse Link 2.4 nightly build to "glassfish/modules": org.eclipse.persistence.antlr_3.2.0.v201107111232.jar org.eclipse.persistence.asm_3.3.1.v201107111215.jar org.eclipse.persistence.core.jpql_2.4.0.v20120407-r11132.jar org.eclipse.persistence.core_2.4.0.v20120407-r11132.jar org.eclipse.persistence.jpa.jpql_2.0.0.v20120407-r11132.jar org.eclipse.persistence.jpa.modelgen_2.4.0.v20120407-r11132.jar org.eclipse.persistence.jpa_2.4.0.v20120407-r11132.jar org.eclipse.persistence.moxy_2.4.0.v20120407-r11132.jar org.eclipse.persistence.nosql_2.4.0.v20120407-r11132.jar org.eclipse.persistence.oracle_2.4.0.v20120407-r11132.jar Start MongoDB Download latest MongoDB from here (2.0.4 as of this writing). Create the default data directory for MongoDB as: sudo mkdir -p /data/db/sudo chown `id -u` /data/db Refer to Quickstart for more details. Start MongoDB as: arungup-mac:mongodb-osx-x86_64-2.0.4 <arungup> ->./bin/mongod./bin/mongod --help for help and startup optionsMon Apr  9 12:56:02 [initandlisten] MongoDB starting : pid=3124 port=27017 dbpath=/data/db/ 64-bit host=arungup-mac.localMon Apr  9 12:56:02 [initandlisten] db version v2.0.4, pdfile version 4.5Mon Apr  9 12:56:02 [initandlisten] git version: 329f3c47fe8136c03392c8f0e548506cb21f8ebfMon Apr  9 12:56:02 [initandlisten] build info: Darwin erh2.10gen.cc 9.8.0 Darwin Kernel Version 9.8.0: Wed Jul 15 16:55:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_I386 i386 BOOST_LIB_VERSION=1_40Mon Apr  9 12:56:02 [initandlisten] options: {}Mon Apr  9 12:56:02 [initandlisten] journal dir=/data/db/journalMon Apr  9 12:56:02 [initandlisten] recover : no journal files present, no recovery neededMon Apr  9 12:56:02 [websvr] admin web console waiting for connections on port 28017Mon Apr  9 12:56:02 [initandlisten] waiting for connections on port 27017 Check out the JPA/NoSQL sample from SVN repository. The complete source code built in this TOTD can be downloaded here. Create Java EE 6 web app Create a Java EE 6 Maven web app as: mvn archetype:generate -DarchetypeGroupId=org.codehaus.mojo.archetypes -DarchetypeArtifactId=webapp-javaee6 -DgroupId=model -DartifactId=javaee-nosql -DarchetypeVersion=1.5 -DinteractiveMode=false Copy the model files from the checked out workspace to the generated project as: cd javaee-nosqlcp -r ~/code/workspaces/org.eclipse.persistence.example.jpa.nosql.mongo/src/model src/main/java Copy "persistence.xml" mkdir src/main/resources cp -r ~/code/workspaces/org.eclipse.persistence.example.jpa.nosql.mongo/src/META-INF ./src/main/resources Add the following dependencies: <dependency> <groupId>org.eclipse.persistence</groupId> <artifactId>org.eclipse.persistence.jpa</artifactId> <version>2.4.0-SNAPSHOT</version> <scope>provided</scope></dependency><dependency> <groupId>org.eclipse.persistence</groupId> <artifactId>org.eclipse.persistence.nosql</artifactId> <version>2.4.0-SNAPSHOT</version></dependency><dependency> <groupId>org.mongodb</groupId> <artifactId>mongo-java-driver</artifactId> <version>2.7.3</version></dependency> The first one is for the EclipseLink latest APIs, the second one is for EclipseLink/NoSQL support, and the last one is the MongoDB Java driver. And the following repository: <repositories> <repository> <id>EclipseLink Repo</id> <url>http://www.eclipse.org/downloads/download.php?r=1&amp;nf=1&amp;file=/rt/eclipselink/maven.repo</url> <snapshots> <enabled>true</enabled> </snapshots> </repository>  </repositories> Copy the "Test.java" to the generated project: mkdir src/main/java/examplecp -r ~/code/workspaces/org.eclipse.persistence.example.jpa.nosql.mongo/src/example/Test.java ./src/main/java/example/ This file contains the source code to CRUD the JPA entity to MongoDB. This sample is explained in detail on EclipseLink wiki. Create a new Servlet in "example" directory as: package example;import java.io.IOException;import java.io.PrintWriter;import javax.servlet.ServletException;import javax.servlet.annotation.WebServlet;import javax.servlet.http.HttpServlet;import javax.servlet.http.HttpServletRequest;import javax.servlet.http.HttpServletResponse;/** * @author Arun Gupta */@WebServlet(name = "TestServlet", urlPatterns = {"/TestServlet"})public class TestServlet extends HttpServlet { protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=UTF-8"); PrintWriter out = response.getWriter(); try { out.println("<html>"); out.println("<head>"); out.println("<title>Servlet TestServlet</title>"); out.println("</head>"); out.println("<body>"); out.println("<h1>Servlet TestServlet at " + request.getContextPath() + "</h1>"); try { Test.main(null); } catch (Exception ex) { ex.printStackTrace(); } out.println("</body>"); out.println("</html>"); } finally { out.close(); } } @Override protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { processRequest(request, response); } @Override protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { processRequest(request, response); }} Build the project and deploy it as: mvn clean packageglassfish3/bin/asadmin deploy --force=true target/javaee-nosql-1.0-SNAPSHOT.war Accessing http://localhost:8080/javaee-nosql/TestServlet shows the following messages in the server.log: connecting(EISLogin( platform=> MongoPlatform user name=> "" MongoConnectionSpec())) . . .Connected: User: Database: 2.7  Version: 2.7 . . .Executing MappedInteraction() spec => null properties => {mongo.collection=CUSTOMER, mongo.operation=INSERT} input => [DatabaseRecord( CUSTOMER._id => 4F848E2BDA0670307E2A8FA4 CUSTOMER.NAME => AMCE)]. . .Data access result: [{TOTALCOST=757.0, ORDERLINES=[{DESCRIPTION=table, LINENUMBER=1, COST=300.0}, {DESCRIPTION=balls, LINENUMBER=2, COST=5.0}, {DESCRIPTION=rackets, LINENUMBER=3, COST=15.0}, {DESCRIPTION=net, LINENUMBER=4, COST=2.0}, {DESCRIPTION=shipping, LINENUMBER=5, COST=80.0}, {DESCRIPTION=handling, LINENUMBER=6, COST=55.0},{DESCRIPTION=tax, LINENUMBER=7, COST=300.0}], SHIPPINGADDRESS=[{POSTALCODE=L5J1H7, PROVINCE=ON, COUNTRY=Canada, CITY=Ottawa,STREET=17 Jane St.}], VERSION=2, _id=4F848E2BDA0670307E2A8FA8,DESCRIPTION=Pingpong table, CUSTOMER__id=4F848E2BDA0670307E2A8FA7, BILLINGADDRESS=[{POSTALCODE=L5J1H8, PROVINCE=ON, COUNTRY=Canada, CITY=Ottawa, STREET=7 Bank St.}]}] You'll not see any output in the browser, just the output in the console. But the code can be easily modified to do so. Once again, the complete Maven project can be downloaded here. Do you want to try accessing relational and non-relational (aka NoSQL) databases in the same PU ?

    Read the article

  • Oracle Data Warehouse and Big Data Magazine MAY Edition for Customers + Partners

    - by KLaker
    Follow us on The latest edition of our monthly data warehouse and big data magazine for Oracle customers and partners is now available. The content for this magazine is taken from the various data warehouse and big data Oracle product management blogs, Oracle press releases, videos posted on Oracle Media Network and Oracle Facebook pages. Click here to view the May Edition Please share this link http://flip.it/fKOUS to our magazine with your customers and partners This magazine is optimized for display on tablets and smartphones using the Flipboard App which is available from the Apple App store and Google Play store

    Read the article

  • Version control and data provenance in charts, slides, and marketing materials that derive from code ouput

    - by EMS
    I develop as part of a small team that mostly does research and statistics stuff. But from the output of our code, other teams often create promotional materials, slides, presentations, etc. We run into a big problem because the marketing team (non-programmers) tend to use Excel, Adobe products, or other tools to carry out their work, and just want easy-to-use data formats from us. This leads to data provenance problems. We see email chains with attachments from 6 months ago and someone is saying "Hey, who generated this data. Can you generate more of it with the recent 6 months of results added in?" I want to help the other teams effectively use version control (my team uses it reasonably well for the code, but every other team classically comes up with many excuses to avoid it). For version controlling a software project where the participants are coders, I have some reasonable understanding of best practices and what to do. But for getting a team of marketing professionals to version control marketing materials and associate metadata about the software used to generate the data for the charts, I'm a bit at a loss. Some of the goals I'd like to achieve: Data that supported a material should never be associated with a person. As in, it should never be the case that someone says "Hey Person XYZ, I see you sent me this data as an attachment 6 months ago, can you update it for me?" Rather, data should be associated with the code and code-version of any code that was used to get it, and perhaps a team of many people who may maintain that code. Then references for data updates are about executing a specific piece of code, with a known version number. I'd like this to be a process that works easily with the tech that the marketing team already uses (e.g. Excel files, Adobe file, whatever). I don't want to burden them with needing to learn a bunch of new stuff just to use version control. They are capable folks, so learning something is fine. Ideally they could use our existing version control framework, but there are some issues around that. I think knowing some general best practices will be enough though, and I can handle patching that into the way our stuff works now. Are there any goals I am failing to think about? What are the time-tested ways to do something like this?

    Read the article

  • What does it mean to treat data as an asset?

    What does it mean to treat data as an asset? When considering this concept, we must define what data is and how it can be considered an asset. Data can easily be defined as a collection of stored truths that are open to interpretation and manipulation.  Expanding on this definition, data can be viewed as a set of captured facts, measurements, and ideas used to make decisions. Furthermore, InvestorsWords.com defines asset as any item of economic value owned by an individual or corporation. Now let’s apply this definition of asset to our definition of data, and ask the following question. Can facts, measurements and ideas be items that are of economic value owned by an individual or corporation? The obvious answer is yes; data can be bought and sold like commodities or analyzed to make smarter business decisions.  We can look at the economic value of data in one of two ways. First, data can be sold as a commodity that can take the form of goods like eBooks, Training, Music, Movies, and so on. Customers are willing to pay to gain access to this data for their consumption. This directly implies that there is an economic value for data in the form of a commodity because customers see a value in obtaining it.  Secondly data can be used in making smarter business decisions that allow for companies to become more profitable and/or reduce their potential for risk in regards to how they operate.  In the past I have worked at companies where we had to analyze previous sales activities in conjunction with current activities to determine how the company was preforming for the quarter.  In addition trends can be formulated based on existing data that allow companies to forecast data so that they can make strategic business decisions based sound forecasted data. Companies that truly value their data are constantly trying to grow and upgrade their data and supporting applications because it is the life blood of a company. If we look at an eBook retailer for example, imagine if they lost all of their data. They would be in essence forced out of business because they would have nothing to sell. In turn, if we look at a company that was using data to facilitate better decision making processes and they lost all of their data then they could be losing potential revenue and/ or increasing the company’s losses by making important business decisions virtually in the dark compared to when they were made on solid data.

    Read the article

< Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16  | Next Page >