Search Results

Search found 58956 results on 2359 pages for 'data structures'.

Page 4/2359 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Move data from others user accounts in my user account

- by user118136

I had problems with compiz setting and I make multiple accounts, now I want to transfer my information from all deleted users in my current account, some data I can not copy because I am not right to read, I type in terminal "sudo nautilus" and I get the permission for read, but the copied data is available only for superusers and I must charge the permissions for each file and each folder. How I can copy the information with out the superuser rights OR how I can charge the permissions for selected folder and all files and folders included in it?

Read the article
Should I keep investing into data structures and algorithms?

- by 4bu3li

These days, I'm investing heavily in data structures and algorithms and trying to solve some programming puzzles. I'm trying to code and solve with Java and Clojure. Am I wasting my time? should I invest more in technologies and frameworks that I already know in order to gain deeper knowledge (the ins and the outs) and be able to code with them more quickly? By studying data structures and algorithms, am I going to become a better programmer or those subjects are only important during college years?

Read the article
When and why are certain data structures used in the context of web development?

- by Ein Doofus

While browsing around the MSDN I came across: http://msdn.microsoft.com/en-us/library/aa287104%28v=vs.71%29 which lists various data structures such as: Queues Stacks Hashtables Binary Trees Binary Search Trees Graphs (I believe there are also Lists) and I was hoping to get a high-level overview of when these various data structures can be used in the broad context of web development, and when used, why one data structure is generally used instead of any other one.

Read the article
Should I keep investing into data structures and algorithms?

- by Chiron

These days, I'm investing heavily in data structures and algorithms and trying to solve some programming puzzles. I'm trying to code and solve with Java and Clojure. Am I wasting my time? should I invest more in technologies and frameworks that I already know in order to gain deeper knowledge (the ins and the outs) and be able to code with them more quickly? By studying data structures and algorithms, am I going to become a better programmer or those subjects are only important during college years?

Read the article
Explaining persistent data structures in simple terms

- by Jason Baker

I'm working on a library for Python that implements some persistent data structures (mainly as a learning exercise). However, I'm beginning to learn that explaining persistent data structures to people unfamiliar with them can be difficult. Can someone help me think of an easy (or at least the least complicated) way to describe persistent data structures to them? I've had a couple of people tell me that the documentation that I have is somewhat confusing. (And before anyone asks, no I don't mean persistent data structures as in persisted to the file system. Google persistent data structures if you're unclear on this.)

Read the article
SQL SERVER – Guest Post – Architecting Data Warehouse – Niraj Bhatt

- by pinaldave

Niraj Bhatt works as an Enterprise Architect for a Fortune 500 company and has an innate passion for building / studying software systems. He is a top rated speaker at various technical forums including Tech·Ed, MCT Summit, Developer Summit, and Virtual Tech Days, among others. Having run a successful startup for four years Niraj enjoys working on – IT innovations that can impact an enterprise bottom line, streamlining IT budgets through IT consolidation, architecture and integration of systems, performance tuning, and review of enterprise applications. He has received Microsoft MVP award for ASP.NET, Connected Systems and most recently on Windows Azure. When he is away from his laptop, you will find him taking deep dives in automobiles, pottery, rafting, photography, cooking and financial statements though not necessarily in that order. He is also a manager/speaker at BDOTNET, Asia’s largest .NET user group. Here is the guest post by Niraj Bhatt. As data in your applications grows it’s the database that usually becomes a bottleneck. It’s hard to scale a relational DB and the preferred approach for large scale applications is to create separate databases for writes and reads. These databases are referred as transactional database and reporting database. Though there are tools / techniques which can allow you to create snapshot of your transactional database for reporting purpose, sometimes they don’t quite fit the reporting requirements of an enterprise. These requirements typically are data analytics, effective schema (for an Information worker to self-service herself), historical data, better performance (flat data, no joins) etc. This is where a need for data warehouse or an OLAP system arises. A Key point to remember is a data warehouse is mostly a relational database. It’s built on top of same concepts like Tables, Rows, Columns, Primary keys, Foreign Keys, etc. Before we talk about how data warehouses are typically structured let’s understand key components that can create a data flow between OLTP systems and OLAP systems. There are 3 major areas to it: a) OLTP system should be capable of tracking its changes as all these changes should go back to data warehouse for historical recording. For e.g. if an OLTP transaction moves a customer from silver to gold category, OLTP system needs to ensure that this change is tracked and send to data warehouse for reporting purpose. A report in context could be how many customers divided by geographies moved from sliver to gold category. In data warehouse terminology this process is called Change Data Capture. There are quite a few systems that leverage database triggers to move these changes to corresponding tracking tables. There are also out of box features provided by some databases e.g. SQL Server 2008 offers Change Data Capture and Change Tracking for addressing such requirements. b) After we make the OLTP system capable of tracking its changes we need to provision a batch process that can run periodically and takes these changes from OLTP system and dump them into data warehouse. There are many tools out there that can help you fill this gap – SQL Server Integration Services happens to be one of them. c) So we have an OLTP system that knows how to track its changes, we have jobs that run periodically to move these changes to warehouse. The question though remains is how warehouse will record these changes? This structural change in data warehouse arena is often covered under something called Slowly Changing Dimension (SCD). While we will talk about dimensions in a while, SCD can be applied to pure relational tables too. SCD enables a database structure to capture historical data. This would create multiple records for a given entity in relational database and data warehouses prefer having their own primary key, often known as surrogate key. As I mentioned a data warehouse is just a relational database but industry often attributes a specific schema style to data warehouses. These styles are Star Schema or Snowflake Schema. The motivation behind these styles is to create a flat database structure (as opposed to normalized one), which is easy to understand / use, easy to query and easy to slice / dice. Star schema is a database structure made up of dimensions and facts. Facts are generally the numbers (sales, quantity, etc.) that you want to slice and dice. Fact tables have these numbers and have references (foreign keys) to set of tables that provide context around those facts. E.g. if you have recorded 10,000 USD as sales that number would go in a sales fact table and could have foreign keys attached to it that refers to the sales agent responsible for sale and to time table which contains the dates between which that sale was made. These agent and time tables are called dimensions which provide context to the numbers stored in fact tables. This schema structure of fact being at center surrounded by dimensions is called Star schema. A similar structure with difference of dimension tables being normalized is called a Snowflake schema. This relational structure of facts and dimensions serves as an input for another analysis structure called Cube. Though physically Cube is a special structure supported by commercial databases like SQL Server Analysis Services, logically it’s a multidimensional structure where dimensions define the sides of cube and facts define the content. Facts are often called as Measures inside a cube. Dimensions often tend to form a hierarchy. E.g. Product may be broken into categories and categories in turn to individual items. Category and Items are often referred as Levels and their constituents as Members with their overall structure called as Hierarchy. Measures are rolled up as per dimensional hierarchy. These rolled up measures are called Aggregates. Now this may seem like an overwhelming vocabulary to deal with but don’t worry it will sink in as you start working with Cubes and others. Let’s see few other terms that we would run into while talking about data warehouses. ODS or an Operational Data Store is a frequently misused term. There would be few users in your organization that want to report on most current data and can’t afford to miss a single transaction for their report. Then there is another set of users that typically don’t care how current the data is. Mostly senior level executives who are interesting in trending, mining, forecasting, strategizing, etc. don’t care for that one specific transaction. This is where an ODS can come in handy. ODS can use the same star schema and the OLAP cubes we saw earlier. The only difference is that the data inside an ODS would be short lived, i.e. for few months and ODS would sync with OLTP system every few minutes. Data warehouse can periodically sync with ODS either daily or weekly depending on business drivers. Data marts are another frequently talked about topic in data warehousing. They are subject-specific data warehouse. Data warehouses that try to span over an enterprise are normally too big to scope, build, manage, track, etc. Hence they are often scaled down to something called Data mart that supports a specific segment of business like sales, marketing, or support. Data marts too, are often designed using star schema model discussed earlier. Industry is divided when it comes to use of data marts. Some experts prefer having data marts along with a central data warehouse. Data warehouse here acts as information staging and distribution hub with spokes being data marts connected via data feeds serving summarized data. Others eliminate the need for a centralized data warehouse citing that most users want to report on detailed data. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Best Practices, Business Intelligence, Data Warehousing, Database, Pinal Dave, PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article
Data Mining Resources

- by Dejan Sarka

There are many different types of analyses, each one with its own pros and cons. Relational reports have a predefined structure, and end users cannot change it. They are simple to use for end users. Reports can use real-time data and snapshots of data to show the state of a report at specific points in time. One of the drawbacks is that report authoring is limited to IT pros and advanced users. Any kind of dynamic restructuring is very limited. If real-time data is used for a report, the report has a negative impact on the performance of the source system. Processing of the reports might be slow because the data comes from relational database management systems, which are not optimized for reporting only. If you create a semantic model of your data, your end users can create ad-hoc report structures. However, the development is more complex because a developer is needed to create these semantic models. For OLAP, you typically use specialized database management systems. You get lightning speed of analyses. End users can use rich and thin clients to interactively change the structure of the report. Typically, they do it graphically. However, the development of an OLAP system is many times quite complex. It involves the preparation and maintenance of an enterprise data warehouse and OLAP cubes. In order to exploit the possibility of real-time restructuring of reports, the users must be both active and educated. The data is usually stale, as it is loaded into data warehouses and OLAP cubes with a scheduled process. With data mining, a structure is not selected in advance; it searches for the structure. As a result, data mining can give you the most valuable results because you can discover patterns you did not expect. A data mining model structure is limited only by the attributes that you use to train the model. One of the drawbacks is that a lot of knowledge is needed for a successful data mining project. End users have to understand the results. Subject matter experts and IT professionals need to understand business problem thoroughly. The development might be sometimes even more complex than the development of OLAP cubes. Each type of analysis has its own place in an enterprise system. SQL Server has tools for all kinds of analyses. However, data mining is the most advanced way of analyzing the data; this is the “I” in BI. In order to get the most out of it, you need to learn quite a lot. In this blog post, I am gathering together resources for learning, including forthcoming events. Books Multiple authors: SQL Server MVP Deep Dives – I wrote an introductory data mining chapter there. Erik Veerman, Teo Lachev and Dejan Sarka: MCTS Self-Paced Training Kit (Exam 70-448): Microsoft SQL Server 2008 - Business Intelligence Development and Maintenance – you can find a good overview of a complete BI solution, including data mining, in this book. Jamie MacLennan, ZhaoHui Tang, and Bogdan Crivat: Data Mining with Microsoft SQL Server 2008 – can’t miss this book if you want to mine your data with SQL Server tools. Michael Berry, Gordon Linoff: Mastering Data Mining: The Art and Science of Customer Relationship Management – data mining from both, business and technical perspective. Dorian Pyle: Data Preparation for Data Mining – an in-depth book about data preparation. Thomas and Ronald Wonnacott: Introductory Statistics – if you thought that you could get away without statistics, then you are not serious about data mining. Jiawei Han and Micheline Kamber: Data Mining Concepts and Techniques – in-depth explanation of the most popular data mining algorithms. Michael Berry and Gordon Linoff: Data Mining Techniques – another book that explains data mining algorithms, more fro a business perspective. Paolo Guidici: Applied Data Mining – very mathematical book, only if you enjoy statistics and mathematics in general. Forthcoming presentations I am presenting two data mining related sessions during the PASS Summit in Charlotte, NC: Wednesday, October 16th, 2013 - Fraud Detection: Notes from the Field – I am showing how to use data mining for a specific business problem. The presentation is based on real-life projects. Friday, October 18th: Excel 2013 Advanced Analytics – I am focusing on Excel Data Mining Add-ins, and how to use them together with Power Pivot and other add-ins. This is the most you can get out of Excel. Sinergija 2013, Belgrade, Serbia Tuesday, October 22nd: Excel 2013 Analytics to the Max – another presentation focusing on the most advanced analytics you can get in Excel. SQL Rally Amsterdam, Netherlands Thursday, November 7th: Advanced Analytics in Excel 2013 – and again I am presenting about data mining in Excel. Why three different titles for the same presentation? I don’t know, I guess I forgot the name I proposed every time right after I sent the proposal. Courses Data Mining with SQL Server 2012 – I wrote a 3-day course for SolidQ. If you are interested in this course, which I could also deliver in a shorter seminar way, you can contact your closes SolidQ subsidiary, or, of course, me directly on addresses [email protected] or [email protected]. This course could also complement the existing courseware portfolio of training providers, which are welcome to contact me as well. OK, now you know: no more excuses, start learning data mining, get the most out of your data

Read the article
Master Data Services Employees Sample Model

- by Davide Mauri

I’ve been playing with Master Data Services quite a lot in those last days and I’m also monitoring the web for all available resources on it. Today I’ve found this freshly released sample available on MSDN Code Gallery: SQL Server Master Data Services Employee Sample Model http://code.msdn.microsoft.com/SSMDSEmployeeSample This sample shows how Recursive Hierarchies can be modeled in order to represent a typical organizational chart scenario where a self-relationship exists on the Employee entity. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Data Integration 12c Raising the Big Data Roof at Oracle OpenWorld

- by Tanu Sood

Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Times New Roman","serif"; mso-fareast-font-family:"MS Mincho";} Author: Dain Hansen, Director, Oracle It was an exciting OpenWorld 2013 for us in the Data Integration track. Our theme this year was all about ‘being future ready’ - previewing one of our biggest releases this year: Oracle Data Integration 12c. Just this week we followed up with this preview by announcing the general availability of 12c release for Oracle’s key data integration products: Oracle Data Integrator 12c and Oracle GoldenGate 12c. The new release delivers extreme performance, increase IT productivity, and simplify deployment, while helping IT organizations to keep pace with new data-oriented technology trends including cloud computing, big data analytics, real-time business intelligence. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Times New Roman","serif"; mso-fareast-font-family:"MS Mincho";} Mark Hurd's keynote on day one set the tone for the Data Integration sessions. Mark focused on big data analytics and the changing consumer expectations. Especially real-time insight is a key theme for Oracle overall and data integration products. In Mark Hurd's keynote we heard from key customers, such as Airbus and Thomson Reuters, how real-time analysis of operational data including machine data creates value, in some cases even saves lives. Thomas Kurian gave a deeper look into Oracle's big data and fast data solutions. In the initial lead Data Integration track session - Brad Adelberg, VP of Development, presented Oracle’s Data Integration 12c product strategy based on key trends from the initial OpenWorld keynotes. Brad talked about how Oracle's data integration products address the new data integration requirements that evolved with cloud computing, big data, and changing consumer expectations and how they set the key themes in our products’ road map. Brad explained why and how fast-time to value, high-performance and future-ready solutions is the top focus areas for product development. If you were not able to attend OpenWorld or this session I recommend reading the white paper: Five New Data Integration Requirements and How to Meet them with Oracle Data Integration, which provides an in-depth look into how Oracle addresses the new trends in the DI market. Following Brad’s session, Nick Wagner provided in depth review of Oracle GoldenGate’s latest features and roadmap. Nick discussed how Oracle GoldenGate’s tight integration with Oracle Database sets the product apart from the competition. We also heard that heterogeneity of the product is still a major focus for GoldenGate’s development and there will be more news on that front when there is a major release. Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-family:"Times New Roman","serif"; mso-fareast-font-family:"MS Mincho";} After GoldenGate’s product strategy session, Denis Gray from the PM team presented Oracle Data Integrator’s product strategy session, talking about the latest and greatest on ODI. Another good session was delivered by long-time GoldenGate users, Comcast. Jason Hurd and Amit Patel of Comcast talked about the various use cases they deploy Oracle GoldenGate throughout their enterprise, from database upgrades, feeding reporting systems, to active-active database synchronization. The Comcast team shared many good tips on how to use GoldenGate for both zero downtime upgrades and active-active replication with conflict management requirement. One of our other important goals we had this year for the Data Integration track at OpenWorld was hearing from our customers. We ended day 1 on just that, with a wonderful award ceremony for Oracle Excellence Awards for Oracle Fusion Middleware Innovation. The ceremony was held in the Yerba Buena Center for the Arts. Congratulations to Royal Bank of Scotland and Yalumba Wine Company, the winners in the Data Integration category. You can find more information on the award and the winners in our previous blog post: 2013 Oracle Excellence Awards for Fusion Middleware Innovation… Selected for their innovation use of Oracle’s Data Integration products; the winners for the Data Integration Category are Royal Bank of Scotland and The Yalumba Wine Company. Congratulations!!! Royal Bank of Scotland’s Market and International Banking division provides clients across the globe with seamless trading and competitive pricing, underpinned by a deep knowledge of risk management across the full spectrum of financial products. They handle millions of transactions daily to keep the lifeblood of their clients’ businesses flowing – whether through payment management solutions or through bespoke trade finance solutions. Royal Bank of Scotland is leveraging Oracle GoldenGate and Oracle Data Integrator along with Oracle Business Intelligence Enterprise Edition and the Oracle Database for a variety of solutions. Mainly, Oracle GoldenGate and Oracle Data Integrator are used to feed their data warehouse – providing a real-time data integration solution that feeds transactional data to their analytics system in minutes to enable improved decision making with timely, accurate data for their business users. Oracle Data Integrator’s in-database transformation capabilities and its ability to integrate with Oracle GoldenGate for real-time data capture is the foundation of this implementation. This solution makes it such that changes happening in the analytics systems are available the same day they are deployed on the operational system with 100% data quality guaranteed. Additionally, the solution has helped to reduce their operational database size from 150GB to 10GB. Impressive! Now what if I told you this solution was built in 3 months and had a less than 6 month return on investment? That’s outstanding! The Yalumba Wine Company is situated in the Barossa Valley of Australia. It is the oldest family owned winery in Australia with a unique way of aging their wines in specially crafted 100 liter barrels. Did you know that “Yalumba” is Aboriginal for “all the land around”? The Yalumba Wine Company is growing rapidly, and was in need of introducing a more modern standard to the existing manufacturing processes to meet globalization demands, overall time-to-market, and better operational efficiency objectives of product development. The Yalumba Wine Company worked with a partner, Bristlecone to develop a unique solution whereby Oracle Data Integrator is leveraged to pull data from Salesforce.com and JD Edwards, in addition to their other pre-existing source systems, for consumption into their data warehouse. They have emphasized the overall ease of developing integration workflows with Oracle Data Integrator. The solution has brought better visibility for the business users, shorter data loading and transformation performance to their data warehouse with rapid incorporation of new data sources, and a solid future-proof foundation for their organization. Moving forward, they plan on leveraging more from Oracle’s Data Integration portfolio. Terrific! In addition to these two customers on Tuesday we featured many other important Oracle Data Integrator and Oracle GoldenGate customers. On Tuesday the GoldenGate panel included: Land O’Lakes, Smuckers, and Veolia Water. Besides giving us yummy nutrition and healthy water, these companies have another aspect in common. They all use GoldenGate to boost their ERP application. Please read the recap by Irem Radzik. On Wednesday, the ODI Panel included: Barry Ralston and Ryan Weber of Infinity Insurance, Paul Stracke of Paychex Inc., and Ian Wall of Vertex Pharmaceuticals for a session filled with interesting projects, use cases and approaches to leveraging Oracle Data Integrator. Please read the recap by Sandrine Riley for more. Thanks to everyone who joined with us and we hope to stay connected! To hear more about our Data Integration12c products join us in an upcoming webcast to learn more. Follow us www.twitter.com/ORCLGoldenGate or goto our website at www.oracle.com/goto/dataintegration

Read the article
Looking for Cutting-Edge Data Integration: 2010 Innovation Awards

- by dain.hansen

This year's Oracle Fusion Middleware Innovation Awards will honor customers and partners who are creatively using to various products across Oracle Fusion Middleware. Brand new to this year's awards is a category for Data Integration. Think you have something unique and innovative with one of our Oracle Data Integration products? We'd love to hear from you! Please submit today The deadline for the nomination is 5 p.m. PT Friday, August 6th 2010, and winning organizations will be notified by late August 2010. What you win! FREE pass to Oracle OpenWorld 2010 in San Francisco for select winners in each category. Honored by Oracle executives at awards ceremony held during Oracle OpenWorld 2010 in San Francisco. Oracle Middleware Innovation Award Winner Plaque 1-3 meetings with Oracle Executives during Oracle OpenWorld 2010 Feature article placement in Oracle Magazine and placement in Oracle Press Release Customer snapshot and video testimonial opportunity, to be hosted on oracle.com Podcast interview opportunity with Senior Oracle Executive

Read the article
Fast Data - Big Data's achilles heel

- by thegreeneman

At OOW 2013 in Mark Hurd and Thomas Kurian's keynote, they discussed Oracle's Fast Data software solution stack and discussed a number of customers deploying Oracle's Big Data / Fast Data solutions and in particular Oracle's NoSQL Database. Since that time, there have been a large number of request seeking clarification on how the Fast Data software stack works together to deliver on the promise of real-time Big Data solutions. Fast Data is a software solution stack that deals with one aspect of Big Data, high velocity. The software in the Fast Data solution stack involves 3 key pieces and their integration: Oracle Event Processing, Oracle Coherence, Oracle NoSQL Database. All three of these technologies address a high throughput, low latency data management requirement. Oracle Event Processing enables continuous query to filter the Big Data fire hose, enable intelligent chained events to real-time service invocation and augments the data stream to provide Big Data enrichment. Extended SQL syntax allows the definition of sliding windows of time to allow SQL statements to look for triggers on events like breach of weighted moving average on a real-time data stream. Oracle Coherence is a distributed, grid caching solution which is used to provide very low latency access to cached data when the data is too big to fit into a single process, so it is spread around in a grid architecture to provide memory latency speed access. It also has some special capabilities to deploy remote behavioral execution for "near data" processing. The Oracle NoSQL Database is designed to ingest simple key-value data at a controlled throughput rate while providing data redundancy in a cluster to facilitate highly concurrent low latency reads. For example, when large sensor networks are generating data that need to be captured while analysts are simultaneously extracting the data using range based queries for upstream analytics. Another example might be storing cookies from user web sessions for ultra low latency user profile management, also leveraging that data using holistic MapReduce operations with your Hadoop cluster to do segmented site analysis. Understand how NoSQL plays a critical role in Big Data capture and enrichment while simultaneously providing a low latency and scalable data management infrastructure thru clustered, always on, parallel processing in a shared nothing architecture. Learn how easily a NoSQL cluster can be deployed to provide essential services in industry specific Fast Data solutions. See these technologies work together in a demonstration highlighting the salient features of these Fast Data enabling technologies in a location based personalization service. The question then becomes how do these things work together to deliver an end to end Fast Data solution. The answer is that while different applications will exhibit unique requirements that may drive the need for one or the other of these technologies, often when it comes to Big Data you may need to use them together. You may have the need for the memory latencies of the Coherence cache, but just have too much data to cache, so you use a combination of Coherence and Oracle NoSQL to handle extreme speed cache overflow and retrieval. Here is a great reference to how these two technologies are integrated and work together. Coherence & Oracle NoSQL Database. On the stream processing side, it is similar as with the Coherence case. As your sliding windows get larger, holding all the data in the stream can become difficult and out of band data may need to be offloaded into persistent storage. OEP needs an extreme speed database like Oracle NoSQL Database to help it continue to perform for the real time loop while dealing with persistent spill in the data stream. Here is a great resource to learn more about how OEP and Oracle NoSQL Database are integrated and work together. OEP & Oracle NoSQL Database.

Read the article
Oracle Announces Oracle Big Data Appliance X3-2 and Enhanced Oracle Big Data Connectors

- by jgelhaus

Enables Customers to Easily Harness the Business Value of Big Data at Lower Cost Engineered System Simplifies Big Data for the Enterprise Oracle Big Data Appliance X3-2 hardware features the latest 8-core Intel® Xeon E5-2600 series of processors, and compared with previous generation, the 18 compute and storage servers with 648 TB raw storage now offer: 33 percent more processing power with 288 CPU cores; 33 percent more memory per node with 1.1 TB of main memory; and up to a 30 percent reduction in power and cooling Oracle Big Data Appliance X3-2 further simplifies implementation and management of big data by integrating all the hardware and software required to acquire, organize and analyze big data. It includes: Support for CDH4.1 including software upgrades developed collaboratively with Cloudera to simplify NameNode High Availability in Hadoop, eliminating the single point of failure in a Hadoop cluster; Oracle NoSQL Database Community Edition 2.0, the latest version that brings better Hadoop integration, elastic scaling and new APIs, including JSON and C support; The Oracle Enterprise Manager plug-in for Big Data Appliance that complements Cloudera Manager to enable users to more easily manage a Hadoop cluster; Updated distributions of Oracle Linux and Oracle Java Development Kit; An updated distribution of open source R, optimized to work with high performance multi-threaded math libraries Read More Data sheet: Oracle Big Data Appliance X3-2 Oracle Big Data Appliance: Datacenter Network Integration Big Data and Natural Language: Extracting Insight From Text Thomson Reuters Discusses Oracle's Big Data Platform Connectors Integrate Hadoop with Oracle Big Data Ecosystem Oracle Big Data Connectors is a suite of software built by Oracle to integrate Apache Hadoop with Oracle Database, Oracle Data Integrator, and Oracle R Distribution. Enhancements to Oracle Big Data Connectors extend these data integration capabilities. With updates to every connector, this release includes: Oracle SQL Connector for Hadoop Distributed File System, for high performance SQL queries on Hadoop data from Oracle Database, enhanced with increased automation and querying of Hive tables and now supported within the Oracle Data Integrator Application Adapter for Hadoop; Transparent access to the Hive Query language from R and introduction of new analytic techniques executing natively in Hadoop, enabling R developers to be more productive by increasing access to Hadoop in the R environment. Read More Data sheet: Oracle Big Data Connectors High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Read the article
SQL SERVER – Introduction to Big Data – Guest Post

- by pinaldave

BIG Data – such a big word – everybody talks about this now a days. It is the word in the database world. In one of the conversation I asked my friend Jasjeet Sigh the same question – what is Big Data? He instantly came up with a very effective write-up. Jasjeet is working as a Technical Manager with Koenig Solutions. He leads the SQL domain, and holds rich IT industry experience. Talking about Koenig, it is a 19 year old IT training company that offers several certification choices. Some of its courses include SharePoint Training, Project Management certifications, Microsoft Trainings, Business Intelligence programs, Web Design and Development courses etc. Big Data, as the name suggests, is about data that is BIG in nature. The data is BIG in terms of size, and it is difficult to manage such enormous data with relational database management systems that are quite popular these days. Big Data is not just about being large in size, it is also about the variety of the data that differs in form or type. Some examples of Big Data are given below : Scientific data related to weather and atmosphere, Genetics etc Data collected by various medical procedures, such as Radiology, CT scan, MRI etc Data related to Global Positioning System Pictures and Videos Radio Frequency Data Data that may vary very rapidly like stock exchange information Apart from difficulties in managing and storing such data, it is difficult to query, analyze and visualize it. The characteristics of Big Data can be defined by four Vs: Volume: It simply means a large volume of data that may span Petabyte, Exabyte and so on. However it also depends organization to organization that what volume of data they consider as Big Data. Variety: As discussed above, Big Data is not limited to relational information or structured Data. It can also include unstructured data like pictures, videos, text, audio etc. Velocity: Velocity means the speed by which data changes. The higher is the velocity, the more efficient should be the system to capture and analyze the data. Missing any important point may lead to wrong analysis or may even result in loss. Veracity: It has been recently added as the fourth V, and generally means truthfulness or adherence to the truth. In terms of Big Data, it is more of a challenge than a characteristic. It is difficult to ascertain the truth out of the enormous amount of data and the one that has high velocity. There are always chances of having un-precise and uncertain data. It is a challenging task to clean such data before it is analyzed. Big Data can be considered as the next big thing in the IT sector in terms of innovation and development. If appropriate technologies are developed to analyze and use the information, it can be the driving force for almost all industrial segments. These include Retail, Manufacturing, Service, Finance, Healthcare etc. This will help them to automate business decisions, increase productivity, and innovate and develop new products. Thanks Jasjeet Singh for an excellent write up. Jasjeet Sign is working as a Technical Manager with Koenig Solutions. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Database, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: Big Data

Read the article
Creating a Corporate Data Hub

- by BuckWoody

The Windows Azure Marketplace has a rich assortment of data and software offerings for you to use – a type of Software as a Service (SaaS) for IT workers, not necessarily for end-users. Among those offerings is the “Data Hub” – a codename for a project that ironically actually does what the codename says. In many of our organizations, we have multiple data quality issues. Finding data is one problem, but finding it just once is often a bigger problem. Lots of departments and even individuals have stored the same data more than once, and in some cases, made changes to one of the copies. It’s difficult to know which location or version of the data is authoritative. Then there’s the problem of accessing the data. It’s fairly straightforward to publish a database, share or other location internally to store the data. But then you have to figure out who owns it, how it is controlled, and pass out the various connection strings to those who want to use it. And then you need to figure out how to let folks access the internal data externally – bringing up all kinds of security issues. Finally, in many cases our user community wants us to combine data from the internally sources with external data, bringing up the security, strings, and exploration features up all over again. Enter the Data Hub. This is an online offering, where you assign an administrator and data stewards. You import the data into the service, and it’s available to you - and only you and your organization if you wish. The basic steps for this service are to set up the portal for your company, assign administrators and permissions, and then you assign data areas and import data into them. From there you make them discoverable, and then you have multiple options that you or your users can access that data. You’re then able, if you wish, to combine that data with other data in one location. So how does all that work? What about security? Is it really that easy? And can you really move the data definition off to the Subject Matter Experts (SME’s) that know the particular data stack better than the IT team does? Well, nothing good is easy – but using the Data Hub is actually pretty simple. I’ll give you a link in a moment where you can sign up and try this yourself. Once you sign up, you assign an administrator. From there you’ll create data areas, and then use a simple interface to bring the data in. All of this is done in a portal interface – nothing to install, configure, update or manage. After the data is entered in, and you’ve assigned meta-data to describe it, your users have multiple options to access it. They can simply use the portal – which actually has powerful visualizations you can use on any platform, even mobile phones or tablets. Your users can also hit the data with Excel – which gives them ultimate flexibility for display, all while using an authoritative, single reference for the data. Since the service is online, they can do this wherever they are – given the proper authentication and permissions. You can also hit the service with simple API calls, like this one from C#: http://msdn.microsoft.com/en-us/library/hh921924 You can make HTTP calls instead of code, and the data can even be exposed as an OData Feed. As you can see, there are a lot of options. You can check out the offering here: http://www.microsoft.com/en-us/sqlazurelabs/labs/data-hub.aspx and you can read the documentation here: http://msdn.microsoft.com/en-us/library/hh921938

Read the article
Creating a Corporate Data Hub

- by BuckWoody

The Windows Azure Marketplace has a rich assortment of data and software offerings for you to use – a type of Software as a Service (SaaS) for IT workers, not necessarily for end-users. Among those offerings is the “Data Hub” – a codename for a project that ironically actually does what the codename says. In many of our organizations, we have multiple data quality issues. Finding data is one problem, but finding it just once is often a bigger problem. Lots of departments and even individuals have stored the same data more than once, and in some cases, made changes to one of the copies. It’s difficult to know which location or version of the data is authoritative. Then there’s the problem of accessing the data. It’s fairly straightforward to publish a database, share or other location internally to store the data. But then you have to figure out who owns it, how it is controlled, and pass out the various connection strings to those who want to use it. And then you need to figure out how to let folks access the internal data externally – bringing up all kinds of security issues. Finally, in many cases our user community wants us to combine data from the internally sources with external data, bringing up the security, strings, and exploration features up all over again. Enter the Data Hub. This is an online offering, where you assign an administrator and data stewards. You import the data into the service, and it’s available to you - and only you and your organization if you wish. The basic steps for this service are to set up the portal for your company, assign administrators and permissions, and then you assign data areas and import data into them. From there you make them discoverable, and then you have multiple options that you or your users can access that data. You’re then able, if you wish, to combine that data with other data in one location. So how does all that work? What about security? Is it really that easy? And can you really move the data definition off to the Subject Matter Experts (SME’s) that know the particular data stack better than the IT team does? Well, nothing good is easy – but using the Data Hub is actually pretty simple. I’ll give you a link in a moment where you can sign up and try this yourself. Once you sign up, you assign an administrator. From there you’ll create data areas, and then use a simple interface to bring the data in. All of this is done in a portal interface – nothing to install, configure, update or manage. After the data is entered in, and you’ve assigned meta-data to describe it, your users have multiple options to access it. They can simply use the portal – which actually has powerful visualizations you can use on any platform, even mobile phones or tablets. Your users can also hit the data with Excel – which gives them ultimate flexibility for display, all while using an authoritative, single reference for the data. Since the service is online, they can do this wherever they are – given the proper authentication and permissions. You can also hit the service with simple API calls, like this one from C#: http://msdn.microsoft.com/en-us/library/hh921924 You can make HTTP calls instead of code, and the data can even be exposed as an OData Feed. As you can see, there are a lot of options. You can check out the offering here: http://www.microsoft.com/en-us/sqlazurelabs/labs/data-hub.aspx and you can read the documentation here: http://msdn.microsoft.com/en-us/library/hh921938

Read the article
Accessing SQL Data Services via ADO.NET Data Service Client Library

- by Mehmet Aras

Is this possible? Basically I would like to use SQL Data Services REST interface and let the ADO.NET Data Service Client library handle communication details and generate the entities that I can use. I looked at the samples in February release of Azure services kit but the samples in there are using HttpWebRequest and HttpWebResponse to consume SQL Data Services RESTfully. I was hoping to use ADO.NET Data Service Client library to abstract low-level details away.

Read the article
Serializing persistent/functional data structures

- by Rob

Persistent data structures depend on the sharing of structure for efficiency. For an example, see here. How can I preserve the structure sharing when I serialize the data structures and write them to a file or database? If I just naively traverse the datastructures, I'll store the correct values, but I'll lose the structure sharing. I'd like to be able to save data-structures with shared components to a file, restore them, and still have most of the structure shared in the restored data.

Read the article
Bridging Two Worlds: Big Data and Enterprise Data

- by Dain C. Hansen

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} The big data world is all the vogue in today’s IT conversations. It’s a world of volume, velocity, variety – tantalizing us with its untapped potential. It’s a world of transformational game-changing technologies that have already begun to alter the information management landscape. One of the reasons that big data is so compelling is that it’s a universal challenge that impacts every one of us. Whether it is healthcare, financial, manufacturing, government, retail - big data presents a pressing problem for many industries: how can so much information be processed so quickly to deliver the ‘bigger’ picture? With big data we’re tapping into new information that didn’t exist before: social data, weblogs, sensor data, complex content, and more. What also makes big data revolutionary is that it turns traditional information architecture on its head, putting into question commonly accepted notions of where and how data should be aggregated processed, analyzed, and stored. This is where Hadoop and NoSQL come in – new technologies which solve new problems for managing unstructured data. And now for some worst practices that I'd recommend that you please not follow: Worst Practice Lesson 1: Throw away everything that you already know about data management, data integration tools, and start completely over. One shouldn’t forget what’s already running in today’s IT. Today’s Business Analytics, Data Warehouses, Business Applications (ERP, CRM, SCM, HCM), and even many social, mobile, cloud applications still rely almost exclusively on structured data – or what we’d like to call enterprise data. This dilemma is what today’s IT leaders are up against: what are the best ways to bridge enterprise data with big data? And what are the best strategies for dealing with the complexities of these two unique worlds? Worst Practice Lesson 2: Throw away all of your existing business applications … because they don’t run on big data yet. Bridging the two worlds of big data and enterprise data means considering solutions that are complete, based on emerging Hadoop technologies (as well as traditional), and are poised for success through integrated design tools, integrated platforms that connect to your existing business applications, as well as and support real-time analytics. Leveraging these types of best practices translates to improved productivity, lowered TCO, IT optimization, and better business insights. Worst Practice Lesson 3: Separate out [and keep separate] your big data sandboxes from all the current enterprise IT systems. Don’t mix sand among playgrounds. We didn't tell you that you wouldn't get dirty doing this. Correlation between the two worlds is key. The real advantage to analyzing big data comes when you can correlate it with the existing data in your data warehouse or your current applications to make sense of the larger patterns. If you have not followed these worst practices 1-3 then you qualify for the first step of our journey: bridging the two worlds of enterprise data and big data. Over the next several weeks we’ll be discussing this topic along with several others around big data as it relates to data integration. We welcome you to join us in the conversation by following us on twitter on #BridgingBigData or download our latest white paper and resource kit: Big Data and Enterprise Data: Bridging Two Worlds.

Read the article
Optimal Data Structure for our own API

- by vermiculus

I'm in the early stages of writing an Emacs major mode for the Stack Exchange network; if you use Emacs regularly, this will benefit you in the end. In order to minimize the number of calls made to Stack Exchange's API (capped at 10000 per IP per day) and to just be a generally responsible citizen, I want to cache the information I receive from the network and store it in memory, waiting to be accessed again. I'm really stuck as to what data structure to store this information in. Obviously, it is going to be a list. However, as with any data structure, the choice must be determined by what data is being stored and what how it will be accessed. What, I would like to be able to store all of this information in a single symbol such as stack-api/cache. So, without further ado, stack-api/cache is a list of conses keyed by last update: `(<csite> <csite> <csite>) where <csite> would be (1362501715 . <site>) At this point, all we've done is define a simple association list. Of course, we must go deeper. Each <site> is a list of the API parameter (unique) followed by a list questions: `("codereview" <cquestion> <cquestion> <cquestion>) Each <cquestion> is, you guessed it, a cons of questions with their last update time: `(1362501715 <question>) (1362501720 . <question>) <question> is a cons of a question structure and a list of answers (again, consed with their last update time): `(<question-structure> <canswer> <canswer> <canswer> and ` `(1362501715 . <answer-structure>) This data structure is likely most accurately described as a tree, but I don't know if there's a better way to do this considering the language, Emacs Lisp (which isn't all that different from the Lisp you know and love at all). The explicit conses are likely unnecessary, but it helps my brain wrap around it better. I'm pretty sure a <csite>, for example, would just turn into (<epoch-time> <api-param> <cquestion> <cquestion> ...) Concerns: Does storing data in a potentially huge structure like this have any performance trade-offs for the system? I would like to avoid storing extraneous data, but I've done what I could and I don't think the dataset is that large in the first place (for normal use) since it's all just human-readable text in reasonable proportion. (I'm planning on culling old data using the times at the head of the list; each inherits its last-update time from its children and so-on down the tree. To what extent this cull should take place: I'm not sure.) Does storing data like this have any performance trade-offs for that which must use it? That is, will set and retrieve operations suffer from the size of the list? Do you have any other suggestions as to what a better structure might look like?

Read the article
SQL SERVER – Data Sources and Data Sets in Reporting Services SSRS

- by Pinal Dave

This example is from the Beginning SSRS by Kathi Kellenberger. Supporting files are available with a free download from the www.Joes2Pros.com web site. This example is from the Beginning SSRS. Supporting files are available with a free download from the www.Joes2Pros.com web site. Connecting to Your Data? When I was a child, the telephone book was an important part of my life. Maybe I was just a nerd, but I enjoyed getting a new book every year to page through to learn about the businesses in my small town or to discover where some of my school acquaintances lived. It was also the source of maps to my town’s neighborhoods and the towns that surrounded me. To make a phone call, I would need a telephone number. In order to find a telephone number, I had to know how to use the telephone book. That seems pretty simple, but it resembles connecting to any data. You have to know where the data is and how to interact with it. A data source is the connection information that the report uses to connect to the database. You have two choices when creating a data source, whether to embed it in the report or to make it a shared resource usable by many reports. Data Sources and Data Sets A few basic terms will make the upcoming choses make more sense. What database on what server do you want to connect to? It would be better to just ask… “what is your data source?” The connection you need to make to get your reports data is called a data source. If you connected to a data source (like the JProCo database) there may be hundreds of tables. You probably only want data from just a few tables. This means you want to write a specific query against this data source. A query on a data source to get just the records you need for an SSRS report is called a Data Set. Creating a local Data Source You can connect embed a connection from your report directly to your JProCo database which (let’s say) is installed on a server named Reno. If you move JProCo to a new server named Tampa then you need to update the Data Set. If you have 10 reports in one project that were all pointing to the JProCo database on the Reno server then they would all need to be updated at once. It’s possible to make a project level Data Source and have each report use that. This means one change can fix all 10 reports at once. This would be called a Shared Data Source. Creating a Shared Data Source The best advice I can give you is to create shared data sources. The reason I recommend this is that if a database moves to a new server you will have just one place in Report Manager to make the server name change. That one change will update the connection information in all the reports that use that data source. To get started, you will start with a fresh project. Go to Start > All Programs > SQL Server 2012 > Microsoft SQL Server Data Tools to launch SSDT. Once SSDT is running, click New Project to create a new project. Once the New Project dialog box appears, fill in the form, as shown in. Be sure to select Report Server Project this time – not the wizard. Click OK to dismiss the New Project dialog box. You should now have an empty project, as shown in the Solution Explorer. A report is meant to show you data. Where is the data? The first task is to create a Shared Data Source. Right-click on the Shared Data Sources folder and choose Add New Data Source. The Shared Data Source Properties dialog box will launch where you can fill in a name for the data source. By default, it is named DataSource1. The best practice is to give the data source a more meaningful name. It is possible that you will have projects with more than one data source and, by naming them, you can tell one from another. Type the name JProCo for the data source name and click the Edit button to configure the database connection properties. If you take a look at the types of data sources you can choose, you will see that SSRS works with many data platforms including Oracle, XML, and Teradata. Make sure SQL Server is selected before continuing. For this post, I am assuming that you are using a local SQL Server and that you can use your Windows account to log in to the SQL Server. If, for some reason you must use SQL Server Authentication, choose that option and fill in your SQL Server account credentials. Otherwise, just accept Windows Authentication. If your database server was installed locally and with the default instance, just type in Localhost for the Server name. Select the JProCo database from the database list. At this point, the connection properties should look like. If you have installed a named instance of SQL Server, you will have to specify the server name like this: Localhost\InstanceName, replacing the InstanceName with whatever your instance name is. If you are not sure about the named instance, launch the SQL Server Configuration Manager found at Start > All Programs > Microsoft SQL Server 2012 > Configuration Tools. If you have a named instance, the name will be shown in parentheses. A default instance of SQL Server will display MSSQLSERVER; a named instance will display the name chosen during installation. Once you get the connection properties filled in, click OK to dismiss the Connection Properties dialog box and OK again to dismiss the Shared Data Source properties. You now have a data source in the Solution Explorer. What’s next I really need to thank Kathi Kellenberger and Rick Morelan for sharing this material for this 5 day series of posts on SSRS. To get really comfortable with SSRS you will get to know the different SSDT windows, Build reports on your own (without the wizards), Add report headers and footers, Accept user input, create levels, charts, or even maps for visual appeal. You might be surprise to know a small 230 page book starts from the very beginning and covers the steps to do all these items. Beginning SSRS 2012 is a small easy to follow book so you can learn SSRS for less than $20. See Joes2Pros.com for more on this and other books. If you want to learn SSRS in easy to simple words – I strongly recommend you to get Beginning SSRS book from Joes 2 Pros. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Reporting Services, SSRS

Read the article
Space-efficient data structures for broad-phase collision detection

- by Marian Ivanov

As far as I know, these are three types of data structures that can be used for collision detection broadphase: Unsorted arrays: Check every object againist every object - O(n^2) time; O(log n) space. It's so slow, it's useless if n isn't really small. for (i=1;i<objects;i++){ for(j=0;j<i;j++) narrowPhase(i,j); }; Sorted arrays: Sort the objects, so that you get O(n^(2-1/k)) for k dimensions O(n^1.5) for 2d and O(n^1.67) for 3d and O(n) space. Assuming the space is 2D and sortedArray is sorted so that if the object begins in sortedArray[i] and another object ends at sortedArray[i-1]; they don't collide Heaps of stacks: Divide the objects between a heap of stacks, so that you only have to check the bucket, its children and its parents - O(n log n) time, but O(n^2) space. This is probably the most frequently used approach. Is there a way of having O(n log n) time with less space? When is it more efficient to use sorted arrays over heaps and vice versa?

Read the article
PostgreSQL to Data-Warehouse: Best approach for near-real-time ETL / extraction of data

- by belvoir

Background: I have a PostgreSQL (v8.3) database that is heavily optimized for OLTP. I need to extract data from it on a semi real-time basis (some-one is bound to ask what semi real-time means and the answer is as frequently as I reasonably can but I will be pragmatic, as a benchmark lets say we are hoping for every 15min) and feed it into a data-warehouse. How much data? At peak times we are talking approx 80-100k rows per min hitting the OLTP side, off-peak this will drop significantly to 15-20k. The most frequently updated rows are ~64 bytes each but there are various tables etc so the data is quite diverse and can range up to 4000 bytes per row. The OLTP is active 24x5.5. Best Solution? From what I can piece together the most practical solution is as follows: Create a TRIGGER to write all DML activity to a rotating CSV log file Perform whatever transformations are required Use the native DW data pump tool to efficiently pump the transformed CSV into the DW Why this approach? TRIGGERS allow selective tables to be targeted rather than being system wide + output is configurable (i.e. into a CSV) and are relatively easy to write and deploy. SLONY uses similar approach and overhead is acceptable CSV easy and fast to transform Easy to pump CSV into the DW Alternatives considered .... Using native logging (http://www.postgresql.org/docs/8.3/static/runtime-config-logging.html). Problem with this is it looked very verbose relative to what I needed and was a little trickier to parse and transform. However it could be faster as I presume there is less overhead compared to a TRIGGER. Certainly it would make the admin easier as it is system wide but again, I don't need some of the tables (some are used for persistent storage of JMS messages which I do not want to log) Querying the data directly via an ETL tool such as Talend and pumping it into the DW ... problem is the OLTP schema would need tweaked to support this and that has many negative side-effects Using a tweaked/hacked SLONY - SLONY does a good job of logging and migrating changes to a slave so the conceptual framework is there but the proposed solution just seems easier and cleaner Using the WAL Has anyone done this before? Want to share your thoughts?

Read the article
Reference Data Management and Master Data: Are Relation ?

- by Mala Narasimharajan

Submitted By: Rahul Kamath Oracle Data Relationship Management (DRM) has always been extremely powerful as an Enterprise Master Data Management (MDM) solution that can help manage changes to master data in a way that influences enterprise structure, whether it be mastering chart of accounts to enable financial transformation, or revamping organization structures to drive business transformation and operational efficiencies, or restructuring sales territories to enable equitable distribution of leads to sales teams following the acquisition of new products, or adding additional cost centers to enable fine grain control over expenses. Increasingly, DRM is also being utilized by Oracle customers for reference data management, an emerging solution space that deserves some explanation. What is reference data? How does it relate to Master Data? Reference data is a close cousin of master data. While master data is challenged with problems of unique identification, may be more rapidly changing, requires consensus building across stakeholders and lends structure to business transactions, reference data is simpler, more slowly changing, but has semantic content that is used to categorize or group other information assets – including master data – and gives them contextual value. In fact, the creation of a new master data element may require new reference data to be created. For example, when a European company acquires a US business, chances are that they will now need to adapt their product line taxonomy to include a new category to describe the newly acquired US product line. Further, the cross-border transaction will also result in a revised geo hierarchy. The addition of new products represents changes to master data while changes to product categories and geo hierarchy are examples of reference data changes.1 The following table contains an illustrative list of examples of reference data by type. Reference data types may include types and codes, business taxonomies, complex relationships & cross-domain mappings or standards. Types & Codes Taxonomies Relationships / Mappings Standards Transaction Codes Industry Classification Categories and Codes, e.g., North America Industry Classification System (NAICS) Product / Segment; Product / Geo Calendars (e.g., Gregorian, Fiscal, Manufacturing, Retail, ISO8601) Lookup Tables (e.g., Gender, Marital Status, etc.) Product Categories City à State à Postal Codes Currency Codes (e.g., ISO) Status Codes Sales Territories (e.g., Geo, Industry Verticals, Named Accounts, Federal/State/Local/Defense) Customer / Market Segment; Business Unit / Channel Country Codes (e.g., ISO 3166, UN) Role Codes Market Segments Country Codes / Currency Codes / Financial Accounts Date/Time, Time Zones (e.g., ISO 8601) Domain Values Universal Standard Products and Services Classification (UNSPSC), eCl@ss International Classification of Diseases (ICD) e.g., ICD9 à IC10 mappings Tax Rates Why manage reference data? Reference data carries contextual value and meaning and therefore its use can drive business logic that helps execute a business process, create a desired application behavior or provide meaningful segmentation to analyze transaction data. Further, mapping reference data often requires human judgment. Sample Use Cases of Reference Data Management Healthcare: Diagnostic Codes The reference data challenges in the healthcare industry offer a case in point. Part of being HIPAA compliant requires medical practitioners to transition diagnosis codes from ICD-9 to ICD-10, a medical coding scheme used to classify diseases, signs and symptoms, causes, etc. The transition to ICD-10 has a significant impact on business processes, procedures, contracts, and IT systems. Since both code sets ICD-9 and ICD-10 offer diagnosis codes of very different levels of granularity, human judgment is required to map ICD-9 codes to ICD-10. The process requires collaboration and consensus building among stakeholders much in the same way as does master data management. Moreover, to build reports to understand utilization, frequency and quality of diagnoses, medical practitioners may need to “cross-walk” mappings -- either forward to ICD-10 or backwards to ICD-9 depending upon the reporting time horizon. Spend Management: Product, Service & Supplier Codes Similarly, as an enterprise looks to rationalize suppliers and leverage their spend, conforming supplier codes, as well as product and service codes requires supporting multiple classification schemes that may include industry standards (e.g., UNSPSC, eCl@ss) or enterprise taxonomies. Aberdeen Group estimates that 90% of companies rely on spreadsheets and manual reviews to aggregate, classify and analyze spend data, and that data management activities account for 12-15% of the sourcing cycle and consume 30-50% of a commodity manager’s time. Creating a common map across the extended enterprise to rationalize codes across procurement, accounts payable, general ledger, credit card, procurement card (P-card) as well as ACH and bank systems can cut sourcing costs, improve compliance, lower inventory stock, and free up talent to focus on value added tasks. Change Management: Point of Sales Transaction Codes and Product Codes In the specialty finance industry, enterprises are confronted with usury laws – governed at the state and local level – that regulate financial product innovation as it relates to consumer loans, check cashing and pawn lending. To comply, it is important to demonstrate that transactions booked at the point of sale are posted against valid product codes that were on offer at the time of booking the sale. Since new products are being released at a steady stream, it is important to ensure timely and accurate mapping of point-of-sale transaction codes with the appropriate product and GL codes to comply with the changing regulations. Multi-National Companies: Industry Classification Schemes As companies grow and expand across geographies, a typical challenge they encounter with reference data represents reconciling various versions of industry classification schemes in use across nations. While the United States, Mexico and Canada conform to the North American Industry Classification System (NAICS) standard, European Union countries choose different variants of the NACE industry classification scheme. Multi-national companies must manage the individual national NACE schemes and reconcile the differences across countries. Enterprises must invest in a reference data change management application to address the challenge of distributing reference data changes to downstream applications and assess which applications were impacted by a given change. References 1 Master Data versus Reference Data, Malcolm Chisholm, April 1, 2006.

Read the article
Focus on Oracle Data Profiling and Data Quality 11g - 24/Fev/11

- by Claudia Costa

Thursday 24th February, 11am GMTOracle offers an integrated suite Data Quality software architected to discover and correct today's data quality problems and establish a platform prepared for tomorrow's yet unknown data challenges.Oracle Data Profiling provides data investigation, discovery, and profiling in support of quality, migration, integration, stewardship, and governance initiatives. It includes a broad range of features that expand upon basic profiling, including automated monitoring, business-rule validation, and trend analysis.Oracle Data Quality for Data Integrator provides cleansing, standardization, matching, address validation, location enrichment, and linking functions for global customer data and operational business data.It ensures that data adheres to established standards that are adaptable to fit each organization's specific needs. Both single - and double - byte data are processed in local languages to provide a unique and centralized view of customers, products and services. During this in-person briefing, Data Integration Solution Specialists will be providing a technical overview and a walkthrough.Agenda Oracle Data Integration Strategy overview A focus on Oracle Data Profiling and Oracle Data Quality for Data Integrator: Oracle Data Profiling Oracle Data Quality for Data Integrator Live demo Q&A This FREE online LIVE eSeminar will be delivered over the Web and Conference Call. Registrations received less than 24hours prior to start time may not receive confirmation to attend.To register click here.For any questions please contact [email protected]

Read the article
Can JSON be made easily and safely editable by the non-technical Excel crowd?

- by glitch

I'm looking for a data storage format that's very intuitive and easy to edit. It should be ideally targeted towards the same crowd as Excel. At the same time I would like the data structure to be a tree. Ideally this would be JSON, since it offers both the tree aspect and allows for more interesting constructs like arrays. That and parsing libraries for JSON are ubiquitous, so I don't have to reinvent the wheel. The problem is that, at least with a non-specialized text editor, JSON is a giant pain to edit for a non-technical user. I'm thinking along the lines of someone who might have used Excel in the past, but never a real text editor. Someone who might not be comfortable with the idea of preserving JSON syntax by hand. Are there data formats out there that would fit this profile? I'd very much prefer this to be a JSON actually, but then it would require a solid editing tool that would hide the underlying implementation from the user. Think Excel and how it abstracts CSV syntax from the user. The reason I'm looking for something like this is because the team has been working with pretty hierarchical data for a while now and we've hit the limits of how easy it is to represent in simple CSVs without having to create complex rules for how represent hierarchy semantics from each row. Any suggestions?

Read the article

Search Results

Search found 58956 results on 2359 pages for 'data structures'.

Page 4/2359 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by user118136

- by 4bu3li

- by Ein Doofus

- by Chiron

- by Jason Baker

- by pinaldave

- by Dejan Sarka

- by Davide Mauri

- by Tanu Sood

- by dain.hansen

- by thegreeneman

- by jgelhaus

- by pinaldave

- by BuckWoody

- by BuckWoody

- by Mehmet Aras

- by Rob

- by Dain C. Hansen

- by vermiculus

- by Pinal Dave

- by Marian Ivanov

- by belvoir

- by Mala Narasimharajan

- by Claudia Costa

- by glitch

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >