azure storage tables - Page 158

Need a rough estimate on how long LEFT JOIN on two 6 million row tables would take.

- by JobGovernor

Hello, well the title says it all. Does anyone have an idea? Would need to lock the table for that long on production site, so it is kinda important. Also, do you have any suggestion how such query could be optimized? Thanks!

Read the article

how can i achieve this 100% height effect with divs rather than tables in pure css?

- by significance

Read the article

CodePlex Daily Summary for Friday, February 25, 2011

CodePlex Daily Summary for Friday, February 25, 2011Popular ReleasesMono.Addins: Mono.Addins 0.6: The 0.6 release of Mono.Addins includes many improvements, bug fixes and new features: Add-in engine Add-in name and description can now be localized. There are new custom attributes for defining them, and can also be specified as xml elements in an add-in manifest instead of attributes. Support for custom add-in properties. It is now possible to specify arbitrary properties in add-ins, which can be queried at install time (using the Mono.Addins.Setup API) or at run-time. Custom extensio...patterns & practices: Project Silk: Project Silk Community Drop 3 - 25 Feb 2011: IntroductionWelcome to the third community drop of Project Silk. For this drop we are requesting feedback on overall application architecture, code review of the JavaScript Conductor and Widgets, and general direction of the application. Project Silk provides guidance and sample implementations that describe and illustrate recommended practices for building modern web applications using technologies such as HTML5, jQuery, CSS3 and Internet Explorer 9. This guidance is intended for experien...PhoneyTools: Initial Release (0.1): This is the 0.1 version for preview of the features.Minemapper: Minemapper v0.1.5: Now supports new Minecraft beta v1.3 map format, thanks to updated mcmap. Disabled biomes, until Minecraft Biome Extractor supports new format.Smartkernel: Smartkernel: ????，??????Document.Editor: 2011.7: Whats new for Document.Editor 2011.7: New Find dialog Improved Email dialog Improved Home tab Improved Format tab Minor Bug Fix's, improvements and speed upsChiave File Encryption: Chiave 0.9.1: Application for file encryption and decryption using 512 Bit rijndael encyrption algorithm with simple to use UI. Its written in C# and compiled in .Net version 3.5. It incorporates features of Windows 7 like Jumplists, Taskbar progress and Aero Glass. Change Log from 0.9 Beta to 0.9.1: ======================= >Added option for system shutdown, sleep, hibernate after operation completed. >Minor Changes to the UI. >Numerous Bug fixes. Feedbacks are Welcome!....Coding4Fun Tools: Coding4Fun.Phone.Toolkit v1.2: New control, Toast Prompt! Removed progress bar since Silverlight Toolkit Feb 2010 has it.Umbraco CMS: Umbraco 4.7: Service release fixing 31 issues. A full changelog will be available with the final stable release of 4.7 Important when upgradingUpgrade as if it was a patch release (update /bin, /umbraco and /umbraco_client). For general upgrade information follow the guide found at http://our.umbraco.org/wiki/install-and-setup/upgrading-an-umbraco-installation 4.7 requires the .NET 4.0 framework Web.Config changes Update the web web.config to include the 4 changes found in (they're clearly marked in...HubbleDotNet - Open source full-text search engine: V1.1.0.0: Add Sqlite3 DBAdapter Add App Report when Query Cache is Collecting. Improve the performance of index through Synchronize. Add top 0 feature so that we can only get count of the result. Improve the score calculating algorithm of match. Let the score of the record that match all items large then others. Add MySql DBAdapter Improve performance for multi-fields sort . Using hash table to access the Payload data. The version before used bin search. Using heap sort instead of qui...Silverlight????[???]: silverlight????[???]2.0: ???????，?????，????????silverlight??????。DBSourceTools: DBSourceTools_1.3.0.0: Release 1.3.0.0 Changed editors from FireEdit to ICSharpCode.TextEditor. Complete re-vamp of Intellisense ( further testing needed). Hightlight Field and Table Names in sql scripts. Added field dropdown on all tables and views in DBExplorer. Added data option for viewing data in Tables. Fixed comment / uncomment bug as reported by tareq. Included Synonyms in scripting engine ( nickt_ch ).IronPython: 2.7 Release Candidate 1: We are pleased to announce the first Release Candidate for IronPython 2.7. This release contains over two dozen bugs fixed in preparation for 2.7 Final. See the release notes for 60193 for details and what has already been fixed in the earlier 2.7 prereleases. - IronPython TeamCaliburn Micro: A Micro-Framework for WPF, Silverlight and WP7: Caliburn.Micro 1.0 RC: This is the official Release Candicate for Caliburn.Micro 1.0. The download contains the binaries, samples and VS templates. VS Templates The templates included are designed for situations where the Caliburn.Micro source needs to be embedded within a single project solution. This was targeted at government and other organizations that expressed specific requirements around using an open source project like this. NuGet This release does not have a corresponding NuGet package. The NuGet pack...Caliburn: A Client Framework for WPF and Silverlight: Caliburn 2.0 RC: This is the official Release Candidate for Caliburn 2.0. It contains all binaries, samples and generated code docs.Rawr: Rawr 4.0.20 Beta: Rawr is now web-based. The link to use Rawr4 is: http://elitistjerks.com/rawr.phpThis is the Cataclysm Beta Release. More details can be found at the following link http://rawr.codeplex.com/Thread/View.aspx?ThreadId=237262 As of the 4.0.16 release, you can now also begin using the new Downloadable WPF version of Rawr!This is a pre-alpha release of the WPF version, there are likely to be a lot of issues. If you have a problem, please follow the Posting Guidelines and put it into the Issue Trac...Azure Storage Samples: Version 1.0 (February 2011): These downloads contain source code. Each is a complete sample that fully exercises Windows Azure Storage across blobs, queues, and tables. The difference between the downloads is implementation approach. Storage DotNet CS.zip is a .NET StorageClient library implementation in the C# language. This library come with the Windows Azure SDK. Contains helper classes for accessing blobs, queues, and tables. Storage REST CS.zip is a REST implementation in the C# language. The code to implement R...PowerGUI Visual Studio Extension: PowerGUI VSX 1.3.2: New FeaturesPowerGUI Console Tool Window PowerShell Project Type PowerGUI 2.4 SupportMiniTwitter: 1.66: MiniTwitter 1.66 ???? ?? ?????????? 2 ??????????????????? User Streams ?????????Windows Phone 7 Isolated Storage Explorer: WP7 Isolated Storage Explorer v1.0 Beta: Current release features:WPF desktop explorer client Visual Studio integrated tool window explorer client (Visual Studio 2010 Professional and above) Supported operations: Refresh (isolated storage information), Add Folder, Add Existing Item, Download File, Delete Folder, Delete File Explorer supports operations running on multiple remote applications at the same time Explorer detects application disconnect (1-2 second delay) Explorer confirms operation completed status Explorer d...New ProjectsAgriscope: This is an open information visualization tool used to assist RADA and other Agriculture officers in retrieving and analyzing data in day to day tasks.AVCampos NF-e: Realizar a emissão e controle de nf-e, através de ambientes moveis.Babel Obfuscator NAnt Tasks: This is an NAnt task for Babel Obfuscator. Babel Obfuscator protect software components realized with Microsoft .NET Framework in order to make reverse engineering difficult. Babel Obfuscator can be downloaded at http://www.babelfor.netConcurrent Programming Library: Concurrent Programming Library provides an opportunity to develop a parallel programs using .net framework 2.0 and above. It includes an implementation of various parallel algorithms, thread-safe collections and patterns.EOrg: Gelistirme maksatli yaptigim çalismalar.Extend Grid View: Extend grid view is user control. It help paging a dataset is set on gridview.FinlogiK ReSharper Contrib: FinlogiK ReSharper Contrib is a plugin for ReSharper 5.1 which adds code cleanup and inspection options for static qualifiers.Game development with Playstation Move and Ogre3D: This project is a research aiming to develop a program which can handle the Playstation Move on PC. After that, we will implement a game based on it. The programming language is C++. The graphics is handled by Ogre3D.JAD: Projeto de software.JSARP: This tool allows describing and verifying Petri Nets with the support of a graphical interface. This tool, is being developed in Java.KangmoDB - A replacement for the storage engine of SQLite: KangmoDB claims to be a real-time storage engine that replaces the one in SQLite. KangmoDB tries to achieve the lowest latency time for a transaction with ACID properties. It will be mainly used for the stock market that requires lowest latency with highest stability. MetaprogrammingInDotNetBook: This project will contain code and other artifacts related to the "Metaprogramming in .NET" book that should be avaible in October 2011.munix workstation: The µnix project is an endeavour to create a complete workstation and UNIX-like OS using standard logic IC's and 8-bit AVR microcontrollers. The goal isn't to make something that will compete with a traditional workstation in computation but instead to have a great DIY project.PhoneyTools: Set of controls and utilities for WP7 development.Plist Builder: Serialize non-circular-referencing .NET objects to plist in .NET.Quake3.NET: A port of the Quake 3 engine to C#. This is not merely a port of Quake 3 to run in a managed environment, but a complete rewrite of the engine using C# 4.0's powerful language features.SecViz: Web server security attack graph alert correlation IDS SerialNome: This is a multiport serial applicationsprout sms: a wp7 cabbage clientUsing external assembly in Biztalk 2009 map: Using external assembly in Biztalk 2009 map.

Read the article

EM12c Release 4: Database as a Service Enhancements

- by Adeesh Fulay

Oracle Enterprise Manager 12.1.0.4 (or simply put EM12c R4) is the latest update to the product. As previous versions, this release provides tons of enhancements and bug fixes, attributing to improved stability and quality. One of the areas that is most exciting and has seen tremendous growth in the last few years is that of Database as a Service. EM12c R4 provides a significant update to Database as a Service. The key themes are: Comprehensive Database Service Catalog (includes single instance, RAC, and Data Guard) Additional Storage Options for Snap Clone (includes support for Database feature CloneDB) Improved Rapid Start Kits Extensible Metering and Chargeback Miscellaneous Enhancements 1. Comprehensive Database Service Catalog Before we get deep into implementation of a service catalog, lets first understand what it is and what benefits it provides. Per ITIL, a service catalog is an exhaustive list of IT services that an organization provides or offers to its employees or customers. Service catalogs have been widely popular in the space of cloud computing, primarily as the medium to provide standardized and pre-approved service definitions. There is already some good collateral out there that talks about Oracle database service catalogs. The two whitepapers i recommend reading are: Service Catalogs: Defining Standardized Database Service High Availability Best Practices for Database Consolidation: The Foundation for Database as a Service [Oracle MAA] EM12c comes with an out-of-the-box service catalog and self service portal since release 1. For the customers, it provides the following benefits: Present a collection of standardized database service definitions, Define standardized pools of hardware and software for provisioning, Role based access to cater to different class of users, Automated procedures to provision the predefined database definitions, Setup chargeback plans based on service tiers and database configuration sizes, etc Starting Release 4, the scope of services offered via the service catalog has been expanded to include databases with varying levels of availability - Single Instance (SI) or Real Application Clusters (RAC) databases with multiple data guard based standby databases. Some salient points of the data guard integration: Standby pools can now be defined across different datacenters or within the same datacenter as the primary (this helps in modelling the concept of near and far DR sites) The standby databases can be single instance, RAC, or RAC One Node databases Multiple standby databases can be provisioned, where the maximum limit is determined by the version of database software The standby databases can be in either mount or read only (requires active data guard option) mode All database versions 10g to 12c supported (as certified with EM 12c) All 3 protection modes can be used - Maximum availability, performance, security Log apply can be set to sync or async along with the required apply lag The different service levels or service tiers are popularly represented using metals - Platinum, Gold, Silver, Bronze, and so on. The Oracle MAA whitepaper (referenced above) calls out the various service tiers as defined by Oracle's best practices, but customers can choose any logical combinations from the table below: Primary Standby [1 or more] EM 12cR4 SI - SI SI RAC - RAC SI RAC RAC RON - RON RON where RON = RAC One Node is supported via custom post-scripts in the service template A sample service catalog would look like the image below. Here we have defined 4 service levels, which have been deployed across 2 data centers, and have 3 standardized sizes. Again, it is important to note that this is just an example to get the creative juices flowing. I imagine each customer would come up with their own catalog based on the application requirements, their RTO/RPO goals, and the product licenses they own. In the screenwatch titled 'Build Service Catalog using EM12c DBaaS', I walk through the complete steps required to setup this sample service catalog in EM12c. 2. Additional Storage Options for Snap Clone In my previous blog posts, i have described the snap clone feature in detail. Essentially, it provides a storage agnostic, self service, rapid, and space efficient approach to solving your data cloning problems. The net benefit is that you get incredible amounts of storage savings (on average 90%) all while cloning databases in a matter of minutes. Space and Time, two things enterprises would love to save on. This feature has been designed with the goal of providing data cloning capabilities while protecting your existing investments in server, storage, and software. With this in mind, we have pursued with the dual solution approach of Hardware and Software. In the hardware approach, we connect directly to your storage appliances and perform all low level actions required to rapidly clone your databases. While in the software approach, we use an intermediate software layer to talk to any storage vendor or any storage configuration to perform the same low level actions. Thus delivering the benefits of database thin cloning, without requiring you to drastically changing the infrastructure or IT's operating style. In release 4, we expand the scope of options supported by snap clone with the addition of database CloneDB. While CloneDB is not a new feature, it was first introduced in 11.2.0.2 patchset, it has over the years become more stable and mature. CloneDB leverages a combination of Direct NFS (or dNFS) feature of the database, RMAN image copies, sparse files, and copy-on-write technology to create thin clones of databases from existing backups in a matter of minutes. It essentially has all the traits that we want to present to our customers via the snap clone feature. For more information on cloneDB, i highly recommend reading the following sources: Blog by Tim Hall: Direct NFS (DNFS) CloneDB in Oracle Database 11g Release 2 Oracle OpenWorld Presentation by Cern: Efficient Database Cloning using Direct NFS and CloneDB The advantages of the new CloneDB integration with EM12c Snap Clone are: Space and time savings Ease of setup - no additional software is required other than the Oracle database binary Works on all platforms Reduce the dependence on storage administrators Cloning process fully orchestrated by EM12c, and delivered to developers/DBAs/QA Testers via the self service portal Uses dNFS to delivers better performance, availability, and scalability over kernel NFS Complete lifecycle of the clones managed by EM12c - performance, configuration, etc 3. Improved Rapid Start Kits DBaaS deployments tend to be complex and its setup requires a series of steps. These steps are typically performed across different users and different UIs. The Rapid Start Kit provides a single command solution to setup Database as a Service (DBaaS) and Pluggable Database as a Service (PDBaaS). One command creates all the Cloud artifacts like Roles, Administrators, Credentials, Database Profiles, PaaS Infrastructure Zone, Database Pools and Service Templates. Once the Rapid Start Kit has been successfully executed, requests can be made to provision databases and PDBs from the self service portal. Rapid start kit can create complex topologies involving multiple zones, pools and service templates. It also supports standby databases and use of RMAN image backups. The Rapid Start Kit in reality is a simple emcli script which takes a bunch of xml files as input and executes the complete automation in a matter of seconds. On a full rack Exadata, it took only 40 seconds to setup PDBaaS end-to-end. This kit works for both Oracle's engineered systems like Exadata, SuperCluster, etc and also on commodity hardware. One can draw parallel to the Exadata One Command script, which again takes a bunch of inputs from the administrators and then runs a simple script that configures everything from network to provisioning the DB software. Steps to use the kit: The kit can be found under the SSA plug-in directory on the OMS: EM_BASE/oracle/MW/plugins/oracle.sysman.ssa.oms.plugin_12.1.0.8.0/dbaas/setup It can be run from this default location or from any server which has emcli client installed For most scenarios, you would use the script dbaas/setup/database_cloud_setup.py For Exadata, special integration is provided to reduce the number of inputs even further. The script to use for this scenario would be dbaas/setup/exadata_cloud_setup.py The database_cloud_setup.py script takes two inputs: Cloud boundary xml: This file defines the cloud topology in terms of the zones and pools along with host names, oracle home locations or container database names that would be used as infrastructure for provisioning database services. This file is optional in case of Exadata, as the boundary is well know via the Exadata system target available in EM. Input xml: This file captures inputs for users, roles, profiles, service templates, etc. Essentially, all inputs required to define the DB services and other settings of the self service portal. Once all the xml files have been prepared, invoke the script as follows for PDBaaS: emcli @database_cloud_setup.py -pdbaas -cloud_boundary=/tmp/my_boundary.xml -cloud_input=/tmp/pdb_inputs.xml The script will prompt for passwords a few times for key users like sysman, cloud admin, SSA admin, etc. Once complete, you can simply log into EM as the self service user and request for databases from the portal. More information available in the Rapid Start Kit chapter in Cloud Administration Guide. 4. Extensible Metering and Chargeback Last but not the least, Metering and Chargeback in release 4 has been made extensible in all possible regards. The new extensibility features allow customer, partners, system integrators, etc to : Extend chargeback to any target type managed in EM Promote any metric in EM as a chargeback entity Extend list of charge items via metric or configuration extensions Model abstract entities like no. of backup requests, job executions, support requests, etc A slew of emcli verbs have also been added that allows administrators to create, edit, delete, import/export charge plans, and assign cost centers all via the command line. More information available in the Chargeback API chapter in Cloud Administration Guide. 5. Miscellaneous Enhancements There are other miscellaneous, yet important, enhancements that are worth a mention. These mostly have been asked by customers like you. These are: Custom naming of DB Services Self service users can provide custom names for DB SID, DB service, schemas, and tablespaces Every custom name is validated for uniqueness in EM 'Create like' of Service Templates Now creating variants of a service template is only a click away. This would be vital when you publish service templates to represent different database sizes or service levels. Profile viewer View the details of a profile like datafile, control files, snapshot ids, export/import files, etc prior to its selection in the service template Cleanup automation - for failed and successful requests Single emcli command to cleanup all remnant artifacts of a failed request Cleanup can be performed on a per request bases or by the entire pool As an extension, you can also delete successful requests Improved delete user workflow Allows administrators to reassign cloud resources to another user or delete all of them Support for multiple tablespaces for schema as a service In addition to multiple schemas, user can also specify multiple tablespaces per request I hope this was a good introduction to the new Database as a Service enhancements in EM12c R4. I encourage you to explore many of these new and existing features and give us feedback. Good luck! References: Cloud Management Page on OTN Cloud Administration Guide [Documentation] -- Adeesh Fulay (@adeeshf)

Read the article

WCF Bidirectional serialization fails

- by Gena Verdel

I'm trying to take advantage of Bidirectional serialization of some relational Linq-2-Sql generated entity classes. When using Unidirectional option everything works just fine, bu the moment I add IsReferenceType=true, objects fail to get transported over the tcp binding. Sample code: Entity class: [Table(Name="dbo.Blocks")] [DataContract()] public partial class Block : INotifyPropertyChanging, INotifyPropertyChanged { private static PropertyChangingEventArgs emptyChangingEventArgs = new PropertyChangingEventArgs(String.Empty); private long _ID; private int _StatusID; private string _Name; private bool _IsWithControlPoints; private long _DivisionID; private string _SHAPE; private EntitySet<BlockByWorkstation> _BlockByWorkstations; private EntitySet<PlanningPointAppropriation> _PlanningPointAppropriations; private EntitySet<Neighbor> _Neighbors; private EntitySet<Neighbor> _Neighbors1; private EntitySet<Task> _Tasks; private EntitySet<PlanningPointByBlock> _PlanningPointByBlocks; private EntitySet<ControlPointByBlock> _ControlPointByBlocks; private EntityRef<Division> _Division; private bool serializing; #region Extensibility Method Definitions partial void OnLoaded(); partial void OnValidate(System.Data.Linq.ChangeAction action); partial void OnCreated(); partial void OnIDChanging(long value); partial void OnIDChanged(); partial void OnStatusIDChanging(int value); partial void OnStatusIDChanged(); partial void OnNameChanging(string value); partial void OnNameChanged(); partial void OnIsWithControlPointsChanging(bool value); partial void OnIsWithControlPointsChanged(); partial void OnDivisionIDChanging(long value); partial void OnDivisionIDChanged(); partial void OnSHAPEChanging(string value); partial void OnSHAPEChanged(); #endregion public Block() { this.Initialize(); } [Column(Storage="_ID", AutoSync=AutoSync.OnInsert, DbType="BigInt NOT NULL IDENTITY", IsPrimaryKey=true, IsDbGenerated=true)] [DataMember(Order=1)] public override long ID { get { return this._ID; } set { if ((this._ID != value)) { this.OnIDChanging(value); this.SendPropertyChanging(); this._ID = value; this.SendPropertyChanged("ID"); this.OnIDChanged(); } } } [Column(Storage="_StatusID", DbType="Int NOT NULL")] [DataMember(Order=2)] public int StatusID { get { return this._StatusID; } set { if ((this._StatusID != value)) { this.OnStatusIDChanging(value); this.SendPropertyChanging(); this._StatusID = value; this.SendPropertyChanged("StatusID"); this.OnStatusIDChanged(); } } } [Column(Storage="_Name", DbType="NVarChar(255)")] [DataMember(Order=3)] public string Name { get { return this._Name; } set { if ((this._Name != value)) { this.OnNameChanging(value); this.SendPropertyChanging(); this._Name = value; this.SendPropertyChanged("Name"); this.OnNameChanged(); } } } [Column(Storage="_IsWithControlPoints", DbType="Bit NOT NULL")] [DataMember(Order=4)] public bool IsWithControlPoints { get { return this._IsWithControlPoints; } set { if ((this._IsWithControlPoints != value)) { this.OnIsWithControlPointsChanging(value); this.SendPropertyChanging(); this._IsWithControlPoints = value; this.SendPropertyChanged("IsWithControlPoints"); this.OnIsWithControlPointsChanged(); } } } [Column(Storage="_DivisionID", DbType="BigInt NOT NULL")] [DataMember(Order=5)] public long DivisionID { get { return this._DivisionID; } set { if ((this._DivisionID != value)) { if (this._Division.HasLoadedOrAssignedValue) { throw new System.Data.Linq.ForeignKeyReferenceAlreadyHasValueException(); } this.OnDivisionIDChanging(value); this.SendPropertyChanging(); this._DivisionID = value; this.SendPropertyChanged("DivisionID"); this.OnDivisionIDChanged(); } } } [Column(Storage="_SHAPE", DbType="Text", UpdateCheck=UpdateCheck.Never)] [DataMember(Order=6)] public string SHAPE { get { return this._SHAPE; } set { if ((this._SHAPE != value)) { this.OnSHAPEChanging(value); this.SendPropertyChanging(); this._SHAPE = value; this.SendPropertyChanged("SHAPE"); this.OnSHAPEChanged(); } } } [Association(Name="Block_BlockByWorkstation", Storage="_BlockByWorkstations", ThisKey="ID", OtherKey="BlockID")] [DataMember(Order=7, EmitDefaultValue=false)] public EntitySet<BlockByWorkstation> BlockByWorkstations { get { if ((this.serializing && (this._BlockByWorkstations.HasLoadedOrAssignedValues == false))) { return null; } return this._BlockByWorkstations; } set { this._BlockByWorkstations.Assign(value); } } [Association(Name="Block_PlanningPointAppropriation", Storage="_PlanningPointAppropriations", ThisKey="ID", OtherKey="MasterBlockID")] [DataMember(Order=8, EmitDefaultValue=false)] public EntitySet<PlanningPointAppropriation> PlanningPointAppropriations { get { if ((this.serializing && (this._PlanningPointAppropriations.HasLoadedOrAssignedValues == false))) { return null; } return this._PlanningPointAppropriations; } set { this._PlanningPointAppropriations.Assign(value); } } [Association(Name="Block_Neighbor", Storage="_Neighbors", ThisKey="ID", OtherKey="FirstBlockID")] [DataMember(Order=9, EmitDefaultValue=false)] public EntitySet<Neighbor> Neighbors { get { if ((this.serializing && (this._Neighbors.HasLoadedOrAssignedValues == false))) { return null; } return this._Neighbors; } set { this._Neighbors.Assign(value); } } [Association(Name="Block_Neighbor1", Storage="_Neighbors1", ThisKey="ID", OtherKey="SecondBlockID")] [DataMember(Order=10, EmitDefaultValue=false)] public EntitySet<Neighbor> Neighbors1 { get { if ((this.serializing && (this._Neighbors1.HasLoadedOrAssignedValues == false))) { return null; } return this._Neighbors1; } set { this._Neighbors1.Assign(value); } } [Association(Name="Block_Task", Storage="_Tasks", ThisKey="ID", OtherKey="BlockID")] [DataMember(Order=11, EmitDefaultValue=false)] public EntitySet<Task> Tasks { get { if ((this.serializing && (this._Tasks.HasLoadedOrAssignedValues == false))) { return null; } return this._Tasks; } set { this._Tasks.Assign(value); } } [Association(Name="Block_PlanningPointByBlock", Storage="_PlanningPointByBlocks", ThisKey="ID", OtherKey="BlockID")] [DataMember(Order=12, EmitDefaultValue=false)] public EntitySet<PlanningPointByBlock> PlanningPointByBlocks { get { if ((this.serializing && (this._PlanningPointByBlocks.HasLoadedOrAssignedValues == false))) { return null; } return this._PlanningPointByBlocks; } set { this._PlanningPointByBlocks.Assign(value); } } [Association(Name="Block_ControlPointByBlock", Storage="_ControlPointByBlocks", ThisKey="ID", OtherKey="BlockID")] [DataMember(Order=13, EmitDefaultValue=false)] public EntitySet<ControlPointByBlock> ControlPointByBlocks { get { if ((this.serializing && (this._ControlPointByBlocks.HasLoadedOrAssignedValues == false))) { return null; } return this._ControlPointByBlocks; } set { this._ControlPointByBlocks.Assign(value); } } [Association(Name="Division_Block", Storage="_Division", ThisKey="DivisionID", OtherKey="ID", IsForeignKey=true, DeleteOnNull=true, DeleteRule="CASCADE")] public Division Division { get { return this._Division.Entity; } set { Division previousValue = this._Division.Entity; if (((previousValue != value) || (this._Division.HasLoadedOrAssignedValue == false))) { this.SendPropertyChanging(); if ((previousValue != null)) { this._Division.Entity = null; previousValue.Blocks.Remove(this); } this._Division.Entity = value; if ((value != null)) { value.Blocks.Add(this); this._DivisionID = value.ID; } else { this._DivisionID = default(long); } this.SendPropertyChanged("Division"); } } } public event PropertyChangingEventHandler PropertyChanging; public event PropertyChangedEventHandler PropertyChanged; protected virtual void SendPropertyChanging() { if ((this.PropertyChanging != null)) { this.PropertyChanging(this, emptyChangingEventArgs); } } protected virtual void SendPropertyChanged(String propertyName) { if ((this.PropertyChanged != null)) { this.PropertyChanged(this, new PropertyChangedEventArgs(propertyName)); } } private void attach_BlockByWorkstations(BlockByWorkstation entity) { this.SendPropertyChanging(); entity.Block = this; } private void detach_BlockByWorkstations(BlockByWorkstation entity) { this.SendPropertyChanging(); entity.Block = null; } private void attach_PlanningPointAppropriations(PlanningPointAppropriation entity) { this.SendPropertyChanging(); entity.Block = this; } private void detach_PlanningPointAppropriations(PlanningPointAppropriation entity) { this.SendPropertyChanging(); entity.Block = null; } private void attach_Neighbors(Neighbor entity) { this.SendPropertyChanging(); entity.FirstBlock = this; } private void detach_Neighbors(Neighbor entity) { this.SendPropertyChanging(); entity.FirstBlock = null; } private void attach_Neighbors1(Neighbor entity) { this.SendPropertyChanging(); entity.SecondBlock = this; } private void detach_Neighbors1(Neighbor entity) { this.SendPropertyChanging(); entity.SecondBlock = null; } private void attach_Tasks(Task entity) { this.SendPropertyChanging(); entity.Block = this; } private void detach_Tasks(Task entity) { this.SendPropertyChanging(); entity.Block = null; } private void attach_PlanningPointByBlocks(PlanningPointByBlock entity) { this.SendPropertyChanging(); entity.Block = this; } private void detach_PlanningPointByBlocks(PlanningPointByBlock entity) { this.SendPropertyChanging(); entity.Block = null; } private void attach_ControlPointByBlocks(ControlPointByBlock entity) { this.SendPropertyChanging(); entity.Block = this; } private void detach_ControlPointByBlocks(ControlPointByBlock entity) { this.SendPropertyChanging(); entity.Block = null; } private void Initialize() { this._BlockByWorkstations = new EntitySet<BlockByWorkstation>(new Action<BlockByWorkstation>(this.attach_BlockByWorkstations), new Action<BlockByWorkstation>(this.detach_BlockByWorkstations)); this._PlanningPointAppropriations = new EntitySet<PlanningPointAppropriation>(new Action<PlanningPointAppropriation>(this.attach_PlanningPointAppropriations), new Action<PlanningPointAppropriation>(this.detach_PlanningPointAppropriations)); this._Neighbors = new EntitySet<Neighbor>(new Action<Neighbor>(this.attach_Neighbors), new Action<Neighbor>(this.detach_Neighbors)); this._Neighbors1 = new EntitySet<Neighbor>(new Action<Neighbor>(this.attach_Neighbors1), new Action<Neighbor>(this.detach_Neighbors1)); this._Tasks = new EntitySet<Task>(new Action<Task>(this.attach_Tasks), new Action<Task>(this.detach_Tasks)); this._PlanningPointByBlocks = new EntitySet<PlanningPointByBlock>(new Action<PlanningPointByBlock>(this.attach_PlanningPointByBlocks), new Action<PlanningPointByBlock>(this.detach_PlanningPointByBlocks)); this._ControlPointByBlocks = new EntitySet<ControlPointByBlock>(new Action<ControlPointByBlock>(this.attach_ControlPointByBlocks), new Action<ControlPointByBlock>(this.detach_ControlPointByBlocks)); this._Division = default(EntityRef<Division>); OnCreated(); } [OnDeserializing()] [System.ComponentModel.EditorBrowsableAttribute(EditorBrowsableState.Never)] public void OnDeserializing(StreamingContext context) { this.Initialize(); } [OnSerializing()] [System.ComponentModel.EditorBrowsableAttribute(EditorBrowsableState.Never)] public void OnSerializing(StreamingContext context) { this.serializing = true; } [OnSerialized()] [System.ComponentModel.EditorBrowsableAttribute(EditorBrowsableState.Never)] public void OnSerialized(StreamingContext context) { this.serializing = false; } } App.config: <?xml version="1.0" encoding="utf-8" ?> <configuration> <system.web> <compilation debug="true" /> </system.web>  <system.serviceModel> <services> <service behaviorConfiguration="debugging" name="DBServicesLibrary.DBService"> </service> </services> <behaviors> <serviceBehaviors> <behavior name="DBServicesLibrary.DBServiceBehavior">  <serviceMetadata httpGetEnabled="True"/>  <serviceDebug includeExceptionDetailInFaults="False" /> </behavior> <behavior name="debugging"> <serviceDebug includeExceptionDetailInFaults="true"/> </behavior> </serviceBehaviors> </behaviors> </system.serviceModel> </configuration> Host part: ServiceHost svh = new ServiceHost(typeof(DBService)); svh.AddServiceEndpoint( typeof(DBServices.Contract.IDBService), new NetTcpBinding(), "net.tcp://localhost:8000"); Client part: ChannelFactory<DBServices.Contract.IDBService> scf; scf = new ChannelFactory<DBServices.Contract.IDBService>(new NetTcpBinding(),"net.tcp://localhost:8000"); _serv = scf.CreateChannel(); ((IContextChannel)_serv).OperationTimeout = new TimeSpan(0, 5, 0);

Read the article

Talend Enterprise Data Integration overperforms on Oracle SPARC T4

- by Amir Javanshir

The SPARC T microprocessor, released in 2005 by Sun Microsystems, and now continued at Oracle, has a good track record in parallel execution and multi-threaded performance. However it was less suited for pure single-threaded workloads. The new SPARC T4 processor is now filling that gap by offering a 5x better single-thread performance over previous generations. Following our long-term relationship with Talend, a fast growing ISV positioned by Gartner in the “Visionaries” quadrant of the “Magic Quadrant for Data Integration Tools”, we decided to test some of their integration components with the T4 chip, more precisely on a T4-1 system, in order to verify first hand if this new processor stands up to its promises. Several tests were performed, mainly focused on: Single-thread performance of the new SPARC T4 processor compared to an older SPARC T2+ processor Overall throughput of the SPARC T4-1 server using multiple threads The tests consisted in reading large amounts of data --ten's of gigabytes--, processing and writing them back to a file or an Oracle 11gR2 database table. They are CPU, memory and IO bound tests. Given the main focus of this project --CPU performance--, bottlenecks were removed as much as possible on the memory and IO sub-systems. When possible, the data to process was put into the ZFS filesystem cache, for instance. Also, two external storage devices were directly attached to the servers under test, each one divided in two ZFS pools for read and write operations. Multi-thread: Testing throughput on the Oracle T4-1 The tests were performed with different number of simultaneous threads (1, 2, 4, 8, 12, 16, 32, 48 and 64) and using different storage devices: Flash, Fibre Channel storage, two stripped internal disks and one single internal disk. All storage devices used ZFS as filesystem and volume management. Each thread read a dedicated 1GB-large file containing 12.5M lines with the following structure: customerID;FirstName;LastName;StreetAddress;City;State;Zip;Cust_Status;Since_DT;Status_DT 1;Ronald;Reagan;South Highway;Santa Fe;Montana;98756;A;04-06-2006;09-08-2008 2;Theodore;Roosevelt;Timberlane Drive;Columbus;Louisiana;75677;A;10-05-2009;27-05-2008 3;Andrew;Madison;S Rustle St;Santa Fe;Arkansas;75677;A;29-04-2005;09-02-2008 4;Dwight;Adams;South Roosevelt Drive;Baton Rouge;Vermont;75677;A;15-02-2004;26-01-2007 […] The following graphs present the results of our tests: Unsurprisingly up to 16 threads, all files fit in the ZFS cache a.k.a L2ARC : once the cache is hot there is no performance difference depending on the underlying storage. From 16 threads upwards however, it is clear that IO becomes a bottleneck, having a good IO subsystem is thus key. Single-disk performance collapses whereas the Sun F5100 and ST6180 arrays allow the T4-1 to scale quite seamlessly. From 32 to 64 threads, the performance is almost constant with just a slow decline. For the database load tests, only the best IO configuration --using external storage devices-- were used, hosting the Oracle table spaces and redo log files. Using the Sun Storage F5100 array allows the T4-1 server to scale up to 48 parallel JVM processes before saturating the CPU. The final result is a staggering 646K lines per second insertion in an Oracle table using 48 parallel threads. Single-thread: Testing the single thread performance Seven different tests were performed on both servers. Given the fact that only one thread, thus one file was read, no IO bottleneck was involved, all data being served from the ZFS cache. Read File ? Filter ? Write File: Read file, filter data, write the filtered data in a new file. The filter is set on the “Status” column: only lines with status set to “A” are selected. This limits each output file to about 500 MB. Read File ? Load Database Table: Read file, insert into a single Oracle table. Average: Read file, compute the average of a numeric column, write the result in a new file. Division & Square Root: Read file, perform a division and square root on a numeric column, write the result data in a new file. Oracle DB Dump: Dump the content of an Oracle table (12.5M rows) into a CSV file. Transform: Read file, transform, write the result data in a new file. The transformations applied are: set the address column to upper case and add an extra column at the end, which is the concatenation of two columns. Sort: Read file, sort a numeric and alpha numeric column, write the result data in a new file. The following table and graph present the final results of the tests: Throughput unit is thousand lines per second processed (K lines/second). Improvement is the % of improvement between the T5140 and T4-1. Test T4-1 (Time s.) T5140 (Time s.) Improvement T4-1 (Throughput) T5140 (Throughput) Read/Filter/Write 125 806 645% 100 16 Read/Load Database 195 1111 570% 64 11 Average 96 557 580% 130 22 Division & Square Root 161 1054 655% 78 12 Oracle DB Dump 164 945 576% 76 13 Transform 159 1124 707% 79 11 Sort 251 1336 532% 50 9 The improvement of single-thread performance is quite dramatic: depending on the tests, the T4 is between 5.4 to 7 times faster than the T2+. It seems clear that the SPARC T4 processor has gone a long way filling the gap in single-thread performance, without sacrifying the multi-threaded capability as it still shows a very impressive scaling on heavy-duty multi-threaded jobs. Finally, as always at Oracle ISV Engineering, we are happy to help our ISV partners test their own applications on our platforms, so don't hesitate to contact us and let's see what the SPARC T4-based systems can do for your application! "As describe in this benchmark, Talend Enterprise Data Integration has overperformed on T4. I was generally happy to see that the T4 gave scaling opportunities for many scenarios like complex aggregations. Row by row insertion in Oracle DB is faster with more than 650,000 rows per seconds without using any bulk Oracle capabilities !" Cedric Carbone, Talend CTO.

Read the article

Unexpected advantage of Engineered Systems

- by user12244672

It's not surprising that Engineered Systems accelerate the debugging and resolution of customer issues. But what has surprised me is just how much faster issue resolution is with Engineered Systems such as SPARC SuperCluster. These are powerful, complex, systems used by customers wanting extreme database performance, app performance, and cost saving server consolidation. A SPARC SuperCluster consists or 2 or 4 powerful T4-4 compute nodes, 3 or 6 extreme performance Exadata Storage Cells, a ZFS Storage Appliance 7320 for general purpose storage, and ultra fast Infiniband switches. Each with its own firmware. It runs Solaris 11, Solaris 10, 11gR2, LDoms virtualization, and Zones virtualization on the T4-4 compute nodes, a modified version of Solaris 11 in the ZFS Storage Appliance, a modified and highly tuned version of Oracle Linux running Exadata software on the Storage Cells, another Linux derivative in the Infiniband switches, etc. It has an Infiniband data network between the components, a 10Gb data network to the outside world, and a 1Gb management network. And customers can run whatever middleware and apps they want on it, clustered in whatever way they want. In one word, powerful. In another, complex. The system is highly Engineered. But it's designed to run general purpose applications. That is, the physical components, configuration, cabling, virtualization technologies, switches, firmware, Operating System versions, network protocols, tunables, etc. are all preset for optimum performance and robustness. That improves the customer experience as what the customer runs leverages our technical know-how and best practices and is what we've tested intensely within Oracle. It should also make debugging easier by fixing a large number of variables which would otherwise be in play if a customer or Systems Integrator had assembled such a complex system themselves from the constituent components. For example, there's myriad network protocols which could be used with Infiniband. Myriad ways the components could be interconnected, myriad tunable settings, etc. But what has really surprised me - and I've been working in this area for 15 years now - is just how much easier and faster Engineered Systems have made debugging and issue resolution. All those error opportunities for sub-optimal cabling, unusual network protocols, sub-optimal deployment of virtualization technologies, issues with 3rd party storage, issues with 3rd party multi-pathing products, etc., are simply taken out of the equation. All those error opportunities for making an issue unique to a particular set-up, the "why aren't we seeing this on any other system ?" type questions, the doubts, just go away when we or a customer discover an issue on an Engineered System. It enables a really honed response, getting to the root cause much, much faster than would otherwise be the case. Here's a couple of examples from the last month, one found in-house by my team, one found by a customer: Example 1: We found a node eviction issue running 11gR2 with Solaris 11 SRU 12 under extreme load on what we call our ExaLego test system (mimics an Exadata / SuperCluster 11gR2 Exadata Storage Cell set-up). We quickly established that an enhancement in SRU12 enabled an 11gR2 process to query Infiniband's Subnet Manager, replacing a fallback mechanism it had used previously. Under abnormally heavy load, the query could return results which were misinterpreted resulting in node eviction. In several daily joint debugging sessions between the Solaris, Infiniband, and 11gR2 teams, the issue was fully root caused, evaluated, and a fix agreed upon. That fix went back into all Solaris releases the following Monday. From initial issue discovery to the fix being put back into all Solaris releases was just 10 days. Example 2: A customer reported sporadic performance degradation. The reasons were unclear and the information sparse. The SPARC SuperCluster Engineered Systems support teams which comprises both SPARC/Solaris and Database/Exadata experts worked to root cause the issue. A number of contributing factors were discovered, including tunable parameters. An intense collaborative investigation between the engineering teams identified the root cause to a CPU bound networking thread which was being starved of CPU cycles under extreme load. Workarounds were identified. Modifications have been put back into 11gR2 to alleviate the issue and a development project already underway within Solaris has been sped up to provide the final resolution on the Solaris side. The fixed SPARC SuperCluster configuration greatly aided issue reproduction and dramatically sped up root cause analysis, allowing the correct workarounds and fixes to be identified, prioritized, and implemented. The customer is now extremely happy with performance and robustness. Since the configuration is common to other customers, the lessons learned are being proactively rolled out to other customers and incorporated into the installation procedures for future customers. This effectively acts as a turbo-boost to performance and reliability for all SPARC SuperCluster customers. If this had occurred in a "home grown" system of this complexity, I expect it would have taken at least 6 months to get to the bottom of the issue. But because it was an Engineered System, known, understood, and qualified by both the Solaris and Database teams, we were able to collaborate closely to identify cause and effect and expedite a solution for the customer. That is a key advantage of Engineered Systems which should not be underestimated. Indeed, the initial issue mitigation on the Database side followed by final fix on the Solaris side, highlights the high degree of collaboration and excellent teamwork between the Oracle engineering teams. It's a compelling advantage of the integrated Oracle Red Stack in general and Engineered Systems in particular.

Read the article

Inheritance Mapping Strategies with Entity Framework Code First CTP5: Part 3 – Table per Concrete Type (TPC) and Choosing Strategy Guidelines

- by mortezam

This is the third (and last) post in a series that explains different approaches to map an inheritance hierarchy with EF Code First. I've described these strategies in previous posts: Part 1 – Table per Hierarchy (TPH) Part 2 – Table per Type (TPT)In today’s blog post I am going to discuss Table per Concrete Type (TPC) which completes the inheritance mapping strategies supported by EF Code First. At the end of this post I will provide some guidelines to choose an inheritance strategy mainly based on what we've learned in this series. TPC and Entity Framework in the Past Table per Concrete type is somehow the simplest approach suggested, yet using TPC with EF is one of those concepts that has not been covered very well so far and I've seen in some resources that it was even discouraged. The reason for that is just because Entity Data Model Designer in VS2010 doesn't support TPC (even though the EF runtime does). That basically means if you are following EF's Database-First or Model-First approaches then configuring TPC requires manually writing XML in the EDMX file which is not considered to be a fun practice. Well, no more. You'll see that with Code First, creating TPC is perfectly possible with fluent API just like other strategies and you don't need to avoid TPC due to the lack of designer support as you would probably do in other EF approaches. Table per Concrete Type (TPC)In Table per Concrete type (aka Table per Concrete class) we use exactly one table for each (nonabstract) class. All properties of a class, including inherited properties, can be mapped to columns of this table, as shown in the following figure: As you can see, the SQL schema is not aware of the inheritance; effectively, we’ve mapped two unrelated tables to a more expressive class structure. If the base class was concrete, then an additional table would be needed to hold instances of that class. I have to emphasize that there is no relationship between the database tables, except for the fact that they share some similar columns. TPC Implementation in Code First Just like the TPT implementation, we need to specify a separate table for each of the subclasses. We also need to tell Code First that we want all of the inherited properties to be mapped as part of this table. In CTP5, there is a new helper method on EntityMappingConfiguration class called MapInheritedProperties that exactly does this for us. Here is the complete object model as well as the fluent API to create a TPC mapping: public abstract class BillingDetail { public int BillingDetailId { get; set; } public string Owner { get; set; } public string Number { get; set; } } public class BankAccount : BillingDetail { public string BankName { get; set; } public string Swift { get; set; } } public class CreditCard : BillingDetail { public int CardType { get; set; } public string ExpiryMonth { get; set; } public string ExpiryYear { get; set; } } public class InheritanceMappingContext : DbContext { public DbSet<BillingDetail> BillingDetails { get; set; } protected override void OnModelCreating(ModelBuilder modelBuilder) { modelBuilder.Entity<BankAccount>().Map(m => { m.MapInheritedProperties(); m.ToTable("BankAccounts"); }); modelBuilder.Entity<CreditCard>().Map(m => { m.MapInheritedProperties(); m.ToTable("CreditCards"); }); } } The Importance of EntityMappingConfiguration ClassAs a side note, it worth mentioning that EntityMappingConfiguration class turns out to be a key type for inheritance mapping in Code First. Here is an snapshot of this class: namespace System.Data.Entity.ModelConfiguration.Configuration.Mapping { public class EntityMappingConfiguration<TEntityType> where TEntityType : class { public ValueConditionConfiguration Requires(string discriminator); public void ToTable(string tableName); public void MapInheritedProperties(); } } As you have seen so far, we used its Requires method to customize TPH. We also used its ToTable method to create a TPT and now we are using its MapInheritedProperties along with ToTable method to create our TPC mapping. TPC Configuration is Not Done Yet!We are not quite done with our TPC configuration and there is more into this story even though the fluent API we saw perfectly created a TPC mapping for us in the database. To see why, let's start working with our object model. For example, the following code creates two new objects of BankAccount and CreditCard types and tries to add them to the database: using (var context = new InheritanceMappingContext()) { BankAccount bankAccount = new BankAccount(); CreditCard creditCard = new CreditCard() { CardType = 1 }; context.BillingDetails.Add(bankAccount); context.BillingDetails.Add(creditCard); context.SaveChanges(); } Running this code throws an InvalidOperationException with this message: The changes to the database were committed successfully, but an error occurred while updating the object context. The ObjectContext might be in an inconsistent state. Inner exception message: AcceptChanges cannot continue because the object's key values conflict with another object in the ObjectStateManager. Make sure that the key values are unique before calling AcceptChanges. The reason we got this exception is because DbContext.SaveChanges() internally invokes SaveChanges method of its internal ObjectContext. ObjectContext's SaveChanges method on its turn by default calls AcceptAllChanges after it has performed the database modifications. AcceptAllChanges method merely iterates over all entries in ObjectStateManager and invokes AcceptChanges on each of them. Since the entities are in Added state, AcceptChanges method replaces their temporary EntityKey with a regular EntityKey based on the primary key values (i.e. BillingDetailId) that come back from the database and that's where the problem occurs since both the entities have been assigned the same value for their primary key by the database (i.e. on both BillingDetailId = 1) and the problem is that ObjectStateManager cannot track objects of the same type (i.e. BillingDetail) with the same EntityKey value hence it throws. If you take a closer look at the TPC's SQL schema above, you'll see why the database generated the same values for the primary keys: the BillingDetailId column in both BankAccounts and CreditCards table has been marked as identity. How to Solve The Identity Problem in TPC As you saw, using SQL Server’s int identity columns doesn't work very well together with TPC since there will be duplicate entity keys when inserting in subclasses tables with all having the same identity seed. Therefore, to solve this, either a spread seed (where each table has its own initial seed value) will be needed, or a mechanism other than SQL Server’s int identity should be used. Some other RDBMSes have other mechanisms allowing a sequence (identity) to be shared by multiple tables, and something similar can be achieved with GUID keys in SQL Server. While using GUID keys, or int identity keys with different starting seeds will solve the problem but yet another solution would be to completely switch off identity on the primary key property. As a result, we need to take the responsibility of providing unique keys when inserting records to the database. We will go with this solution since it works regardless of which database engine is used. Switching Off Identity in Code First We can switch off identity simply by placing DatabaseGenerated attribute on the primary key property and pass DatabaseGenerationOption.None to its constructor. DatabaseGenerated attribute is a new data annotation which has been added to System.ComponentModel.DataAnnotations namespace in CTP5: public abstract class BillingDetail { [DatabaseGenerated(DatabaseGenerationOption.None)] public int BillingDetailId { get; set; } public string Owner { get; set; } public string Number { get; set; } } As always, we can achieve the same result by using fluent API, if you prefer that: modelBuilder.Entity<BillingDetail>() .Property(p => p.BillingDetailId) .HasDatabaseGenerationOption(DatabaseGenerationOption.None); Working With The Object Model Our TPC mapping is ready and we can try adding new records to the database. But, like I said, now we need to take care of providing unique keys when creating new objects: using (var context = new InheritanceMappingContext()) { BankAccount bankAccount = new BankAccount() { BillingDetailId = 1 }; CreditCard creditCard = new CreditCard() { BillingDetailId = 2, CardType = 1 }; context.BillingDetails.Add(bankAccount); context.BillingDetails.Add(creditCard); context.SaveChanges(); } Polymorphic Associations with TPC is Problematic The main problem with this approach is that it doesn’t support Polymorphic Associations very well. After all, in the database, associations are represented as foreign key relationships and in TPC, the subclasses are all mapped to different tables so a polymorphic association to their base class (abstract BillingDetail in our example) cannot be represented as a simple foreign key relationship. For example, consider the the domain model we introduced here where User has a polymorphic association with BillingDetail. This would be problematic in our TPC Schema, because if User has a many-to-one relationship with BillingDetail, the Users table would need a single foreign key column, which would have to refer both concrete subclass tables. This isn’t possible with regular foreign key constraints. Schema Evolution with TPC is Complex A further conceptual problem with this mapping strategy is that several different columns, of different tables, share exactly the same semantics. This makes schema evolution more complex. For example, a change to a base class property results in changes to multiple columns. It also makes it much more difficult to implement database integrity constraints that apply to all subclasses. Generated SQLLet's examine SQL output for polymorphic queries in TPC mapping. For example, consider this polymorphic query for all BillingDetails and the resulting SQL statements that being executed in the database: var query = from b in context.BillingDetails select b; Just like the SQL query generated by TPT mapping, the CASE statements that you see in the beginning of the query is merely to ensure columns that are irrelevant for a particular row have NULL values in the returning flattened table. (e.g. BankName for a row that represents a CreditCard type). TPC's SQL Queries are Union Based As you can see in the above screenshot, the first SELECT uses a FROM-clause subquery (which is selected with a red rectangle) to retrieve all instances of BillingDetails from all concrete class tables. The tables are combined with a UNION operator, and a literal (in this case, 0 and 1) is inserted into the intermediate result; (look at the lines highlighted in yellow.) EF reads this to instantiate the correct class given the data from a particular row. A union requires that the queries that are combined, project over the same columns; hence, EF has to pad and fill up nonexistent columns with NULL. This query will really perform well since here we can let the database optimizer find the best execution plan to combine rows from several tables. There is also no Joins involved so it has a better performance than the SQL queries generated by TPT where a Join is required between the base and subclasses tables. Choosing Strategy GuidelinesBefore we get into this discussion, I want to emphasize that there is no one single "best strategy fits all scenarios" exists. As you saw, each of the approaches have their own advantages and drawbacks. Here are some rules of thumb to identify the best strategy in a particular scenario: If you don’t require polymorphic associations or queries, lean toward TPC—in other words, if you never or rarely query for BillingDetails and you have no class that has an association to BillingDetail base class. I recommend TPC (only) for the top level of your class hierarchy, where polymorphism isn’t usually required, and when modification of the base class in the future is unlikely. If you do require polymorphic associations or queries, and subclasses declare relatively few properties (particularly if the main difference between subclasses is in their behavior), lean toward TPH. Your goal is to minimize the number of nullable columns and to convince yourself (and your DBA) that a denormalized schema won’t create problems in the long run. If you do require polymorphic associations or queries, and subclasses declare many properties (subclasses differ mainly by the data they hold), lean toward TPT. Or, depending on the width and depth of your inheritance hierarchy and the possible cost of joins versus unions, use TPC. By default, choose TPH only for simple problems. For more complex cases (or when you’re overruled by a data modeler insisting on the importance of nullability constraints and normalization), you should consider the TPT strategy. But at that point, ask yourself whether it may not be better to remodel inheritance as delegation in the object model (delegation is a way of making composition as powerful for reuse as inheritance). Complex inheritance is often best avoided for all sorts of reasons unrelated to persistence or ORM. EF acts as a buffer between the domain and relational models, but that doesn’t mean you can ignore persistence concerns when designing your classes. SummaryIn this series, we focused on one of the main structural aspect of the object/relational paradigm mismatch which is inheritance and discussed how EF solve this problem as an ORM solution. We learned about the three well-known inheritance mapping strategies and their implementations in EF Code First. Hopefully it gives you a better insight about the mapping of inheritance hierarchies as well as choosing the best strategy for your particular scenario. Happy New Year and Happy Code-Firsting! References ADO.NET team blog Java Persistence with Hibernate book a { color: #5A99FF; } a:visited { color: #5A99FF; } .title { padding-bottom: 5px; font-family: Segoe UI; font-size: 11pt; font-weight: bold; padding-top: 15px; } .code, .typeName { font-family: consolas; } .typeName { color: #2b91af; } .padTop5 { padding-top: 5px; } .padTop10 { padding-top: 10px; } .exception { background-color: #f0f0f0; font-style: italic; padding-bottom: 5px; padding-left: 5px; padding-top: 5px; padding-right: 5px; }

Read the article

Heaps of Trouble?

- by Paul White NZ

If you’re not already a regular reader of Brad Schulz’s blog, you’re missing out on some great material. In his latest entry, he is tasked with optimizing a query run against tables that have no indexes at all. The problem is, predictably, that performance is not very good. The catch is that we are not allowed to create any indexes (or even new statistics) as part of our optimization efforts. In this post, I’m going to look at the problem from a slightly different angle, and present an alternative solution to the one Brad found. Inevitably, there’s going to be some overlap between our entries, and while you don’t necessarily need to read Brad’s post before this one, I do strongly recommend that you read it at some stage; he covers some important points that I won’t cover again here. The Example We’ll use data from the AdventureWorks database, copied to temporary unindexed tables. A script to create these structures is shown below: CREATE TABLE #Custs ( CustomerID INTEGER NOT NULL, TerritoryID INTEGER NULL, CustomerType NCHAR(1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #Prods ( ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, Name NVARCHAR(50) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, ); GO CREATE TABLE #OrdHeader ( SalesOrderID INTEGER NOT NULL, OrderDate DATETIME NOT NULL, SalesOrderNumber NVARCHAR(25) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL, CustomerID INTEGER NOT NULL, ); GO CREATE TABLE #OrdDetail ( SalesOrderID INTEGER NOT NULL, OrderQty SMALLINT NOT NULL, LineTotal NUMERIC(38,6) NOT NULL, ProductMainID INTEGER NOT NULL, ProductSubID INTEGER NOT NULL, ProductSubSubID INTEGER NOT NULL, ); GO INSERT #Custs ( CustomerID, TerritoryID, CustomerType ) SELECT C.CustomerID, C.TerritoryID, C.CustomerType FROM AdventureWorks.Sales.Customer C WITH (TABLOCK); GO INSERT #Prods ( ProductMainID, ProductSubID, ProductSubSubID, Name ) SELECT P.ProductID, P.ProductID, P.ProductID, P.Name FROM AdventureWorks.Production.Product P WITH (TABLOCK); GO INSERT #OrdHeader ( SalesOrderID, OrderDate, SalesOrderNumber, CustomerID ) SELECT H.SalesOrderID, H.OrderDate, H.SalesOrderNumber, H.CustomerID FROM AdventureWorks.Sales.SalesOrderHeader H WITH (TABLOCK); GO INSERT #OrdDetail ( SalesOrderID, OrderQty, LineTotal, ProductMainID, ProductSubID, ProductSubSubID ) SELECT D.SalesOrderID, D.OrderQty, D.LineTotal, D.ProductID, D.ProductID, D.ProductID FROM AdventureWorks.Sales.SalesOrderDetail D WITH (TABLOCK); The query itself is a simple join of the four tables: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #OrdDetail D ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID JOIN #OrdHeader H ON D.SalesOrderID = H.SalesOrderID JOIN #Custs C ON H.CustomerID = C.CustomerID ORDER BY P.ProductMainID ASC OPTION (RECOMPILE, MAXDOP 1); Remember that these tables have no indexes at all, and only the single-column sampled statistics SQL Server automatically creates (assuming default settings). The estimated query plan produced for the test query looks like this (click to enlarge): The Problem The problem here is one of cardinality estimation – the number of rows SQL Server expects to find at each step of the plan. The lack of indexes and useful statistical information means that SQL Server does not have the information it needs to make a good estimate. Every join in the plan shown above estimates that it will produce just a single row as output. Brad covers the factors that lead to the low estimates in his post. In reality, the join between the #Prods and #OrdDetail tables will produce 121,317 rows. It should not surprise you that this has rather dire consequences for the remainder of the query plan. In particular, it makes a nonsense of the optimizer’s decision to use Nested Loops to join to the two remaining tables. Instead of scanning the #OrdHeader and #Custs tables once (as it expected), it has to perform 121,317 full scans of each. The query takes somewhere in the region of twenty minutes to run to completion on my development machine. A Solution At this point, you may be thinking the same thing I was: if we really are stuck with no indexes, the best we can do is to use hash joins everywhere. We can force the exclusive use of hash joins in several ways, the two most common being join and query hints. A join hint means writing the query using the INNER HASH JOIN syntax; using a query hint involves adding OPTION (HASH JOIN) at the bottom of the query. The difference is that using join hints also forces the order of the join, whereas the query hint gives the optimizer freedom to reorder the joins at its discretion. Adding the OPTION (HASH JOIN) hint results in this estimated plan: That produces the correct output in around seven seconds, which is quite an improvement! As a purely practical matter, and given the rigid rules of the environment we find ourselves in, we might leave things there. (We can improve the hashing solution a bit – I’ll come back to that later on). Faster Nested Loops It might surprise you to hear that we can beat the performance of the hash join solution shown above using nested loops joins exclusively, and without breaking the rules we have been set. The key to this part is to realize that a condition like (A = B) can be expressed as (A <= B) AND (A >= B). Armed with this tremendous new insight, we can rewrite the join predicates like so: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #OrdDetail D JOIN #OrdHeader H ON D.SalesOrderID >= H.SalesOrderID AND D.SalesOrderID <= H.SalesOrderID JOIN #Custs C ON H.CustomerID >= C.CustomerID AND H.CustomerID <= C.CustomerID JOIN #Prods P ON P.ProductMainID >= D.ProductMainID AND P.ProductMainID <= D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (RECOMPILE, LOOP JOIN, MAXDOP 1, FORCE ORDER); I’ve also added LOOP JOIN and FORCE ORDER query hints to ensure that only nested loops joins are used, and that the tables are joined in the order they appear. The new estimated execution plan is: This new query runs in under 2 seconds. Why Is It Faster? The main reason for the improvement is the appearance of the eager Index Spools, which are also known as index-on-the-fly spools. If you read my Inside The Optimiser series you might be interested to know that the rule responsible is called JoinToIndexOnTheFly. An eager index spool consumes all rows from the table it sits above, and builds a index suitable for the join to seek on. Taking the index spool above the #Custs table as an example, it reads all the CustomerID and TerritoryID values with a single scan of the table, and builds an index keyed on CustomerID. The term ‘eager’ means that the spool consumes all of its input rows when it starts up. The index is built in a work table in tempdb, has no associated statistics, and only exists until the query finishes executing. The result is that each unindexed table is only scanned once, and just for the columns necessary to build the temporary index. From that point on, every execution of the inner side of the join is answered by a seek on the temporary index – not the base table. A second optimization is that the sort on ProductMainID (required by the ORDER BY clause) is performed early, on just the rows coming from the #OrdDetail table. The optimizer has a good estimate for the number of rows it needs to sort at that stage – it is just the cardinality of the table itself. The accuracy of the estimate there is important because it helps determine the memory grant given to the sort operation. Nested loops join preserves the order of rows on its outer input, so sorting early is safe. (Hash joins do not preserve order in this way, of course). The extra lazy spool on the #Prods branch is a further optimization that avoids executing the seek on the temporary index if the value being joined (the ‘outer reference’) hasn’t changed from the last row received on the outer input. It takes advantage of the fact that rows are still sorted on ProductMainID, so if duplicates exist, they will arrive at the join operator one after the other. The optimizer is quite conservative about introducing index spools into a plan, because creating and dropping a temporary index is a relatively expensive operation. It’s presence in a plan is often an indication that a useful index is missing. I want to stress that I rewrote the query in this way primarily as an educational exercise – I can’t imagine having to do something so horrible to a production system. Improving the Hash Join I promised I would return to the solution that uses hash joins. You might be puzzled that SQL Server can create three new indexes (and perform all those nested loops iterations) faster than it can perform three hash joins. The answer, again, is down to the poor information available to the optimizer. Let’s look at the hash join plan again: Two of the hash joins have single-row estimates on their build inputs. SQL Server fixes the amount of memory available for the hash table based on this cardinality estimate, so at run time the hash join very quickly runs out of memory. This results in the join spilling hash buckets to disk, and any rows from the probe input that hash to the spilled buckets also get written to disk. The join process then continues, and may again run out of memory. This is a recursive process, which may eventually result in SQL Server resorting to a bailout join algorithm, which is guaranteed to complete eventually, but may be very slow. The data sizes in the example tables are not large enough to force a hash bailout, but it does result in multiple levels of hash recursion. You can see this for yourself by tracing the Hash Warning event using the Profiler tool. The final sort in the plan also suffers from a similar problem: it receives very little memory and has to perform multiple sort passes, saving intermediate runs to disk (the Sort Warnings Profiler event can be used to confirm this). Notice also that because hash joins don’t preserve sort order, the sort cannot be pushed down the plan toward the #OrdDetail table, as in the nested loops plan. Ok, so now we understand the problems, what can we do to fix it? We can address the hash spilling by forcing a different order for the joins: SELECT P.ProductMainID AS PID, P.Name, D.OrderQty, H.SalesOrderNumber, H.OrderDate, C.TerritoryID FROM #Prods P JOIN #Custs C JOIN #OrdHeader H ON H.CustomerID = C.CustomerID JOIN #OrdDetail D ON D.SalesOrderID = H.SalesOrderID ON P.ProductMainID = D.ProductMainID AND P.ProductSubID = D.ProductSubID AND P.ProductSubSubID = D.ProductSubSubID ORDER BY D.ProductMainID OPTION (MAXDOP 1, HASH JOIN, FORCE ORDER); With this plan, each of the inputs to the hash joins has a good estimate, and no hash recursion occurs. The final sort still suffers from the one-row estimate problem, and we get a single-pass sort warning as it writes rows to disk. Even so, the query runs to completion in three or four seconds. That’s around half the time of the previous hashing solution, but still not as fast as the nested loops trickery. Final Thoughts SQL Server’s optimizer makes cost-based decisions, so it is vital to provide it with accurate information. We can’t really blame the performance problems highlighted here on anything other than the decision to use completely unindexed tables, and not to allow the creation of additional statistics. I should probably stress that the nested loops solution shown above is not one I would normally contemplate in the real world. It’s there primarily for its educational and entertainment value. I might perhaps use it to demonstrate to the sceptical that SQL Server itself is crying out for an index. Be sure to read Brad’s original post for more details. My grateful thanks to him for granting permission to reuse some of his material. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

Read the article

SQL SERVER – Importance of User Without Login – T-SQL Demo Script

- by pinaldave

Earlier I wrote a blog post about SQL SERVER – Importance of User Without Login and my friend and SQL Expert Vinod Kumar has written excellent follow up blog post about Contained Databases inside SQL Server 2012. Now lots of people asked me if I can also explain the same concept again so here is the small demonstration for it. Let me show you how login without user can help. Before we continue on this subject I strongly recommend that you read my earlier blog post here. In following demo I am going to demonstrate following situation. Login using the System Admin account Create a user without login Checking Access Impersonate the user without login Checking Access Revert Impersonation Give Permission to user without login Impersonate the user without login Checking Access Revert Impersonation Clean up USE [AdventureWorks2012] GO -- Step 1 : Login using the SA -- Step 2 : Create Login Less User CREATE USER [testguest] 9ITHOUT LOGIN WITH DEFAULT_SCHEMA=[dbo] GO -- Step 3 : Checking access to Tables SELECT * FROM sys.tables; -- Step 4 : Changing the execution contest EXECUTE AS USER = 'testguest'; GO -- Step 5 : Checking access to Tables SELECT * FROM sys.tables; GO -- Step 6 : Reverting Permissions REVERT; -- Step 7 : Giving more Permissions to testguest user GRANT SELECT ON [dbo].[ErrorLog] TO [testguest]; GRANT SELECT ON [dbo].[DatabaseLog] TO [testguest]; GO -- Step 8 : Changing the execution contest EXECUTE AS USER = 'testguest'; GO -- Step 9 : Checking access to Tables SELECT * FROM sys.tables; GO -- Step 10 : Reverting Permissions REVERT; GO -- Step 11: Clean up DROP USER [testguest]Step 3 GO Here is the step 9 we will be able to notice that how a user without login gets access to some of the data/object which we gave permission. What I am going to prove with this example? Well there can be different rights with different account. Once the login is authenticated it makes sense for impersonating a user with only necessary permissions to be used for further operation. Again this is very basic and fundamental example. There are lots of more points to be discussed as we go in future posts. Just do not take this blog post as a template and implement everything as it is. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Security, SQL Server, SQL Tips and Tricks, T SQL, Technology

Read the article

Fusion Concepts: Fusion Database Schemas

- by Vik Kumar

You often read about FUSION and FUSION_RUNTIME users while dealing with Fusion Applications. There is one more called FUSION_DYNAMIC. Here are some details on the difference between these three and the purpose of each type of schema. FUSION: It can be considered as an Administrator of the Fusion Applications with all the corresponding rights and powers such as owning tables and objects, providing grants to FUSION_RUNTIME. It is used for patching and has grants to many internal DBMS functions. FUSION_RUNTIME: Used to run the Applications. Contains no DB objects. FUSION_DYNAMIC: This schema owns the objects that are created dynamically through ADM_DDL. ADM_DDL is a package that acts as a wrapper around the DDL statement. ADM_DDL support operations like truncate table, create index etc. As the above statements indicate that FUSION owns the tables and objects including FND tables so using FUSION to run applications is insecure. It would be possible to modify security policies and other key information in the base tables (like FND) to break the Fusion Applications security via SQL injection etc. Other possibilities would be to write a logon DB trigger and steal credentials etc. Thus, to make Fusion Applications secure FUSION_RUNTIME is granted privileges to execute DMLs only on APPS tables. Another benefit of having separate users is achieving Separation of Duties (SODs) at schema level which is required by auditors. Below are the roles and privileges assigned to FUSION, FUSION_RUNTIME and FUSION_DYNAMIC schema: FUSION It has the following privileges: Create SESSION Do all types of DDL owned by FUSION. Additionally, some specific priveleges on other schemas is also granted to FUSION. EXECUTE ON various EDN_PUBLISH_EVENT It has the following roles: CTXAPP for managing Oracle Text Objects AQ_SER_ROLE and AQ_ADMINISTRATOR_ROLE for managing Advanced Queues (AQ) FUSION_RUNTIME It has the following privileges: CREATE SESSION CHANGE NOTIFICATION EXECUTE ON various EDN_PUBLISH_EVENT It has the following roles: FUSION_APPS_READ_WRITE for performing DML (Select, Insert, Delete) on Fusion Apps tables FUSION_APPS_EXECUTE for performing execute on objects such as procedures, functions, packages etc. AQ_SER_ROLE and AQ_ADMINISTRATOR_ROLE for managing Advanced Queues (AQ) FUSION_DYNAMIC It has following privileges: CREATE SESSION, PROCEDURE, TABLE, SEQUENCE, SYNONYM, VIEW UNLIMITED TABLESPACE ANALYZE ANY CREATE MINING MODEL EXECUTE on specific procedure, function or package and SELECT on specific tables. This depends on the objects identified by product teams that ADM_DDL needs to have access in order to perform dynamic DDL statements. There is one more role FUSION_APPS_READ_ONLY which is not attached to any user and has only SELECT privilege on all the Fusion objects. FUSION_RUNTIME does not have any synonyms defined to access objects owned by FUSION schema. A logon trigger is defined in FUSION_RUNTIME which sets the current schema to FUSION and eliminates the need of any synonyms. What it means for developers? Fusion Application developers should be using FUSION_RUNTIME for testing and running Fusion Applications UI, BC and to connect to any SQL front end like SQL *PLUS, SQL Loader etc. For testing ADFbc using AM tester while using FUSION_RUNTIME you may hit the following error: oracle.jbo.JboException: JBO-29000: Unexpected exception caught: java.sql.SQLException, msg=invalid name pattern: FUSION.FND_TABLE_OF_VARCHAR2_255 The fix is to add the below JVM parameter in the Run/Debug client property in the Model project properties -Doracle.jdbc.createDescriptorUseCurrentSchemaForSchemaName=true More details are discussed in this forum thread for it.

Read the article

.NETTER Code Starter Pack v1.0.beta Released

- by Mohammad Ashraful Alam

.NETTER Code Starter Pack contains a gallery of Visual Studio 2010 solutions leveraging latest and new technologies released by Microsoft. Each Visual Studio solution included here is focused to provide a very simple starting point for cutting edge development technologies and framework, using well known Northwind database. The current release of this project includes starter samples for the following technologies: ASP.NET Dynamic Data QuickStart (TBD) Azure Service Platform Windows Azure Hello World Windows Azure Storage Simple CRUD Database Scripts Entity Framework 4.0 (TBD) SharePoint 2010 Visual Web Part Linq QuickStart Silverlight Business App Hello World WCF RIA Services QuickStart Utility Framework MEF Moq QuickStart T-4 QuickStart Unity QuickStart WCF WCF Data Services QuickStart WCF Hello World WorkFlow Foundation Web API Facebook Toolkit QuickStart Download link: http://codebox.codeplex.com/releases/view/57382 Technorati Tags: release,new release,asp.net,mef,unity,sharepoint,wcf

Read the article

SQL Monitor’s data repository

- by Chris Lambrou

As one of the developers of SQL Monitor, I often get requests passed on by our support people from customers who are looking to dip into SQL Monitor’s own data repository, in order to pull out bits of information that they’re interested in. Since there’s clearly interest out there in playing around directly with the data repository, I thought I’d write some blog posts to start to describe how it all works. The hardest part for me is knowing where to begin, since the schema of the data repository is pretty big. Hmmm… I guess it’s tricky for anyone to write anything but the most trivial of queries against the data repository without understanding the hierarchy of monitored objects, so perhaps my first post should start there. I always imagine that whenever a customer fires up SSMS and starts to explore their SQL Monitor data repository database, they become immediately bewildered by the schema – that was certainly my experience when I did so for the first time. The following query shows the number of different object types in the data repository schema: SELECT type_desc, COUNT(*) AS [count] FROM sys.objects GROUP BY type_desc ORDER BY type_desc; type_desccount 1DEFAULT_CONSTRAINT63 2FOREIGN_KEY_CONSTRAINT181 3INTERNAL_TABLE3 4PRIMARY_KEY_CONSTRAINT190 5SERVICE_QUEUE3 6SQL_INLINE_TABLE_VALUED_FUNCTION381 7SQL_SCALAR_FUNCTION2 8SQL_STORED_PROCEDURE100 9SYSTEM_TABLE41 10UNIQUE_CONSTRAINT54 11USER_TABLE193 12VIEW124 With 193 tables, 124 views, 100 stored procedures and 381 table valued functions, that’s quite a hefty schema, and when you browse through it using SSMS, it can be a bit daunting at first. So, where to begin? Well, let’s narrow things down a bit and only look at the tables belonging to the data schema. That’s where all of the collected monitoring data is stored by SQL Monitor. The following query gives us the names of those tables: SELECT sch.name + '.' + obj.name AS [name] FROM sys.objects obj JOIN sys.schemas sch ON sch.schema_id = obj.schema_id WHERE obj.type_desc = 'USER_TABLE' AND sch.name = 'data' ORDER BY sch.name, obj.name; This query still returns 110 tables. I won’t show them all here, but let’s have a look at the first few of them: name 1data.Cluster_Keys 2data.Cluster_Machine_ClockSkew_UnstableSamples 3data.Cluster_Machine_Cluster_StableSamples 4data.Cluster_Machine_Keys 5data.Cluster_Machine_LogicalDisk_Capacity_StableSamples 6data.Cluster_Machine_LogicalDisk_Keys 7data.Cluster_Machine_LogicalDisk_Sightings 8data.Cluster_Machine_LogicalDisk_UnstableSamples 9data.Cluster_Machine_LogicalDisk_Volume_StableSamples 10data.Cluster_Machine_Memory_Capacity_StableSamples 11data.Cluster_Machine_Memory_UnstableSamples 12data.Cluster_Machine_Network_Capacity_StableSamples 13data.Cluster_Machine_Network_Keys 14data.Cluster_Machine_Network_Sightings 15data.Cluster_Machine_Network_UnstableSamples 16data.Cluster_Machine_OperatingSystem_StableSamples 17data.Cluster_Machine_Ping_UnstableSamples 18data.Cluster_Machine_Process_Instances 19data.Cluster_Machine_Process_Keys 20data.Cluster_Machine_Process_Owner_Instances 21data.Cluster_Machine_Process_Sightings 22data.Cluster_Machine_Process_UnstableSamples 23… There are two things I want to draw your attention to: The table names describe a hierarchy of the different types of object that are monitored by SQL Monitor (e.g. clusters, machines and disks). For each object type in the hierarchy, there are multiple tables, ending in the suffixes _Keys, _Sightings, _StableSamples and _UnstableSamples. Not every object type has a table for every suffix, but the _Keys suffix is especially important and a _Keys table does indeed exist for every object type. In fact, if we limit the query to return only those tables ending in _Keys, we reveal the full object hierarchy: SELECT sch.name + '.' + obj.name AS [name] FROM sys.objects obj JOIN sys.schemas sch ON sch.schema_id = obj.schema_id WHERE obj.type_desc = 'USER_TABLE' AND sch.name = 'data' AND obj.name LIKE '%_Keys' ORDER BY sch.name, obj.name; name 1data.Cluster_Keys 2data.Cluster_Machine_Keys 3data.Cluster_Machine_LogicalDisk_Keys 4data.Cluster_Machine_Network_Keys 5data.Cluster_Machine_Process_Keys 6data.Cluster_Machine_Services_Keys 7data.Cluster_ResourceGroup_Keys 8data.Cluster_ResourceGroup_Resource_Keys 9data.Cluster_SqlServer_Agent_Job_History_Keys 10data.Cluster_SqlServer_Agent_Job_Keys 11data.Cluster_SqlServer_Database_BackupType_Backup_Keys 12data.Cluster_SqlServer_Database_BackupType_Keys 13data.Cluster_SqlServer_Database_CustomMetric_Keys 14data.Cluster_SqlServer_Database_File_Keys 15data.Cluster_SqlServer_Database_Keys 16data.Cluster_SqlServer_Database_Table_Index_Keys 17data.Cluster_SqlServer_Database_Table_Keys 18data.Cluster_SqlServer_Error_Keys 19data.Cluster_SqlServer_Keys 20data.Cluster_SqlServer_Services_Keys 21data.Cluster_SqlServer_SqlProcess_Keys 22data.Cluster_SqlServer_TopQueries_Keys 23data.Cluster_SqlServer_Trace_Keys 24data.Group_Keys The full object type hierarchy looks like this: Cluster Machine LogicalDisk Network Process Services ResourceGroup Resource SqlServer Agent Job History Database BackupType Backup CustomMetric File Table Index Error Services SqlProcess TopQueries Trace Group Okay, but what about the individual objects themselves represented at each level in this hierarchy? Well that’s what the _Keys tables are for. This is probably best illustrated by way of a simple example – how can I query my own data repository to find the databases on my own PC for which monitoring data has been collected? Like this: SELECT clstr._Name AS cluster_name, srvr._Name AS instance_name, db._Name AS database_name FROM data.Cluster_SqlServer_Database_Keys db JOIN data.Cluster_SqlServer_Keys srvr ON db.ParentId = srvr.Id -- Note here how the parent of a Database is a Server JOIN data.Cluster_Keys clstr ON srvr.ParentId = clstr.Id -- Note here how the parent of a Server is a Cluster WHERE clstr._Name = 'dev-chrisl2' -- This is the hostname of my own PC ORDER BY clstr._Name, srvr._Name, db._Name; cluster_nameinstance_namedatabase_name 1dev-chrisl2SqlMonitorData 2dev-chrisl2master 3dev-chrisl2model 4dev-chrisl2msdb 5dev-chrisl2mssqlsystemresource 6dev-chrisl2tempdb 7dev-chrisl2sql2005SqlMonitorData 8dev-chrisl2sql2005TestDatabase 9dev-chrisl2sql2005master 10dev-chrisl2sql2005model 11dev-chrisl2sql2005msdb 12dev-chrisl2sql2005mssqlsystemresource 13dev-chrisl2sql2005tempdb 14dev-chrisl2sql2008SqlMonitorData 15dev-chrisl2sql2008master 16dev-chrisl2sql2008model 17dev-chrisl2sql2008msdb 18dev-chrisl2sql2008mssqlsystemresource 19dev-chrisl2sql2008tempdb These results show that I have three SQL Server instances on my machine (a default instance, one named sql2005 and one named sql2008), and each instance has the usual set of system databases, along with a database named SqlMonitorData. Basically, this is where I test SQL Monitor on different versions of SQL Server, when I’m developing. There are a few important things we can learn from this query: Each _Keys table has a column named Id. This is the primary key. Each _Keys table has a column named ParentId. A foreign key relationship is defined between each _Keys table and its parent _Keys table in the hierarchy. There are two exceptions to this, Cluster_Keys and Group_Keys, because clusters and groups live at the root level of the object hierarchy. Each _Keys table has a column named _Name. This is used to uniquely identify objects in the table within the scope of the same shared parent object. Actually, that last item isn’t always true. In some cases, the _Name column is actually called something else. For example, the data.Cluster_Machine_Services_Keys table has a column named _ServiceName instead of _Name (sorry for the inconsistency). In other cases, a name isn’t sufficient to uniquely identify an object. For example, right now my PC has multiple processes running, all sharing the same name, Chrome (one for each tab open in my web-browser). In such cases, multiple columns are used to uniquely identify an object within the scope of the same shared parent object. Well, that’s it for now. I’ve given you enough information for you to explore the _Keys tables to see how objects are stored in your own data repositories. In a future post, I’ll try to explain how monitoring data is stored for each object, using the _StableSamples and _UnstableSamples tables. If you have any questions about this post, or suggestions for future posts, just submit them in the comments section below.

Read the article

Currency Conversion in Oracle BI applications

- by Saurabh Verma

Authored by Vijay Aggarwal and Hichem Sellami A typical data warehouse contains Star and/or Snowflake schema, made up of Dimensions and Facts. The facts store various numerical information including amounts. Example; Order Amount, Invoice Amount etc. With the true global nature of business now-a-days, the end-users want to view the reports in their own currency or in global/common currency as defined by their business. This presents a unique opportunity in BI to provide the amounts in converted rates either by pre-storing or by doing on-the-fly conversions while displaying the reports to the users. Source Systems OBIA caters to various source systems like EBS, PSFT, Sebl, JDE, Fusion etc. Each source has its own unique and intricate ways of defining and storing currency data, doing currency conversions and presenting to the OLTP users. For example; EBS stores conversion rates between currencies which can be classified by conversion rates, like Corporate rate, Spot rate, Period rate etc. Siebel stores exchange rates by conversion rates like Daily. EBS/Fusion stores the conversion rates for each day, where as PSFT/Siebel store for a range of days. PSFT has Rate Multiplication Factor and Rate Division Factor and we need to calculate the Rate based on them, where as other Source systems store the Currency Exchange Rate directly. OBIA Design The data consolidation from various disparate source systems, poses the challenge to conform various currencies, rate types, exchange rates etc., and designing the best way to present the amounts to the users without affecting the performance. When consolidating the data for reporting in OBIA, we have designed the mechanisms in the Common Dimension, to allow users to report based on their required currencies. OBIA Facts store amounts in various currencies: Document Currency: This is the currency of the actual transaction. For a multinational company, this can be in various currencies. Local Currency: This is the base currency in which the accounting entries are recorded by the business. This is generally defined in the Ledger of the company. Global Currencies: OBIA provides five Global Currencies. Three are used across all modules. The last two are for CRM only. A Global currency is very useful when creating reports where the data is viewed enterprise-wide. Example; a US based multinational would want to see the reports in USD. The company will choose USD as one of the global currencies. OBIA allows users to define up-to five global currencies during the initial implementation. The term Currency Preference is used to designate the set of values: Document Currency, Local Currency, Global Currency 1, Global Currency 2, Global Currency 3; which are shared among all modules. There are four more currency preferences, specific to certain modules: Global Currency 4 (aka CRM Currency) and Global Currency 5 which are used in CRM; and Project Currency and Contract Currency, used in Project Analytics. When choosing Local Currency for Currency preference, the data will show in the currency of the Ledger (or Business Unit) in the prompt. So it is important to select one Ledger or Business Unit when viewing data in Local Currency. More on this can be found in the section: Toggling Currency Preferences in the Dashboard. Design Logic When extracting the fact data, the OOTB mappings extract and load the document amount, and the local amount in target tables. It also loads the exchange rates required to convert the document amount into the corresponding global amounts. If the source system only provides the document amount in the transaction, the extract mapping does a lookup to get the Local currency code, and the Local exchange rate. The Load mapping then uses the local currency code and rate to derive the local amount. The load mapping also fetches the Global Currencies and looks up the corresponding exchange rates. The lookup of exchange rates is done via the Exchange Rate Dimension provided as a Common/Conforming Dimension in OBIA. The Exchange Rate Dimension stores the exchange rates between various currencies for a date range and Rate Type. Two physical tables W_EXCH_RATE_G and W_GLOBAL_EXCH_RATE_G are used to provide the lookups and conversions between currencies. The data is loaded from the source system’s Ledger tables. W_EXCH_RATE_G stores the exchange rates between currencies with a date range. On the other hand, W_GLOBAL_EXCH_RATE_G stores the currency conversions between the document currency and the pre-defined five Global Currencies for each day. Based on the requirements, the fact mappings can decide and use one or both tables to do the conversion. Currency design in OBIA also taps into the MLS and Domain architecture, thus allowing the users to map the currencies to a universal Domain during the implementation time. This is especially important for companies deploying and using OBIA with multiple source adapters. Some Gotchas to Look for It is necessary to think through the currencies during the initial implementation. 1) Identify various types of currencies that are used by your business. Understand what will be your Local (or Base) and Documentation currency. Identify various global currencies that your users will want to look at the reports. This will be based on the global nature of your business. Changes to these currencies later in the project, while permitted, but may cause Full data loads and hence lost time. 2) If the user has a multi source system make sure that the Global Currencies and Global Rate Types chosen in Configuration Manager do have the corresponding source specific counterparts. In other words, make sure for every DW specific value chosen for Currency Code or Rate Type, there is a source Domain mapping already done. Technical Section This section will briefly mention the technical scenarios employed in the OBIA adaptors to extract data from each source system. In OBIA, we have two main tables which store the Currency Rate information as explained in previous sections. W_EXCH_RATE_G and W_GLOBAL_EXCH_RATE_G are the two tables. W_EXCH_RATE_G stores all the Currency Conversions present in the source system. It captures data for a Date Range. W_GLOBAL_EXCH_RATE_G has Global Currency Conversions stored at a Daily level. However the challenge here is to store all the 5 Global Currency Exchange Rates in a single record for each From Currency. Let’s voyage further into the Source System Extraction logic for each of these tables and understand the flow briefly. EBS: In EBS, we have Currency Data stored in GL_DAILY_RATES table. As the name indicates GL_DAILY_RATES EBS table has data at a daily level. However in our warehouse we store the data with a Date Range and insert a new range record only when the Exchange Rate changes for a particular From Currency, To Currency and Rate Type. Below are the main logical steps that we employ in this process. (Incremental Flow only) – Cleanup the data in W_EXCH_RATE_G. Delete the records which have Start Date > minimum conversion date Update the End Date of the existing records. Compress the daily data from GL_DAILY_RATES table into Range Records. Incremental map uses $$XRATE_UPD_NUM_DAY as an extra parameter. Generate Previous Rate, Previous Date and Next Date for each of the Daily record from the OLTP. Filter out the records which have Conversion Rate same as Previous Rates or if the Conversion Date lies within a single day range. Mark the records as ‘Keep’ and ‘Filter’ and also get the final End Date for the single Range record (Unique Combination of From Date, To Date, Rate and Conversion Date). Filter the records marked as ‘Filter’ in the INFA map. The above steps will load W_EXCH_RATE_GS. Step 0 updates/deletes W_EXCH_RATE_G directly. SIL map will then insert/update the GS data into W_EXCH_RATE_G. These steps convert the daily records in GL_DAILY_RATES to Range records in W_EXCH_RATE_G. We do not need such special logic for loading W_GLOBAL_EXCH_RATE_G. This is a table where we store data at a Daily Granular Level. However we need to pivot the data because the data present in multiple rows in source tables needs to be stored in different columns of the same row in DW. We use GROUP BY and CASE logic to achieve this. Fusion: Fusion has extraction logic very similar to EBS. The only difference is that the Cleanup logic that was mentioned in step 0 above does not use $$XRATE_UPD_NUM_DAY parameter. In Fusion we bring all the Exchange Rates in Incremental as well and do the cleanup. The SIL then takes care of Insert/Updates accordingly. PeopleSoft:PeopleSoft does not have From Date and To Date explicitly in the Source tables. Let’s look at an example. Please note that this is achieved from PS1 onwards only. 1 Jan 2010 – USD to INR – 45 31 Jan 2010 – USD to INR – 46 PSFT stores records in above fashion. This means that Exchange Rate of 45 for USD to INR is applicable for 1 Jan 2010 to 30 Jan 2010. We need to store data in this fashion in DW. Also PSFT has Exchange Rate stored as RATE_MULT and RATE_DIV. We need to do a RATE_MULT/RATE_DIV to get the correct Exchange Rate. We generate From Date and To Date while extracting data from source and this has certain assumptions: If a record gets updated/inserted in the source, it will be extracted in incremental. Also if this updated/inserted record is between other dates, then we also extract the preceding and succeeding records (based on dates) of this record. This is required because we need to generate a range record and we have 3 records whose ranges have changed. Taking the same example as above, if there is a new record which gets inserted on 15 Jan 2010; the new ranges are 1 Jan to 14 Jan, 15 Jan to 30 Jan and 31 Jan to Next available date. Even though 1 Jan record and 31 Jan have not changed, we will still extract them because the range is affected. Similar logic is used for Global Exchange Rate Extraction. We create the Range records and get it into a Temporary table. Then we join to Day Dimension, create individual records and pivot the data to get the 5 Global Exchange Rates for each From Currency, Date and Rate Type. Siebel: Siebel Facts are dependent on Global Exchange Rates heavily and almost none of them really use individual Exchange Rates. In other words, W_GLOBAL_EXCH_RATE_G is the main table used in Siebel from PS1 release onwards. As of January 2002, the Euro Triangulation method for converting between currencies belonging to EMU members is not needed for present and future currency exchanges. However, the method is still available in Siebel applications, as are the old currencies, so that historical data can be maintained accurately. The following description applies only to historical data needing conversion prior to the 2002 switch to the Euro for the EMU member countries. If a country is a member of the European Monetary Union (EMU), you should convert its currency to other currencies through the Euro. This is called triangulation, and it is used whenever either currency being converted has EMU Triangulation checked. Due to this, there are multiple extraction flows in SEBL ie. EUR to EMU, EUR to NonEMU, EUR to DMC and so on. We load W_EXCH_RATE_G through multiple flows with these data. This has been kept same as previous versions of OBIA. W_GLOBAL_EXCH_RATE_G being a new table does not have such needs. However SEBL does not have From Date and To Date columns in the Source tables similar to PSFT. We use similar extraction logic as explained in PSFT section for SEBL as well. What if all 5 Global Currencies configured are same? As mentioned in previous sections, from PS1 onwards we store Global Exchange Rates in W_GLOBAL_EXCH_RATE_G table. The extraction logic for this table involves Pivoting data from multiple rows into a single row with 5 Global Exchange Rates in 5 columns. As mentioned in previous sections, we use CASE and GROUP BY functions to achieve this. This approach poses a unique problem when all the 5 Global Currencies Chosen are same. For example – If the user configures all 5 Global Currencies as ‘USD’ then the extract logic will not be able to generate a record for From Currency=USD. This is because, not all Source Systems will have a USD->USD conversion record. We have _Generated mappings to take care of this case. We generate a record with Conversion Rate=1 for such cases. Reusable Lookups Before PS1, we had a Mapplet for Currency Conversions. In PS1, we only have reusable Lookups- LKP_W_EXCH_RATE_G and LKP_W_GLOBAL_EXCH_RATE_G. These lookups have another layer of logic so that all the lookup conditions are met when they are used in various Fact Mappings. Any user who would want to do a LKP on W_EXCH_RATE_G or W_GLOBAL_EXCH_RATE_G should and must use these Lookups. A direct join or Lookup on the tables might lead to wrong data being returned. Changing Currency preferences in the Dashboard: In the 796x series, all amount metrics in OBIA were showing the Global1 amount. The customer needed to change the metric definitions to show them in another Currency preference. Project Analytics started supporting currency preferences since 7.9.6 release though, and it published a Tech note for other module customers to add toggling between currency preferences to the solution. List of Currency Preferences Starting from 11.1.1.x release, the BI Platform added a new feature to support multiple currencies. The new session variable (PREFERRED_CURRENCY) is populated through a newly introduced currency prompt. This prompt can take its values from the xml file: userpref_currencies_OBIA.xml, which is hosted in the BI Server installation folder, under :< home>\instances\instance1\config\OracleBIPresentationServicesComponent\coreapplication_obips1\userpref_currencies.xml This file contains the list of currency preferences, like“Local Currency”, “Global Currency 1”,…which customers can also rename to give them more meaningful business names. There are two options for showing the list of currency preferences to the user in the dashboard: Static and Dynamic. In Static mode, all users will see the full list as in the user preference currencies file. In the Dynamic mode, the list shown in the currency prompt drop down is a result of a dynamic query specified in the same file. Customers can build some security into the rpd, so the list of currency preferences will be based on the user roles…BI Applications built a subject area: “Dynamic Currency Preference” to run this query, and give every user only the list of currency preferences required by his application roles. Adding Currency to an Amount Field When the user selects one of the items from the currency prompt, all the amounts in that page will show in the Currency corresponding to that preference. For example, if the user selects “Global Currency1” from the prompt, all data will be showing in Global Currency 1 as specified in the Configuration Manager. If the user select “Local Currency”, all amount fields will show in the Currency of the Business Unit selected in the BU filter of the same page. If there is no particular Business Unit selected in that filter, and the data selected by the query contains amounts in more than one currency (for example one BU has USD as a functional currency, the other has EUR as functional currency), then subtotals will not be available (cannot add USD and EUR amounts in one field), and depending on the set up (see next paragraph), the user may receive an error. There are two ways to add the Currency field to an amount metric: In the form of currency code, like USD, EUR…For this the user needs to add the field “Apps Common Currency Code” to the report. This field is in every subject area, usually under the table “Currency Tag” or “Currency Code”… In the form of currency symbol ($ for USD, € for EUR,…) For this, the user needs to format the amount metrics in the report as a currency column, by specifying the currency tag column in the Column Properties option in Column Actions drop down list. Typically this column should be the “BI Common Currency Code” available in every subject area. Select Column Properties option in the Edit list of a metric. In the Data Format tab, select Custom as Treat Number As. Enter the following syntax under Custom Number Format: [$:currencyTagColumn=Subjectarea.table.column] Where Column is the “BI Common Currency Code” defined to take the currency code value based on the currency preference chosen by the user in the Currency preference prompt.

Read the article

re-enabling a table for mysql replication

- by jessieE

We were able to setup mysql master-slave replication with the following version on both master/slave: mysqld Ver 5.5.28-29.1-log for Linux on x86_64 (Percona Server (GPL), Release 29.1) One day, we noticed that replication has stopped, we tried skipping over the entries that caused the replication errors. The errors persisted so we decided to skip replication for the 4 problematic tables. The slave has now caught up with the master except for the 4 tables. What is the best way to enable replication again for the 4 tables? This is what I have in mind but I don't know if it will work: 1) Modify slave config to enable replication again for the 4 tables 2) stop slave replication 3) for each of the 4 tables, use pt-table-sync --execute --verbose --print --sync-to-master h=localhost,D=mydb,t=mytable 4) restart slave database to reload replication configuration 5) start slave replication

Read the article

BPM 11g and Human Workflow Shadow Rows by Adam Desjardin

- by JuergenKress

During the OFM Forum last week, there were a few discussions around the relationship between the Human Workflow (WF_TASK*) tables in the SOA_INFRA schema and BPMN processes. It is important to know how these are related because it can have a performance impact. We have seen this performance issue several times when BPMN processes are used to model high volume system integrations without knowing all of the implications of using BPMN in this pattern. Most people assume that BPMN instances and their related data are stored in the CUBE_*, DLV_*, and AUDIT_* tables in the same way that BPEL instances are stored, with additional data in the BPM_* tables as well. The group of tables that is not usually considered though is the WF* tables that are used for Human Workflow. The WFTASK table is used by all BPMN processes in order to support features such as process level comments and attachments, whether those features are currently used in the process or not. For a standard human task that is created from a BPMN process, the following data is stored in the WFTASK table: One row per human task that is created The COMPONENTTYPE = "Workflow" TASKDEFINITIONID = Human Task ID (partition/CompositeName!Version/TaskName) ACCESSKEY = NULL Read the complete article here. SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Facebook Wiki

Read the article

Why is String Templating Better Than String Concatenation from an Engineering Perspective?

- by stephen

I once read (I think it was in "Programming Pearls") that one should use templates instead of building the string through the use of concatenation. For example, consider the template below (using C# razor library) <in a properties file> Browser Capabilities Type = @Model.Type Name = @Model.Browser Version = @Model.Version Supports Frames = @Model.Frames Supports Tables = @Model.Tables Supports Cookies = @Model.Cookies Supports VBScript = @Model.VBScript Supports Java Applets = @Model.JavaApplets Supports ActiveX Controls = @Model.ActiveXControls and later, in a separate code file private void Button1_Click(object sender, System.EventArgs e) { BrowserInfoTemplate = Properties.Resources.browserInfoTemplate; // see above string browserInfo = RazorEngine.Razor.Parse(BrowserInfoTemplate, browser); ... } From a software engineering perspective, how is this better than an equivalent string concatentation, like below: private void Button1_Click(object sender, System.EventArgs e) { System.Web.HttpBrowserCapabilities browser = Request.Browser; string s = "Browser Capabilities\n" + "Type = " + browser.Type + "\n" + "Name = " + browser.Browser + "\n" + "Version = " + browser.Version + "\n" + "Supports Frames = " + browser.Frames + "\n" + "Supports Tables = " + browser.Tables + "\n" + "Supports Cookies = " + browser.Cookies + "\n" + "Supports VBScript = " + browser.VBScript + "\n" + "Supports JavaScript = " + browser.EcmaScriptVersion.ToString() + "\n" + "Supports Java Applets = " + browser.JavaApplets + "\n" + "Supports ActiveX Controls = " + browser.ActiveXControls + "\n" ... }

Read the article

Entity Framework with large systems - how to divide models?

- by jkohlhepp

I'm working with a SQL Server database with 1000+ tables, another few hundred views, and several thousand stored procedures. We are looking to start using Entity Framework for our newer projects, and we are working on our strategy for doing so. The thing I'm hung up on is how best to split the tables into different models (EDMX or DbContext if we go code first). I can think of a few strategies right off the bat: Split by schema We have our tables split across probably a dozen schemas. We could do one model per schema. This isn't perfect, though, because dbo still ends up being very large, with 500+ tables / views. Another problem is that certain units of work will end up having to do transactions that span multiple models, which adds to complexity, although I assume EF makes this fairly straightforward. Split by intent Instead of worrying about schemas, split the models by intent. So we'll have different models for each application, or project, or module, or screen, depending on how granular we want to get. The problem I see with this is that there are certain tables that inevitably have to be used in every case, such as User or AuditHistory. Do we add those to every model (violates DRY I think), or are those in a separate model that is used by every project? Don't split at all - one giant model This is obviously simple from a development perspective but from my research and my intuition this seems like it could perform terribly, both at design time, compile time, and possibly run time. What is the best practice for using EF against such a large database? Specifically what strategies do people use in designing models against this volume of DB objects? Are there options that I'm not thinking of that work better than what I have above? Also, is this a problem in other ORMs such as NHibernate? If so have they come up with any better solutions than EF?

Read the article

Big Data – Data Mining with Hive – What is Hive? – What is HiveQL (HQL)? – Day 15 of 21

- by Pinal Dave

In yesterday’s blog post we learned the importance of the operational database in Big Data Story. In this article we will understand what is Hive and HQL in Big Data Story. Yahoo started working on PIG (we will understand that in the next blog post) for their application deployment on Hadoop. The goal of Yahoo to manage their unstructured data. Similarly Facebook started deploying their warehouse solutions on Hadoop which has resulted in HIVE. The reason for going with HIVE is because the traditional warehousing solutions are getting very expensive. What is HIVE? Hive is a datawarehouseing infrastructure for Hadoop. The primary responsibility is to provide data summarization, query and analysis. It supports analysis of large datasets stored in Hadoop’s HDFS as well as on the Amazon S3 filesystem. The best part of HIVE is that it supports SQL-Like access to structured data which is known as HiveQL (or HQL) as well as big data analysis with the help of MapReduce. Hive is not built to get a quick response to queries but it it is built for data mining applications. Data mining applications can take from several minutes to several hours to analysis the data and HIVE is primarily used there. HIVE Organization The data are organized in three different formats in HIVE. Tables: They are very similar to RDBMS tables and contains rows and tables. Hive is just layered over the Hadoop File System (HDFS), hence tables are directly mapped to directories of the filesystems. It also supports tables stored in other native file systems. Partitions: Hive tables can have more than one partition. They are mapped to subdirectories and file systems as well. Buckets: In Hive data may be divided into buckets. Buckets are stored as files in partition in the underlying file system. Hive also has metastore which stores all the metadata. It is a relational database containing various information related to Hive Schema (column types, owners, key-value data, statistics etc.). We can use MySQL database over here. What is HiveSQL (HQL)? Hive query language provides the basic SQL like operations. Here are few of the tasks which HQL can do easily. Create and manage tables and partitions Support various Relational, Arithmetic and Logical Operators Evaluate functions Download the contents of a table to a local directory or result of queries to HDFS directory Here is the example of the HQL Query: SELECT upper(name), salesprice FROM sales; SELECT category, count(1) FROM products GROUP BY category; When you look at the above query, you can see they are very similar to SQL like queries. Tomorrow In tomorrow’s blog post we will discuss about very important components of the Big Data Ecosystem – Pig. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

Read the article

Blueprints for Oracle NoSQL Database

- by dan.mcclary

I think that some of the most interesting analytic problems are graph problems. I'm always interested in new ways to store and access graphs. As such, I really like the work being done by Tinkerpop to create Open Source Software to make property graphs more accessible over a wide variety of datastores. Since key-value stores like Oracle NoSQL Database are well-suited to storing property graphs, I decided to extend the Blueprints API to work with it. Below I'll discuss some of the implementation details, but you can check out the finished product here: http://github.com/dwmclary/blueprints-oracle-nosqldb. What's in a Property Graph? In the most general sense, a graph is just a collection of vertices and edges. Vertices and edges can have properties: weights, names, or any number of other traits. In an undirected graph, edges connect vertices without direction. A directed graph specifies that all edges have a head and a tail --- a direction. A multi-graph allows multiple edges to connect two vertices. A "property graph" encompasses all of these traits. Key-Value Stores for Property Graphs Key-Value stores like Oracle NoSQL Database tend to be ideal for implementing property graphs. First, if any vertex or edge can have any number of traits, we can treat it as a hash map. For example: Vertex["name"] = "Mary" Vertex["age"] = 28 Vertex["ID"] = 12345 and so on. This is a natural key-value relationship: the key "name" maps to the value "Mary." Moreover if we maintain two hash maps, one for vertex objects and one for edge objects, we've essentially captured the graph. As such, any scalable key-value store is fertile ground for planting graphs. Oracle NoSQL Database as a Scalable Graph Database While Oracle NoSQL Database offers useful features like tunable consistency, what lends it to storing property graphs is the storage guarantees around its key structure. Keys in Oracle NoSQL Database are divided into two parts: a major key and a minor key. The storage guarantee is simple. Major keys will be distributed across storage nodes, which could encompass a large number of servers. However, all minor keys which are children of a given major key are guaranteed to be stored on the same storage node. For example, the vertices: /Personnel/Vertex/1 and /Personnel/Vertex/2 May be stored on different servers, but /Personnel/Vertex/1-/name and /Personnel/Vertex/1-/age will always be on the same server. This means that we can structure our graph database such that retrieving all the properties for a vertex or edge requires I/O from only a single storage node. Moreover, Oracle NoSQL Database provides a storeIterator which allows us to store a huge number of vertices and edges in a scalable fashion. By storing the vertices and edges as major keys, we guarantee that they are distributed evenly across all storage nodes. At the same time we can use a partial major key to iterate over all the vertices or edges (e.g. we search over /Personnel/Vertex to iterate over all vertices). Fork It! The Blueprints API and Oracle NoSQL Database present a great way to get started using a scalable key-value database to store and access graph data. However, a graph store isn't useful without a good graph to work on. I encourage you to fork or pull the repository, store some data, and try using Gremlin or any other language to explore.

Read the article

Has anyone tried using Google Wave for collaborative open-source projects?

- by Platinum Azure

The title says it all, really... Is Google Wave remotely feasible for open-source code projects, and has anyone tried it at all? I'm thinking not at this point, but I'm curious as to the thoughts of the community.

Read the article

Alternatives for comparing data from different databases

- by Alex

I have two huge tables on separate databases. One of them has the information of all the SMS that passed through the company's servers while the other one has the information of the actual billing of those SMS. My job is to compare samples of both of these tables (for example, the records between 1 and 2 pm) to see if there are any differences: SMS that were sent but not charged to the user for whatever reason that may be happening. The columns I will be using to compare are the remitent's phone number and the exact date the SMS was sent. An issue here is that dates usually are the same on both sides, but in many cases differ by 1 or 2 seconds. I have, so far, two alternatives to do this: (PL/SQL) Create two tables where i'm going to temporarily store all the records of that 1hour sample. One for each of the main tables. Then, for each distinct phone number, select the time of every SMS sent from that phone from both my temporary tables and start comparing one by one using cursors. In this case, the procedure would be ran on the server where one of the sources is so the contents of the other one would be looked up using a dblink. (sqlplus + c++) Instead of storing the 1hour samples in new tables, output the query to a text file. I will have two text files, one for each source. Then, open the first file and load all of it's content on a hash_map (key-value) using c++, where the key will be the phone number and the value a list of times of SMS sent from that phone. Finally, open the second file, grab each line (in this format: numberX timeX), look for numberX's entry on the hash_map (wich will be a list of times) and then check if timeX is on that list. If it isn't, save it somewhere to finally store it on a "uncharged" table (this would also be the final step on case 1) My main concern is efficiency. These samples have about 2 million records on each source, so just grabbing one record on one side and looking it up on the other would not be possible. That's the reason I wanted to use hash_maps Which do you think is a better option?

Read the article

Oracle 10.2.0.1 --> 10.2.0.4 patchset errors on Advanced Queuing tables. Serious or not?

- by hurfdurf

We're running Oracle on RHEL 5.4 64-bit. We recently did an upgrade from 10.2.0.1 to 10.2.0.4. Many errors were generated during the upgrade (sample listed below from trace.log) but during application testing afterward everything seemed fine (clean EXP, inserts, updates, deletes, etc.). The errors look like they are all related to Advanced Queuing tables and views. We are not using replication at all, this is a simple single instance db. ORA-24002: QUEUE_TABLE SYS.AQ_EVENT_TABLE does not exist ORA-24032: object AQ$_AQ_SRVNTFN_TABLE_T exists, index could not be created ORA-24032: object AQ$_ALERT_QT_S exists, index could not be created for queue ORA-06512: at "SYS.DBMS_AQADM_SYSCALLS", line 117 ORA-06512: at "SYS.DBMS_AQADM_SYS", line 5116 Is this worth worrying about, and if so, how do I go about cleaning up/recreating the corrupted and/or missing objects?

Read the article

SQL Down Under Show 51 - Guest Conor Cunningham - Now online

- by Greg Low

Late last night I got to record an interview with Conor Cunningham.Most people that know Conor have come across him as the product team wizard that knows so much about query processing and optimization in SQL Server. Conor is currently spending quite a lot of time working on Windows Azure SQL Database, which we used to know as SQL Azure. I'm still trying to think of a good way to say "WASD". I suppose I'll pronounce it like "wassid". Windows Azure SQL Reporting is easier. I think it just needs to be pronounced like "wazza" with a very Australian accent.In the show, we've spent time on the current state of the platform, on dispelling a number of common misbeliefs about the product, and hopefully on answering most of the common questions that seem to get asked about it. We then ventured into Federations, Data Sync, and Reporting.You'll find the show (and previous shows) here: http://www.sqldownunder.com/Resources/Podcast.aspxEnjoy!PS: For those that like transcripts, we've got the process for producing them much improved now and the transcript should also be up within a few days.

Read the article

How to securely store and update backup on remote server via ssh/rsync

- by Sergey P. aka azure

I have about 200 Gb of pictures (let's say about 1 mb/file, 200k files) on my desktop. I have access (including root access) to remote linux server. And I want to have updateable backup of my pictures on remote server. rsync seems to be the right tool for such kind of job. But other people also have access (including root access) to this server and I want to keep my pictures private. So the question is: what is the best way to keep private files on remote "shared" linux server securely?

Search Results

Search found 14693 results on 588 pages for 'azure storage tables'.

Page 158/588 | < Previous Page | 154 155 156 157 158 159 160 161 162 163 164 165 | Next Page >

- by JobGovernor

- by significance

- by Adeesh Fulay

- by Gena Verdel

- by Amir Javanshir

- by user12244672

- by mortezam

- by Paul White NZ

- by pinaldave

- by Vik Kumar

- by Mohammad Ashraful Alam

- by Chris Lambrou

- by Saurabh Verma

- by jessieE

- by JuergenKress

- by stephen

- by jkohlhepp

- by Pinal Dave

- by dan.mcclary

- by Platinum Azure

- by Alex

- by hurfdurf

- by Greg Low

- by Sergey P. aka azure

< Previous Page | 154 155 156 157 158 159 160 161 162 163 164 165 | Next Page >