Search Results

Search found 41365 results on 1655 pages for 'oracle database 11g advan'.

Page 361/1655 | < Previous Page | 357 358 359 360 361 362 363 364 365 366 367 368 | Next Page >

New Packt Books: APEX & JRockit

- by [email protected]

I have received these 2 ebooks from Packt Publkishing and I am currently reviewing them. Both of them look great so far. Oracle Application Express 3.2 - The Essentials and More First of all, I have to mention that I am new to APEX. I was interested on this product which is a development tool for Web applications on the Oracle Database. As I support JDeveloper and ADF, which are products that work very closely with the Oracle Database and are a rapid development tool as well, it is always interesting and useful to know complementary tools. APEX looks very useful and the book includes many working examples. A more complete review of this book is coming soon. Further information about this book can be seen at Packt. Oracle JRockit: The Definitive Guide Many of our Oracle Coherence customers run their caches and clusters using JRockit. This JVM has helped us to solve lots of Service Requests. It is a really reliable, fast and stable JVM. It works great on both development and production environments with big amounts of data, concurrency, multi-threading and many other factors that can make a JVM crash. I must also mention JRockit Mission Control (JRMC), which is a great tool for management and monitoring. I really recommend it. As a matter of fact, some months ago, I created a document entitled "How to Monitor Coherence-Based Applications using JRockit Mission Control" (Doc Id 961617.1) on My Oracle Support. Also, the JRockit Runtime Analyzer (JRA) and it successor of newer versions, the JRockit Flight Recorder (JFR) are deeply reviewed. This book contains very clear and complete information about all this and more. I will post an entry with a more complete review soon (and will probably post an entry about Coherence monitoring with JRMC soon too). Further information about this book can be seen at Packt.

Read the article
BI Applications 7.9.6.3 and EBS 12.1.3 Vision: Integrated Demo Environments

- by Mike.Hallett(at)Oracle-BI&EPM

If you need a combined BI-Applications + eBusiness Suite Applications demonstration environment, or for proof of concept work for your customers, then these versions of images on Oracle Virtual Box are now available for partners to download and use. To get access to these images, Partners must be OPN members, specialised in OBI or BI-Apps. This is an integrated Demo/Test Drive/POC/Self Enablement environment including two separate images (in English) representing the entire Oracle Stack – Applications, Middleware, Database, Operating System and Virtual Machine. Minimum Hardware requirements for each image to run separately 4GB RAM Minimum Hardware requirements for both images to run concurrently 8GB RAM Dual CPU 64 bit OS BI Applications 7.9.6.3 Linux based and running on Oracle Virtual Box and compatible with OVM Image Content: BI Application Analytics demo data extracted from EBS 12.1.3 Vision for Financials and HR using EBS 12.1.3 Vision (image supplied) Built Integration to EBS 12.1.3 Vision image (provided). Fully functional BI Applications 7.9.6.3 software install and configuration Image can be connected to load any data from any other compatible source system. BI Apps Demo data is based on OOTB EBS Vision 12.1.3 Configured to run BI Apps data load for all other modules of EBS 12.1.3 Vision. Includes OBIEE Sample demonstration content Documented scripts for running presentations, demonstrations and Test Drives Image Size: 34GB zip, 84GB unzip. Min Hardware 4GB RAM EBS Vision 12.1.3 Linux based and running on Oracle Virtual Box and compatible with OVM Image Content: eBusiness Suite (EBS) Applications Vision 12.1.3 Standard Vision instance with all given setups, configurations and data Source system for BI Apps 7.9.6.3 Image Size: 76GB zip, 300GB unzip. Min Hardware 4GB RAM Distribution: The Virtual Box images are posted on an external FTP server @ BI Applications 7.9.6.3 EBS12.1.3 To download, Partners need to request the current password to access the images. To request the current ftp.oracle.com password and the password required to unzip the images, please email Marek Winiarski Support Contact = Marek Winiarski: Oracle Partner Solution Consultant

Read the article
NRF Week - Disney Store Tour

- by sarah.taylor(at)oracle.com

Disney has created a real buzz at this year's NRF event. Yesterday morning we began the Oracle Retail Exchange program with a visit to the flagship Disney store in Times Square. Additionally Oracle made a key announcement with Disney on Oracle Retail's Point of Sale implementation in 330 stores worldwide. Today Disney's Steve Finney gave a super session on The Magic of Disney at the NRF Big Show. We also saw Disney making an exclusive news announcement about their plans for Global store openings at the Oracle trade show stand - with a little help from Mickey and Minnie Mouse. Disney Stores have been entirely reinvented since the company in 2008 took ownership after previously franchising the retail arm of the business. They have subsequently been a strong Oracle partner and technology has played a key role in their re imagination of the store environment. The new Imagination stores have a 20% higher footfall and margins are up 25%. The Disney brand is synonymous with magical and memorable experiences for children of all ages. The company is achieving a unique retail experience that delights children and shareholders alike! Technology is a key pillar in helping to deliver on both a strong operating model and a unique customer experience - the best thirty minutes in a child's day is their aim. Steve Finney this morning said their technology has to be as reliable as a theme park ride. Store experiences are much more enjoyable when there are short waiting times and children can interact with their favourite characters through magic mirrors, mobile point of sale, touch screens and custom animations that are digitally transmitted to stores globally. The Oracle Retail Point of Sale with iPad touch screens reduces check out times, stores customer data, ensures that promotions are delivered accurately and reduces losses. This means higher levels of guest conversion, increased availability and convenience for customers who want to check availability at other locations. Disney is a pioneer. At NRF's 100th show, we had the privilege of learning from a retailer using technology as a creative force to drive their business forward.

Read the article
Use Enterprise Manager Cloud Control to monitor OBIEE 11.1.1.7.x Dashboards

- by Torben Hein -Oracle

(in via Senthil ) If your OBIEE 11.1.1.7.x is set up in the following way: The OBIEE repository is an Oracle Database and is set up as a data warehouse Usage tracking is enabled in OBIEE. ( For information on how to enable usage tracking in OBIEE, refer to the following link: Setting Up Usage Tracking in Oracle BI 11g ) The OBIEE instance is discovered in EM Cloud Control. ( For information on how to discover an OBIEE instance in Cloud Control, refer to the following link: Discovering Oracle Business Intelligence Instance and Oracle Essbase Targets ) The OBIEE repository is discovered in EM Cloud Control. ( For information on how to discover an Oracle database, refer to the following link: Discovering, Promoting, and Adding Database Targets ) then we've got news for you: KM Article: OBIEE 11g: How To Diagnose Slowly Performing Dashboards using Enterprise Manager Cloud Control (Doc ID 1668236.1) takes you step by step through monitoring the SQL query performance behind your OBIEE dashboard. This Diagnostic approach ... .. will help you piece together information on BI dashboard performance, e.g. processing time from the different layers of the BI system including the repository. .. should enable you to get to the bottom of slow dashboards by using the wealth of information available in EM Cloud Control on OBIEE and Oracle DB. .. will NOT fix any performance issues on its own, but will help identify bottlenecks while processing dashboard requests. (layout and post: Torben, authorized: Lia)

Read the article
Patch Set Update: Hyperion Essbase 11.1.2.3.502

- by Paul Anderson -Oracle

A Patch Set Update (PSU) for Oracle Hyperion Essbase 11.1.2.3.x . The PSU downloads are from the My Oracle Support | Patches & Updates section. Hyperion Essbase Server 11.1.2.3.502 Patch 18950479: Essbase Server Hyperion Essbase Client 11.2.3.502 Patch 18950453: Essbase RTC Patch 18950474: Essbase Client Patch 18950482: Essbase MSI Hyperion Essbase Administration Services (EAS) 11.1.2.3.502 Patch 17767626: Essbase Server Patch 17767628: Essbase Console MSI Hyperion Analytic Provider Services (APS) 11.1.2.3.502 Patch 18907738: APS Services Hyperion Essbase Studio 11.1.2.3.502 Patch 18907980: Essbase Studio Server Patch 18907987: Essbase Studio Console MSI Refer to the Readme file prior to proceeding with this PSU implementation for important information that includes a full list of the defects fixed, along with additional support information, prerequisites, details for applying patch and troubleshooting FAQ's. It is important to ensure that the requirements and support paths to this patch are met as outlined within the Readme file. The Readme file is available from the Patches & Updates download screen. To locate the latest Essbase Patch Sets and Patch Set Updates at anytime visit the My Oracle Support (MOS) Knowledge Article: Available Patch Sets and Patch Set Updates for Oracle Hyperion Essbase Doc ID 1396084.1 Why not share your experience about installing this patch ... In the MOS | Patches & Updates screen simply click the "Start a Discussion" and submit your review. The patch install reviews and other patch related information is available within the My Oracle Support Communities. Visit the Oracle Hyperion EPM sub-space: Hyperion Patch Reviews

Read the article
EM12c: Using the LIST verb in emcli

- by SubinDaniVarughese

Many of us who use EM CLI to write scripts and automate our daily tasks should not miss out on the new list verb released with Oracle Enterprise Manager 12.1.0.3.0. The combination of list and Jython based scripting support in EM CLI makes it easier to achieve automation for complex tasks with just a few lines of code. Before I jump into a script, let me highlight the key attributes of the list verb and why it’s simply excellent! 1. Multiple resources under a single verb:A resource can be set of users or targets, etc. Using the list verb, you can retrieve information about a resource from the repository database.Here is an example which retrieves the list of administrators within EM.Standard mode$ emcli list -resource="Administrators" Interactive modeemcli>list(resource="Administrators")The output will be the same as standard mode.Standard mode$ emcli @myAdmin.pyEnter password : ******The output will be the same as standard mode.Contents of myAdmin.py scriptlogin()print list(resource="Administrators",jsonout=False).out()To get a list of all available resources use$ emcli list -helpWith every release of EM, more resources are being added to the list verb. If you have a resource which you feel would be valuable then go ahead and contact Oracle Support to log an enhancement request with product development. Be sure to say how the resource is going to help improve your daily tasks. 2. Consistent Formatting:It is possible to format the output of any resource consistently using these options: –column This option is used to specify which columns should be shown in the output. Here is an example which shows the list of administrators and their account status$ emcli list -resource="Administrators" -columns="USER_NAME,REPOS_ACCOUNT_STATUS" To get a list of columns in a resource use:$ emcli list -resource="Administrators" -help You can also specify the width of the each column. For example, here the column width of user_type is set to 20 and department to 30. $ emcli list -resource=Administrators -columns="USER_NAME,USER_TYPE:20,COST_CENTER,CONTACT,DEPARTMENT:30"This is useful if your terminal is too small or you need to fine tune a list of specific columns for your quick use or improved readability. –colsize This option is used to resize column widths.Here is the same example as above, but using -colsize to define the width of user_type to 20 and department to 30.$ emcli list -resource=Administrators -columns="USER_NAME,USER_TYPE,COST_CENTER,CONTACT,DEPARTMENT" -colsize="USER_TYPE:20,DEPARTMENT:30" The existing standard EMCLI formatting options are also available in list verb. They are: -format="name:pretty" | -format="name:script” | -format="name:csv" | -noheader | -scriptThere are so many uses depending on your needs. Have a look at the resources and columns in each resource. Refer to the EMCLI book in EM documentation for more information.3. Search:Using the -search option in the list verb makes it is possible to search for a specific row in a specific column within a resource. This is similar to the sqlplus where clause. The following operators are supported: = != > < >= <= like is (Must be followed by null or not null)Here is an example which searches for all EM administrators in the marketing department located in the USA.$emcli list -resource="Administrators" -search="DEPARTMENT ='Marketing'" -search="LOCATION='USA'" Here is another example which shows all the named credentials created since a specific date. $emcli list -resource=NamedCredentials -search="CredCreatedDate > '11-Nov-2013 12:37:20 PM'"Note that the timestamp has to be in the format DD-MON-YYYY HH:MI:SS AM/PM Some resources need a bind variable to be passed to get output. A bind variable is created in the resource and then referenced in the command. For example, this command will list all the default preferred credentials for target type oracle_database.Here is an example$ emcli list -resource="PreferredCredentialsDefault" -bind="TargetType='oracle_database'" -colsize="SetName:15,TargetType:15" You can provide multiple bind variables. To verify if a column is searchable or requires a bind variable, use the –help option. Here is an example:$ emcli list -resource="PreferredCredentialsDefault" -help 4. Secure accessWhen list verb collects the data, it only displays content for which the administrator currently logged into emcli, has access. For example consider this usecase:AdminA has access only to TargetA. AdminA logs into EM CLIExecuting the list verb to get the list of all targets will only show TargetA.5. User defined SQLUsing the –sql option, user defined sql can be executed. The SQL provided in the -sql option is executed as the EM user MGMT_VIEW, which has read-only access to the EM published MGMT$ database views in the SYSMAN schema. To get the list of EM published MGMT$ database views, go to the Extensibility Programmer's Reference book in EM documentation. There is a chapter about Using Management Repository Views. It’s always recommended to reference the documentation for the supported MGMT$ database views. Consider you are using the MGMT$ABC view which is not in the chapter. During upgrade, it is possible, since the view was not in the book and not supported, it is likely the view might undergo a change in its structure or the data in it. Using a supported view ensures that your scripts using -sql will continue working after upgrade.Here’s an example $ emcli list -sql='select * from mgmt$target' 6. JSON output support JSON (JavaScript Object Notation) enables data to be displayed in a collection of name/value pairs. There is lot of reading material about JSON on line for more information.As an example, we had a requirement where an EM administrator had many 11.2 databases in their test environment and the developers had requested an Administrator to change the lifecycle status from Test to Production which meant the admin had to go to the EM “All targets” page and identify the set of 11.2 databases and then to go into each target database page and manually changes the property to Production. Sounds easy to say, but this Administrator had numerous targets and this task is repeated for every release cycle.We told him there is an easier way to do this with a script and he can reuse the script whenever anyone wanted to change a set of targets to a different Lifecycle status. Here is a jython script which uses list and JSON to change all 11.2 database target’s LifeCycle Property value.If you are new to scripting and Jython, I would suggest visiting the basic chapters in any Jython tutorials. Understanding Jython is important to write the logic depending on your usecase.If you are already writing scripts like perl or shell or know a programming language like java, then you can easily understand the logic.Disclaimer: The scripts in this post are subject to the Oracle Terms of Use located here. 1 from emcli import * 2 search_list = ['PROPERTY_NAME=\'DBVersion\'','TARGET_TYPE= \'oracle_database\'','PROPERTY_VALUE LIKE \'11.2%\''] 3 if len(sys.argv) == 2: 4 print login(username=sys.argv[0]) 5 l_prop_val_to_set = sys.argv[1] 6 l_targets = list(resource="TargetProperties", search=search_list, columns="TARGET_NAME,TARGET_TYPE,PROPERTY_NAME") 7 for target in l_targets.out()['data']: 8 t_pn = 'LifeCycle Status' 9 print "INFO: Setting Property name " + t_pn + " to value " + l_prop_val_to_set + " for " + target['TARGET_NAME'] 10 print set_target_property_value(property_records= target['TARGET_NAME']+":"+target['TARGET_TYPE']+":"+ t_pn+":"+l_prop_val_to_set) 11 else: 12 print "\n ERROR: Property value argument is missing" 13 print "\n INFO: Format to run this file is filename.py <username> <Database Target LifeCycle Status Property Value>" You can download the script from here. I could not upload the file with .py extension so you need to rename the file to myScript.py before executing it using emcli.A line by line explanation for beginners: Line 1 Imports the emcli verbs as functions 2 search_list is a variable to pass to the search option in list verb. I am using escape character for the single quotes. In list verb to pass more than one value for the same option, you should define as above comma separated values, surrounded by square brackets. 3 This is an “if” condition to ensure the user does provide two arguments with the script, else in line #15, it prints an error message. 4 Logging into EM. You can remove this if you have setup emcli with autologin. For more details about setup and autologin, please go the EM CLI book in EM documentation. 5 l_prop_val_to_set is another variable. This is the property value to be set. Remember we are changing the value from Test to Production. The benefit of this variable is you can reuse the script to change the property value from and to any other values. 6 Here the output of the list verb is stored in l_targets. In the list verb I am passing the resource as TargetProperties, search as the search_list variable and I only need these three columns – target_name, target_type and property_name. I don’t need the other columns for my task. 7 This is a for loop. The data in l_targets is available in JSON format. Using the for loop, each pair will now be available in the ‘target’ variable. 8 t_pn is the “LifeCycle Status” variable. If required, I can have this also as an input and then use my script to change any target property. In this example, I just wanted to change the “LifeCycle Status”. 9 This a message informing the user the script is setting the property value for dbxyz. 10 This line shows the set_target_property_value verb which sets the value using the property_records option. Once it is set for a target pair, it moves to the next one. In my example, I am just showing three dbs, but the real use is when you have 20 or 50 targets. The script is executed as:$ emcli @myScript.py subin Production The recommendation is to first test the scripts before running it on a production system. We tested on a small set of targets and optimizing the script for fewer lines of code and better messaging.For your quick reference, the resources available in Enterprise Manager 12.1.0.4.0 with list verb are:$ emcli list -helpWatch this space for more blog posts using the list verb and EM CLI Scripting use cases. I hope you enjoyed reading this blog post and it has helped you gain more information about the list verb. Happy Scripting!!Disclaimer: The scripts in this post are subject to the Oracle Terms of Use located here. Stay Connected: Twitter | Facebook | YouTube | Linkedin | Newsletter mt=8">Download the Oracle Enterprise Manager 12c Mobile app

Read the article
Endeca Information Discovery 3-Day Hands-on Training Boot-Camp

- by Mike.Hallett(at)Oracle-BI&EPM

For Oracle Partners, on October 15-17, 2012 in Paris, France: Register here. The Oracle Endeca Information Discovery (OEID) Boot-Camp is designed to give partners an understanding of OEID’s features, and how it complements the existing Oracle Business Intelligence suite. Participants will learn how to develop & implement solutions using a Data Discovery method. Training is in English. What will be covered? The Oracle Endeca Information Discovery (OEID) Boot Camp is a three-day class with a combination of lecture and hands-on exercises, tailored to make participants aware of the Oracle Endeca Information Discovery platform, and to gain valuable skills for the implementation of projects. Prerequisites You must bring a laptop with you for the Hands-on labs: Attendees should have experience and familiarity with the basic concepts of business intelligence and be OPN Partners with Gold or above membership. This training is free to OPN Partners. Click here for more information. Where and When ? Monday, October 15th until Wednesday, October 17th included 9:00 - 18:00 Oracle France 15, boulevard Charles de Gaulle 92715 Colombes: Access Venue Map Register here : NOTE there is a Limited number of seats, you will get confirmation within 2 weeks.

Read the article
Advanced Exalytics, Endeca and OBI Training for Partners

- by Mike.Hallett(at)Oracle-BI&EPM

Normal 0 false false false EN-GB X-NONE X-NONE MicrosoftInternetExplorer4 In Sweden and in Saudi Arabia there will be the next set of hands-on advanced Exalytics, Endeca and OBI Training workshops for Partners: partners from any country may attend these. These are free of charge to OPN Specialised member partners, but are subject to availability (please do not attend unless you have received a confirmation from Oracle to do so). Check the workshop agendas and pre-requisites, and register from the links below: Exalytics and Business Intelligence Hands-on Workshop (3-days) August 24-26, 2013: Oracle Riyadh, Saudi Arabia September 3-5, 2013: Oracle Kista, Sweden Endeca Information Discovery Hands-on Workshop (3-days) October 22-24, 2013: Oracle Kista, Sweden Advanced Oracle Business Intelligence Hands-on Workshop (3-days) August 19-21, 2013: Oracle Riyadh, Saudi Arabia /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

Read the article
CBO????????

- by Liu Maclean(???)

???Itpub????????CBO??????????, ????????: SQL> create table maclean1 as select * from dba_objects; Table created. SQL> update maclean1 set status='INVALID' where owner='MACLEAN'; 2 rows updated. SQL> commit; Commit complete. SQL> create index ind_maclean1 on maclean1(status); Index created. SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN1',cascade=>true); PL/SQL procedure successfully completed. SQL> explain plan for select * from maclean1 where status='INVALID'; Explained. SQL> set linesize 140 pagesize 1400 SQL> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------- Plan hash value: 987568083 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 11320 | 1028K| 85 (0)| 00:00:02 | |* 1 | TABLE ACCESS FULL| MACLEAN1 | 11320 | 1028K| 85 (0)| 00:00:02 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("STATUS"='INVALID') 13 rows selected. 10053 trace Access path analysis for MACLEAN1 *************************************** SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for MACLEAN1[MACLEAN1] Column (#10): STATUS( AvgLen: 7 NDV: 2 Nulls: 0 Density: 0.500000 Table: MACLEAN1 Alias: MACLEAN1 Card: Original: 22639.000000 Rounded: 11320 Computed: 11319.50 Non Adjusted: 11319.50 Access Path: TableScan Cost: 85.33 Resp: 85.33 Degree: 0 Cost_io: 85.00 Cost_cpu: 11935345 Resp_io: 85.00 Resp_cpu: 11935345 Access Path: index (AllEqRange) Index: IND_MACLEAN1 resc_io: 185.00 resc_cpu: 8449916 ix_sel: 0.500000 ix_sel_with_filters: 0.500000 Cost: 185.24 Resp: 185.24 Degree: 1 Best:: AccessPath: TableScan Cost: 85.33 Degree: 1 Resp: 85.33 Card: 11319.50 Bytes: 0 ?????10053????????????,?????Density = 0.5 ?? 1/ NDV ??? ??????????????STATUS='INVALID"???????????, ????????????????? ????”STATUS”=’INVALID’ condition???2?,?status??????,??????dbms_stats?????????????,???CBO????INDEX Range ind_maclean1,???????,??????opitimizer?????? ?????????????????????????,????????,??????????status=’INVALID’???????card??,????????: [oracle@vrh4 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.2.0 Production on Mon Oct 17 19:15:45 2011 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for Linux: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production SQL> show parameter optimizer_fea NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ optimizer_features_enable string 11.2.0.2 SQL> select * from global_name; GLOBAL_NAME -------------------------------------------------------------------------------- www.oracledatabase12g.com & www.askmaclean.com SQL> drop table maclean; Table dropped. SQL> create table maclean as select * from dba_objects; Table created. SQL> update maclean set status='INVALID' where owner='MACLEAN'; 2 rows updated. SQL> commit; Commit complete. SQL> create index ind_maclean on maclean(status); Index created. SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',cascade=>true, method_opt=>'FOR ALL COLUMNS SIZE 2'); PL/SQL procedure successfully completed. ???????2?bucket????, ??????????????? ???Quest???Guy Harrison???????FREQUENCY????????,??????: rem rem Generate a histogram of data distribution in a column as recorded rem in dba_tab_histograms rem rem Guy Harrison Jan 2010 : www.guyharrison.net rem rem hexstr function is from From http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:707586567563 set pagesize 10000 set lines 120 set verify off col char_value format a10 heading "Endpoint|value" col bucket_count format 99,999,999 heading "bucket|count" col pct format 999.99 heading "Pct" col pct_of_max format a62 heading "Pct of|Max value" rem col endpoint_value format 9999999999999 heading "endpoint|value" CREATE OR REPLACE FUNCTION hexstr (p_number IN NUMBER) RETURN VARCHAR2 AS l_str LONG := TO_CHAR (p_number, 'fm' || RPAD ('x', 50, 'x')); l_return VARCHAR2 (4000); BEGIN WHILE (l_str IS NOT NULL) LOOP l_return := l_return || CHR (TO_NUMBER (SUBSTR (l_str, 1, 2), 'xx')); l_str := SUBSTR (l_str, 3); END LOOP; RETURN (SUBSTR (l_return, 1, 6)); END; / WITH hist_data AS ( SELECT endpoint_value,endpoint_actual_value, NVL(LAG (endpoint_value) OVER (ORDER BY endpoint_value),' ') prev_value, endpoint_number, endpoint_number, endpoint_number - NVL (LAG (endpoint_number) OVER (ORDER BY endpoint_value), 0) bucket_count FROM dba_tab_histograms JOIN dba_tab_col_statistics USING (owner, table_name,column_name) WHERE owner = '&owner' AND table_name = '&table' AND column_name = '&column' AND histogram='FREQUENCY') SELECT nvl(endpoint_actual_value,endpoint_value) endpoint_value , bucket_count, ROUND(bucket_count*100/SUM(bucket_count) OVER(),2) PCT, RPAD(' ',ROUND(bucket_count*50/MAX(bucket_count) OVER()),'*') pct_of_max FROM hist_data; WITH hist_data AS ( SELECT endpoint_value,endpoint_actual_value, NVL(LAG (endpoint_value) OVER (ORDER BY endpoint_value),' ') prev_value, endpoint_number, endpoint_number, endpoint_number - NVL (LAG (endpoint_number) OVER (ORDER BY endpoint_value), 0) bucket_count FROM dba_tab_histograms JOIN dba_tab_col_statistics USING (owner, table_name,column_name) WHERE owner = '&owner' AND table_name = '&table' AND column_name = '&column' AND histogram='FREQUENCY') SELECT hexstr(endpoint_value) char_value, bucket_count, ROUND(bucket_count*100/SUM(bucket_count) OVER(),2) PCT, RPAD(' ',ROUND(bucket_count*50/MAX(bucket_count) OVER()),'*') pct_of_max FROM hist_data ORDER BY endpoint_value; ?????,??????????FREQUENCY?????: ??dbms_stats ?????STATUS=’INVALID’ bucket count=9 percent = 0.04 ,??????10053 trace????????: SQL> explain plan for select * from maclean where status='INVALID'; Explained. SQL> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ------------------------------------- Plan hash value: 3087014066 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 837 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| MACLEAN | 9 | 837 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_MACLEAN | 9 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("STATUS"='INVALID') ??????????????CBO???????STATUS=’INVALID’?cardnality?? , ??????????? ,??index range scan??Full table scan? ????????????????10053 trace: SQL> alter system flush shared_pool; System altered. SQL> oradebug setmypid; Statement processed. SQL> oradebug event 10053 trace name context forever ,level 1; Statement processed. SQL> explain plan for select * from maclean where status='INVALID'; Explained. SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for MACLEAN[MACLEAN] Column (#10): NewDensity:0.000199, OldDensity:0.000022 BktCnt:22640, PopBktCnt:22640, PopValCnt:2, NDV:2 ???NewDensity= bucket_count / SUM(bucket_count) /2 Column (#10): STATUS( AvgLen: 7 NDV: 2 Nulls: 0 Density: 0.000199 Histogram: Freq #Bkts: 2 UncompBkts: 22640 EndPtVals: 2 Table: MACLEAN Alias: MACLEAN Card: Original: 22640.000000 Rounded: 9 Computed: 9.00 Non Adjusted: 9.00 Access Path: TableScan Cost: 85.30 Resp: 85.30 Degree: 0 Cost_io: 85.00 Cost_cpu: 10804625 Resp_io: 85.00 Resp_cpu: 10804625 Access Path: index (AllEqRange) Index: IND_MACLEAN resc_io: 2.00 resc_cpu: 20763 ix_sel: 0.000398 ix_sel_with_filters: 0.000398 Cost: 2.00 Resp: 2.00 Degree: 1 Best:: AccessPath: IndexRange Index: IND_MACLEAN Cost: 2.00 Degree: 1 Resp: 2.00 Card: 9.00 Bytes: 0 ???????????2 bucket?????CBO????????????,???????????????????,???dbms_stats.DEFAULT_METHOD_OPT????????????????????? ???dbms_stats?????????????????????col_usage$??????predicate???????,??col_usage$??<????????SMON??(?):??col_usage$????>? ??????????dbms_stats????????,col_usage$????????????predicate???,??dbms_stats??????????????????, ?: SQL> drop table maclean; Table dropped. SQL> create table maclean as select * from dba_objects; Table created. SQL> update maclean set status='INVALID' where owner='MACLEAN'; 2 rows updated. SQL> commit; Commit complete. SQL> create index ind_maclean on maclean(status); Index created. ??dbms_stats??method_opt??maclean? SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN'); PL/SQL procedure successfully completed. @histogram.sql Enter value for owner: SYS old 12: WHERE owner = '&owner' new 12: WHERE owner = 'SYS' Enter value for table: MACLEAN old 13: AND table_name = '&table' new 13: AND table_name = 'MACLEAN' Enter value for column: STATUS old 14: AND column_name = '&column' new 14: AND column_name = 'STATUS' no rows selected ????col_usage$?????,????????status????? declare begin for i in 1..500 loop execute immediate ' alter system flush shared_pool'; DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO; execute immediate 'select count(*) from maclean where status=''INVALID'' ' ; end loop; end; / PL/SQL procedure successfully completed. SQL> select obj# from obj$ where name='MACLEAN'; OBJ# ---------- 97215 SQL> select * from col_usage$ where OBJ#=97215; OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP ---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- --------- 97215 1 1 0 0 0 0 0 17-OCT-11 97215 10 499 0 0 0 0 0 17-OCT-11 SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN'); PL/SQL procedure successfully completed. @histogram.sql Enter value for owner: SYS Enter value for table: MACLEAN Enter value for column: STATUS Endpoint bucket Pct of value count Pct Max value ---------- ----------- ------- -------------------------------------------------------------- INVALI 2 .04 VALIC3 5,453 99.96 *************************************************

Read the article
SQL SERVER – Automated Type Conversion using Expressor Studio

- by pinaldave

Recently I had an interesting situation during my consultation project. Let me share to you how I solved the problem using Expressor Studio. Consider a situation in which you need to read a field, such as customer_identifier, from a text file and pass that field into a database table. In the source file’s metadata structure, customer_identifier is described as a string; however, in the target database table, customer_identifier is described as an integer. Legitimately, all the source values for customer_identifier are valid numbers, such as “109380”. To implement this in an ETL application, you probably would have hard-coded a type conversion function call, such as: output.customer_identifier=stringToInteger(input.customer_identifier) That wasn’t so bad, was it? For this instance, programming this hard-coded type conversion function call was relatively easy. However, hard-coding, whether type conversion code or other business rule code, almost always means that the application containing hard-coded fields, function calls, and values is: a) specific to an instance of use; b) is difficult to adapt to new situations; and c) doesn’t contain many reusable sub-parts. Therefore, in the long run, applications with hard-coded type conversion function calls don’t scale well. In addition, they increase the overall level of effort and degree of difficulty to write and maintain the ETL applications. To get around the trappings of hard-coding type conversion function calls, developers need an access to smarter typing systems. Expressor Studio product offers this feature exactly, by providing developers with a type conversion automation engine based on type abstraction. The theory behind the engine is quite simple. A user specifies abstract data fields in the engine, and then writes applications against the abstractions (whereas in most ETL software, developers develop applications against the physical model). When a Studio-built application is run, Studio’s engine automatically converts the source type to the abstracted data field’s type and converts the abstracted data field’s type to the target type. The engine can do this because it has a couple of built-in rules for type conversions. So, using the example above, a developer could specify customer_identifier as an abstract data field with a type of integer when using Expressor Studio. Upon reading the string value from the text file, Studio’s type conversion engine automatically converts the source field from the type specified in the source’s metadata structure to the abstract field’s type. At the time of writing the data value to the target database, the engine doesn’t have any work to do because the abstract data type and the target data type are just the same. Had they been different, the engine would have automatically provided the conversion. ?Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Database, Pinal Dave, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQLAuthority News, T SQL, Technology Tagged: SSIS

Read the article
How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
Big Data’s Killer App…

- by jean-pierre.dijcks

Recently Keith spent some time talking about the cloud on this blog and I will spare you my thoughts on the whole thing. What I do want to write down is something about the Big Data movement and what I think is the killer app for Big Data... Where is this coming from, ok, I confess... I spent 3 days in cloud land at the Cloud Connect conference in Santa Clara and it was quite a lot of fun. One of the nice things at Cloud Connect was that there was a track dedicated to Big Data, which prompted me to some extend to write this post. What is Big Data anyways? The most valuable point made in the Big Data track was that Big Data in itself is not very cool. Doing something with Big Data is what makes all of this cool and interesting to a business user! The other good insight I got was that a lot of people think Big Data means a single gigantic monolithic system holding gazillions of bytes or documents or log files. Well turns out that most people in the Big Data track are talking about a lot of collections of smaller data sets. So rather than thinking "big = monolithic" you should be thinking "big = many data sets". This is more than just theoretical, it is actually relevant when thinking about big data and how to process it. It is important because it means that the platform that stores data will most likely consist out of multiple solutions. You may be storing logs on something like HDFS, you may store your customer information in Oracle and you may store distilled clickstream information in some distilled form in MySQL. The big question you will need to solve is not what lives where, but how to get it all together and get some value out of all that data. NoSQL and MapReduce Nope, sorry, this is not the killer app... and no I'm not saying this because my business card says Oracle and I'm therefore biased. I think language is important, but as with storage I think pragmatic is better. In other words, some questions can be answered with SQL very efficiently, others can be answered with PERL or TCL others with MR. History should teach us that anyone trying to solve a problem will use any and all tools around. For example, most data warehouses (Big Data 1.0?) get a lot of data in flat files. Everyone then runs a bunch of shell scripts to massage or verify those files and then shoves those files into the database. We've even built shell script support into external tables to allow for this. I think the Big Data projects will do the same. Some people will use MapReduce, although I would argue that things like Cascading are more interesting, some people will use Java. Some data is stored on HDFS making Cascading the way to go, some data is stored in Oracle and SQL does do a good job there. As with storage and with history, be pragmatic and use what fits and neither NoSQL nor MR will be the one and only. Also, a language, while important, does in itself not deliver business value. So while cool it is not a killer app... Vertical Behavioral Analytics This is the killer app! And you are now thinking: "what does that mean?" Let's decompose that heading. First of all, analytics. I would think you had guessed by now that this is really what I'm after, and of course you are right. But not just analytics, which has a very large scope and means many things to many people. I'm not just after Business Intelligence (analytics 1.0?) or data mining (analytics 2.0?) but I'm after something more interesting that you can only do after collecting large volumes of specific data. That all important data is about behavior. What do my customers do? More importantly why do they behave like that? If you can figure that out, you can tailor web sites, stores, products etc. to that behavior and figure out how to be successful. Today's behavior that is somewhat easily tracked is web site clicks, search patterns and all of those things that a web site or web server tracks. that is where the Big Data lives and where these patters are now emerging. Other examples however are emerging, and one of the examples used at the conference was about prediction churn for a telco based on the social network its members are a part of. That social network is not about LinkedIn or Facebook, but about who calls whom. I call you a lot, you switch provider, and I might/will switch too. And that just naturally brings me to the next word, vertical. Vertical in this context means per industry, e.g. communications or retail or government or any other vertical. The reason for being more specific than just behavioral analytics is that each industry has its own data sources, has its own quirky logic and has its own demands and priorities. Of course, the methods and some of the software will be common and some will have both retail and service industry analytics in place (your corner coffee store for example). But the gist of it all is that analytics that can predict customer behavior for a specific focused group of people in a specific industry is what makes Big Data interesting. Building a Vertical Behavioral Analysis System Well, that is going to be interesting. I have not seen much going on in that space and if I had to have some criticism on the cloud connect conference it would be the lack of concrete user cases on big data. The telco example, while a step into the vertical behavioral part is not really on big data. It used a sample of data from the customers' data warehouse. One thing I do think, and this is where I think parts of the NoSQL stuff come from, is that we will be doing this analysis where the data is. Over the past 10 years we at Oracle have called this in-database analytics. I guess we were (too) early? Now the entire market is going there including companies like SAS. In-place btw does not mean "no data movement at all", what it means that you will do this on data's permanent home. For SAS that is kind of the current problem. Most of the inputs live in a data warehouse. So why move it into SAS and back? That all worked with 1 TB data warehouses, but when we are looking at 100TB to 500 TB of distilled data... Comments? As it is still early days with these systems, I'm very interested in seeing reactions and thoughts to some of these thoughts...

Read the article
Azure, don't give me multiple VMs, give me one elastic VM

- by FransBouma

Yesterday, Microsoft revealed new major features for Windows Azure (see ScottGu's post). It all looks shiny and great, but after reading most of the material describing the new features, I still find the overall idea behind all of it flawed: why should I care on how much VMs my web app runs? Isn't that a problem to solve for the Windows Azure engineers / software? And what if I need the file system, why can't I simply get a virtual filesystem ? To illustrate my point, let's use a real example: a product website with a customer system/database and next to it a support site with accompanying database. Both are written in .NET, using ASP.NET and use a SQL Server database each. The product website offers files to download by customers, very simple. You have a couple of options to host these websites: Buy a server, place it in a rack at an ISP and run the sites on that server Use 'shared hosting' with an ISP, which means your sites' appdomains are running on the same machine, as well as the files stored, and the databases are hosted in the same server as the other shared databases. Hire a VM, install your OS of choice at an ISP, and host the sites on that VM, basically the same as the first option, except you don't have a physical server At some cloud-vendor, either host the sites 'shared' or in a VM. See above. With all of those options, scalability is a problem, even the cloud-based ones, though not due to the same reasons: The physical server solution has the obvious problem that if you need more power, you need to buy a bigger server or more servers which requires you to add replication and other overhead Shared hosting solutions are almost always capped on memory usage / traffic and database size: if your sites get too big, you have to move out of the shared hosting environment and start over with one of the other solutions The VM solution, be it a VM at an ISP or 'in the cloud' at e.g. Windows Azure or Amazon, in theory allows scaling out by simply instantiating more VMs, however that too introduces the same overhead problems as with the physical servers: suddenly more than 1 instance runs your sites. If a cloud vendor offers its services in the form of VMs, you won't gain much over having a VM at some ISP: the main problems you have to work around are still there: when you spin up more than one VM, your application must be completely stateless at any moment, including the DB sub system, because what's in memory in instance 1 might not be in memory in instance 2. This might sounds trivial but it's not. A lot of the websites out there started rather small: they were perfectly runnable on a single machine with normal memory and CPU power. After all, you don't need a big machine to run a website with even thousands of users a day. Moving these sites to a multi-VM environment will cause a problem: all the in-memory state they use, all the multi-page transitions they use while keeping state across the transition, they can't do that anymore like they did that on a single machine: state is something of the past, you have to store every byte of state in either a DB or in a viewstate or in a cookie somewhere so with the next request, all state information is available through the request, as nothing is kept in-memory. Our example uses a bunch of files in a file system. Using multiple VMs will require that these files move to a cloud storage system which is mounted in each VM so we don't have to store the files on each VM. This might require different file paths, but this change should be minor. What's perhaps less minor is the maintenance procedure in place on the new type of cloud storage used: instead of ftp-ing into a VM, you might have to update the files using different ways / tools. All in all this makes moving an existing website which was written for an environment that's based around a VM (namely .NET with its CLR) overly cumbersome and problematic: it forces you to refactor your website system to be able to be used 'in the cloud', which is caused by the limited way how e.g. Windows Azure offers its cloud services: in blocks of VMs. Offer a scalable, flexible VM which extends with my needs Instead, cloud vendors should offer simply one VM to me. On that VM I run the websites, store my DB and my files. As it's a virtual machine, how this machine is actually ran on physical hardware (e.g. partitioned), I don't care, as that's the problem for the cloud vendor to solve. If I need more resources, e.g. I have more traffic to my server, way more visitors per day, the VM stretches, like I bought a bigger box. This frees me from the problem which comes with multiple VMs: I don't have any refactoring to do at all: I can simply build my website as if it runs on my local hardware server, upload it to the VM offered by the cloud vendor, install it on the VM and I'm done. "But that might require changes to windows!" Yes, but Microsoft is Windows. Windows Azure is their service, they can make whatever change to what they offer to make it look like it's windows. Yet, they're stuck, like Amazon, in thinking in VMs, which forces developers to 'think ahead' and gamble whether they would need to migrate to a cloud with multiple VMs in the future or not. Which comes down to: gamble whether they should invest time in code / architecture which they might never need. (YAGNI anyone?) So the VM we're talking about, is that a low-level VM which runs a guest OS, or is that VM a different kind of VM? The flexible VM: .NET's CLR ? My example websites are ASP.NET based, which means they run inside a .NET appdomain, on the .NET CLR, which is a VM. The only physical OS resource the sites need is the file system, however this too is accessed through .NET. In short: all the websites see is what .NET allows the websites to see, the world as the websites know it is what .NET shows them and lets them access. How the .NET appdomain is run physically, that's the concern of .NET, not mine. This begs the question why Windows Azure doesn't offer virtual appdomains? Or better: .NET environments which look like one machine but could be physically multiple machines. In such an environment, no change has to be made to the websites to migrate them from a local machine or own server to the cloud to get proper scaling: the .NET VM will simply scale with the need: more memory needed, more CPU power needed, it stretches. What it offers to the application running inside the appdomain is simply increasing, but not fragmented: all resources are available to the application: this means that the problem of how to scale is back to where it should be: with the cloud vendor. "Yeah, great, but what about the databases?" The .NET application communicates with the database server through a .NET ADO.NET provider. Where the database is located is not a problem of the appdomain: the ADO.NET provider has to solve that. I.o.w.: we can host the databases in an environment which offers itself as a single resource and is accessible through one connection string without replication overhead on the outside, and use that environment inside the .NET VM as if it was a single DB. But what about memory replication and other problems? This environment isn't simple, at least not for the cloud vendor. But it is simple for the customer who wants to run his sites in that cloud: no work needed. No refactoring needed of existing code. Upload it, run it. Perhaps I'm dreaming and what I described above isn't possible. Yet, I think if cloud vendors don't move into that direction, what they're offering isn't interesting: it doesn't solve a problem at all, it simply offers a way to instantiate more VMs with the guest OS of choice at the cost of me needing to refactor my website code so it can run in the straight jacket form factor dictated by the cloud vendor. Let's not kid ourselves here: most of us developers will never build a website which needs a truck load of VMs to run it: almost all websites created by developers can run on just a few VMs at most. Yet, the most expensive change is right at the start: moving from one to two VMs. As soon as you have refactored your website code to run across multiple VMs, adding another one is just as easy as clicking a mouse button. But that first step, that's the problem here and as it's right there at the beginning of scaling the website, it's particularly strange that cloud vendors refuse to solve that problem and leave it to the developers to solve that. Which makes migrating 'to the cloud' particularly expensive.

Read the article
Who is Jeremiah Owyang?

- by Michael Hylton

12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Q: What’s your current role and what career path brought you here? J.O.: I'm currently a partner and one of the founding team members at Altimeter Group. I'm currently the Research Director, as well as wear the hat of Industry Analyst. Prior to joining Altimeter, I was an Industry Analyst at Forrester covering Social Computing, and before that, deployed and managed the social media program at Hitachi Data Systems in Santa Clara. Around that time, I started a career blog called Web Strategy which focused on how companies were using the web to connect with customers --and never looked back. Q: As an industry analyst, what are you focused on these days? J.O.: There are three trends that I'm focused my research on at this time: 1) The Dynamic Customer Journey: Individuals (both b2c and b2b) are given so many options in their sources of data, channels to choose from and screens to consume them on that we've found that at each given touchpoint there are 75 potential permutations. Companies that can map this, then deliver information to individuals when they need it will have a competitive advantage and we want to find out who's doing this. 2) One of the sub themes that supports this trend is Social Performance. Yesterday's social web was disparate engagement of humans, but the next phase will be data driven, and soon new technologies will emerge to help all those that are consuming, publishing, and engaging on the social web to be more efficient with their time through forms of automation. As you might expect, this comes with upsides and downsides. 3) The Sentient World is our research theme that looks out the furthest as the world around us (even inanimate objects) become 'self aware' and are able to talk back to us via digital devices and beyond. Big data, internet of things, mobile devices will all be this next set. Q: People cite that the line between work and life is getting more and more blurred. Do you see your personal life influencing your professional work? J.O.: The lines between our work and personal lives are dissolving, and this leads to a greater upside of being always connected and have deeper relationships with those that are not. It also means a downside of society expectations that we're always around and available for colleagues, customers, and beyond. In the future, a balance will be sought as we seek to achieve the goals of family, friends, work, and our own personal desires. All of this is being ironically written at 430 am on a Sunday am. Q: How can people keep up with what you’re working on? J.O.: A great question, thanks. There are a few sources of information to find out, I'll lead with the first which is my blog at web-strategist.com. A few times a week I'll publish my industry insights (hires, trends, forces, funding, M&A, business needs) as well as on twitter where I'll point to all the news that's fit to print @jowyang. As my research reports go live (we publish them for all to read --called Open Research-- at no cost) they'll emerge on my blog, or checkout the research tab to find out more now. http://www.web-strategist.com/blog/research/ Q: Recently, you’ve been working with us here at Oracle on something exciting coming up later this week. What’s on the horizon? J.O.: Absolutely! This coming Thursday, September 13th, I’m doing a webcast with Oracle on “Managing Social Relationships for the Enterprise”. This is going to be a great discussion with Reggie Bradford, Senior Vice President of Product Development at Oracle and Christian Finn, Senior Director of Product Management for Oracle WebCenter. I’m looking forward to a great discussion around all those issues that so many companies are struggling with these days as they realize how much social media is impacting their business. It’s changing the way your customers and employees interact with your brand. Today it’s no longer a matter of when to become a social-enabled enterprise, but how to become a successful one. 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Q: You’ve been very actively pursued for media interviews and conference and company speaking engagements – anything you’d like to share to give us a sneak peak of what to expect on Thursday’s webcast? J.O.: Below is a 15 minute video which encapsulates Altimeter’s themes on the Dynamic Customer Journey and the Sentient World. I’m really proud to have taken an active role in the first ever LeWeb outside of Paris. This one, which was featured in downtown London across the street from Westminster Abbey was sold out. If you’ve not heard of LeWeb, this is a global Internet conference hosted by Loic and Geraldine Le Meur, a power couple that stem from Paris but are also living in Silicon Valley, this is one of my favorite conferences to connect with brands, technology innovators, investors and friends. Altimeter was able to play a minor role in suggesting the theme for the event “Faster Than Real Time” which stems off previous LeWebs that focused on the “Real time web”. In this radical state, companies are able to anticipate the needs of their customers by using data, technology, and devices and deliver meaningful experiences before customers even know they need it. I explore two of three of Altimeter’s research themes, the Dynamic Customer Journey, and the Sentient World in my speech, but due to time, did not focus on Adaptive Organization.

Read the article
SQL Server v.Next (Denali) : More on contained databases and "contained users"

- by AaronBertrand

One of the reasons for contained databases (see my previous post ) is to allow for a more seamless transition when moving a database from one server to another. One of the biggest complications in doing so is making sure that all of the logins are in place on the new server. Contained databases help solve this issue by creating a new type of user: a database-level user with a password. I want to stress that this is not the same concept as a user without a login , which serves a completely different...(read more)

Read the article
Today’s Performance Tip: Views are for Convenience, Not Performance!

- by Jonathan Kehayias

I tweeted this last week on twitter and got a lot of retweets so I thought that I’d blog the story behind the tweet. Most vendor databases have views in them, and when people want to retrieve data from a database, it seems like the most common first stop they make are the vendor supplied Views. This post is in no way a bash against the usage or creation of Views in a SQL Server Database, I have created them before to simplify code and compartmentalize commonly required queries so that there...(read more)

Read the article
Is it wise to store a big lump of json on a database row

- by Ieyasu Sawada

I have this project which stores product details from amazon into the database. Just to give you an idea on how big it is: [{"title":"Genetic Engineering (Opposing Viewpoints)","short_title":"Genetic Engineering ...","brand":"","condition":"","sales_rank":"7171426","binding":"Book","item_detail_url":"http://localhost/wordpress/product/?asin=0737705124","node_list":"Books > Science & Math > Biological Sciences > Biotechnology","node_category":"Books","subcat":"","model_number":"","item_url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=128","details_url":"http://localhost/wordpress/product/?asin=0737705124","large_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/large-notfound.png","medium_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/medium-notfound.png","small_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/small-notfound.png","thumbnail_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/thumbnail-notfound.png","tiny_img":"http://localhost/wordpress/wp-content/plugins/ecom/img/tiny-notfound.png","swatch_img":"http://localhost/wordpress/wp-content/plugins/ecom/img/swatch-notfound.png","total_images":"6","amount":"33.70","currency":"$","long_currency":"USD","price":"$33.70","price_type":"List Price","show_price_type":"0","stars_url":"","product_review":"","rating":"","yellow_star_class":"","white_star_class":"","rating_text":" of 5","reviews_url":"","review_label":"","reviews_label":"Read all ","review_count":"","create_review_url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=132","create_review_label":"Write a review","buy_url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=19186","add_to_cart_action":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/add_to_cart.php","asin":"0737705124","status":"Only 7 left in stock.","snippet_condition":"in_stock","status_class":"ninstck","customer_images":["http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg","http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/31FIM-YIUrL.jpg","http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg","http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg"],"disclaimer":"","item_attributes":[{"attr":"Author","value":"Greenhaven Press"},{"attr":"Binding","value":"Hardcover"},{"attr":"EAN","value":"9780737705126"},{"attr":"Edition","value":"1"},{"attr":"ISBN","value":"0737705124"},{"attr":"Label","value":"Greenhaven Press"},{"attr":"Manufacturer","value":"Greenhaven Press"},{"attr":"NumberOfItems","value":"1"},{"attr":"NumberOfPages","value":"224"},{"attr":"ProductGroup","value":"Book"},{"attr":"ProductTypeName","value":"ABIS_BOOK"},{"attr":"PublicationDate","value":"2000-06"},{"attr":"Publisher","value":"Greenhaven Press"},{"attr":"SKU","value":"G0737705124I2N00"},{"attr":"Studio","value":"Greenhaven Press"},{"attr":"Title","value":"Genetic Engineering (Opposing Viewpoints)"}],"customer_review_url":"http://localhost/wordpress/wp-content/ecom-customer-reviews/0737705124.html","flickr_results":["http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/5105560852_06c7d06f14_m.jpg"],"freebase_text":"No around the web data available yet","freebase_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/freebase-notfound.jpg","ebay_related_items":[{"title":"Genetic Engineering (Introducing Issues With Opposing Viewpoints), , Good Book","image":"http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/140.jpg","url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=12165","currency_id":"$","current_price":"26.2"},{"title":"Genetic Engineering Opposing Viewpoints by DAVID BENDER - 1964 Hardcover","image":"http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/140.jpg","url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=130","currency_id":"AUD","current_price":"11.99"}],"no_follow":"rel=\"nofollow\"","new_tab":"target=\"_blank\"","related_products":[],"super_saver_shipping":"","shipping_availability":"","total_offers":"7","added_to_cart":""}] So the structure for the table is: asin title details (the product details in json) Will the performance suffer if I have to store like 10,000 products? Is there any other way of doing this? I'm thinking of the following, but the current setup is really the most convenient one since I also have to use the data on the client side: store the product details in a file. So something like ASIN123.json store the product details in one big file. (I'm guessing it will be a drag to extract data from this file) store each of the fields in the details in its own table field Thanks in advance!

Read the article
Is inline SQL still classed as bad practice now that we have Micro ORMs?

- by Grofit

This is a bit of an open ended question but I wanted some opinions, as I grew up in a world where inline SQL scripts were the norm, then we were all made very aware of SQL injection based issues, and how fragile the sql was when doing string manipulations all over the place. Then came the dawn of the ORM where you were explaining the query to the ORM and letting it generate its own SQL, which in a lot of cases was not optimal but was safe and easy. Another good thing about ORMs or database abstraction layers were that the SQL was generated with its database engine in mind, so I could use Hibernate/Nhibernate with MSSQL, MYSQL and my code never changed it was just a configuration detail. Now fast forward to current day, where Micro ORMs seem to be winning over more developers I was wondering why we have seemingly taken a U-Turn on the whole in-line sql subject. I must admit I do like the idea of no ORM config files and being able to write my query in a more optimal manner but it feels like I am opening myself back up to the old vulnerabilities such as SQL injection and I am also tying myself to one database engine so if I want my software to support multiple database engines I would need to do some more string hackery which seems to then start to make code unreadable and more fragile. (Just before someone mentions it I know you can use parameter based arguments with most micro orms which offers protection in most cases from sql injection) So what are peoples opinions on this sort of thing? I am using Dapper as my Micro ORM in this instance and NHibernate as my regular ORM in this scenario, however most in each field are quite similar. What I term as inline sql is SQL strings within source code. There used to be design debates over SQL strings in source code detracting from the fundamental intent of the logic, which is why statically typed linq style queries became so popular its still just 1 language, but with lets say C# and Sql in one page you have 2 languages intermingled in your raw source code now. Just to clarify, the SQL injection is just one of the known issues with using sql strings, I already mention you can stop this from happening with parameter based queries, however I highlight other issues with having SQL queries ingrained in your source code, such as the lack of DB Vendor abstraction as well as losing any level of compile time error capturing on string based queries, these are all issues which we managed to side step with the dawn of ORMs with their higher level querying functionality, such as HQL or LINQ etc (not all of the issues but most of them). So I am less focused on the individual highlighted issues and more the bigger picture of is it now becoming more acceptable to have SQL strings directly in your source code again, as most Micro ORMs use this mechanism. Here is a similar question which has a few different view points, although is more about the inline sql without the micro orm context: http://stackoverflow.com/questions/5303746/is-inline-sql-hard-coding

Read the article
Security Controls on data for P6 Analytics

- by Jeffrey McDaniel

The Star database and P6 Analytics calculates security based on P6 security using OBS, global, project, cost, and resource security considerations. If there is some concern that users are not seeing expected data in P6 Analytics here are some areas to review: 1. Determining if a user has cost security is based on the Project level security privileges - either View Project Costs/Financials or Edit EPS Financials. If expecting to see costs make sure one of these permissions are allocated. 2. User must have OBS access on a Project. Not WBS level. WBS level security is not supported. Make sure user has OBS on project level. 3. Resource Access is determined by what is granted in P6. Verify the resource access granted to this user in P6. Resource security is hierarchical. Project access will override Resource access based on the way security policies are applied. 4. Module access must be given to a P6 user for that user to come over into Star/P6 Analytics. For earlier version of RDB there was a report_user_flag on the Users table. This flag field is no longer used after P6 Reporting Database 2.1. 5. For P6 Reporting Database versions 2.2 and higher, the Extended Schema Security service must be run to calculate all security. Any changes to privileges or security this service must be rerun before any ETL. 6. In P6 Analytics 2.0 or higher, a Weblogic user must exist that matches the P6 username. For example user Tim must exist in P6 and Weblogic users for Tim to be able to log into P6 Analytics and access data based on P6 security. In earlier versions the username needed to exist in RPD. 7. Cache in OBI is another area that can sometimes make it seem a user isn't seeing the data they expect. While cache can be beneficial for performance in OBI. If the data is outdated it can retrieve older, stale data. Clearing or turning off cache when rerunning a query can determine if the returned result set was from cache or from the database.

Read the article
Evaluating Solutions to Manage Product Compliance? Don't Wait Much Longer

- by Kerrie Foy

Depending on severity, product compliance issues can cause all sorts of problems from run-away budgets to business closures. But effective policies and safeguards can create a strong foundation for innovation, productivity, market penetration and competitive advantage. If you’ve been putting off a systematic approach to product compliance, it is time to reconsider that decision, or indecision. Why now? No matter what industry, companies face a litany of worldwide and regional regulations that require proof of product compliance and environmental friendliness for market access. For example, Restriction of Hazardous Substances (RoHS) is a regulation that restricts the use of six dangerous materials used in the manufacture of electronic and electrical equipment. ROHS was originally adopted by the European Union in 2003 for implementation in 2006, and it has evolved over time through various regional versions for North America, China, Japan, Korea, Norway and Turkey. In addition, the RoHS directive allowed for material exemptions used in Medical Devices, but that exemption ends in 2014. Additional regulations worth watching are the Battery Directive, Waste Electrical and Electronic Equipment (WEEE), and Registration, Evaluation, Authorization and Restriction of Chemicals (REACH) directives. Additional evolving regulations are coming from governing bodies like the Food and Drug Administration (FDA) and the International Organization for Standardization (ISO). Corporate sustainability initiatives are also gaining urgency and influencing product design. In a survey of 405 corporations in the Global 500 by Carbon Disclosure Project, co-written by PwC (CDP Global 500 Climate Change Report 2012 entitled Business Resilience in an Uncertain, Resource-Constrained World), 48% of the respondents indicated they saw potential to create new products and business services as a response to climate change. Just 21% reported a dedicated budget for the research. However, the report goes on to explain that those few companies are winning over new customers and driving additional profits by exploiting their abilities to adapt to environmental needs. The article cites Dell as an example – Dell has invested in research to develop new products designed to reduce its customers’ emissions by more than 10 million metric tons of CO2e per year. This reduction in emissions should save Dell’s customers over $1billion per year as a result! Over time we expect to see many additional companies prove that eco-design provides marketplace benefits through differentiation and direct customer value. How do you meet compliance requirements and also successfully invest in eco-friendly designs? No doubt companies struggle to answer this question. After all, the journey to get there may involve transforming business models, go-to-market strategies, supply networks, quality assurance policies and compliance processes per the rapidly evolving global and regional directives. There may be limited executive focus on the initiative, inability to quantify noncompliance, or not enough resources to justify investment. To make things even more difficult to address, compliance responsibility can be a passionate topic within an organization, making the prospect of change on an enterprise scale problematic and time-consuming. Without a single source of truth for product data and without proper processes in place, ensuring product compliance burgeons into a crushing task that is cost-prohibitive and overwhelming to an organization. With all the overhead, certain markets or demographics become simply inaccessible. Therefore, the risk to consumer goodwill and satisfaction, revenue, business continuity, and market potential is too great not to solve the compliance challenge. Companies are beginning to adapt and even thrive in today’s highly regulated and transparent environment by implementing systematic approaches to product compliance that are more than functional bandages but revenue-generating engines. Consider partnering with Oracle to help you address your compliance needs. Many of the world’s most innovative leaders and pioneers are leveraging Oracle’s Agile Product Lifecycle Management (PLM) portfolio of enterprise applications to manage the product value chain, centralize product data, automate processes, and launch more eco-friendly products to market faster. Particularly, the Agile Product Governance & Compliance (PG&C) solution provides out-of-the-box functionality to integrate actionable regulatory information into the enterprise product record from the ideation to the disposal/recycling phase. Agile PG&C makes it possible to efficiently manage compliance per corporate green initiatives as well as regional and global directives. Options are critical, but so is ease-of-use. Anyone who’s grappled with compliance policy knows legal interpretation plays a major role in determining how an organization responds to regulation. Agile PG&C gives you the freedom to configure product compliance per your needs, while maintaining rigorous control over the product record in an easy-to-use interface that facilitates adoption efforts. It allows you to assign regulations as specifications for a part or BOM roll-up. Each specification has a threshold value that alerts you to a non-compliance issue if the threshold value is exceeded. Set however many regulations as specifications you need to make sure a product can be sold in your target countries. Another option is to implement like one of our leading consumer electronics customers and define your own “catch-all” specification to ensure compliance in all markets. You can give your suppliers secure access to enter their component data or integrate a third party’s data. With Agile PG&C you are able to design compliance earlier into your products to reduce cost and improve quality downstream when stakes are higher. Agile PG&C is a comprehensive solution that makes product compliance more reliable and efficient. Throughout product lifecycles, use the solution to support full material disclosures, efficiently manage declarations with your suppliers, feed compliance data into a corrective action if a product must be changed, and swiftly satisfy audits by showing all due diligence tracked in one solution. Given the compounding regulation and consumer focus on urgent environmental issues, now is the time to act. Implementing an enterprise, systematic approach to product compliance is a competitive investment. From the start, Agile Product Governance & Compliance enables companies to confidently design for compliance and sustainability, reduce the cost of compliance, minimize the risk of business interruption, deliver responsible products, and inspire new innovation. Don’t wait any longer! To find out more about Agile Product Governance & Compliance download the data sheet, contact your sales representative, or call Oracle at 1-800-633-0738. Many thanks to Shane Goodwin, Senior Manager, Oracle Agile PLM Product Management, for contributions to this article.

Read the article
SQL SERVER – Storing 64-bit Unsigned Integer Value in Database

- by Pinal Dave

Here is a very interesting question I received in an email just another day. Some questions just are so good that it makes me wonder how come I have not faced it first hand. Anyway here is the question - “Pinal, I am migrating my database from MySQL to SQL Server and I have faced unique situation. I have been using Unsigned 64-bit integer in MySQL but when I try to migrate that column to SQL Server, I am facing an issue as there is no datatype which I find appropriate for my column. It is now too late to change the datatype and I need immediate solution. One chain of thought was to change the data type of the column from Unsigned 64-bit (BIGINT) to VARCHAR(n) but that will just change the data type for me such that I will face quite a lot of performance related issues in future. In SQL Server we also have the BIGINT data type but that is Signed 64-bit datatype. BIGINT datatype in SQL Server have range of -2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807). However, my digit is much larger than this number. Is there anyway, I can store my big 64-bit Unsigned Integer without loosing much of the performance of by converting it to VARCHAR.” Very interesting question, for the sake of the argument, we can ask user that there should be no need of such a big number or if you are taking about identity column I really doubt that if your table will grow beyond this table. Here the real question which I found interesting was how to store 64-bit unsigned integer value in SQL Server without converting it to String data type. After thinking a bit, I found a fairly simple answer. I can use NUMERIC data type. I can use NUMERIC(20) datatype for 64-bit unsigned integer value, NUMERIC(10) datatype for 32-bit unsigned integer value and NUMERIC(5) datatype for 16-bit unsigned integer value. Numeric datatype supports 38 maximum of 38 precision. Now here is another thing to keep in mind. Using NUMERIC datatype will indeed accept the 64-bit unsigned integer but in future if you try to enter negative value, it will also allow the same. Hence, you will need to put any additional constraint over column to only accept positive integer there. Here is another big concern, SQL Server will store the number as numeric and will treat that as a positive integer for all the practical purpose. You will have to write in your application logic to interpret that as a 64-bit Unsigned Integer. On another side if you are using unsigned integers in your application, there are good chance that you already have logic taking care of the same. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL Datatype

Read the article
Software center is broken and can not be repaired or reinstalled

- by Michal

When I open the software center, I am told that I can not use it, for it is broken, and needs to be repaired. First I try to do this automatically, as I was offered. I enter a root password, and then the installation fails. installArchives() failed: reconfiguring packages... reconfiguring packages... reconfiguring packages... reconfiguring packages... (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 275048 files and directories currently installed.) Unpacking wine1.4-i386 (from .../wine1.4-i386_1.4-0ubuntu4.1_i386.deb) ... dpkg: error processing /var/cache/apt/archives/wine1.4-i386_1.4-0ubuntu4.1_i386.deb (--unpack): trying to overwrite '/usr/bin/wine', which is also in package wine1.5 1.5.5-0ubuntu1~ppa1~oneiric1+pulse17 No apport report written because MaxReports is reached already dpkg-deb: error: subprocess paste was killed by signal (Broken pipe) Errors were encountered while processing: /var/cache/apt/archives/wine1.4-i386_1.4-0ubuntu4.1_i386.deb Error in function: dpkg: dependency problems prevent configuration of wine1.4-common: wine1.4-common depends on wine1.4 (= 1.4-0ubuntu4.1); however: Package wine1.4 is not installed. dpkg: error processing wine1.4-common (--configure): dependency problems - leaving unconfigured What should I do now? First of all, I've tried reinstalling the center, but it failed due to the same 1.4 dependency as is laid out here. I've googled for help and although I don't understand linux at all, I've tried some suggestions: I've tried editing the dpkg status in /var/lib/dpkg/status which failed because the file could not be found. I've tried purging wine/* but that had unresolved dependencies as well. It's a giant mess.

Read the article
Designing a completely new database/gui solution for my compnay

- by user1277304

I'm no expert when it come to everything Visual Studio 2010 and utilizing SQL server 2008. I'm sure some of my personal projects I've built for personal use would get laughed off the face of the planet, but SQLCe has been the solution I was looking for those home type of projects. And they work, flawlessly. Now I feel it's time to step up to the big league. I want to develop a complete, unified and module based solution for my company that I'm working for. We're still using stuff from the 80s for goodness sake! I use Excel and query the ancient database on my own because I can't stand the GUI. Nothing against people of age, but the IDE our programmers are using is from the stone age, and they use APL of all things with it. I've yet to see a radio button control anywhere in the GUI where it would make sense. Anyway, I want to do this right from the ground up. I'm by no means a newbie when it comes to programming in .NET 2010, however, I want the entire solution to be professionally done. I want version control, test projects, project flow, SQL 2008 integration and all the bells and whistles that come with that. I know for a fact that if we had something like that running, not only would development costs and time be slashed four fold, but the possibilities for expansion and performance would sky rocket. (Between the GUI an our DB engine, it can only use ONE CORE! ONE! It's 2012 for goodness sake!) Our business is growing and our current ancient solution just can't keep up, and I'd hate to see our business go down in flames because our programmer is stuck in the 80's and refuses to use anything current. So I ask you guys, the experts and know-it-alls, where do I start? Are there any gems of good books out there in the haystack of all "This for dummies" type of deals? I already have several people backing me in this endeavor, and while it may seem brash to just usurp the current programmers, I'm doing this for the company as a whole.

Read the article
Master-slave vs. peer-to-peer archictecture: benefits and problems

- by Ashok_Ora

Normal 0 false false false EN-US X-NONE X-NONE Almost two decades ago, I was a member of a database development team that introduced adaptive locking. Locking, the most popular concurrency control technique in database systems, is pessimistic. Locking ensures that two or more conflicting operations on the same data item don’t “trample” on each other’s toes, resulting in data corruption. In a nutshell, here’s the issue we were trying to address. In everyday life, traffic lights serve the same purpose. They ensure that traffic flows smoothly and when everyone follows the rules, there are no accidents at intersections. As I mentioned earlier, the problem with typical locking protocols is that they are pessimistic. Regardless of whether there is another conflicting operation in the system or not, you have to hold a lock! Acquiring and releasing locks can be quite expensive, depending on how many objects the transaction touches. Every transaction has to pay this penalty. To use the earlier traffic light analogy, if you have ever waited at a red light in the middle of nowhere with no one on the road, wondering why you need to wait when there’s clearly no danger of a collision, you know what I mean. The adaptive locking scheme that we invented was able to minimize the number of locks that a transaction held, by detecting whether there were one or more transactions that needed conflicting eyou could get by without holding any lock at all. In many “well-behaved” workloads, there are few conflicts, so this optimization is a huge win. If, on the other hand, there are many concurrent, conflicting requests, the algorithm gracefully degrades to the “normal” behavior with minimal cost. We were able to reduce the number of lock requests per TPC-B transaction from 178 requests down to 2! Wow! This is a dramatic improvement in concurrency as well as transaction latency. The lesson from this exercise was that if you can identify the common scenario and optimize for that case so that only the uncommon scenarios are more expensive, you can make dramatic improvements in performance without sacrificing correctness. So how does this relate to the architecture and design of some of the modern NoSQL systems? NoSQL systems can be broadly classified as master-slave sharded, or peer-to-peer sharded systems. NoSQL systems with a peer-to-peer architecture have an interesting way of handling changes. Whenever an item is changed, the client (or an intermediary) propagates the changes synchronously or asynchronously to multiple copies (for availability) of the data. Since the change can be propagated asynchronously, during some interval in time, it will be the case that some copies have received the update, and others haven’t. What happens if someone tries to read the item during this interval? The client in a peer-to-peer system will fetch the same item from multiple copies and compare them to each other. If they’re all the same, then every copy that was queried has the same (and up-to-date) value of the data item, so all’s good. If not, then the system provides a mechanism to reconcile the discrepancy and to update stale copies. So what’s the problem with this? There are two major issues: First, IT’S HORRIBLY PESSIMISTIC because, in the common case, it is unlikely that the same data item will be updated and read from different locations at around the same time! For every read operation, you have to read from multiple copies. That’s a pretty expensive, especially if the data are stored in multiple geographically separate locations and network latencies are high. Second, if the copies are not all the same, the application has to reconcile the differences and propagate the correct value to the out-dated copies. This means that the application program has to handle discrepancies in the different versions of the data item and resolve the issue (which can further add to cost and operation latency). Resolving discrepancies is only one part of the problem. What if the same data item was updated independently on two different nodes (copies)? In that case, due to the asynchronous nature of change propagation, you might land up with different versions of the data item in different copies. In this case, the application program also has to resolve conflicts and then propagate the correct value to the copies that are out-dated or have incorrect versions. This can get really complicated. My hunch is that there are many peer-to-peer-based applications that don’t handle this correctly, and worse, don’t even know it. Imagine have 100s of millions of records in your database – how can you tell whether a particular data item is incorrect or out of date? And what price are you willing to pay for ensuring that the data can be trusted? Multiple network messages per read request? Discrepancy and conflict resolution logic in the application, and potentially, additional messages? All this overhead, when all you were trying to do was to read a data item. Wouldn’t it be simpler to avoid this problem in the first place? Master-slave architectures like the Oracle NoSQL Database handles this very elegantly. A change to a data item is always sent to the master copy. Consequently, the master copy always has the most current and authoritative version of the data item. The master is also responsible for propagating the change to the other copies (for availability and read scalability). Client drivers are aware of master copies and replicas, and client drivers are also aware of the “currency” of a replica. In other words, each NoSQL Database client knows how stale a replica is. This vastly simplifies the job of the application developer. If the application needs the most current version of the data item, the client driver will automatically route the request to the master copy. If the application is willing to tolerate some staleness of data (e.g. a version that is no more than 1 second out of date), the client can easily determine which replica (or set of replicas) can satisfy the request, and route the request to the most efficient copy. This results in a dramatic simplification in application logic and also minimizes network requests (the driver will only send the request to exactl the right replica, not many). So, back to my original point. A well designed and well architected system minimizes or eliminates unnecessary overhead and avoids pessimistic algorithms wherever possible in order to deliver a highly efficient and high performance system. If you’ve every programmed an Oracle NoSQL Database application, you’ll know the difference! /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article
Managing Data Growth in SQL Server

'Help, my database ate my disk drives!'. Many DBAs spend most of their time dealing with variations of the problem of database processes consuming too much disk space. This happens because of errors such as incorrect configurations for recovery models, data growth for large objects and queries that overtax TempDB resources. Rodney describes, with some feeling, the errors that can lead to this sort of crisis for the working DBA, and their solution.

Read the article

< Previous Page | 357 358 359 360 361 362 363 364 365 366 367 368 | Next Page >