Search Results

Search found 45150 results on 1806 pages for 'oracle technology database'.

Page 394/1806 | < Previous Page | 390 391 392 393 394 395 396 397 398 399 400 401 | Next Page >

CBO????????

- by Liu Maclean(???)

???Itpub????????CBO??????????, ????????: SQL> create table maclean1 as select * from dba_objects; Table created. SQL> update maclean1 set status='INVALID' where owner='MACLEAN'; 2 rows updated. SQL> commit; Commit complete. SQL> create index ind_maclean1 on maclean1(status); Index created. SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN1',cascade=>true); PL/SQL procedure successfully completed. SQL> explain plan for select * from maclean1 where status='INVALID'; Explained. SQL> set linesize 140 pagesize 1400 SQL> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------- Plan hash value: 987568083 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 11320 | 1028K| 85 (0)| 00:00:02 | |* 1 | TABLE ACCESS FULL| MACLEAN1 | 11320 | 1028K| 85 (0)| 00:00:02 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("STATUS"='INVALID') 13 rows selected. 10053 trace Access path analysis for MACLEAN1 *************************************** SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for MACLEAN1[MACLEAN1] Column (#10): STATUS( AvgLen: 7 NDV: 2 Nulls: 0 Density: 0.500000 Table: MACLEAN1 Alias: MACLEAN1 Card: Original: 22639.000000 Rounded: 11320 Computed: 11319.50 Non Adjusted: 11319.50 Access Path: TableScan Cost: 85.33 Resp: 85.33 Degree: 0 Cost_io: 85.00 Cost_cpu: 11935345 Resp_io: 85.00 Resp_cpu: 11935345 Access Path: index (AllEqRange) Index: IND_MACLEAN1 resc_io: 185.00 resc_cpu: 8449916 ix_sel: 0.500000 ix_sel_with_filters: 0.500000 Cost: 185.24 Resp: 185.24 Degree: 1 Best:: AccessPath: TableScan Cost: 85.33 Degree: 1 Resp: 85.33 Card: 11319.50 Bytes: 0 ?????10053????????????,?????Density = 0.5 ?? 1/ NDV ??? ??????????????STATUS='INVALID"???????????, ????????????????? ????”STATUS”=’INVALID’ condition???2?,?status??????,??????dbms_stats?????????????,???CBO????INDEX Range ind_maclean1,???????,??????opitimizer?????? ?????????????????????????,????????,??????????status=’INVALID’???????card??,????????: [oracle@vrh4 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.2.0 Production on Mon Oct 17 19:15:45 2011 Copyright (c) 1982, 2010, Oracle. All rights reserved. Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production PL/SQL Release 11.2.0.2.0 - Production CORE 11.2.0.2.0 Production TNS for Linux: Version 11.2.0.2.0 - Production NLSRTL Version 11.2.0.2.0 - Production SQL> show parameter optimizer_fea NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ optimizer_features_enable string 11.2.0.2 SQL> select * from global_name; GLOBAL_NAME -------------------------------------------------------------------------------- www.oracledatabase12g.com & www.askmaclean.com SQL> drop table maclean; Table dropped. SQL> create table maclean as select * from dba_objects; Table created. SQL> update maclean set status='INVALID' where owner='MACLEAN'; 2 rows updated. SQL> commit; Commit complete. SQL> create index ind_maclean on maclean(status); Index created. SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN',cascade=>true, method_opt=>'FOR ALL COLUMNS SIZE 2'); PL/SQL procedure successfully completed. ???????2?bucket????, ??????????????? ???Quest???Guy Harrison???????FREQUENCY????????,??????: rem rem Generate a histogram of data distribution in a column as recorded rem in dba_tab_histograms rem rem Guy Harrison Jan 2010 : www.guyharrison.net rem rem hexstr function is from From http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:707586567563 set pagesize 10000 set lines 120 set verify off col char_value format a10 heading "Endpoint|value" col bucket_count format 99,999,999 heading "bucket|count" col pct format 999.99 heading "Pct" col pct_of_max format a62 heading "Pct of|Max value" rem col endpoint_value format 9999999999999 heading "endpoint|value" CREATE OR REPLACE FUNCTION hexstr (p_number IN NUMBER) RETURN VARCHAR2 AS l_str LONG := TO_CHAR (p_number, 'fm' || RPAD ('x', 50, 'x')); l_return VARCHAR2 (4000); BEGIN WHILE (l_str IS NOT NULL) LOOP l_return := l_return || CHR (TO_NUMBER (SUBSTR (l_str, 1, 2), 'xx')); l_str := SUBSTR (l_str, 3); END LOOP; RETURN (SUBSTR (l_return, 1, 6)); END; / WITH hist_data AS ( SELECT endpoint_value,endpoint_actual_value, NVL(LAG (endpoint_value) OVER (ORDER BY endpoint_value),' ') prev_value, endpoint_number, endpoint_number, endpoint_number - NVL (LAG (endpoint_number) OVER (ORDER BY endpoint_value), 0) bucket_count FROM dba_tab_histograms JOIN dba_tab_col_statistics USING (owner, table_name,column_name) WHERE owner = '&owner' AND table_name = '&table' AND column_name = '&column' AND histogram='FREQUENCY') SELECT nvl(endpoint_actual_value,endpoint_value) endpoint_value , bucket_count, ROUND(bucket_count*100/SUM(bucket_count) OVER(),2) PCT, RPAD(' ',ROUND(bucket_count*50/MAX(bucket_count) OVER()),'*') pct_of_max FROM hist_data; WITH hist_data AS ( SELECT endpoint_value,endpoint_actual_value, NVL(LAG (endpoint_value) OVER (ORDER BY endpoint_value),' ') prev_value, endpoint_number, endpoint_number, endpoint_number - NVL (LAG (endpoint_number) OVER (ORDER BY endpoint_value), 0) bucket_count FROM dba_tab_histograms JOIN dba_tab_col_statistics USING (owner, table_name,column_name) WHERE owner = '&owner' AND table_name = '&table' AND column_name = '&column' AND histogram='FREQUENCY') SELECT hexstr(endpoint_value) char_value, bucket_count, ROUND(bucket_count*100/SUM(bucket_count) OVER(),2) PCT, RPAD(' ',ROUND(bucket_count*50/MAX(bucket_count) OVER()),'*') pct_of_max FROM hist_data ORDER BY endpoint_value; ?????,??????????FREQUENCY?????: ??dbms_stats ?????STATUS=’INVALID’ bucket count=9 percent = 0.04 ,??????10053 trace????????: SQL> explain plan for select * from maclean where status='INVALID'; Explained. SQL> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ------------------------------------- Plan hash value: 3087014066 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 837 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| MACLEAN | 9 | 837 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_MACLEAN | 9 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("STATUS"='INVALID') ??????????????CBO???????STATUS=’INVALID’?cardnality?? , ??????????? ,??index range scan??Full table scan? ????????????????10053 trace: SQL> alter system flush shared_pool; System altered. SQL> oradebug setmypid; Statement processed. SQL> oradebug event 10053 trace name context forever ,level 1; Statement processed. SQL> explain plan for select * from maclean where status='INVALID'; Explained. SINGLE TABLE ACCESS PATH Single Table Cardinality Estimation for MACLEAN[MACLEAN] Column (#10): NewDensity:0.000199, OldDensity:0.000022 BktCnt:22640, PopBktCnt:22640, PopValCnt:2, NDV:2 ???NewDensity= bucket_count / SUM(bucket_count) /2 Column (#10): STATUS( AvgLen: 7 NDV: 2 Nulls: 0 Density: 0.000199 Histogram: Freq #Bkts: 2 UncompBkts: 22640 EndPtVals: 2 Table: MACLEAN Alias: MACLEAN Card: Original: 22640.000000 Rounded: 9 Computed: 9.00 Non Adjusted: 9.00 Access Path: TableScan Cost: 85.30 Resp: 85.30 Degree: 0 Cost_io: 85.00 Cost_cpu: 10804625 Resp_io: 85.00 Resp_cpu: 10804625 Access Path: index (AllEqRange) Index: IND_MACLEAN resc_io: 2.00 resc_cpu: 20763 ix_sel: 0.000398 ix_sel_with_filters: 0.000398 Cost: 2.00 Resp: 2.00 Degree: 1 Best:: AccessPath: IndexRange Index: IND_MACLEAN Cost: 2.00 Degree: 1 Resp: 2.00 Card: 9.00 Bytes: 0 ???????????2 bucket?????CBO????????????,???????????????????,???dbms_stats.DEFAULT_METHOD_OPT????????????????????? ???dbms_stats?????????????????????col_usage$??????predicate???????,??col_usage$??<????????SMON??(?):??col_usage$????>? ??????????dbms_stats????????,col_usage$????????????predicate???,??dbms_stats??????????????????, ?: SQL> drop table maclean; Table dropped. SQL> create table maclean as select * from dba_objects; Table created. SQL> update maclean set status='INVALID' where owner='MACLEAN'; 2 rows updated. SQL> commit; Commit complete. SQL> create index ind_maclean on maclean(status); Index created. ??dbms_stats??method_opt??maclean? SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN'); PL/SQL procedure successfully completed. @histogram.sql Enter value for owner: SYS old 12: WHERE owner = '&owner' new 12: WHERE owner = 'SYS' Enter value for table: MACLEAN old 13: AND table_name = '&table' new 13: AND table_name = 'MACLEAN' Enter value for column: STATUS old 14: AND column_name = '&column' new 14: AND column_name = 'STATUS' no rows selected ????col_usage$?????,????????status????? declare begin for i in 1..500 loop execute immediate ' alter system flush shared_pool'; DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO; execute immediate 'select count(*) from maclean where status=''INVALID'' ' ; end loop; end; / PL/SQL procedure successfully completed. SQL> select obj# from obj$ where name='MACLEAN'; OBJ# ---------- 97215 SQL> select * from col_usage$ where OBJ#=97215; OBJ# INTCOL# EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP ---------- ---------- -------------- -------------- ----------------- ----------- ---------- ---------- --------- 97215 1 1 0 0 0 0 0 17-OCT-11 97215 10 499 0 0 0 0 0 17-OCT-11 SQL> exec dbms_stats.gather_table_stats('SYS','MACLEAN'); PL/SQL procedure successfully completed. @histogram.sql Enter value for owner: SYS Enter value for table: MACLEAN Enter value for column: STATUS Endpoint bucket Pct of value count Pct Max value ---------- ----------- ------- -------------------------------------------------------------- INVALI 2 .04 VALIC3 5,453 99.96 *************************************************

Read the article
How to find and fix performance problems in ORM powered applications

- by FransBouma

Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application. Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter. As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.

Read the article
Big Data’s Killer App…

- by jean-pierre.dijcks

Recently Keith spent some time talking about the cloud on this blog and I will spare you my thoughts on the whole thing. What I do want to write down is something about the Big Data movement and what I think is the killer app for Big Data... Where is this coming from, ok, I confess... I spent 3 days in cloud land at the Cloud Connect conference in Santa Clara and it was quite a lot of fun. One of the nice things at Cloud Connect was that there was a track dedicated to Big Data, which prompted me to some extend to write this post. What is Big Data anyways? The most valuable point made in the Big Data track was that Big Data in itself is not very cool. Doing something with Big Data is what makes all of this cool and interesting to a business user! The other good insight I got was that a lot of people think Big Data means a single gigantic monolithic system holding gazillions of bytes or documents or log files. Well turns out that most people in the Big Data track are talking about a lot of collections of smaller data sets. So rather than thinking "big = monolithic" you should be thinking "big = many data sets". This is more than just theoretical, it is actually relevant when thinking about big data and how to process it. It is important because it means that the platform that stores data will most likely consist out of multiple solutions. You may be storing logs on something like HDFS, you may store your customer information in Oracle and you may store distilled clickstream information in some distilled form in MySQL. The big question you will need to solve is not what lives where, but how to get it all together and get some value out of all that data. NoSQL and MapReduce Nope, sorry, this is not the killer app... and no I'm not saying this because my business card says Oracle and I'm therefore biased. I think language is important, but as with storage I think pragmatic is better. In other words, some questions can be answered with SQL very efficiently, others can be answered with PERL or TCL others with MR. History should teach us that anyone trying to solve a problem will use any and all tools around. For example, most data warehouses (Big Data 1.0?) get a lot of data in flat files. Everyone then runs a bunch of shell scripts to massage or verify those files and then shoves those files into the database. We've even built shell script support into external tables to allow for this. I think the Big Data projects will do the same. Some people will use MapReduce, although I would argue that things like Cascading are more interesting, some people will use Java. Some data is stored on HDFS making Cascading the way to go, some data is stored in Oracle and SQL does do a good job there. As with storage and with history, be pragmatic and use what fits and neither NoSQL nor MR will be the one and only. Also, a language, while important, does in itself not deliver business value. So while cool it is not a killer app... Vertical Behavioral Analytics This is the killer app! And you are now thinking: "what does that mean?" Let's decompose that heading. First of all, analytics. I would think you had guessed by now that this is really what I'm after, and of course you are right. But not just analytics, which has a very large scope and means many things to many people. I'm not just after Business Intelligence (analytics 1.0?) or data mining (analytics 2.0?) but I'm after something more interesting that you can only do after collecting large volumes of specific data. That all important data is about behavior. What do my customers do? More importantly why do they behave like that? If you can figure that out, you can tailor web sites, stores, products etc. to that behavior and figure out how to be successful. Today's behavior that is somewhat easily tracked is web site clicks, search patterns and all of those things that a web site or web server tracks. that is where the Big Data lives and where these patters are now emerging. Other examples however are emerging, and one of the examples used at the conference was about prediction churn for a telco based on the social network its members are a part of. That social network is not about LinkedIn or Facebook, but about who calls whom. I call you a lot, you switch provider, and I might/will switch too. And that just naturally brings me to the next word, vertical. Vertical in this context means per industry, e.g. communications or retail or government or any other vertical. The reason for being more specific than just behavioral analytics is that each industry has its own data sources, has its own quirky logic and has its own demands and priorities. Of course, the methods and some of the software will be common and some will have both retail and service industry analytics in place (your corner coffee store for example). But the gist of it all is that analytics that can predict customer behavior for a specific focused group of people in a specific industry is what makes Big Data interesting. Building a Vertical Behavioral Analysis System Well, that is going to be interesting. I have not seen much going on in that space and if I had to have some criticism on the cloud connect conference it would be the lack of concrete user cases on big data. The telco example, while a step into the vertical behavioral part is not really on big data. It used a sample of data from the customers' data warehouse. One thing I do think, and this is where I think parts of the NoSQL stuff come from, is that we will be doing this analysis where the data is. Over the past 10 years we at Oracle have called this in-database analytics. I guess we were (too) early? Now the entire market is going there including companies like SAS. In-place btw does not mean "no data movement at all", what it means that you will do this on data's permanent home. For SAS that is kind of the current problem. Most of the inputs live in a data warehouse. So why move it into SAS and back? That all worked with 1 TB data warehouses, but when we are looking at 100TB to 500 TB of distilled data... Comments? As it is still early days with these systems, I'm very interested in seeing reactions and thoughts to some of these thoughts...

Read the article
Azure, don't give me multiple VMs, give me one elastic VM

- by FransBouma

Yesterday, Microsoft revealed new major features for Windows Azure (see ScottGu's post). It all looks shiny and great, but after reading most of the material describing the new features, I still find the overall idea behind all of it flawed: why should I care on how much VMs my web app runs? Isn't that a problem to solve for the Windows Azure engineers / software? And what if I need the file system, why can't I simply get a virtual filesystem ? To illustrate my point, let's use a real example: a product website with a customer system/database and next to it a support site with accompanying database. Both are written in .NET, using ASP.NET and use a SQL Server database each. The product website offers files to download by customers, very simple. You have a couple of options to host these websites: Buy a server, place it in a rack at an ISP and run the sites on that server Use 'shared hosting' with an ISP, which means your sites' appdomains are running on the same machine, as well as the files stored, and the databases are hosted in the same server as the other shared databases. Hire a VM, install your OS of choice at an ISP, and host the sites on that VM, basically the same as the first option, except you don't have a physical server At some cloud-vendor, either host the sites 'shared' or in a VM. See above. With all of those options, scalability is a problem, even the cloud-based ones, though not due to the same reasons: The physical server solution has the obvious problem that if you need more power, you need to buy a bigger server or more servers which requires you to add replication and other overhead Shared hosting solutions are almost always capped on memory usage / traffic and database size: if your sites get too big, you have to move out of the shared hosting environment and start over with one of the other solutions The VM solution, be it a VM at an ISP or 'in the cloud' at e.g. Windows Azure or Amazon, in theory allows scaling out by simply instantiating more VMs, however that too introduces the same overhead problems as with the physical servers: suddenly more than 1 instance runs your sites. If a cloud vendor offers its services in the form of VMs, you won't gain much over having a VM at some ISP: the main problems you have to work around are still there: when you spin up more than one VM, your application must be completely stateless at any moment, including the DB sub system, because what's in memory in instance 1 might not be in memory in instance 2. This might sounds trivial but it's not. A lot of the websites out there started rather small: they were perfectly runnable on a single machine with normal memory and CPU power. After all, you don't need a big machine to run a website with even thousands of users a day. Moving these sites to a multi-VM environment will cause a problem: all the in-memory state they use, all the multi-page transitions they use while keeping state across the transition, they can't do that anymore like they did that on a single machine: state is something of the past, you have to store every byte of state in either a DB or in a viewstate or in a cookie somewhere so with the next request, all state information is available through the request, as nothing is kept in-memory. Our example uses a bunch of files in a file system. Using multiple VMs will require that these files move to a cloud storage system which is mounted in each VM so we don't have to store the files on each VM. This might require different file paths, but this change should be minor. What's perhaps less minor is the maintenance procedure in place on the new type of cloud storage used: instead of ftp-ing into a VM, you might have to update the files using different ways / tools. All in all this makes moving an existing website which was written for an environment that's based around a VM (namely .NET with its CLR) overly cumbersome and problematic: it forces you to refactor your website system to be able to be used 'in the cloud', which is caused by the limited way how e.g. Windows Azure offers its cloud services: in blocks of VMs. Offer a scalable, flexible VM which extends with my needs Instead, cloud vendors should offer simply one VM to me. On that VM I run the websites, store my DB and my files. As it's a virtual machine, how this machine is actually ran on physical hardware (e.g. partitioned), I don't care, as that's the problem for the cloud vendor to solve. If I need more resources, e.g. I have more traffic to my server, way more visitors per day, the VM stretches, like I bought a bigger box. This frees me from the problem which comes with multiple VMs: I don't have any refactoring to do at all: I can simply build my website as if it runs on my local hardware server, upload it to the VM offered by the cloud vendor, install it on the VM and I'm done. "But that might require changes to windows!" Yes, but Microsoft is Windows. Windows Azure is their service, they can make whatever change to what they offer to make it look like it's windows. Yet, they're stuck, like Amazon, in thinking in VMs, which forces developers to 'think ahead' and gamble whether they would need to migrate to a cloud with multiple VMs in the future or not. Which comes down to: gamble whether they should invest time in code / architecture which they might never need. (YAGNI anyone?) So the VM we're talking about, is that a low-level VM which runs a guest OS, or is that VM a different kind of VM? The flexible VM: .NET's CLR ? My example websites are ASP.NET based, which means they run inside a .NET appdomain, on the .NET CLR, which is a VM. The only physical OS resource the sites need is the file system, however this too is accessed through .NET. In short: all the websites see is what .NET allows the websites to see, the world as the websites know it is what .NET shows them and lets them access. How the .NET appdomain is run physically, that's the concern of .NET, not mine. This begs the question why Windows Azure doesn't offer virtual appdomains? Or better: .NET environments which look like one machine but could be physically multiple machines. In such an environment, no change has to be made to the websites to migrate them from a local machine or own server to the cloud to get proper scaling: the .NET VM will simply scale with the need: more memory needed, more CPU power needed, it stretches. What it offers to the application running inside the appdomain is simply increasing, but not fragmented: all resources are available to the application: this means that the problem of how to scale is back to where it should be: with the cloud vendor. "Yeah, great, but what about the databases?" The .NET application communicates with the database server through a .NET ADO.NET provider. Where the database is located is not a problem of the appdomain: the ADO.NET provider has to solve that. I.o.w.: we can host the databases in an environment which offers itself as a single resource and is accessible through one connection string without replication overhead on the outside, and use that environment inside the .NET VM as if it was a single DB. But what about memory replication and other problems? This environment isn't simple, at least not for the cloud vendor. But it is simple for the customer who wants to run his sites in that cloud: no work needed. No refactoring needed of existing code. Upload it, run it. Perhaps I'm dreaming and what I described above isn't possible. Yet, I think if cloud vendors don't move into that direction, what they're offering isn't interesting: it doesn't solve a problem at all, it simply offers a way to instantiate more VMs with the guest OS of choice at the cost of me needing to refactor my website code so it can run in the straight jacket form factor dictated by the cloud vendor. Let's not kid ourselves here: most of us developers will never build a website which needs a truck load of VMs to run it: almost all websites created by developers can run on just a few VMs at most. Yet, the most expensive change is right at the start: moving from one to two VMs. As soon as you have refactored your website code to run across multiple VMs, adding another one is just as easy as clicking a mouse button. But that first step, that's the problem here and as it's right there at the beginning of scaling the website, it's particularly strange that cloud vendors refuse to solve that problem and leave it to the developers to solve that. Which makes migrating 'to the cloud' particularly expensive.

Read the article
Who is Jeremiah Owyang?

- by Michael Hylton

12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Q: What’s your current role and what career path brought you here? J.O.: I'm currently a partner and one of the founding team members at Altimeter Group. I'm currently the Research Director, as well as wear the hat of Industry Analyst. Prior to joining Altimeter, I was an Industry Analyst at Forrester covering Social Computing, and before that, deployed and managed the social media program at Hitachi Data Systems in Santa Clara. Around that time, I started a career blog called Web Strategy which focused on how companies were using the web to connect with customers --and never looked back. Q: As an industry analyst, what are you focused on these days? J.O.: There are three trends that I'm focused my research on at this time: 1) The Dynamic Customer Journey: Individuals (both b2c and b2b) are given so many options in their sources of data, channels to choose from and screens to consume them on that we've found that at each given touchpoint there are 75 potential permutations. Companies that can map this, then deliver information to individuals when they need it will have a competitive advantage and we want to find out who's doing this. 2) One of the sub themes that supports this trend is Social Performance. Yesterday's social web was disparate engagement of humans, but the next phase will be data driven, and soon new technologies will emerge to help all those that are consuming, publishing, and engaging on the social web to be more efficient with their time through forms of automation. As you might expect, this comes with upsides and downsides. 3) The Sentient World is our research theme that looks out the furthest as the world around us (even inanimate objects) become 'self aware' and are able to talk back to us via digital devices and beyond. Big data, internet of things, mobile devices will all be this next set. Q: People cite that the line between work and life is getting more and more blurred. Do you see your personal life influencing your professional work? J.O.: The lines between our work and personal lives are dissolving, and this leads to a greater upside of being always connected and have deeper relationships with those that are not. It also means a downside of society expectations that we're always around and available for colleagues, customers, and beyond. In the future, a balance will be sought as we seek to achieve the goals of family, friends, work, and our own personal desires. All of this is being ironically written at 430 am on a Sunday am. Q: How can people keep up with what you’re working on? J.O.: A great question, thanks. There are a few sources of information to find out, I'll lead with the first which is my blog at web-strategist.com. A few times a week I'll publish my industry insights (hires, trends, forces, funding, M&A, business needs) as well as on twitter where I'll point to all the news that's fit to print @jowyang. As my research reports go live (we publish them for all to read --called Open Research-- at no cost) they'll emerge on my blog, or checkout the research tab to find out more now. http://www.web-strategist.com/blog/research/ Q: Recently, you’ve been working with us here at Oracle on something exciting coming up later this week. What’s on the horizon? J.O.: Absolutely! This coming Thursday, September 13th, I’m doing a webcast with Oracle on “Managing Social Relationships for the Enterprise”. This is going to be a great discussion with Reggie Bradford, Senior Vice President of Product Development at Oracle and Christian Finn, Senior Director of Product Management for Oracle WebCenter. I’m looking forward to a great discussion around all those issues that so many companies are struggling with these days as they realize how much social media is impacting their business. It’s changing the way your customers and employees interact with your brand. Today it’s no longer a matter of when to become a social-enabled enterprise, but how to become a successful one. 12.00 Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Calibri","sans-serif"; mso-bidi-font-family:"Times New Roman";} Q: You’ve been very actively pursued for media interviews and conference and company speaking engagements – anything you’d like to share to give us a sneak peak of what to expect on Thursday’s webcast? J.O.: Below is a 15 minute video which encapsulates Altimeter’s themes on the Dynamic Customer Journey and the Sentient World. I’m really proud to have taken an active role in the first ever LeWeb outside of Paris. This one, which was featured in downtown London across the street from Westminster Abbey was sold out. If you’ve not heard of LeWeb, this is a global Internet conference hosted by Loic and Geraldine Le Meur, a power couple that stem from Paris but are also living in Silicon Valley, this is one of my favorite conferences to connect with brands, technology innovators, investors and friends. Altimeter was able to play a minor role in suggesting the theme for the event “Faster Than Real Time” which stems off previous LeWebs that focused on the “Real time web”. In this radical state, companies are able to anticipate the needs of their customers by using data, technology, and devices and deliver meaningful experiences before customers even know they need it. I explore two of three of Altimeter’s research themes, the Dynamic Customer Journey, and the Sentient World in my speech, but due to time, did not focus on Adaptive Organization.

Read the article
SQL Server v.Next (Denali) : More on contained databases and "contained users"

- by AaronBertrand

One of the reasons for contained databases (see my previous post ) is to allow for a more seamless transition when moving a database from one server to another. One of the biggest complications in doing so is making sure that all of the logins are in place on the new server. Contained databases help solve this issue by creating a new type of user: a database-level user with a password. I want to stress that this is not the same concept as a user without a login , which serves a completely different...(read more)

Read the article
Today’s Performance Tip: Views are for Convenience, Not Performance!

- by Jonathan Kehayias

I tweeted this last week on twitter and got a lot of retweets so I thought that I’d blog the story behind the tweet. Most vendor databases have views in them, and when people want to retrieve data from a database, it seems like the most common first stop they make are the vendor supplied Views. This post is in no way a bash against the usage or creation of Views in a SQL Server Database, I have created them before to simplify code and compartmentalize commonly required queries so that there...(read more)

Read the article
Is it wise to store a big lump of json on a database row

- by Ieyasu Sawada

I have this project which stores product details from amazon into the database. Just to give you an idea on how big it is: [{"title":"Genetic Engineering (Opposing Viewpoints)","short_title":"Genetic Engineering ...","brand":"","condition":"","sales_rank":"7171426","binding":"Book","item_detail_url":"http://localhost/wordpress/product/?asin=0737705124","node_list":"Books > Science & Math > Biological Sciences > Biotechnology","node_category":"Books","subcat":"","model_number":"","item_url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=128","details_url":"http://localhost/wordpress/product/?asin=0737705124","large_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/large-notfound.png","medium_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/medium-notfound.png","small_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/small-notfound.png","thumbnail_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/thumbnail-notfound.png","tiny_img":"http://localhost/wordpress/wp-content/plugins/ecom/img/tiny-notfound.png","swatch_img":"http://localhost/wordpress/wp-content/plugins/ecom/img/swatch-notfound.png","total_images":"6","amount":"33.70","currency":"$","long_currency":"USD","price":"$33.70","price_type":"List Price","show_price_type":"0","stars_url":"","product_review":"","rating":"","yellow_star_class":"","white_star_class":"","rating_text":" of 5","reviews_url":"","review_label":"","reviews_label":"Read all ","review_count":"","create_review_url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=132","create_review_label":"Write a review","buy_url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=19186","add_to_cart_action":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/add_to_cart.php","asin":"0737705124","status":"Only 7 left in stock.","snippet_condition":"in_stock","status_class":"ninstck","customer_images":["http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg","http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/31FIM-YIUrL.jpg","http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg","http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/51M2vvFvs2BL.jpg"],"disclaimer":"","item_attributes":[{"attr":"Author","value":"Greenhaven Press"},{"attr":"Binding","value":"Hardcover"},{"attr":"EAN","value":"9780737705126"},{"attr":"Edition","value":"1"},{"attr":"ISBN","value":"0737705124"},{"attr":"Label","value":"Greenhaven Press"},{"attr":"Manufacturer","value":"Greenhaven Press"},{"attr":"NumberOfItems","value":"1"},{"attr":"NumberOfPages","value":"224"},{"attr":"ProductGroup","value":"Book"},{"attr":"ProductTypeName","value":"ABIS_BOOK"},{"attr":"PublicationDate","value":"2000-06"},{"attr":"Publisher","value":"Greenhaven Press"},{"attr":"SKU","value":"G0737705124I2N00"},{"attr":"Studio","value":"Greenhaven Press"},{"attr":"Title","value":"Genetic Engineering (Opposing Viewpoints)"}],"customer_review_url":"http://localhost/wordpress/wp-content/ecom-customer-reviews/0737705124.html","flickr_results":["http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/5105560852_06c7d06f14_m.jpg"],"freebase_text":"No around the web data available yet","freebase_image":"http://localhost/wordpress/wp-content/plugins/ecom/img/freebase-notfound.jpg","ebay_related_items":[{"title":"Genetic Engineering (Introducing Issues With Opposing Viewpoints), , Good Book","image":"http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/140.jpg","url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=12165","currency_id":"$","current_price":"26.2"},{"title":"Genetic Engineering Opposing Viewpoints by DAVID BENDER - 1964 Hardcover","image":"http://localhost/wordpress/wp-content/uploads/2013/10/ecom_images/140.jpg","url":"http://localhost/wordpress/wp-content/ecom-plugin-redirects/ecom_redirector.php?id=130","currency_id":"AUD","current_price":"11.99"}],"no_follow":"rel=\"nofollow\"","new_tab":"target=\"_blank\"","related_products":[],"super_saver_shipping":"","shipping_availability":"","total_offers":"7","added_to_cart":""}] So the structure for the table is: asin title details (the product details in json) Will the performance suffer if I have to store like 10,000 products? Is there any other way of doing this? I'm thinking of the following, but the current setup is really the most convenient one since I also have to use the data on the client side: store the product details in a file. So something like ASIN123.json store the product details in one big file. (I'm guessing it will be a drag to extract data from this file) store each of the fields in the details in its own table field Thanks in advance!

Read the article
Is inline SQL still classed as bad practice now that we have Micro ORMs?

- by Grofit

This is a bit of an open ended question but I wanted some opinions, as I grew up in a world where inline SQL scripts were the norm, then we were all made very aware of SQL injection based issues, and how fragile the sql was when doing string manipulations all over the place. Then came the dawn of the ORM where you were explaining the query to the ORM and letting it generate its own SQL, which in a lot of cases was not optimal but was safe and easy. Another good thing about ORMs or database abstraction layers were that the SQL was generated with its database engine in mind, so I could use Hibernate/Nhibernate with MSSQL, MYSQL and my code never changed it was just a configuration detail. Now fast forward to current day, where Micro ORMs seem to be winning over more developers I was wondering why we have seemingly taken a U-Turn on the whole in-line sql subject. I must admit I do like the idea of no ORM config files and being able to write my query in a more optimal manner but it feels like I am opening myself back up to the old vulnerabilities such as SQL injection and I am also tying myself to one database engine so if I want my software to support multiple database engines I would need to do some more string hackery which seems to then start to make code unreadable and more fragile. (Just before someone mentions it I know you can use parameter based arguments with most micro orms which offers protection in most cases from sql injection) So what are peoples opinions on this sort of thing? I am using Dapper as my Micro ORM in this instance and NHibernate as my regular ORM in this scenario, however most in each field are quite similar. What I term as inline sql is SQL strings within source code. There used to be design debates over SQL strings in source code detracting from the fundamental intent of the logic, which is why statically typed linq style queries became so popular its still just 1 language, but with lets say C# and Sql in one page you have 2 languages intermingled in your raw source code now. Just to clarify, the SQL injection is just one of the known issues with using sql strings, I already mention you can stop this from happening with parameter based queries, however I highlight other issues with having SQL queries ingrained in your source code, such as the lack of DB Vendor abstraction as well as losing any level of compile time error capturing on string based queries, these are all issues which we managed to side step with the dawn of ORMs with their higher level querying functionality, such as HQL or LINQ etc (not all of the issues but most of them). So I am less focused on the individual highlighted issues and more the bigger picture of is it now becoming more acceptable to have SQL strings directly in your source code again, as most Micro ORMs use this mechanism. Here is a similar question which has a few different view points, although is more about the inline sql without the micro orm context: http://stackoverflow.com/questions/5303746/is-inline-sql-hard-coding

Read the article
Security Controls on data for P6 Analytics

- by Jeffrey McDaniel

The Star database and P6 Analytics calculates security based on P6 security using OBS, global, project, cost, and resource security considerations. If there is some concern that users are not seeing expected data in P6 Analytics here are some areas to review: 1. Determining if a user has cost security is based on the Project level security privileges - either View Project Costs/Financials or Edit EPS Financials. If expecting to see costs make sure one of these permissions are allocated. 2. User must have OBS access on a Project. Not WBS level. WBS level security is not supported. Make sure user has OBS on project level. 3. Resource Access is determined by what is granted in P6. Verify the resource access granted to this user in P6. Resource security is hierarchical. Project access will override Resource access based on the way security policies are applied. 4. Module access must be given to a P6 user for that user to come over into Star/P6 Analytics. For earlier version of RDB there was a report_user_flag on the Users table. This flag field is no longer used after P6 Reporting Database 2.1. 5. For P6 Reporting Database versions 2.2 and higher, the Extended Schema Security service must be run to calculate all security. Any changes to privileges or security this service must be rerun before any ETL. 6. In P6 Analytics 2.0 or higher, a Weblogic user must exist that matches the P6 username. For example user Tim must exist in P6 and Weblogic users for Tim to be able to log into P6 Analytics and access data based on P6 security. In earlier versions the username needed to exist in RPD. 7. Cache in OBI is another area that can sometimes make it seem a user isn't seeing the data they expect. While cache can be beneficial for performance in OBI. If the data is outdated it can retrieve older, stale data. Clearing or turning off cache when rerunning a query can determine if the returned result set was from cache or from the database.

Read the article
Evaluating Solutions to Manage Product Compliance? Don't Wait Much Longer

- by Kerrie Foy

Depending on severity, product compliance issues can cause all sorts of problems from run-away budgets to business closures. But effective policies and safeguards can create a strong foundation for innovation, productivity, market penetration and competitive advantage. If you’ve been putting off a systematic approach to product compliance, it is time to reconsider that decision, or indecision. Why now? No matter what industry, companies face a litany of worldwide and regional regulations that require proof of product compliance and environmental friendliness for market access. For example, Restriction of Hazardous Substances (RoHS) is a regulation that restricts the use of six dangerous materials used in the manufacture of electronic and electrical equipment. ROHS was originally adopted by the European Union in 2003 for implementation in 2006, and it has evolved over time through various regional versions for North America, China, Japan, Korea, Norway and Turkey. In addition, the RoHS directive allowed for material exemptions used in Medical Devices, but that exemption ends in 2014. Additional regulations worth watching are the Battery Directive, Waste Electrical and Electronic Equipment (WEEE), and Registration, Evaluation, Authorization and Restriction of Chemicals (REACH) directives. Additional evolving regulations are coming from governing bodies like the Food and Drug Administration (FDA) and the International Organization for Standardization (ISO). Corporate sustainability initiatives are also gaining urgency and influencing product design. In a survey of 405 corporations in the Global 500 by Carbon Disclosure Project, co-written by PwC (CDP Global 500 Climate Change Report 2012 entitled Business Resilience in an Uncertain, Resource-Constrained World), 48% of the respondents indicated they saw potential to create new products and business services as a response to climate change. Just 21% reported a dedicated budget for the research. However, the report goes on to explain that those few companies are winning over new customers and driving additional profits by exploiting their abilities to adapt to environmental needs. The article cites Dell as an example – Dell has invested in research to develop new products designed to reduce its customers’ emissions by more than 10 million metric tons of CO2e per year. This reduction in emissions should save Dell’s customers over $1billion per year as a result! Over time we expect to see many additional companies prove that eco-design provides marketplace benefits through differentiation and direct customer value. How do you meet compliance requirements and also successfully invest in eco-friendly designs? No doubt companies struggle to answer this question. After all, the journey to get there may involve transforming business models, go-to-market strategies, supply networks, quality assurance policies and compliance processes per the rapidly evolving global and regional directives. There may be limited executive focus on the initiative, inability to quantify noncompliance, or not enough resources to justify investment. To make things even more difficult to address, compliance responsibility can be a passionate topic within an organization, making the prospect of change on an enterprise scale problematic and time-consuming. Without a single source of truth for product data and without proper processes in place, ensuring product compliance burgeons into a crushing task that is cost-prohibitive and overwhelming to an organization. With all the overhead, certain markets or demographics become simply inaccessible. Therefore, the risk to consumer goodwill and satisfaction, revenue, business continuity, and market potential is too great not to solve the compliance challenge. Companies are beginning to adapt and even thrive in today’s highly regulated and transparent environment by implementing systematic approaches to product compliance that are more than functional bandages but revenue-generating engines. Consider partnering with Oracle to help you address your compliance needs. Many of the world’s most innovative leaders and pioneers are leveraging Oracle’s Agile Product Lifecycle Management (PLM) portfolio of enterprise applications to manage the product value chain, centralize product data, automate processes, and launch more eco-friendly products to market faster. Particularly, the Agile Product Governance & Compliance (PG&C) solution provides out-of-the-box functionality to integrate actionable regulatory information into the enterprise product record from the ideation to the disposal/recycling phase. Agile PG&C makes it possible to efficiently manage compliance per corporate green initiatives as well as regional and global directives. Options are critical, but so is ease-of-use. Anyone who’s grappled with compliance policy knows legal interpretation plays a major role in determining how an organization responds to regulation. Agile PG&C gives you the freedom to configure product compliance per your needs, while maintaining rigorous control over the product record in an easy-to-use interface that facilitates adoption efforts. It allows you to assign regulations as specifications for a part or BOM roll-up. Each specification has a threshold value that alerts you to a non-compliance issue if the threshold value is exceeded. Set however many regulations as specifications you need to make sure a product can be sold in your target countries. Another option is to implement like one of our leading consumer electronics customers and define your own “catch-all” specification to ensure compliance in all markets. You can give your suppliers secure access to enter their component data or integrate a third party’s data. With Agile PG&C you are able to design compliance earlier into your products to reduce cost and improve quality downstream when stakes are higher. Agile PG&C is a comprehensive solution that makes product compliance more reliable and efficient. Throughout product lifecycles, use the solution to support full material disclosures, efficiently manage declarations with your suppliers, feed compliance data into a corrective action if a product must be changed, and swiftly satisfy audits by showing all due diligence tracked in one solution. Given the compounding regulation and consumer focus on urgent environmental issues, now is the time to act. Implementing an enterprise, systematic approach to product compliance is a competitive investment. From the start, Agile Product Governance & Compliance enables companies to confidently design for compliance and sustainability, reduce the cost of compliance, minimize the risk of business interruption, deliver responsible products, and inspire new innovation. Don’t wait any longer! To find out more about Agile Product Governance & Compliance download the data sheet, contact your sales representative, or call Oracle at 1-800-633-0738. Many thanks to Shane Goodwin, Senior Manager, Oracle Agile PLM Product Management, for contributions to this article.

Read the article
How do I restrict concurrent statistics gathering to a small set of tables from a single schema?

- by Maria Colgan

I got an interesting question from one of my colleagues in the performance team last week about how to restrict a concurrent statistics gather to a small subset of tables from one schema, rather than the entire schema. I thought I would share the solution we came up with because it was rather elegant, and took advantage of concurrent statistics gathering, incremental statistics, and the not so well known “obj_filter_list” parameter in DBMS_STATS.GATHER_SCHEMA_STATS procedure. You should note that the solution outline below with “obj_filter_list” still applies, even when concurrent statistics gathering and/or incremental statistics gathering is disabled. The reason my colleague had asked the question in the first place was because he wanted to enable incremental statistics for 5 large partitioned tables in one schema. The first time you gather statistics after you enable incremental statistics on a table, you have to gather statistics for all of the existing partitions so that a synopsis may be created for them. If the partitioned table in question is large and contains a lot of partition, this could take a considerable amount of time. Since my colleague only had the Exadata environment at his disposal overnight, he wanted to re-gather statistics on 5 partition tables as quickly as possible to ensure that it all finished before morning. Prior to Oracle Database 11g Release 2, the only way to do this would have been to write a script with an individual DBMS_STATS.GATHER_TABLE_STATS command for each partition, in each of the 5 tables, as well as another one to gather global statistics on the table. Then, run each script in a separate session and manually manage how many of this session could run concurrently. Since each table has over one thousand partitions that would definitely be a daunting task and would most likely keep my colleague up all night! In Oracle Database 11g Release 2 we can take advantage of concurrent statistics gathering, which enables us to gather statistics on multiple tables in a schema (or database), and multiple (sub)partitions within a table concurrently. By using concurrent statistics gathering we no longer have to run individual statistics gathering commands for each partition. Oracle will automatically create a statistics gathering job for each partition, and one for the global statistics on each partitioned table. With the use of concurrent statistics, our script can now be simplified to just five DBMS_STATS.GATHER_TABLE_STATS commands, one for each table. This approach would work just fine but we really wanted to get this down to just one command. So how can we do that? You may be wondering why we didn’t just use the DBMS_STATS.GATHER_SCHEMA_STATS procedure with the OPTION parameter set to ‘GATHER STALE’. Unfortunately the statistics on the 5 partitioned tables were not stale and enabling incremental statistics does not mark the existing statistics stale. Plus how would we limit the schema statistics gather to just the 5 partitioned tables? So we went to ask one of the statistics developers if there was an alternative way. The developer told us the advantage of the “obj_filter_list” parameter in DBMS_STATS.GATHER_SCHEMA_STATS procedure. The “obj_filter_list” parameter allows you to specify a list of objects that you want to gather statistics on within a schema or database. The parameter takes a collection of type DBMS_STATS.OBJECTTAB. Each entry in the collection has 5 feilds; the schema name or the object owner, the object type (i.e., ‘TABLE’ or ‘INDEX’), object name, partition name, and subpartition name. You don't have to specify all five fields for each entry. Empty fields in an entry are treated as if it is a wildcard field (similar to ‘*’ character in LIKE predicates). Each entry corresponds to one set of filter conditions on the objects. If you have more than one entry, an object is qualified for statistics gathering as long as it satisfies the filter conditions in one entry. You first must create the collection of objects, and then gather statistics for the specified collection. It’s probably easier to explain this with an example. I’m using the SH sample schema but needed a couple of additional partitioned table tables to get recreate my colleagues scenario of 5 partitioned tables. So I created SALES2, SALES3, and COSTS2 as copies of the SALES and COSTS table respectively (setup.sql). I also deleted statistics on all of the tables in the SH schema beforehand to more easily demonstrate our approach. Step 0. Delete the statistics on the tables in the SH schema. Step 1. Enable concurrent statistics gathering. Remember, this has to be done at the global level. Step 2. Enable incremental statistics for the 5 partitioned tables. Step 3. Create the DBMS_STATS.OBJECTTAB and pass it to the DBMS_STATS.GATHER_SCHEMA_STATS command. Here, you will notice that we defined two variables of DBMS_STATS.OBJECTTAB type. The first, filter_lst, will be used to pass the list of tables we want to gather statistics on, and will be the value passed to the obj_filter_list parameter. The second, obj_lst, will be used to capture the list of tables that have had statistics gathered on them by this command, and will be the value passed to the objlist parameter. In Oracle Database 11g Release 2, you need to specify the objlist parameter in order to get the obj_filter_list parameter to work correctly due to bug 14539274. Will also needed to define the number of objects we would supply in the obj_filter_list. In our case we ere specifying 5 tables (filter_lst.extend(5)). Finally, we need to specify the owner name and object name for each of the objects in the list. Once the list definition is complete we can issue the DBMS_STATS.GATHER_SCHEMA_STATS command. Step 4. Confirm statistics were gathered on the 5 partitioned tables. Here are a couple of other things to keep in mind when specifying the entries for the obj_filter_list parameter. If a field in the entry is empty, i.e., null, it means there is no condition on this field. In the above example , suppose you remove the statement Obj_filter_lst(1).ownname := ‘SH’; You will get the same result since when you have specified gather_schema_stats so there is no need to further specify ownname in the obj_filter_lst. All of the names in the entry are normalized, i.e., uppercased if they are not double quoted. So in the above example, it is OK to use Obj_filter_lst(1).objname := ‘sales’;. However if you have a table called ‘MyTab’ instead of ‘MYTAB’, then you need to specify Obj_filter_lst(1).objname := ‘”MyTab”’; As I said before, although we have illustrated the usage of the obj_filter_list parameter for partitioned tables, with concurrent and incremental statistics gathering turned on, the obj_filter_list parameter is generally applicable to any gather_database_stats, gather_dictionary_stats and gather_schema_stats command. You can get a copy of the script I used to generate this post here. +Maria Colgan

Read the article
Software center is broken and can not be repaired or reinstalled

- by Michal

When I open the software center, I am told that I can not use it, for it is broken, and needs to be repaired. First I try to do this automatically, as I was offered. I enter a root password, and then the installation fails. installArchives() failed: reconfiguring packages... reconfiguring packages... reconfiguring packages... reconfiguring packages... (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 275048 files and directories currently installed.) Unpacking wine1.4-i386 (from .../wine1.4-i386_1.4-0ubuntu4.1_i386.deb) ... dpkg: error processing /var/cache/apt/archives/wine1.4-i386_1.4-0ubuntu4.1_i386.deb (--unpack): trying to overwrite '/usr/bin/wine', which is also in package wine1.5 1.5.5-0ubuntu1~ppa1~oneiric1+pulse17 No apport report written because MaxReports is reached already dpkg-deb: error: subprocess paste was killed by signal (Broken pipe) Errors were encountered while processing: /var/cache/apt/archives/wine1.4-i386_1.4-0ubuntu4.1_i386.deb Error in function: dpkg: dependency problems prevent configuration of wine1.4-common: wine1.4-common depends on wine1.4 (= 1.4-0ubuntu4.1); however: Package wine1.4 is not installed. dpkg: error processing wine1.4-common (--configure): dependency problems - leaving unconfigured What should I do now? First of all, I've tried reinstalling the center, but it failed due to the same 1.4 dependency as is laid out here. I've googled for help and although I don't understand linux at all, I've tried some suggestions: I've tried editing the dpkg status in /var/lib/dpkg/status which failed because the file could not be found. I've tried purging wine/* but that had unresolved dependencies as well. It's a giant mess.

Read the article
Designing a completely new database/gui solution for my compnay

- by user1277304

I'm no expert when it come to everything Visual Studio 2010 and utilizing SQL server 2008. I'm sure some of my personal projects I've built for personal use would get laughed off the face of the planet, but SQLCe has been the solution I was looking for those home type of projects. And they work, flawlessly. Now I feel it's time to step up to the big league. I want to develop a complete, unified and module based solution for my company that I'm working for. We're still using stuff from the 80s for goodness sake! I use Excel and query the ancient database on my own because I can't stand the GUI. Nothing against people of age, but the IDE our programmers are using is from the stone age, and they use APL of all things with it. I've yet to see a radio button control anywhere in the GUI where it would make sense. Anyway, I want to do this right from the ground up. I'm by no means a newbie when it comes to programming in .NET 2010, however, I want the entire solution to be professionally done. I want version control, test projects, project flow, SQL 2008 integration and all the bells and whistles that come with that. I know for a fact that if we had something like that running, not only would development costs and time be slashed four fold, but the possibilities for expansion and performance would sky rocket. (Between the GUI an our DB engine, it can only use ONE CORE! ONE! It's 2012 for goodness sake!) Our business is growing and our current ancient solution just can't keep up, and I'd hate to see our business go down in flames because our programmer is stuck in the 80's and refuses to use anything current. So I ask you guys, the experts and know-it-alls, where do I start? Are there any gems of good books out there in the haystack of all "This for dummies" type of deals? I already have several people backing me in this endeavor, and while it may seem brash to just usurp the current programmers, I'm doing this for the company as a whole.

Read the article
Master-slave vs. peer-to-peer archictecture: benefits and problems

- by Ashok_Ora

Normal 0 false false false EN-US X-NONE X-NONE Almost two decades ago, I was a member of a database development team that introduced adaptive locking. Locking, the most popular concurrency control technique in database systems, is pessimistic. Locking ensures that two or more conflicting operations on the same data item don’t “trample” on each other’s toes, resulting in data corruption. In a nutshell, here’s the issue we were trying to address. In everyday life, traffic lights serve the same purpose. They ensure that traffic flows smoothly and when everyone follows the rules, there are no accidents at intersections. As I mentioned earlier, the problem with typical locking protocols is that they are pessimistic. Regardless of whether there is another conflicting operation in the system or not, you have to hold a lock! Acquiring and releasing locks can be quite expensive, depending on how many objects the transaction touches. Every transaction has to pay this penalty. To use the earlier traffic light analogy, if you have ever waited at a red light in the middle of nowhere with no one on the road, wondering why you need to wait when there’s clearly no danger of a collision, you know what I mean. The adaptive locking scheme that we invented was able to minimize the number of locks that a transaction held, by detecting whether there were one or more transactions that needed conflicting eyou could get by without holding any lock at all. In many “well-behaved” workloads, there are few conflicts, so this optimization is a huge win. If, on the other hand, there are many concurrent, conflicting requests, the algorithm gracefully degrades to the “normal” behavior with minimal cost. We were able to reduce the number of lock requests per TPC-B transaction from 178 requests down to 2! Wow! This is a dramatic improvement in concurrency as well as transaction latency. The lesson from this exercise was that if you can identify the common scenario and optimize for that case so that only the uncommon scenarios are more expensive, you can make dramatic improvements in performance without sacrificing correctness. So how does this relate to the architecture and design of some of the modern NoSQL systems? NoSQL systems can be broadly classified as master-slave sharded, or peer-to-peer sharded systems. NoSQL systems with a peer-to-peer architecture have an interesting way of handling changes. Whenever an item is changed, the client (or an intermediary) propagates the changes synchronously or asynchronously to multiple copies (for availability) of the data. Since the change can be propagated asynchronously, during some interval in time, it will be the case that some copies have received the update, and others haven’t. What happens if someone tries to read the item during this interval? The client in a peer-to-peer system will fetch the same item from multiple copies and compare them to each other. If they’re all the same, then every copy that was queried has the same (and up-to-date) value of the data item, so all’s good. If not, then the system provides a mechanism to reconcile the discrepancy and to update stale copies. So what’s the problem with this? There are two major issues: First, IT’S HORRIBLY PESSIMISTIC because, in the common case, it is unlikely that the same data item will be updated and read from different locations at around the same time! For every read operation, you have to read from multiple copies. That’s a pretty expensive, especially if the data are stored in multiple geographically separate locations and network latencies are high. Second, if the copies are not all the same, the application has to reconcile the differences and propagate the correct value to the out-dated copies. This means that the application program has to handle discrepancies in the different versions of the data item and resolve the issue (which can further add to cost and operation latency). Resolving discrepancies is only one part of the problem. What if the same data item was updated independently on two different nodes (copies)? In that case, due to the asynchronous nature of change propagation, you might land up with different versions of the data item in different copies. In this case, the application program also has to resolve conflicts and then propagate the correct value to the copies that are out-dated or have incorrect versions. This can get really complicated. My hunch is that there are many peer-to-peer-based applications that don’t handle this correctly, and worse, don’t even know it. Imagine have 100s of millions of records in your database – how can you tell whether a particular data item is incorrect or out of date? And what price are you willing to pay for ensuring that the data can be trusted? Multiple network messages per read request? Discrepancy and conflict resolution logic in the application, and potentially, additional messages? All this overhead, when all you were trying to do was to read a data item. Wouldn’t it be simpler to avoid this problem in the first place? Master-slave architectures like the Oracle NoSQL Database handles this very elegantly. A change to a data item is always sent to the master copy. Consequently, the master copy always has the most current and authoritative version of the data item. The master is also responsible for propagating the change to the other copies (for availability and read scalability). Client drivers are aware of master copies and replicas, and client drivers are also aware of the “currency” of a replica. In other words, each NoSQL Database client knows how stale a replica is. This vastly simplifies the job of the application developer. If the application needs the most current version of the data item, the client driver will automatically route the request to the master copy. If the application is willing to tolerate some staleness of data (e.g. a version that is no more than 1 second out of date), the client can easily determine which replica (or set of replicas) can satisfy the request, and route the request to the most efficient copy. This results in a dramatic simplification in application logic and also minimizes network requests (the driver will only send the request to exactl the right replica, not many). So, back to my original point. A well designed and well architected system minimizes or eliminates unnecessary overhead and avoids pessimistic algorithms wherever possible in order to deliver a highly efficient and high performance system. If you’ve every programmed an Oracle NoSQL Database application, you’ll know the difference! /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article
Managing Data Growth in SQL Server

'Help, my database ate my disk drives!'. Many DBAs spend most of their time dealing with variations of the problem of database processes consuming too much disk space. This happens because of errors such as incorrect configurations for recovery models, data growth for large objects and queries that overtax TempDB resources. Rodney describes, with some feeling, the errors that can lead to this sort of crisis for the working DBA, and their solution.

Read the article
my probleme about "The installation or removal of a software package failed"

- by tulipelle

Recently, when I open Ubuntu software center, it ask me repair package Then I found this message . installArchives() failed: (Reading database ... (Reading database ... 5% (Reading database ... 10% (Reading database ... 15% (Reading database ... 20% (Reading database ... 25% (Reading database ... 30% (Reading database ... 35% (Reading database ... 40% (Reading database ... 45% (Reading database ... 50% (Reading database ... 55% (Reading database ... 60% (Reading database ... 65% (Reading database ... 70% (Reading database ... 75% (Reading database ... 80% (Reading database ... 85% (Reading database ... 90% (Reading database ... 95% (Reading database ... 100% (Reading database ... 569135 files and directories currently installed.) Unpacking linux-image-3.5.0-42-generic (from .../linux-image-3.5.0-42-generic_3.5.0-42.65~precise1_amd64.deb) ... Done. dpkg: error processing /var/cache/apt/archives/linux-image-3.5.0-42-generic_3.5.0-42.65~precise1_amd64.deb (--unpack): failed in write on buffer copy for backend dpkg-deb during `./boot/vmlinuz-3.5.0-42-generic': No space left on device No apport report written because the error message indicates a disk full error dpkg-deb: error: subprocess paste was killed by signal (Broken pipe) Examining /etc/kernel/postrm.d . run-parts: executing /etc/kernel/postrm.d/initramfs-tools 3.5.0-42-generic /boot/vmlinuz-3.5.0-42-generic run-parts: executing /etc/kernel/postrm.d/zz-update-grub 3.5.0-42-generic /boot/vmlinuz-3.5.0-42-generic Errors were encountered while processing: /var/cache/apt/archives/linux-image-3.5.0-42-generic_3.5.0-42.65~precise1_amd64.deb Error in function: dpkg: dependency problems prevent configuration of linux-image-generic-lts-quantal: linux-image-generic-lts-quantal depends on linux-image-3.5.0-42-generic; however: Package linux-image-3.5.0-42-generic is not installed. dpkg: error processing linux-image-generic-lts-quantal (--configure): dependency problems - leaving unconfigured dpkg: dependency problems prevent configuration of linux-generic-lts-quantal: linux-generic-lts-quantal depends on linux-image-generic-lts-quantal; however: Package linux-image-generic-lts-quantal is not configured yet. dpkg: error processing linux-generic-lts-quantal (--configure): dependency problems - leaving unconfigured

Read the article
Notifications for Expiring DBSNMP Passwords

- by Courtney Llamas

Most user accounts these days have a password profile on them that automatically expires the password after a set number of days. Depending on your company’s security requirements, this may be as little as 30 days or as long as 365 days, although typically it falls between 60-90 days. For a normal user, this can cause a small interruption in your day as you have to go get your password reset by an admin. When this happens to privileged accounts, such as the DBSNMP account that is responsible for monitoring database availability, it can cause bigger problems. In Oracle Enterprise Manager 12c you may notice the error message “ORA-28002: the password will expire within 5 days” when you connect to a target, or worse you may get “ORA-28001: the password has expired". If you wait too long, your monitoring will fail because the password is locked out. Wouldn’t it be nice if we could get an alert 10 days before our DBSNMP password expired? Thanks to Oracle Enterprise Manager 12c Metric Extensions (ME), you can! See the Oracle Enterprise Manager Cloud Control Administrator’s Guide for more information on Metric Extensions. To create a metric extension, select Enterprise / Monitoring / Metric Extensions, and then click on Create. On the General Properties screen select either Cluster Database or Database Instance, depending on which target you need to monitor. If you have both RAC and Single instance you may need to create one for each. In this example we will create a Cluster Database metric. Enter a Name for the ME and a Display Name. Then select SQL for the Adapter. Adjust the Collection Schedule as desired, for this example we will collect this metric every 1 day. Notice for metric collected every day, we can determine the exact time we want to collect. On the Adapter page, enter the query that you wish to execute. In this example we will use the query below that specifically checks for the DBSNMP user that is expiring within 10 days. Of course, you can adjust this query to alert for any user that can cause an outage such as an application account or service account such as RMAN. select username, account_status, trunc(expiry_date-sysdate) days_to_expirefrom dba_userswhere username = 'DBSNMP'and expiry_date is not null; The next step is to create the columns to store the data returned from the query. Click Add and add a column for each of the fields in the same order that data is returned. The table below will help you complete the column additions. Name Display Name Column Type Value Type Metric Category Unit Username User Name Key String Security AccountStatus Account Status Data String Security DaysToExpire Days Until Expiration Data Number Security Days When creating the DaysToExpire column, you can add a default threshold here for Warning and Critical (say < 10 and 5). When all columns have been added, click Next. On the Credentials page, you can choose to use the default monitoring credentials or specify new credentials. We will use the default credentials established for our target (dbsnmp). The next step is to test your Metric Extension. Click on Add to select a target for testing, then click Select. Now click the button Run Test to execute the test against the selected target(s). We can see in the example below that the Metric Extension has executed and returned a value of 68 days to expire. Click Next to proceed. Review the metric extension in the final screen and click Finish. The metric will be created in Editable status. Select the metric, click Actions and select Deployable Draft. You can do this once more to move to Published. Finally, we want to apply this metric to a target. When managing many targets, it’s best to add your metric to a template, for details on adding a Metric Extension to a template see the Administrator’s Guide. For this example, we will deploy this to a target directly. Select Actions / Deploy to Targets. Click Add and select the target you wish to deploy to and click Submit. Once deployment is complete, we can go to the target and view the Metric & Collection Settings to see the new metric and its thresholds. After some time, you will find the metric has collected and the days to expiration for DBSNMP user can be seen in the All Metrics view. For metrics collected once per day, you may have to wait up to 24 hours to see the metric and current severity. In the example below, the current severity is Clear (green check) as it is not scheduled to expire within 10 days. To test the notification, we can edit the thresholds for the new metric so they trigger an alert. Our password expires in 139 days, so we’ll change our Warning to 140 and leave Critical at 5, in our example we also changed the collection time to every 5 minutes. At the next collection, you’ll find that the current severity changes to a Warning and any related Incident Rules would be triggered to create an Incident or Notification as desired. Now that you get a notification that your DBSNMP passwords is about to expire, you can use OEM Command Line Interface (EM CLI) verb update_db_password to change it at both the database target and the OEM target in one step. The caveat is you must know the existing password to use the update_db_password command. To learn more about EM CLI, see the Oracle Enterprise Manager Command Line Interface Guide. Below is an example of changing the password with the update_db_password verb. $ ./emcli update_db_password -target_name=emrep -target_type=oracle_database -user_name=dbsnmp -change_at_target=yes -change_all_references=yes Enter value for old_password :Enter value for new_password :Enter value for retype_new_password :Successfully submitted a job to change the password in Enterprise Manager and on the target database: "emrep"Execute "emcli get_jobs -job_id=FA66C1C4D663297FE0437656F20ACC84" to check the status of the job.Search for job name "CHANGE_PWD_JOB_FA66C1C4D662297FE0437656F20ACC84" on the Jobs home page to check job execution details. The subsequent job created will typically run quickly enough that a blackout is not needed, however if you submit a script with many targets to change, your job may run slower so adding a blackout to the script is recommended. $ ./emcli get_jobs -job_id=FA66C1C4D663297FE0437656F20ACC84 Name Type Job ID Execution ID Scheduled Completed TZ Offset Status Status ID Owner Target Type Target Name CHANGE_PWD_JOB_FA66C1C4D662297FE0437656F20ACC84 ChangePassword FA66C1C4D663297FE0437656F20ACC84 FA66C1C4D665297FE0437656F20ACC84 2014-05-28 09:39:12 2014-05-28 09:39:18 GMT-07:00 Succeeded 5 SYSMAN oracle_database emrep After implementing the above Metric Extension and using the EM CLI update_db_password verb, you will be able to stay on top of your DBSNMP password changes without experiencing an unplanned monitoring outage.

Read the article
Designing a completly new database/gui solution for my compnay

- by user1277304

I'm no expert when it come to Everything Visual Studio 2010 and utilizing SQL server 2008. I'm sure some of my personal projects I've built for personal use would get laughed off the face of the planet, but SQLCe has been the solution I was looking for those home type of projects. And they work, flawlessly. Now I feel it's time to step up to the big league. I want to develop a complete, unified and module based solution for my compnay that I'm working for. We're still using stuff from the 80s for goodness sake! I use Excel and query the ancient database on my own because I can't stand the GUI. Nothing against people of age, but the IDE our programmers are using is from the stone age, and they use APL of all things with it. I've yet to see a radio buttton control anywhere in the GUI where it would make sense. Anyway, I want to do this right from the ground up. I'm by no means a newbie when it comes to programming in .NET 2010, however, I want the entire solution to be professionaly done. I want version control, test projects, project flow, SQL 2008 integration and all the bells and whistles that come with that. I know for a fact that if we had something like that runnning, not only would development costs and time be slashed four fold, but the possibilities for expansion and performance would sky rocket. (Between the GUI an our DB engine, it can only use ONE CORE! ONE! It's 2012 for goodness sake!) Our buisness is growing and our current ancient solution just can't keep up, and I'd hate to see our buisness go down in flames because our programmer is stuck in the 80's and refuses to use anything current. So I ask you guys, the experts and know-it-alls, where do I start? Are there any gems of good books out there in the haystack of all "This for dummies" type of deals? I already have several people backing me in this endevour, and while it may seem brash to just usurp the current programmers, I'm doing this for the company as a whole. Thank you guys for your time.

Read the article
Prolog - How do you distinguish between just a string, and a variable? [closed]

- by Mr Prolog

When you are querying a Prolog database, often you will use terms that start with an uppercase letter as your variables. However, let's say that one of the constraints on your query is that a person's location must be "Dallas", and you want to query all the information in the database who meet those specifications. How would do you correctly make sure that Dallas is not interpreted as a variable to store a value in, and is interpreted as a string instead, for usage as a constraint on the query?

Read the article
XMLHttpRequest not working, trying to test database connection [closed]

- by Frederick Marcoux

I'm currently creating my own CMS for personnal use but I'm blocked at a code. I'm trying to make a installation script but the AJAX request to test if database works, doesn't work... There's my JS code: function testDB() { "use strict"; var host = document.getElementById('host').value; var username = document.getElementById('username').value; var password = document.getElementById('password').value; var db = document.getElementById('db_name').value; var xmlhttp = new XMLHttpRequest(); var url = "test_db.php"; var params = "host="+host+"&username="+username+"&password="+password+"&db="+db; xmlhttp.open("POST", url, true); xmlhttp.setRequestHeader("Content-type", "application/x-www-form-urlencoded"); xmlhttp.setRequestHeader("Content-length", params.length); xmlhttp.setRequestHeader("Connection", "close"); xmlhttp.send(params); $('#loader').removeAttr('style'); if (xmlhttp.responseText !== '') { if (xmlhttp.readyState===4 && xmlhttp.status===200) { $('#next').removeAttr('disabled'); $('#test').attr('disabled', 'disabled'); $('#test').text('Connection Successful!'); $('#test').addClass('btn-success'); $('#login').addClass('success'); $('#login1').addClass('success'); $('#db').addClass('success'); $('#loader').attr('style', 'display: none;'); } else { $('#next').attr('disabled', 'disabled'); $('#test').removeClass('btn-success'); $('#test').removeAttr('disabled'); $('#test').text('Test Connection'); $('#login').removeClass('success'); $('#login1').removeClass('success'); $('#db').removeClass('success'); $('#loader').attr('style', 'display: none;'); } } else { $('#next').attr('disabled', 'disabled'); $('#next').attr('disabled', 'disabled'); $('#test').removeClass('btn-success'); $('#test').removeAttr('disabled'); $('#test').text('Test Connection'); $('#login').removeClass('success'); $('#login1').removeClass('success'); $('#db').removeClass('success'); $('#loader').attr('style', 'display: none;'); } } And there's my PHP code: <?php $link = mysql_connect($_POST['host'], $_POST['username'], $_POST['password']); if (!$link) { echo ''; } else { if (mysql_select_db($_POST['db'])) { echo 'Connection Successful!'; } else { echo ''; } } mysql_close($link); ?> I don't know why it doesn't work but I tried with JQuery $.ajax, $.get, $.post but nothing work...

Read the article
PASS Summit 2010 Presentation Feedback

- by andyleonard

Introduction It's always an honor to present anywhere. Presenting at the PASS Summit is a special honor. I delivered three presentations last month: Database Design for Developers SSIS Design Patterns, Part 2 A Lightning Talk on SSIS Database Design for Developers First, a bit of explanation (defense): I submitted this abstract to the PASS Abstracts folks by mistake . I kid you not. Inspired by Adam Machanic ( Blog | @AdamMachanic ) I maintain a document of current presentations. I've recently published...(read more)

Read the article
Taking a Projects Development to the Next Level

- by user1745022

I have been looking for some advice for a while on how to handle a project I am working on, but to no avail. I am pretty much on my fourth iteration of improving an "application" I am working on; the first two times were in Excel, the third Time in Access, and now in Visual Studio. The field is manufacturing. The basic idea is I am taking read-only data from a massive Sybase server, filtering it and creating much smaller tables in Access daily (using delete and append Queries) and then doing a bunch of stuff. More specifically, I use a series of queries to either combine data from multiple tables or group data in specific ways (aggregate functions), and then I place this data into a table (so I can sort and manipulate data using DAO.recordset and run multiple custom algorithms). This process is then repeated multiple times throughout the database until a set of relevant tables are created. Many times I will create a field in a query with a value such as 1.1 so that when I append it to a table I can store information in the field from the algorithms. So as the process continues the number of fields for the tables change. The overall application consists of 4 "back-end" databases linked together on a shared drive, with various output (either front-end access applications or Excel). So my question is is this how many data driven applications that solve problems essentially work? Each backend database is updated with fresh data daily and updating each takes around 10 seconds (for three) and 2 minutes(for 1). Project Objectives. I want/am moving to SQL Server soon. Front End will be a Web Application (I know basic web-development and like the administration flexibility) and visual-studio will be IDE with c#/.NET. Should these algorithms be run "inside the database," or using a series of C# functions on each server request. I know you're not supposed to store data in a database unless it is an actual data point, and in Access I have many columns that just hold calculations from algorithms in vba. The truth is, I have seen multiple professional Access applications, and have never seen one that has the complexity or does even close to what mine does (for better or worse). But I know some professional software applications are 1000 times better then mine. So Please Please Please give me a suggestion of some sort. I have been completely on my own and need some guidance on how to approach this project the right way.

Read the article
Auditing DDL Changes in SQL Server databases

Even where Source Control isn't being used by developers, it is still possible to automate the process of tracking the changes being made to a database and put those into Source Control, in order to track what changed and when. You can even get an email alert when it happens. With suitable scripting, you can even do it if you don't have direct access to the live database. Grant shows how easy this is with SQL Compare.

Read the article
Help with DB Structure, vOD site

- by Chud37

I have a video on demand style site that hosts series of videos under different modules. However with the way I have designed the database it is proving to be very slow. I have asked this question before and someone suggested indexing, but i cannot seem to get my head around it. But I would like someone to help with the structure of the database here to see if it can be improved. The core table is Videos: ID bigint(20) (primary key, auto-increment) pID text airdate text title text subject mediumtext url mediumtext mID int(11) vID int(11) sID int(11) pID is a unique 5 digit string to each video that is a shorthand identifier. Airdate is the TS, (stored in text format, right there maybe I should change that to TIMESTAMP AUTO UPDATE), title is self explanatory, subject is self explanatory, url is the hard link on the site to the video, mID is joined to another table for the module title, vID is joined to another table for the language of the video, (english, russian, etc) and sID is the summary for the module, a paragraph stored in an external database. The slowest part of the website is the logging part of it. I store the data in another table called 'Hits': id mediumint(10) (primary key, auto-increment) progID text ts int(10) Again, here (this was all made a while ago) but my Timestamp (ts) is an INT instead of ON UPDATE CURRENT TIMESTAMP, which I guess it should be. However This table is now 47,492 rows long and the script that I wrote to process it is very very slow, so slow in fact that it times out. A row is added to this table each time a user clicks 'Play' on the website and then so the progID is the same as the pID, and it logs the php time() timestamp in ts. Basically I load the entire database of 'Hits' into an array and count the hits in each day using the TS column. I am guessing (i'm quite slow at all this, but I had no idea this would happen when I built the thing) that this is possibly the worst way to go about this. So my questions are as follows: Is there a better way of structuring the 'Videos' table, is so, what do you suggest? Is there a better way of structuring 'hits', if so, please help/tell me! Or is it the fact that my tables are fine and the PHP coding is crappy?

Read the article

Search Results

Search found 45150 results on 1806 pages for 'oracle technology database'.

Page 394/1806 | < Previous Page | 390 391 392 393 394 395 396 397 398 399 400 401 | Next Page >

- by Liu Maclean(???)

- by FransBouma

- by jean-pierre.dijcks

- by FransBouma

- by Michael Hylton

- by AaronBertrand

- by Jonathan Kehayias

- by Ieyasu Sawada

- by Grofit

- by Jeffrey McDaniel

- by Kerrie Foy

- by Maria Colgan

- by Michal

- by user1277304

- by Ashok_Ora

- by tulipelle

- by Courtney Llamas

- by user1277304

- by Mr Prolog

- by Frederick Marcoux

- by andyleonard

- by user1745022

- by Chud37

< Previous Page | 390 391 392 393 394 395 396 397 398 399 400 401 | Next Page >